* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed [not found] <CAGVrzcbsSV7h3qA3KuCTwKNFEeww_kSNcfUkfw3PPjeXQXBo6g@mail.gmail.com> @ 2014-03-05 0:48 ` Eric Dumazet 2014-03-05 1:03 ` Florian Fainelli 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2014-03-05 0:48 UTC (permalink / raw) To: Florian Fainelli Cc: linux-kernel@vger.kernel.org, linux-mm, paulmck, linux-nfs, trond.myklebust, netdev On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: > Hi all, > > I am seeing the following RCU stalls messages appearing on an ARMv7 > 4xCPUs system running 3.14-rc4: > > [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: > [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, > c=4294967081, q=516) > [ 42.987169] INFO: Stall ended before state dump start > > this is happening under the following conditions: > > - the attached bumper.c binary alters various kernel thread priorities > based on the contents of bumpup.cfg and > - malloc_crazy is running from a NFS share > - malloc_crazy.c is running in a loop allocating chunks of memory but > never freeing it > > when the priorities are altered, instead of getting the OOM killer to > be invoked, the RCU stalls are happening. Taking NFS out of the > equation does not allow me to reproduce the problem even with the > priorities altered. > > This "problem" seems to have been there for quite a while now since I > was able to get 3.8.13 to trigger that bug as well, with a slightly > more detailed RCU debugging trace which points the finger at kswapd0. > > You should be able to get that reproduced under QEMU with the > Versatile Express platform emulating a Cortex A15 CPU and the attached > files. > > Any help or suggestions would be greatly appreciated. Thanks! Do you have a more complete trace, including stack traces ? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 0:48 ` RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed Eric Dumazet @ 2014-03-05 1:03 ` Florian Fainelli 2014-03-05 1:16 ` Florian Fainelli 2014-03-05 1:41 ` Paul E. McKenney 0 siblings, 2 replies; 10+ messages in thread From: Florian Fainelli @ 2014-03-05 1:03 UTC (permalink / raw) To: Eric Dumazet Cc: linux-kernel@vger.kernel.org, linux-mm, paulmck, linux-nfs, trond.myklebust, netdev [-- Attachment #1: Type: text/plain, Size: 1677 bytes --] 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: > On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: >> Hi all, >> >> I am seeing the following RCU stalls messages appearing on an ARMv7 >> 4xCPUs system running 3.14-rc4: >> >> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, >> c=4294967081, q=516) >> [ 42.987169] INFO: Stall ended before state dump start >> >> this is happening under the following conditions: >> >> - the attached bumper.c binary alters various kernel thread priorities >> based on the contents of bumpup.cfg and >> - malloc_crazy is running from a NFS share >> - malloc_crazy.c is running in a loop allocating chunks of memory but >> never freeing it >> >> when the priorities are altered, instead of getting the OOM killer to >> be invoked, the RCU stalls are happening. Taking NFS out of the >> equation does not allow me to reproduce the problem even with the >> priorities altered. >> >> This "problem" seems to have been there for quite a while now since I >> was able to get 3.8.13 to trigger that bug as well, with a slightly >> more detailed RCU debugging trace which points the finger at kswapd0. >> >> You should be able to get that reproduced under QEMU with the >> Versatile Express platform emulating a Cortex A15 CPU and the attached >> files. >> >> Any help or suggestions would be greatly appreciated. Thanks! > > Do you have a more complete trace, including stack traces ? Attatched is what I get out of SysRq-t, which is the only thing I have (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): Thanks! -- Florian [-- Attachment #2: rcu_stall_3.14-rc4_arm_sysrq_t.txt --] [-- Type: text/plain, Size: 40969 bytes --] [ 3474.417333] INFO: Stall ended before state dump start [ 3500.312946] SysRq : Show State [ 3500.316015] task PC stack pid father [ 3500.321244] init S c04bda98 0 1 0 0x00000000 [ 3500.327640] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) [ 3500.334786] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) [ 3500.341672] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3500.349247] kthreadd S c04bda98 0 2 0 0x00000000 [ 3500.355635] [<c04bda98>] (__schedule) from [<c003c084>] (kthreadd+0x168/0x16c) [ 3500.362866] [<c003c084>] (kthreadd) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.370181] ksoftirqd/0 S c04bda98 0 3 2 0x00000000 [ 3500.376567] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) [ 3500.384494] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.392072] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.399300] kworker/0:0 S c04bda98 0 4 2 0x00000000 [ 3500.405691] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.413357] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.420588] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.427817] kworker/0:0H S c04bda98 0 5 2 0x00000000 [ 3500.434205] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.441871] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.449102] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.456329] kworker/u8:0 S c04bda98 0 6 2 0x00000000 [ 3500.462718] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.470384] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.477615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.484843] rcu_sched R running 0 7 2 0x00000000 [ 3500.491230] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) [ 3500.499157] [<c04bd378>] (schedule_timeout) from [<c006553c>] (rcu_gp_kthread+0x26c/0x5f8) [ 3500.507431] [<c006553c>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.514749] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.521977] rcu_bh S c04bda98 0 8 2 0x00000000 [ 3500.528363] [<c04bda98>] (__schedule) from [<c0065350>] (rcu_gp_kthread+0x80/0x5f8) [ 3500.536028] [<c0065350>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.543346] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.550573] migration/0 S c04bda98 0 9 2 0x00000000 [ 3500.556959] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) [ 3500.564885] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.572465] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.579692] watchdog/0 S c04bda98 0 10 2 0x00000000 [ 3500.586076] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) [ 3500.594001] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.601581] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.608808] watchdog/1 P c04bda98 0 11 2 0x00000000 [ 3500.615195] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.622948] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.630439] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.637667] migration/1 P c04bda98 0 12 2 0x00000000 [ 3500.644055] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.651807] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.659299] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.666527] ksoftirqd/1 P c04bda98 0 13 2 0x00000000 [ 3500.672912] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.680665] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.688156] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.695384] kworker/1:0 S c04bda98 0 14 2 0x00000000 [ 3500.701772] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.709438] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.716669] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.723896] kworker/1:0H S c04bda98 0 15 2 0x00000000 [ 3500.730284] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.737950] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.745181] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.752408] watchdog/2 P c04bda98 0 16 2 0x00000000 [ 3500.758794] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.766546] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.774038] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.781266] migration/2 P c04bda98 0 17 2 0x00000000 [ 3500.787652] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.795403] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.802894] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.810121] ksoftirqd/2 P c04bda98 0 18 2 0x00000000 [ 3500.816508] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.824259] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.831751] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.838979] kworker/2:0 S c04bda98 0 19 2 0x00000000 [ 3500.845367] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.853033] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.860264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.867491] kworker/2:0H S c04bda98 0 20 2 0x00000000 [ 3500.873880] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.881545] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3500.888776] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.896003] watchdog/3 P c04bda98 0 21 2 0x00000000 [ 3500.902389] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.910142] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.917633] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.924860] migration/3 P c04bda98 0 22 2 0x00000000 [ 3500.931246] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.938998] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.946489] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.953716] ksoftirqd/3 P c04bda98 0 23 2 0x00000000 [ 3500.960102] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) [ 3500.967855] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) [ 3500.975338] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3500.982565] kworker/3:0 S c04bda98 0 24 2 0x00000000 [ 3500.988954] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3500.996620] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.003851] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.011079] kworker/3:0H S c04bda98 0 25 2 0x00000000 [ 3501.017466] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3501.025133] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.032364] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.039591] khelper S c04bda98 0 26 2 0x00000000 [ 3501.045979] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.053730] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.061048] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.068276] kdevtmpfs S c04bda98 0 27 2 0x00000000 [ 3501.074665] [<c04bda98>] (__schedule) from [<c028ba34>] (devtmpfsd+0x258/0x34c) [ 3501.081984] [<c028ba34>] (devtmpfsd) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.088867] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.096095] writeback S c04bda98 0 28 2 0x00000000 [ 3501.102484] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.110237] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.117555] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.124783] bioset S c04bda98 0 29 2 0x00000000 [ 3501.131170] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.138923] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.146240] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.153469] kblockd S c04bda98 0 30 2 0x00000000 [ 3501.159857] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.167610] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.174929] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.182156] ata_sff S c04bda98 0 31 2 0x00000000 [ 3501.188545] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.196297] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.203615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.210843] khubd S c04bda98 0 32 2 0x00000000 [ 3501.217230] [<c04bda98>] (__schedule) from [<c0328cb8>] (hub_thread+0xf74/0x119c) [ 3501.224722] [<c0328cb8>] (hub_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.231692] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.238920] edac-poller S c04bda98 0 33 2 0x00000000 [ 3501.245308] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.253061] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.260379] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.267606] rpciod S c04bda98 0 34 2 0x00000000 [ 3501.273995] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.281748] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.289066] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.296293] kworker/0:1 R running 0 35 2 0x00000000 [ 3501.302679] Workqueue: nfsiod rpc_async_release [ 3501.307230] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) [ 3501.314809] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) [ 3501.322648] [<c04be150>] (_cond_resched) from [<c0035490>] (process_one_work+0x120/0x378) [ 3501.330836] [<c0035490>] (process_one_work) from [<c0036198>] (worker_thread+0x13c/0x404) [ 3501.339022] [<c0036198>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.346253] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.353481] khungtaskd R running 0 36 2 0x00000000 [ 3501.359868] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) [ 3501.367795] [<c04bd378>] (schedule_timeout) from [<c007cf8c>] (watchdog+0x68/0x2e8) [ 3501.375461] [<c007cf8c>] (watchdog) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.382257] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.389485] kswapd0 R running 0 37 2 0x00000000 [ 3501.395875] [<c001519c>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) [ 3501.403630] [<c00111a4>] (show_stack) from [<c0046f68>] (show_state_filter+0x64/0x90) [ 3501.411470] [<c0046f68>] (show_state_filter) from [<c0249d90>] (__handle_sysrq+0xb0/0x17c) [ 3501.419746] [<c0249d90>] (__handle_sysrq) from [<c025b6fc>] (serial8250_rx_chars+0xf8/0x208) [ 3501.428195] [<c025b6fc>] (serial8250_rx_chars) from [<c025d360>] (serial8250_handle_irq.part.18+0x68/0x9c) [ 3501.437860] [<c025d360>] (serial8250_handle_irq.part.18) from [<c025c418>] (serial8250_interrupt+0x3c/0xc0) [ 3501.447613] [<c025c418>] (serial8250_interrupt) from [<c005e300>] (handle_irq_event_percpu+0x54/0x180) [ 3501.456930] [<c005e300>] (handle_irq_event_percpu) from [<c005e46c>] (handle_irq_event+0x40/0x60) [ 3501.465814] [<c005e46c>] (handle_irq_event) from [<c0061334>] (handle_fasteoi_irq+0x80/0x158) [ 3501.474349] [<c0061334>] (handle_fasteoi_irq) from [<c005dac8>] (generic_handle_irq+0x2c/0x3c) [ 3501.482971] [<c005dac8>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) [ 3501.490897] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) [ 3501.498475] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) [ 3501.505964] Exception stack(0xcd21bdd8 to 0xcd21be20) [ 3501.511021] bdc0: 00000000 00000000 [ 3501.519207] bde0: 00004451 00004452 00000000 cd0e9940 cd0e9940 cd21bf00 00000000 00000000 [ 3501.527393] be00: 00000020 00000001 00000000 cd21be20 c009a5b0 c009a5d4 60000113 ffffffff [ 3501.535583] [<c0011d00>] (__irq_svc) from [<c009a5d4>] (list_lru_count_node+0x3c/0x74) [ 3501.543513] [<c009a5d4>] (list_lru_count_node) from [<c00c00b8>] (super_cache_count+0x60/0xc4) [ 3501.552137] [<c00c00b8>] (super_cache_count) from [<c008bbbc>] (shrink_slab_node+0x34/0x1e4) [ 3501.560585] [<c008bbbc>] (shrink_slab_node) from [<c008c53c>] (shrink_slab+0xc0/0xec) [ 3501.568424] [<c008c53c>] (shrink_slab) from [<c008ef14>] (kswapd+0x57c/0x994) [ 3501.575568] [<c008ef14>] (kswapd) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.582190] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.589418] fsnotify_mark S c04bda98 0 38 2 0x00000000 [ 3501.595807] [<c04bda98>] (__schedule) from [<c00f5a40>] (fsnotify_mark_destroy+0xf8/0x12c) [ 3501.604081] [<c00f5a40>] (fsnotify_mark_destroy) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.612007] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.619234] nfsiod S c04bda98 0 39 2 0x00000000 [ 3501.625623] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.633376] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.640693] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.647921] crypto S c04bda98 0 40 2 0x00000000 [ 3501.654309] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.662062] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.669380] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.676608] kworker/u8:1 R running 0 44 2 0x00000000 [ 3501.682994] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3501.690661] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.697892] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.705120] kpsmoused S c04bda98 0 53 2 0x00000000 [ 3501.711508] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.719261] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.726579] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.733807] deferwq S c04bda98 0 54 2 0x00000000 [ 3501.740194] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) [ 3501.747946] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.755264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.762492] udhcpc R running 0 92 1 0x00000000 [ 3501.768879] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) [ 3501.777938] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) [ 3501.787865] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) [ 3501.796140] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) [ 3501.803894] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) [ 3501.811648] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3501.819311] telnetd S c04bda98 0 100 1 0x00000000 [ 3501.825697] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) [ 3501.834841] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) [ 3501.844768] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) [ 3501.853043] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) [ 3501.860797] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) [ 3501.868550] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3501.876212] sh S c04bda98 0 101 1 0x00000000 [ 3501.882600] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) [ 3501.889746] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) [ 3501.896631] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3501.904206] portmap S c04bda98 0 102 1 0x00000000 [ 3501.910593] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) [ 3501.919736] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) [ 3501.929663] [<c00cde68>] (poll_schedule_timeout) from [<c00cf38c>] (do_sys_poll+0x3b8/0x478) [ 3501.938112] [<c00cf38c>] (do_sys_poll) from [<c00cf4fc>] (SyS_poll+0x5c/0xd4) [ 3501.945258] [<c00cf4fc>] (SyS_poll) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3501.952746] kworker/0:2 S c04bda98 0 122 2 0x00000000 [ 3501.959134] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) [ 3501.966799] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) [ 3501.974029] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3501.981257] udhcpc R running 0 132 1 0x00000000 [ 3501.987643] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) [ 3501.996701] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) [ 3502.006628] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) [ 3502.014903] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) [ 3502.022657] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) [ 3502.030411] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3502.038072] udhcpc R running 0 137 1 0x00000000 [ 3502.044459] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) [ 3502.053515] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) [ 3502.063443] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) [ 3502.071718] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) [ 3502.079472] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) [ 3502.087226] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3502.094887] lockd S c04bda98 0 143 2 0x00000000 [ 3502.101273] [<c04bda98>] (__schedule) from [<c04bd3b4>] (schedule_timeout+0x16c/0x1ac) [ 3502.109201] [<c04bd3b4>] (schedule_timeout) from [<c04aad44>] (svc_recv+0x5ac/0x81c) [ 3502.116956] [<c04aad44>] (svc_recv) from [<c019e6bc>] (lockd+0x98/0x148) [ 3502.123666] [<c019e6bc>] (lockd) from [<c003b87c>] (kthread+0xd4/0xec) [ 3502.130201] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 3502.137429] rcu.sh S c04bda98 0 153 101 0x00000000 [ 3502.143815] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) [ 3502.150959] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) [ 3502.157843] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) [ 3502.165418] malloc_test_bcm R running 0 155 153 0x00000000 [ 3502.171805] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) [ 3502.179384] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) [ 3502.187224] [<c04be150>] (_cond_resched) from [<c008c560>] (shrink_slab+0xe4/0xec) [ 3502.194803] [<c008c560>] (shrink_slab) from [<c008e818>] (try_to_free_pages+0x310/0x490) [ 3502.202906] [<c008e818>] (try_to_free_pages) from [<c0086184>] (__alloc_pages_nodemask+0x5a4/0x8f4) [ 3502.211963] [<c0086184>] (__alloc_pages_nodemask) from [<c009cd60>] (__pte_alloc+0x24/0x168) [ 3502.220411] [<c009cd60>] (__pte_alloc) from [<c00a0fdc>] (handle_mm_fault+0xc30/0xcdc) [ 3502.228340] [<c00a0fdc>] (handle_mm_fault) from [<c001749c>] (do_page_fault+0x194/0x27c) [ 3502.236441] [<c001749c>] (do_page_fault) from [<c000844c>] (do_DataAbort+0x30/0x90) [ 3502.244107] [<c000844c>] (do_DataAbort) from [<c0011e34>] (__dabt_usr+0x34/0x40) [ 3502.251509] Exception stack(0xcc9c7fb0 to 0xcc9c7ff8) [ 3502.256566] 7fa0: 76388000 00101000 00101002 000aa280 [ 3502.264752] 7fc0: 76388008 b6fa9508 00101000 00100008 b6fa9538 00100000 00001000 bed52d24 [ 3502.272937] 7fe0: 00000000 bed52c80 b6efefa8 b6efefc4 40000010 ffffffff [ 3502.279559] Sched Debug Version: v0.11, 3.14.0-rc4 #32 [ 3502.284702] ktime : 3502275.231136 [ 3502.291061] sched_clk : 3502279.556898 [ 3502.297420] cpu_clk : 3502279.557268 [ 3502.303778] jiffies : 320030 [ 3502.309441] [ 3502.310931] sysctl_sched [ 3502.313465] .sysctl_sched_latency : 6.000000 [ 3502.319563] .sysctl_sched_min_granularity : 0.750000 [ 3502.325662] .sysctl_sched_wakeup_granularity : 1.000000 [ 3502.331760] .sysctl_sched_child_runs_first : 0 [ 3502.337250] .sysctl_sched_features : 11899 [ 3502.343087] .sysctl_sched_tunable_scaling : 1 (logaritmic) [ 3502.349706] [ 3502.351198] cpu#0 [ 3502.353124] .nr_running : 9 [ 3502.357745] .load : 7168 [ 3502.362626] .nr_switches : 41007 [ 3502.367594] .nr_load_updates : 350030 [ 3502.372649] .nr_uninterruptible : 0 [ 3502.377269] .next_balance : 4294.942188 [ 3502.382758] .curr->pid : 37 [ 3502.387466] .clock : 3500304.328054 [ 3502.393216] .cpu_load[0] : 31 [ 3502.397922] .cpu_load[1] : 31 [ 3502.402628] .cpu_load[2] : 31 [ 3502.407335] .cpu_load[3] : 31 [ 3502.412043] .cpu_load[4] : 31 [ 3502.416750] [ 3502.416750] cfs_rq[0]: [ 3502.420589] .exec_clock : 0.000000 [ 3502.425818] .MIN_vruntime : 1392.857683 [ 3502.431308] .min_vruntime : 1395.857683 [ 3502.436798] .max_vruntime : 1392.895054 [ 3502.442287] .spread : 0.037371 [ 3502.447515] .spread0 : 0.000000 [ 3502.452744] .nr_spread_over : 0 [ 3502.457364] .nr_running : 7 [ 3502.461985] .load : 7168 [ 3502.466866] .runnable_load_avg : 31 [ 3502.471573] .blocked_load_avg : 0 [ 3502.476193] [ 3502.476193] rt_rq[0]: [ 3502.479945] .rt_nr_running : 2 [ 3502.484564] .rt_throttled : 0 [ 3502.489185] .rt_time : 0.000000 [ 3502.494414] .rt_runtime : 0.000001 [ 3502.499643] [ 3502.499643] runnable tasks: [ 3502.499643] task PID tree-key switches prio exec-runtime sum-exec p [ 3502.499643] ----------------------------------------------------------------------------------------------- [ 3502.525299] init 1 1293.503936 967 120 0 0 0 [ 3502.540386] kthreadd 2 -3.000000 47 2 0 0 0 [ 3502.555474] ksoftirqd/0 3 -3.000000 411 2 0 0 0 [ 3502.570562] kworker/0:0 4 1212.395251 9 120 0 0 0 [ 3502.585647] kworker/0:0H 5 76.078793 3 100 0 0 0 [ 3502.600732] kworker/u8:0 6 474.674159 9 120 0 0 0 [ 3502.615820] rcu_sched 7 1392.871202 202 120 0 0 0 [ 3502.630906] rcu_bh 8 15.631059 2 120 0 0 0 [ 3502.645991] migration/0 9 0.000000 5 0 0 0 0 [ 3502.661079] watchdog/0 10 -3.000000 878 0 0 0 0 [ 3502.676164] watchdog/1 11 22.645905 2 120 0 0 0 [ 3502.691250] migration/1 12 0.000000 2 0 0 0 0 [ 3502.706336] ksoftirqd/1 13 28.653864 2 120 0 0 0 [ 3502.721422] kworker/1:0 14 395.389726 8 120 0 0 0 [ 3502.736508] kworker/1:0H 15 76.078608 3 100 0 0 0 [ 3502.751595] watchdog/2 16 36.663186 2 120 0 0 0 [ 3502.766680] migration/2 17 0.000000 2 0 0 0 0 [ 3502.781767] ksoftirqd/2 18 42.671219 2 120 0 0 0 [ 3502.796854] kworker/2:0 19 395.389431 8 120 0 0 0 [ 3502.811941] kworker/2:0H 20 76.078598 3 100 0 0 0 [ 3502.827027] watchdog/3 21 50.680315 2 120 0 0 0 [ 3502.842112] migration/3 22 0.000000 2 0 0 0 0 [ 3502.857198] ksoftirqd/3 23 56.688385 2 120 0 0 0 [ 3502.872286] kworker/3:0 24 395.389949 8 120 0 0 0 [ 3502.887372] kworker/3:0H 25 76.078597 3 100 0 0 0 [ 3502.902457] khelper 26 -3.000000 2 2 0 0 0 [ 3502.917543] kdevtmpfs 27 980.384584 647 120 0 0 0 [ 3502.932629] writeback 28 77.578808 2 100 0 0 0 [ 3502.947715] bioset 29 79.080205 2 100 0 0 0 [ 3502.962804] kblockd 30 80.583022 2 100 0 0 0 [ 3502.977890] ata_sff 31 82.086421 2 100 0 0 0 [ 3502.992977] khubd 32 -3.000000 49 3 0 0 0 [ 3503.008063] edac-poller 33 85.093351 2 100 0 0 0 [ 3503.023148] rpciod 34 88.314163 2 100 0 0 0 [ 3503.038233] kworker/0:1 35 1392.895054 589 120 0 0 0 [ 3503.053319] khungtaskd 36 1392.857683 2 120 0 0 0 [ 3503.068405] R kswapd0 37 -3.000000 17266 2 0 0 0 [ 3503.083491] fsnotify_mark 38 396.392655 2 120 0 0 0 [ 3503.098577] nfsiod 39 398.390829 2 100 0 0 0 [ 3503.113663] crypto 40 400.392267 2 100 0 0 0 [ 3503.128749] kworker/u8:1 44 1392.857683 18 120 0 0 0 [ 3503.143835] kpsmoused 53 956.219135 2 100 0 0 0 [ 3503.158921] deferwq 54 985.352494 2 100 0 0 0 [ 3503.174006] udhcpc 92 1392.857683 14 120 0 0 0 [ 3503.189092] telnetd 100 -3.000000 1 65 0 0 0 [ 3503.204178] sh 101 -3.000000 224 2 0 0 0 [ 3503.219265] portmap 102 -3.000000 13 2 0 0 0 [ 3503.234351] kworker/0:2 122 1235.968172 3 120 0 0 0 [ 3503.249436] udhcpc 132 1392.857683 1 120 0 0 0 [ 3503.264522] udhcpc 137 1392.857683 1 120 0 0 0 [ 3503.279608] lockd 143 1324.783814 2 120 0 0 0 [ 3503.294694] rcu.sh 153 -3.000000 8 2 0 0 0 [ 3503.309781] malloc_crazy 155 0.000000 18087 2 0 0 0 [ 3503.324868] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 1:03 ` Florian Fainelli @ 2014-03-05 1:16 ` Florian Fainelli 2014-03-05 1:43 ` Paul E. McKenney 2014-03-05 1:41 ` Paul E. McKenney 1 sibling, 1 reply; 10+ messages in thread From: Florian Fainelli @ 2014-03-05 1:16 UTC (permalink / raw) To: Eric Dumazet Cc: linux-kernel@vger.kernel.org, linux-mm, paulmck, linux-nfs, trond.myklebust, netdev 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>: > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: >>> Hi all, >>> >>> I am seeing the following RCU stalls messages appearing on an ARMv7 >>> 4xCPUs system running 3.14-rc4: >>> >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, >>> c=4294967081, q=516) >>> [ 42.987169] INFO: Stall ended before state dump start >>> >>> this is happening under the following conditions: >>> >>> - the attached bumper.c binary alters various kernel thread priorities >>> based on the contents of bumpup.cfg and >>> - malloc_crazy is running from a NFS share >>> - malloc_crazy.c is running in a loop allocating chunks of memory but >>> never freeing it >>> >>> when the priorities are altered, instead of getting the OOM killer to >>> be invoked, the RCU stalls are happening. Taking NFS out of the >>> equation does not allow me to reproduce the problem even with the >>> priorities altered. >>> >>> This "problem" seems to have been there for quite a while now since I >>> was able to get 3.8.13 to trigger that bug as well, with a slightly >>> more detailed RCU debugging trace which points the finger at kswapd0. >>> >>> You should be able to get that reproduced under QEMU with the >>> Versatile Express platform emulating a Cortex A15 CPU and the attached >>> files. >>> >>> Any help or suggestions would be greatly appreciated. Thanks! >> >> Do you have a more complete trace, including stack traces ? > > Attatched is what I get out of SysRq-t, which is the only thing I have > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): QEMU for Versatile Express w/ 2 CPUs yields something slightly different than the real HW platform this is happening with, but it does produce the RCU stall anyway: [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] [ 125.766841] Modules linked in: [ 125.768389] [ 125.769199] CPU: 1 PID: 91 Comm: malloc_crazy Not tainted 3.14.0-rc4 #39 [ 125.769883] task: edbded00 ti: c089c000 task.ti: c089c000 [ 125.771743] PC is at load_balance+0x4b0/0x760 [ 125.772069] LR is at cpumask_next_and+0x44/0x5c [ 125.772387] pc : [<c004ff58>] lr : [<c01db940>] psr: 60000113 [ 125.772387] sp : c089db48 ip : 80000113 fp : edfd8cf4 [ 125.773128] r10: c0de871c r9 : ed893840 r8 : 00000000 [ 125.773452] r7 : c0de8458 r6 : edfd8840 r5 : edfd8840 r4 : ed89389c [ 125.773825] r3 : 000012d8 r2 : 80000113 r1 : 00000023 r0 : 00000000 [ 125.774332] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 125.774753] Control: 30c7387d Table: 80835d40 DAC: 00000000 [ 125.775266] CPU: 1 PID: 91 Comm: malloc_test_bcm Not tainted 3.14.0-rc4 #39 [ 125.776392] [<c0015624>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) [ 125.777026] [<c00111a4>] (show_stack) from [<c04c1bd4>] (dump_stack+0x84/0x94) [ 125.777429] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] (watchdog_timer_fn+0x144/0x17c) [ 125.777865] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] (__run_hrtimer.isra.32+0x54/0xe4) [ 125.778333] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] (hrtimer_interrupt+0x11c/0x2d0) [ 125.778814] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] (arch_timer_handler_virt+0x28/0x30) [ 125.779297] [<c03c4f80>] (arch_timer_handler_virt) from [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) [ 125.779734] [<c006280c>] (handle_percpu_devid_irq) from [<c005edec>] (generic_handle_irq+0x2c/0x3c) [ 125.780145] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) [ 125.780513] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) [ 125.780867] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) [ 125.781312] Exception stack(0xc089db00 to 0xc089db48) [ 125.781787] db00: 00000000 00000023 80000113 000012d8 ed89389c edfd8840 edfd8840 c0de8458 [ 125.782234] db20: 00000000 ed893840 c0de871c edfd8cf4 80000113 c089db48 c01db940 c004ff58 [ 125.782594] db40: 60000113 ffffffff [ 125.782864] [<c0011d00>] (__irq_svc) from [<c004ff58>] (load_balance+0x4b0/0x760) [ 125.783215] [<c004ff58>] (load_balance) from [<c005035c>] (rebalance_domains+0x154/0x284) [ 125.783595] [<c005035c>] (rebalance_domains) from [<c00504c0>] (run_rebalance_domains+0x34/0x164) [ 125.784000] [<c00504c0>] (run_rebalance_domains) from [<c0025aac>] (__do_softirq+0x110/0x24c) [ 125.784388] [<c0025aac>] (__do_softirq) from [<c0025e6c>] (irq_exit+0xac/0xf4) [ 125.784726] [<c0025e6c>] (irq_exit) from [<c000eb80>] (handle_IRQ+0x44/0x90) [ 125.785059] [<c000eb80>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) [ 125.785412] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) [ 125.785742] Exception stack(0xc089dcf8 to 0xc089dd40) [ 125.785983] dce0: ee4e38c0 00000000 [ 125.786360] dd00: 000200da 00000001 ee4e38a0 c0de2340 2d201000 edfe3358 c05c0c18 00000001 [ 125.786737] dd20: c05c0c2c c0e1e180 00000000 c089dd40 c0086050 c0086140 40000113 ffffffff [ 125.787120] [<c0011d00>] (__irq_svc) from [<c0086140>] (get_page_from_freelist+0x2bc/0x638) [ 125.787507] [<c0086140>] (get_page_from_freelist) from [<c0087018>] (__alloc_pages_nodemask+0x114/0x8f4) [ 125.787949] [<c0087018>] (__alloc_pages_nodemask) from [<c00a20c8>] (handle_mm_fault+0x9f8/0xcdc) [ 125.788357] [<c00a20c8>] (handle_mm_fault) from [<c001793c>] (do_page_fault+0x194/0x27c) [ 125.788726] [<c001793c>] (do_page_fault) from [<c000844c>] (do_DataAbort+0x30/0x90) [ 125.789080] [<c000844c>] (do_DataAbort) from [<c0011e34>] (__dabt_usr+0x34/0x40) [ 125.789403] Exception stack(0xc089dfb0 to 0xc089dff8) [ 125.789652] dfa0: ae55e008 11111111 000dc000 ae582000 [ 125.790029] dfc0: be967d18 00000000 00008390 00000000 00000000 00000000 b6f29000 be967d04 [ 125.790404] dfe0: 11111111 be967ce8 00008550 b6e5ce48 20000010 ffffffff [ 125.791282] BUG: soft lockup - CPU#0 stuck for 53s! [kworker/0:1:41] [ 125.791710] Modules linked in: [ 125.792046] [ 125.792836] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39 [ 125.793832] task: ed893840 ti: c0826000 task.ti: c0826000 [ 125.795257] PC is at finish_task_switch+0x40/0xec [ 125.795562] LR is at __schedule+0x1fc/0x51c [ 125.795856] pc : [<c0044724>] lr : [<c04c3274>] psr: 600f0013 [ 125.795856] sp : c0827ec8 ip : 00000000 fp : c0827edc [ 125.796416] r10: eda56e00 r9 : c0deac28 r8 : 00000000 [ 125.796698] r7 : c0de871c r6 : c0826020 r5 : ed893840 r4 : 00000001 [ 125.797028] r3 : eda56e00 r2 : 000012d7 r1 : ed854780 r0 : edfd8840 [ 125.797488] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel [ 125.797866] Control: 30c7387d Table: 808353c0 DAC: fffffffd [ 125.798321] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39 [ 125.800604] [<c0015624>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) [ 125.801205] [<c00111a4>] (show_stack) from [<c04c1bd4>] (dump_stack+0x84/0x94) [ 125.801786] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] (watchdog_timer_fn+0x144/0x17c) [ 125.802238] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] (__run_hrtimer.isra.32+0x54/0xe4) [ 125.802679] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] (hrtimer_interrupt+0x11c/0x2d0) [ 125.803108] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] (arch_timer_handler_virt+0x28/0x30) [ 125.803530] [<c03c4f80>] (arch_timer_handler_virt) from [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) [ 125.803965] [<c006280c>] (handle_percpu_devid_irq) from [<c005edec>] (generic_handle_irq+0x2c/0x3c) [ 125.804380] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) [ 125.804774] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) [ 125.805181] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) [ 125.805706] Exception stack(0xc0827e80 to 0xc0827ec8) [ 125.806185] 7e80: edfd8840 ed854780 000012d7 eda56e00 00000001 ed893840 c0826020 c0de871c [ 125.806623] 7ea0: 00000000 c0deac28 eda56e00 c0827edc 00000000 c0827ec8 c04c3274 c0044724 [ 125.807032] 7ec0: 600f0013 ffffffff [ 125.807328] [<c0011d00>] (__irq_svc) from [<c0044724>] (finish_task_switch+0x40/0xec) [ 125.807752] [<c0044724>] (finish_task_switch) from [<c04c3274>] (__schedule+0x1fc/0x51c) [ 125.808175] [<c04c3274>] (__schedule) from [<c0037590>] (worker_thread+0x210/0x404) [ 125.808570] [<c0037590>] (worker_thread) from [<c003cba0>] (kthread+0xd4/0xec) [ 125.808952] [<c003cba0>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) [ 246.080014] INFO: rcu_sched detected stalls on CPUs/tasks: [ 246.080611] (detected by 0, t=6972 jiffies, g=4294967160, c=4294967159, q=127) [ 246.081576] INFO: Stall ended before state dump start [ 246.082510] BUG: soft lockup - CPU#1 stuck for 69s! [grep:93] [ 246.082849] Modules linked in: [ 246.083046] [ 246.083179] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39 [ 246.083548] task: edbdf480 ti: c09e6000 task.ti: c09e6000 [ 246.083897] PC is at rebalance_domains+0x0/0x284 [ 246.084154] LR is at run_rebalance_domains+0x34/0x164 [ 246.084430] pc : [<c0050208>] lr : [<c00504c0>] psr: 20000113 [ 246.084430] sp : c09e7ef8 ip : c0dde340 fp : 40000005 [ 246.084992] r10: 00000007 r9 : c0ddf840 r8 : c0ddf840 [ 246.085267] r7 : edfe0840 r6 : 00000100 r5 : c0de209c r4 : 00000001 [ 246.085606] r3 : c005048c r2 : 2d201000 r1 : 00000001 r0 : edfe0840 [ 246.085944] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 246.086315] Control: 30c7387d Table: 808351c0 DAC: 00000000 [ 246.086626] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39 [ 246.086992] [<c0015624>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) [ 246.087420] [<c00111a4>] (show_stack) from [<c04c1bd4>] (dump_stack+0x84/0x94) [ 246.087825] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] (watchdog_timer_fn+0x144/0x17c) [ 246.088260] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] (__run_hrtimer.isra.32+0x54/0xe4) [ 246.088726] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] (hrtimer_interrupt+0x11c/0x2d0) [ 246.089190] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] (arch_timer_handler_virt+0x28/0x30) [ 246.089670] [<c03c4f80>] (arch_timer_handler_virt) from [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) [ 246.090155] [<c006280c>] (handle_percpu_devid_irq) from [<c005edec>] (generic_handle_irq+0x2c/0x3c) [ 246.090822] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) [ 246.091233] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) [ 246.091631] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) [ 246.092007] Exception stack(0xc09e7eb0 to 0xc09e7ef8) [ 246.092291] 7ea0: edfe0840 00000001 2d201000 c005048c [ 246.092719] 7ec0: 00000001 c0de209c 00000100 edfe0840 c0ddf840 c0ddf840 00000007 40000005 [ 246.093142] 7ee0: c0dde340 c09e7ef8 c00504c0 c0050208 20000113 ffffffff [ 246.093647] [<c0011d00>] (__irq_svc) from [<c0050208>] (rebalance_domains+0x0/0x284) [ 246.094422] [<c0050208>] (rebalance_domains) from [<c09e6028>] (0xc09e6028) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 1:16 ` Florian Fainelli @ 2014-03-05 1:43 ` Paul E. McKenney 2014-03-05 3:55 ` Florian Fainelli 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-03-05 1:43 UTC (permalink / raw) To: Florian Fainelli Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote: > 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>: > > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: > >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: > >>> Hi all, > >>> > >>> I am seeing the following RCU stalls messages appearing on an ARMv7 > >>> 4xCPUs system running 3.14-rc4: > >>> > >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: > >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, > >>> c=4294967081, q=516) > >>> [ 42.987169] INFO: Stall ended before state dump start > >>> > >>> this is happening under the following conditions: > >>> > >>> - the attached bumper.c binary alters various kernel thread priorities > >>> based on the contents of bumpup.cfg and > >>> - malloc_crazy is running from a NFS share > >>> - malloc_crazy.c is running in a loop allocating chunks of memory but > >>> never freeing it > >>> > >>> when the priorities are altered, instead of getting the OOM killer to > >>> be invoked, the RCU stalls are happening. Taking NFS out of the > >>> equation does not allow me to reproduce the problem even with the > >>> priorities altered. > >>> > >>> This "problem" seems to have been there for quite a while now since I > >>> was able to get 3.8.13 to trigger that bug as well, with a slightly > >>> more detailed RCU debugging trace which points the finger at kswapd0. > >>> > >>> You should be able to get that reproduced under QEMU with the > >>> Versatile Express platform emulating a Cortex A15 CPU and the attached > >>> files. > >>> > >>> Any help or suggestions would be greatly appreciated. Thanks! > >> > >> Do you have a more complete trace, including stack traces ? > > > > Attatched is what I get out of SysRq-t, which is the only thing I have > > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): > > QEMU for Versatile Express w/ 2 CPUs yields something slightly > different than the real HW platform this is happening with, but it > does produce the RCU stall anyway: > > [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] This soft-lockup condition can result in RCU CPU stall warnings. Fix the problem causing the soft lockup, and I bet that your RCU CPU stall warnings go away. Thanx, Paul > [ 125.766841] Modules linked in: > [ 125.768389] > [ 125.769199] CPU: 1 PID: 91 Comm: malloc_crazy Not tainted 3.14.0-rc4 #39 > [ 125.769883] task: edbded00 ti: c089c000 task.ti: c089c000 > [ 125.771743] PC is at load_balance+0x4b0/0x760 > [ 125.772069] LR is at cpumask_next_and+0x44/0x5c > [ 125.772387] pc : [<c004ff58>] lr : [<c01db940>] psr: 60000113 > [ 125.772387] sp : c089db48 ip : 80000113 fp : edfd8cf4 > [ 125.773128] r10: c0de871c r9 : ed893840 r8 : 00000000 > [ 125.773452] r7 : c0de8458 r6 : edfd8840 r5 : edfd8840 r4 : ed89389c > [ 125.773825] r3 : 000012d8 r2 : 80000113 r1 : 00000023 r0 : 00000000 > [ 125.774332] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > [ 125.774753] Control: 30c7387d Table: 80835d40 DAC: 00000000 > [ 125.775266] CPU: 1 PID: 91 Comm: malloc_test_bcm Not tainted 3.14.0-rc4 #39 > [ 125.776392] [<c0015624>] (unwind_backtrace) from [<c00111a4>] > (show_stack+0x10/0x14) > [ 125.777026] [<c00111a4>] (show_stack) from [<c04c1bd4>] > (dump_stack+0x84/0x94) > [ 125.777429] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] > (watchdog_timer_fn+0x144/0x17c) > [ 125.777865] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] > (__run_hrtimer.isra.32+0x54/0xe4) > [ 125.778333] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] > (hrtimer_interrupt+0x11c/0x2d0) > [ 125.778814] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] > (arch_timer_handler_virt+0x28/0x30) > [ 125.779297] [<c03c4f80>] (arch_timer_handler_virt) from > [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) > [ 125.779734] [<c006280c>] (handle_percpu_devid_irq) from > [<c005edec>] (generic_handle_irq+0x2c/0x3c) > [ 125.780145] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] > (handle_IRQ+0x40/0x90) > [ 125.780513] [<c000eb7c>] (handle_IRQ) from [<c0008568>] > (gic_handle_irq+0x2c/0x5c) > [ 125.780867] [<c0008568>] (gic_handle_irq) from [<c0011d00>] > (__irq_svc+0x40/0x50) > [ 125.781312] Exception stack(0xc089db00 to 0xc089db48) > [ 125.781787] db00: 00000000 00000023 80000113 000012d8 ed89389c > edfd8840 edfd8840 c0de8458 > [ 125.782234] db20: 00000000 ed893840 c0de871c edfd8cf4 80000113 > c089db48 c01db940 c004ff58 > [ 125.782594] db40: 60000113 ffffffff > [ 125.782864] [<c0011d00>] (__irq_svc) from [<c004ff58>] > (load_balance+0x4b0/0x760) > [ 125.783215] [<c004ff58>] (load_balance) from [<c005035c>] > (rebalance_domains+0x154/0x284) > [ 125.783595] [<c005035c>] (rebalance_domains) from [<c00504c0>] > (run_rebalance_domains+0x34/0x164) > [ 125.784000] [<c00504c0>] (run_rebalance_domains) from [<c0025aac>] > (__do_softirq+0x110/0x24c) > [ 125.784388] [<c0025aac>] (__do_softirq) from [<c0025e6c>] > (irq_exit+0xac/0xf4) > [ 125.784726] [<c0025e6c>] (irq_exit) from [<c000eb80>] (handle_IRQ+0x44/0x90) > [ 125.785059] [<c000eb80>] (handle_IRQ) from [<c0008568>] > (gic_handle_irq+0x2c/0x5c) > [ 125.785412] [<c0008568>] (gic_handle_irq) from [<c0011d00>] > (__irq_svc+0x40/0x50) > [ 125.785742] Exception stack(0xc089dcf8 to 0xc089dd40) > [ 125.785983] dce0: > ee4e38c0 00000000 > [ 125.786360] dd00: 000200da 00000001 ee4e38a0 c0de2340 2d201000 > edfe3358 c05c0c18 00000001 > [ 125.786737] dd20: c05c0c2c c0e1e180 00000000 c089dd40 c0086050 > c0086140 40000113 ffffffff > [ 125.787120] [<c0011d00>] (__irq_svc) from [<c0086140>] > (get_page_from_freelist+0x2bc/0x638) > [ 125.787507] [<c0086140>] (get_page_from_freelist) from [<c0087018>] > (__alloc_pages_nodemask+0x114/0x8f4) > [ 125.787949] [<c0087018>] (__alloc_pages_nodemask) from [<c00a20c8>] > (handle_mm_fault+0x9f8/0xcdc) > [ 125.788357] [<c00a20c8>] (handle_mm_fault) from [<c001793c>] > (do_page_fault+0x194/0x27c) > [ 125.788726] [<c001793c>] (do_page_fault) from [<c000844c>] > (do_DataAbort+0x30/0x90) > [ 125.789080] [<c000844c>] (do_DataAbort) from [<c0011e34>] > (__dabt_usr+0x34/0x40) > [ 125.789403] Exception stack(0xc089dfb0 to 0xc089dff8) > [ 125.789652] dfa0: ae55e008 > 11111111 000dc000 ae582000 > [ 125.790029] dfc0: be967d18 00000000 00008390 00000000 00000000 > 00000000 b6f29000 be967d04 > [ 125.790404] dfe0: 11111111 be967ce8 00008550 b6e5ce48 20000010 ffffffff > [ 125.791282] BUG: soft lockup - CPU#0 stuck for 53s! [kworker/0:1:41] > [ 125.791710] Modules linked in: > [ 125.792046] > [ 125.792836] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39 > [ 125.793832] task: ed893840 ti: c0826000 task.ti: c0826000 > [ 125.795257] PC is at finish_task_switch+0x40/0xec > [ 125.795562] LR is at __schedule+0x1fc/0x51c > [ 125.795856] pc : [<c0044724>] lr : [<c04c3274>] psr: 600f0013 > [ 125.795856] sp : c0827ec8 ip : 00000000 fp : c0827edc > [ 125.796416] r10: eda56e00 r9 : c0deac28 r8 : 00000000 > [ 125.796698] r7 : c0de871c r6 : c0826020 r5 : ed893840 r4 : 00000001 > [ 125.797028] r3 : eda56e00 r2 : 000012d7 r1 : ed854780 r0 : edfd8840 > [ 125.797488] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM > Segment kernel > [ 125.797866] Control: 30c7387d Table: 808353c0 DAC: fffffffd > [ 125.798321] CPU: 0 PID: 41 Comm: kworker/0:1 Not tainted 3.14.0-rc4 #39 > [ 125.800604] [<c0015624>] (unwind_backtrace) from [<c00111a4>] > (show_stack+0x10/0x14) > [ 125.801205] [<c00111a4>] (show_stack) from [<c04c1bd4>] > (dump_stack+0x84/0x94) > [ 125.801786] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] > (watchdog_timer_fn+0x144/0x17c) > [ 125.802238] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] > (__run_hrtimer.isra.32+0x54/0xe4) > [ 125.802679] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] > (hrtimer_interrupt+0x11c/0x2d0) > [ 125.803108] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] > (arch_timer_handler_virt+0x28/0x30) > [ 125.803530] [<c03c4f80>] (arch_timer_handler_virt) from > [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) > [ 125.803965] [<c006280c>] (handle_percpu_devid_irq) from > [<c005edec>] (generic_handle_irq+0x2c/0x3c) > [ 125.804380] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] > (handle_IRQ+0x40/0x90) > [ 125.804774] [<c000eb7c>] (handle_IRQ) from [<c0008568>] > (gic_handle_irq+0x2c/0x5c) > [ 125.805181] [<c0008568>] (gic_handle_irq) from [<c0011d00>] > (__irq_svc+0x40/0x50) > [ 125.805706] Exception stack(0xc0827e80 to 0xc0827ec8) > [ 125.806185] 7e80: edfd8840 ed854780 000012d7 eda56e00 00000001 > ed893840 c0826020 c0de871c > [ 125.806623] 7ea0: 00000000 c0deac28 eda56e00 c0827edc 00000000 > c0827ec8 c04c3274 c0044724 > [ 125.807032] 7ec0: 600f0013 ffffffff > [ 125.807328] [<c0011d00>] (__irq_svc) from [<c0044724>] > (finish_task_switch+0x40/0xec) > [ 125.807752] [<c0044724>] (finish_task_switch) from [<c04c3274>] > (__schedule+0x1fc/0x51c) > [ 125.808175] [<c04c3274>] (__schedule) from [<c0037590>] > (worker_thread+0x210/0x404) > [ 125.808570] [<c0037590>] (worker_thread) from [<c003cba0>] > (kthread+0xd4/0xec) > [ 125.808952] [<c003cba0>] (kthread) from [<c000e338>] > (ret_from_fork+0x14/0x3c) > [ 246.080014] INFO: rcu_sched detected stalls on CPUs/tasks: > [ 246.080611] (detected by 0, t=6972 jiffies, g=4294967160, > c=4294967159, q=127) > [ 246.081576] INFO: Stall ended before state dump start > [ 246.082510] BUG: soft lockup - CPU#1 stuck for 69s! [grep:93] > [ 246.082849] Modules linked in: > [ 246.083046] > [ 246.083179] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39 > [ 246.083548] task: edbdf480 ti: c09e6000 task.ti: c09e6000 > [ 246.083897] PC is at rebalance_domains+0x0/0x284 > [ 246.084154] LR is at run_rebalance_domains+0x34/0x164 > [ 246.084430] pc : [<c0050208>] lr : [<c00504c0>] psr: 20000113 > [ 246.084430] sp : c09e7ef8 ip : c0dde340 fp : 40000005 > [ 246.084992] r10: 00000007 r9 : c0ddf840 r8 : c0ddf840 > [ 246.085267] r7 : edfe0840 r6 : 00000100 r5 : c0de209c r4 : 00000001 > [ 246.085606] r3 : c005048c r2 : 2d201000 r1 : 00000001 r0 : edfe0840 > [ 246.085944] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > [ 246.086315] Control: 30c7387d Table: 808351c0 DAC: 00000000 > [ 246.086626] CPU: 1 PID: 93 Comm: grep Not tainted 3.14.0-rc4 #39 > [ 246.086992] [<c0015624>] (unwind_backtrace) from [<c00111a4>] > (show_stack+0x10/0x14) > [ 246.087420] [<c00111a4>] (show_stack) from [<c04c1bd4>] > (dump_stack+0x84/0x94) > [ 246.087825] [<c04c1bd4>] (dump_stack) from [<c007e7f0>] > (watchdog_timer_fn+0x144/0x17c) > [ 246.088260] [<c007e7f0>] (watchdog_timer_fn) from [<c003f58c>] > (__run_hrtimer.isra.32+0x54/0xe4) > [ 246.088726] [<c003f58c>] (__run_hrtimer.isra.32) from [<c003fea4>] > (hrtimer_interrupt+0x11c/0x2d0) > [ 246.089190] [<c003fea4>] (hrtimer_interrupt) from [<c03c4f80>] > (arch_timer_handler_virt+0x28/0x30) > [ 246.089670] [<c03c4f80>] (arch_timer_handler_virt) from > [<c006280c>] (handle_percpu_devid_irq+0x6c/0x84) > [ 246.090155] [<c006280c>] (handle_percpu_devid_irq) from > [<c005edec>] (generic_handle_irq+0x2c/0x3c) > [ 246.090822] [<c005edec>] (generic_handle_irq) from [<c000eb7c>] > (handle_IRQ+0x40/0x90) > [ 246.091233] [<c000eb7c>] (handle_IRQ) from [<c0008568>] > (gic_handle_irq+0x2c/0x5c) > [ 246.091631] [<c0008568>] (gic_handle_irq) from [<c0011d00>] > (__irq_svc+0x40/0x50) > [ 246.092007] Exception stack(0xc09e7eb0 to 0xc09e7ef8) > [ 246.092291] 7ea0: edfe0840 > 00000001 2d201000 c005048c > [ 246.092719] 7ec0: 00000001 c0de209c 00000100 edfe0840 c0ddf840 > c0ddf840 00000007 40000005 > [ 246.093142] 7ee0: c0dde340 c09e7ef8 c00504c0 c0050208 20000113 ffffffff > [ 246.093647] [<c0011d00>] (__irq_svc) from [<c0050208>] > (rebalance_domains+0x0/0x284) > [ 246.094422] [<c0050208>] (rebalance_domains) from [<c09e6028>] (0xc09e6028) > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 1:43 ` Paul E. McKenney @ 2014-03-05 3:55 ` Florian Fainelli 2014-03-05 5:34 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Florian Fainelli @ 2014-03-05 3:55 UTC (permalink / raw) To: Paul McKenney Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev 2014-03-04 17:43 GMT-08:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote: >> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>: >> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: >> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: >> >>> Hi all, >> >>> >> >>> I am seeing the following RCU stalls messages appearing on an ARMv7 >> >>> 4xCPUs system running 3.14-rc4: >> >>> >> >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: >> >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, >> >>> c=4294967081, q=516) >> >>> [ 42.987169] INFO: Stall ended before state dump start >> >>> >> >>> this is happening under the following conditions: >> >>> >> >>> - the attached bumper.c binary alters various kernel thread priorities >> >>> based on the contents of bumpup.cfg and >> >>> - malloc_crazy is running from a NFS share >> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but >> >>> never freeing it >> >>> >> >>> when the priorities are altered, instead of getting the OOM killer to >> >>> be invoked, the RCU stalls are happening. Taking NFS out of the >> >>> equation does not allow me to reproduce the problem even with the >> >>> priorities altered. >> >>> >> >>> This "problem" seems to have been there for quite a while now since I >> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly >> >>> more detailed RCU debugging trace which points the finger at kswapd0. >> >>> >> >>> You should be able to get that reproduced under QEMU with the >> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached >> >>> files. >> >>> >> >>> Any help or suggestions would be greatly appreciated. Thanks! >> >> >> >> Do you have a more complete trace, including stack traces ? >> > >> > Attatched is what I get out of SysRq-t, which is the only thing I have >> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): >> >> QEMU for Versatile Express w/ 2 CPUs yields something slightly >> different than the real HW platform this is happening with, but it >> does produce the RCU stall anyway: >> >> [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] > > This soft-lockup condition can result in RCU CPU stall warnings. Fix > the problem causing the soft lockup, and I bet that your RCU CPU stall > warnings go away. I definitively agree, which is why I was asking for help, as I think the kernel thread priority change is what is causing the soft lockup to appear, but nothing obvious jumps to mind when looking at the trace. Thanks! -- Florian -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 3:55 ` Florian Fainelli @ 2014-03-05 5:34 ` Paul E. McKenney [not found] ` <20140305053440.GD3334-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-03-05 5:34 UTC (permalink / raw) To: Florian Fainelli Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev On Tue, Mar 04, 2014 at 07:55:03PM -0800, Florian Fainelli wrote: > 2014-03-04 17:43 GMT-08:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > > On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote: > >> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>: > >> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: > >> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: > >> >>> Hi all, > >> >>> > >> >>> I am seeing the following RCU stalls messages appearing on an ARMv7 > >> >>> 4xCPUs system running 3.14-rc4: > >> >>> > >> >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: > >> >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, > >> >>> c=4294967081, q=516) > >> >>> [ 42.987169] INFO: Stall ended before state dump start > >> >>> > >> >>> this is happening under the following conditions: > >> >>> > >> >>> - the attached bumper.c binary alters various kernel thread priorities > >> >>> based on the contents of bumpup.cfg and > >> >>> - malloc_crazy is running from a NFS share > >> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but > >> >>> never freeing it > >> >>> > >> >>> when the priorities are altered, instead of getting the OOM killer to > >> >>> be invoked, the RCU stalls are happening. Taking NFS out of the > >> >>> equation does not allow me to reproduce the problem even with the > >> >>> priorities altered. > >> >>> > >> >>> This "problem" seems to have been there for quite a while now since I > >> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly > >> >>> more detailed RCU debugging trace which points the finger at kswapd0. > >> >>> > >> >>> You should be able to get that reproduced under QEMU with the > >> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached > >> >>> files. > >> >>> > >> >>> Any help or suggestions would be greatly appreciated. Thanks! > >> >> > >> >> Do you have a more complete trace, including stack traces ? > >> > > >> > Attatched is what I get out of SysRq-t, which is the only thing I have > >> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): > >> > >> QEMU for Versatile Express w/ 2 CPUs yields something slightly > >> different than the real HW platform this is happening with, but it > >> does produce the RCU stall anyway: > >> > >> [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] > > > > This soft-lockup condition can result in RCU CPU stall warnings. Fix > > the problem causing the soft lockup, and I bet that your RCU CPU stall > > warnings go away. > > I definitively agree, which is why I was asking for help, as I think > the kernel thread priority change is what is causing the soft lockup > to appear, but nothing obvious jumps to mind when looking at the > trace. Is your hardware able to make the malloc_crazy CPU periodically dump its stack, perhaps in response to an NMI? If not, another approach is to use ftrace -- though this will require a very high-priority task to turn tracing on and off reasonably quickly, unless you happen to have a very large amount of storage to hold the trace. What happens if you malloc() less intensively? Does that avoid this problem? The reason I ask is that you mentioned that avoiding NFS helped, and it is possible that NFS is increasing storage-access latencies and thus triggering another problem. It is quite possible that slowing down the malloc()s would also help, and might allow you to observe what is happening more easily than when the system is driven fully to the lockup condition. Finally, what are you trying to achieve with this workload? Does your production workload behave in this way? Or is this an experimental investigation of some sort? Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20140305053440.GD3334-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>]
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed [not found] ` <20140305053440.GD3334-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> @ 2014-03-06 0:42 ` Florian Fainelli 2014-03-06 1:42 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Florian Fainelli @ 2014-03-06 0:42 UTC (permalink / raw) To: Paul McKenney Cc: Eric Dumazet, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm, linux-nfs, trond.myklebust, netdev 2014-03-04 21:34 GMT-08:00 Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>: > On Tue, Mar 04, 2014 at 07:55:03PM -0800, Florian Fainelli wrote: >> 2014-03-04 17:43 GMT-08:00 Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>: >> > On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote: >> >> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: >> >> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: >> >> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: >> >> >>> Hi all, >> >> >>> >> >> >>> I am seeing the following RCU stalls messages appearing on an ARMv7 >> >> >>> 4xCPUs system running 3.14-rc4: >> >> >>> >> >> >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: >> >> >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, >> >> >>> c=4294967081, q=516) >> >> >>> [ 42.987169] INFO: Stall ended before state dump start >> >> >>> >> >> >>> this is happening under the following conditions: >> >> >>> >> >> >>> - the attached bumper.c binary alters various kernel thread priorities >> >> >>> based on the contents of bumpup.cfg and >> >> >>> - malloc_crazy is running from a NFS share >> >> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but >> >> >>> never freeing it >> >> >>> >> >> >>> when the priorities are altered, instead of getting the OOM killer to >> >> >>> be invoked, the RCU stalls are happening. Taking NFS out of the >> >> >>> equation does not allow me to reproduce the problem even with the >> >> >>> priorities altered. >> >> >>> >> >> >>> This "problem" seems to have been there for quite a while now since I >> >> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly >> >> >>> more detailed RCU debugging trace which points the finger at kswapd0. >> >> >>> >> >> >>> You should be able to get that reproduced under QEMU with the >> >> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached >> >> >>> files. >> >> >>> >> >> >>> Any help or suggestions would be greatly appreciated. Thanks! >> >> >> >> >> >> Do you have a more complete trace, including stack traces ? >> >> > >> >> > Attatched is what I get out of SysRq-t, which is the only thing I have >> >> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): >> >> >> >> QEMU for Versatile Express w/ 2 CPUs yields something slightly >> >> different than the real HW platform this is happening with, but it >> >> does produce the RCU stall anyway: >> >> >> >> [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] >> > >> > This soft-lockup condition can result in RCU CPU stall warnings. Fix >> > the problem causing the soft lockup, and I bet that your RCU CPU stall >> > warnings go away. >> >> I definitively agree, which is why I was asking for help, as I think >> the kernel thread priority change is what is causing the soft lockup >> to appear, but nothing obvious jumps to mind when looking at the >> trace. > > Is your hardware able to make the malloc_crazy CPU periodically dump > its stack, perhaps in response to an NMI? If not, another approach is > to use ftrace -- though this will require a very high-priority task to > turn tracing on and off reasonably quickly, unless you happen to have > a very large amount of storage to hold the trace. > > What happens if you malloc() less intensively? Does that avoid this > problem? It does yes, putting some arbitrary delays between the malloc() calls does definitively help. >The reason I ask is that you mentioned that avoiding NFS helped, > and it is possible that NFS is increasing storage-access latencies and > thus triggering another problem. It is quite possible that slowing > down the malloc()s would also help, and might allow you to observe what > is happening more easily than when the system is driven fully to the > lockup condition. > > Finally, what are you trying to achieve with this workload? Does your > production workload behave in this way? Or is this an experimental > investigation of some sort? This is an experimental investigation, part of the problem being that there were some expectations that altering priority of essential kernel threads would "just work". It seemed to me like even if we moved kthreadd to SCHED_RR, with priority 2 (as shown by /proc/*/sched), we should still be at a more favorable scheduling class than 'rcu_bh' and 'rcu_sched' which are on SCHED_NORMAL. Interestingly, the issue only appears with 1 or 2 CPUs online, as soon as the 4 are online I am no longer able to reproduce it... -- Florian -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-06 0:42 ` Florian Fainelli @ 2014-03-06 1:42 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2014-03-06 1:42 UTC (permalink / raw) To: Florian Fainelli Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev On Wed, Mar 05, 2014 at 04:42:55PM -0800, Florian Fainelli wrote: > 2014-03-04 21:34 GMT-08:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > > On Tue, Mar 04, 2014 at 07:55:03PM -0800, Florian Fainelli wrote: > >> 2014-03-04 17:43 GMT-08:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > >> > On Tue, Mar 04, 2014 at 05:16:27PM -0800, Florian Fainelli wrote: > >> >> 2014-03-04 17:03 GMT-08:00 Florian Fainelli <f.fainelli@gmail.com>: > >> >> > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: > >> >> >> On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: > >> >> >>> Hi all, > >> >> >>> > >> >> >>> I am seeing the following RCU stalls messages appearing on an ARMv7 > >> >> >>> 4xCPUs system running 3.14-rc4: > >> >> >>> > >> >> >>> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> >>> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, > >> >> >>> c=4294967081, q=516) > >> >> >>> [ 42.987169] INFO: Stall ended before state dump start > >> >> >>> > >> >> >>> this is happening under the following conditions: > >> >> >>> > >> >> >>> - the attached bumper.c binary alters various kernel thread priorities > >> >> >>> based on the contents of bumpup.cfg and > >> >> >>> - malloc_crazy is running from a NFS share > >> >> >>> - malloc_crazy.c is running in a loop allocating chunks of memory but > >> >> >>> never freeing it > >> >> >>> > >> >> >>> when the priorities are altered, instead of getting the OOM killer to > >> >> >>> be invoked, the RCU stalls are happening. Taking NFS out of the > >> >> >>> equation does not allow me to reproduce the problem even with the > >> >> >>> priorities altered. > >> >> >>> > >> >> >>> This "problem" seems to have been there for quite a while now since I > >> >> >>> was able to get 3.8.13 to trigger that bug as well, with a slightly > >> >> >>> more detailed RCU debugging trace which points the finger at kswapd0. > >> >> >>> > >> >> >>> You should be able to get that reproduced under QEMU with the > >> >> >>> Versatile Express platform emulating a Cortex A15 CPU and the attached > >> >> >>> files. > >> >> >>> > >> >> >>> Any help or suggestions would be greatly appreciated. Thanks! > >> >> >> > >> >> >> Do you have a more complete trace, including stack traces ? > >> >> > > >> >> > Attatched is what I get out of SysRq-t, which is the only thing I have > >> >> > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): > >> >> > >> >> QEMU for Versatile Express w/ 2 CPUs yields something slightly > >> >> different than the real HW platform this is happening with, but it > >> >> does produce the RCU stall anyway: > >> >> > >> >> [ 125.762946] BUG: soft lockup - CPU#1 stuck for 53s! [malloc_crazy:91] > >> > > >> > This soft-lockup condition can result in RCU CPU stall warnings. Fix > >> > the problem causing the soft lockup, and I bet that your RCU CPU stall > >> > warnings go away. > >> > >> I definitively agree, which is why I was asking for help, as I think > >> the kernel thread priority change is what is causing the soft lockup > >> to appear, but nothing obvious jumps to mind when looking at the > >> trace. > > > > Is your hardware able to make the malloc_crazy CPU periodically dump > > its stack, perhaps in response to an NMI? If not, another approach is > > to use ftrace -- though this will require a very high-priority task to > > turn tracing on and off reasonably quickly, unless you happen to have > > a very large amount of storage to hold the trace. > > > > What happens if you malloc() less intensively? Does that avoid this > > problem? > > It does yes, putting some arbitrary delays between the malloc() calls > does definitively help. OK, good. This might be helping because it if freeing up enough CPU time that all the critical kthreads actually get to run (for but one example, the OOM killer, of course assuming that you are delaying via sleeping rather than via spinning), or it might be helping by placing less pressure on the VM system. Or by keeping the VM system out of a CPU-bound mode, for that matter. So one useful diagnostic approach would be to look at the CPU consumption and the VM statistics as a function of the amount of delay between malloc() calls. > >The reason I ask is that you mentioned that avoiding NFS helped, > > and it is possible that NFS is increasing storage-access latencies and > > thus triggering another problem. It is quite possible that slowing > > down the malloc()s would also help, and might allow you to observe what > > is happening more easily than when the system is driven fully to the > > lockup condition. > > > > Finally, what are you trying to achieve with this workload? Does your > > production workload behave in this way? Or is this an experimental > > investigation of some sort? > > This is an experimental investigation, part of the problem being that > there were some expectations that altering priority of essential > kernel threads would "just work". It might "just work" -- but only if you used extreme care in choosing the altered set of priorities, especially when running a CPU-bound workload! ;-) > It seemed to me like even if we moved kthreadd to SCHED_RR, with > priority 2 (as shown by /proc/*/sched), we should still be at a more > favorable scheduling class than 'rcu_bh' and 'rcu_sched' which are on > SCHED_NORMAL. Interestingly, the issue only appears with 1 or 2 CPUs > online, as soon as the 4 are online I am no longer able to reproduce > it... Interesting! That hints that the problem is that your system is unable to supply the needed CPU time for some critical kthread. On a 2-CPU system, one approach would be to put all the normal-priority non-per-CPU kthreads onto CPU 1 and the rest onto CPU 0. Don't try moving the per-CPU kthreads! (One exception to this rule being the rcuo* no-CB CPU kthreads.) Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 1:03 ` Florian Fainelli 2014-03-05 1:16 ` Florian Fainelli @ 2014-03-05 1:41 ` Paul E. McKenney 2014-03-05 1:43 ` Florian Fainelli 1 sibling, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-03-05 1:41 UTC (permalink / raw) To: Florian Fainelli Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev On Tue, Mar 04, 2014 at 05:03:24PM -0800, Florian Fainelli wrote: > 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: > > On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: > >> Hi all, > >> > >> I am seeing the following RCU stalls messages appearing on an ARMv7 > >> 4xCPUs system running 3.14-rc4: > >> > >> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: > >> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, > >> c=4294967081, q=516) > >> [ 42.987169] INFO: Stall ended before state dump start > >> > >> this is happening under the following conditions: > >> > >> - the attached bumper.c binary alters various kernel thread priorities > >> based on the contents of bumpup.cfg and > >> - malloc_crazy is running from a NFS share > >> - malloc_crazy.c is running in a loop allocating chunks of memory but > >> never freeing it > >> > >> when the priorities are altered, instead of getting the OOM killer to > >> be invoked, the RCU stalls are happening. Taking NFS out of the > >> equation does not allow me to reproduce the problem even with the > >> priorities altered. > >> > >> This "problem" seems to have been there for quite a while now since I > >> was able to get 3.8.13 to trigger that bug as well, with a slightly > >> more detailed RCU debugging trace which points the finger at kswapd0. The 3.8 kernel was where RCU grace-period processing moved to kthreads. Does 3.7 or earlier trigger? In any case, if you starve RCU's grace-period kthreads (rcu_bh and rcu_sched in your kernel configuration), then RCU CPU stall-warning messages are expected behavior. In 3.7 and earlier, you could get the same effect by starving ksoftirqd. > >> You should be able to get that reproduced under QEMU with the > >> Versatile Express platform emulating a Cortex A15 CPU and the attached > >> files. > >> > >> Any help or suggestions would be greatly appreciated. Thanks! > > > > Do you have a more complete trace, including stack traces ? > > Attatched is what I get out of SysRq-t, which is the only thing I have > (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): > > Thanks! > -- > Florian > [ 3474.417333] INFO: Stall ended before state dump start This was running on 3.14-rc4? Thanx, Paul > [ 3500.312946] SysRq : Show State > [ 3500.316015] task PC stack pid father > [ 3500.321244] init S c04bda98 0 1 0 0x00000000 > [ 3500.327640] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) > [ 3500.334786] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) > [ 3500.341672] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3500.349247] kthreadd S c04bda98 0 2 0 0x00000000 > [ 3500.355635] [<c04bda98>] (__schedule) from [<c003c084>] (kthreadd+0x168/0x16c) > [ 3500.362866] [<c003c084>] (kthreadd) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.370181] ksoftirqd/0 S c04bda98 0 3 2 0x00000000 > [ 3500.376567] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) > [ 3500.384494] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.392072] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.399300] kworker/0:0 S c04bda98 0 4 2 0x00000000 > [ 3500.405691] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.413357] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.420588] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.427817] kworker/0:0H S c04bda98 0 5 2 0x00000000 > [ 3500.434205] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.441871] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.449102] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.456329] kworker/u8:0 S c04bda98 0 6 2 0x00000000 > [ 3500.462718] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.470384] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.477615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.484843] rcu_sched R running 0 7 2 0x00000000 > [ 3500.491230] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) > [ 3500.499157] [<c04bd378>] (schedule_timeout) from [<c006553c>] (rcu_gp_kthread+0x26c/0x5f8) > [ 3500.507431] [<c006553c>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.514749] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.521977] rcu_bh S c04bda98 0 8 2 0x00000000 > [ 3500.528363] [<c04bda98>] (__schedule) from [<c0065350>] (rcu_gp_kthread+0x80/0x5f8) > [ 3500.536028] [<c0065350>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.543346] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.550573] migration/0 S c04bda98 0 9 2 0x00000000 > [ 3500.556959] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) > [ 3500.564885] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.572465] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.579692] watchdog/0 S c04bda98 0 10 2 0x00000000 > [ 3500.586076] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) > [ 3500.594001] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.601581] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.608808] watchdog/1 P c04bda98 0 11 2 0x00000000 > [ 3500.615195] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.622948] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.630439] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.637667] migration/1 P c04bda98 0 12 2 0x00000000 > [ 3500.644055] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.651807] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.659299] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.666527] ksoftirqd/1 P c04bda98 0 13 2 0x00000000 > [ 3500.672912] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.680665] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.688156] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.695384] kworker/1:0 S c04bda98 0 14 2 0x00000000 > [ 3500.701772] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.709438] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.716669] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.723896] kworker/1:0H S c04bda98 0 15 2 0x00000000 > [ 3500.730284] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.737950] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.745181] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.752408] watchdog/2 P c04bda98 0 16 2 0x00000000 > [ 3500.758794] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.766546] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.774038] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.781266] migration/2 P c04bda98 0 17 2 0x00000000 > [ 3500.787652] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.795403] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.802894] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.810121] ksoftirqd/2 P c04bda98 0 18 2 0x00000000 > [ 3500.816508] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.824259] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.831751] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.838979] kworker/2:0 S c04bda98 0 19 2 0x00000000 > [ 3500.845367] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.853033] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.860264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.867491] kworker/2:0H S c04bda98 0 20 2 0x00000000 > [ 3500.873880] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.881545] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3500.888776] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.896003] watchdog/3 P c04bda98 0 21 2 0x00000000 > [ 3500.902389] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.910142] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.917633] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.924860] migration/3 P c04bda98 0 22 2 0x00000000 > [ 3500.931246] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.938998] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.946489] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.953716] ksoftirqd/3 P c04bda98 0 23 2 0x00000000 > [ 3500.960102] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) > [ 3500.967855] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) > [ 3500.975338] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3500.982565] kworker/3:0 S c04bda98 0 24 2 0x00000000 > [ 3500.988954] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3500.996620] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.003851] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.011079] kworker/3:0H S c04bda98 0 25 2 0x00000000 > [ 3501.017466] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3501.025133] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.032364] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.039591] khelper S c04bda98 0 26 2 0x00000000 > [ 3501.045979] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.053730] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.061048] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.068276] kdevtmpfs S c04bda98 0 27 2 0x00000000 > [ 3501.074665] [<c04bda98>] (__schedule) from [<c028ba34>] (devtmpfsd+0x258/0x34c) > [ 3501.081984] [<c028ba34>] (devtmpfsd) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.088867] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.096095] writeback S c04bda98 0 28 2 0x00000000 > [ 3501.102484] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.110237] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.117555] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.124783] bioset S c04bda98 0 29 2 0x00000000 > [ 3501.131170] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.138923] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.146240] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.153469] kblockd S c04bda98 0 30 2 0x00000000 > [ 3501.159857] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.167610] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.174929] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.182156] ata_sff S c04bda98 0 31 2 0x00000000 > [ 3501.188545] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.196297] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.203615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.210843] khubd S c04bda98 0 32 2 0x00000000 > [ 3501.217230] [<c04bda98>] (__schedule) from [<c0328cb8>] (hub_thread+0xf74/0x119c) > [ 3501.224722] [<c0328cb8>] (hub_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.231692] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.238920] edac-poller S c04bda98 0 33 2 0x00000000 > [ 3501.245308] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.253061] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.260379] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.267606] rpciod S c04bda98 0 34 2 0x00000000 > [ 3501.273995] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.281748] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.289066] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.296293] kworker/0:1 R running 0 35 2 0x00000000 > [ 3501.302679] Workqueue: nfsiod rpc_async_release > [ 3501.307230] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) > [ 3501.314809] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) > [ 3501.322648] [<c04be150>] (_cond_resched) from [<c0035490>] (process_one_work+0x120/0x378) > [ 3501.330836] [<c0035490>] (process_one_work) from [<c0036198>] (worker_thread+0x13c/0x404) > [ 3501.339022] [<c0036198>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.346253] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.353481] khungtaskd R running 0 36 2 0x00000000 > [ 3501.359868] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) > [ 3501.367795] [<c04bd378>] (schedule_timeout) from [<c007cf8c>] (watchdog+0x68/0x2e8) > [ 3501.375461] [<c007cf8c>] (watchdog) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.382257] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.389485] kswapd0 R running 0 37 2 0x00000000 > [ 3501.395875] [<c001519c>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) > [ 3501.403630] [<c00111a4>] (show_stack) from [<c0046f68>] (show_state_filter+0x64/0x90) > [ 3501.411470] [<c0046f68>] (show_state_filter) from [<c0249d90>] (__handle_sysrq+0xb0/0x17c) > [ 3501.419746] [<c0249d90>] (__handle_sysrq) from [<c025b6fc>] (serial8250_rx_chars+0xf8/0x208) > [ 3501.428195] [<c025b6fc>] (serial8250_rx_chars) from [<c025d360>] (serial8250_handle_irq.part.18+0x68/0x9c) > [ 3501.437860] [<c025d360>] (serial8250_handle_irq.part.18) from [<c025c418>] (serial8250_interrupt+0x3c/0xc0) > [ 3501.447613] [<c025c418>] (serial8250_interrupt) from [<c005e300>] (handle_irq_event_percpu+0x54/0x180) > [ 3501.456930] [<c005e300>] (handle_irq_event_percpu) from [<c005e46c>] (handle_irq_event+0x40/0x60) > [ 3501.465814] [<c005e46c>] (handle_irq_event) from [<c0061334>] (handle_fasteoi_irq+0x80/0x158) > [ 3501.474349] [<c0061334>] (handle_fasteoi_irq) from [<c005dac8>] (generic_handle_irq+0x2c/0x3c) > [ 3501.482971] [<c005dac8>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) > [ 3501.490897] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) > [ 3501.498475] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) > [ 3501.505964] Exception stack(0xcd21bdd8 to 0xcd21be20) > [ 3501.511021] bdc0: 00000000 00000000 > [ 3501.519207] bde0: 00004451 00004452 00000000 cd0e9940 cd0e9940 cd21bf00 00000000 00000000 > [ 3501.527393] be00: 00000020 00000001 00000000 cd21be20 c009a5b0 c009a5d4 60000113 ffffffff > [ 3501.535583] [<c0011d00>] (__irq_svc) from [<c009a5d4>] (list_lru_count_node+0x3c/0x74) > [ 3501.543513] [<c009a5d4>] (list_lru_count_node) from [<c00c00b8>] (super_cache_count+0x60/0xc4) > [ 3501.552137] [<c00c00b8>] (super_cache_count) from [<c008bbbc>] (shrink_slab_node+0x34/0x1e4) > [ 3501.560585] [<c008bbbc>] (shrink_slab_node) from [<c008c53c>] (shrink_slab+0xc0/0xec) > [ 3501.568424] [<c008c53c>] (shrink_slab) from [<c008ef14>] (kswapd+0x57c/0x994) > [ 3501.575568] [<c008ef14>] (kswapd) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.582190] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.589418] fsnotify_mark S c04bda98 0 38 2 0x00000000 > [ 3501.595807] [<c04bda98>] (__schedule) from [<c00f5a40>] (fsnotify_mark_destroy+0xf8/0x12c) > [ 3501.604081] [<c00f5a40>] (fsnotify_mark_destroy) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.612007] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.619234] nfsiod S c04bda98 0 39 2 0x00000000 > [ 3501.625623] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.633376] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.640693] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.647921] crypto S c04bda98 0 40 2 0x00000000 > [ 3501.654309] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.662062] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.669380] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.676608] kworker/u8:1 R running 0 44 2 0x00000000 > [ 3501.682994] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3501.690661] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.697892] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.705120] kpsmoused S c04bda98 0 53 2 0x00000000 > [ 3501.711508] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.719261] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.726579] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.733807] deferwq S c04bda98 0 54 2 0x00000000 > [ 3501.740194] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) > [ 3501.747946] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.755264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.762492] udhcpc R running 0 92 1 0x00000000 > [ 3501.768879] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) > [ 3501.777938] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) > [ 3501.787865] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) > [ 3501.796140] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) > [ 3501.803894] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) > [ 3501.811648] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3501.819311] telnetd S c04bda98 0 100 1 0x00000000 > [ 3501.825697] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) > [ 3501.834841] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) > [ 3501.844768] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) > [ 3501.853043] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) > [ 3501.860797] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) > [ 3501.868550] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3501.876212] sh S c04bda98 0 101 1 0x00000000 > [ 3501.882600] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) > [ 3501.889746] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) > [ 3501.896631] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3501.904206] portmap S c04bda98 0 102 1 0x00000000 > [ 3501.910593] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) > [ 3501.919736] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) > [ 3501.929663] [<c00cde68>] (poll_schedule_timeout) from [<c00cf38c>] (do_sys_poll+0x3b8/0x478) > [ 3501.938112] [<c00cf38c>] (do_sys_poll) from [<c00cf4fc>] (SyS_poll+0x5c/0xd4) > [ 3501.945258] [<c00cf4fc>] (SyS_poll) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3501.952746] kworker/0:2 S c04bda98 0 122 2 0x00000000 > [ 3501.959134] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) > [ 3501.966799] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3501.974029] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3501.981257] udhcpc R running 0 132 1 0x00000000 > [ 3501.987643] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) > [ 3501.996701] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) > [ 3502.006628] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) > [ 3502.014903] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) > [ 3502.022657] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) > [ 3502.030411] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3502.038072] udhcpc R running 0 137 1 0x00000000 > [ 3502.044459] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) > [ 3502.053515] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) > [ 3502.063443] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) > [ 3502.071718] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) > [ 3502.079472] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) > [ 3502.087226] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3502.094887] lockd S c04bda98 0 143 2 0x00000000 > [ 3502.101273] [<c04bda98>] (__schedule) from [<c04bd3b4>] (schedule_timeout+0x16c/0x1ac) > [ 3502.109201] [<c04bd3b4>] (schedule_timeout) from [<c04aad44>] (svc_recv+0x5ac/0x81c) > [ 3502.116956] [<c04aad44>] (svc_recv) from [<c019e6bc>] (lockd+0x98/0x148) > [ 3502.123666] [<c019e6bc>] (lockd) from [<c003b87c>] (kthread+0xd4/0xec) > [ 3502.130201] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) > [ 3502.137429] rcu.sh S c04bda98 0 153 101 0x00000000 > [ 3502.143815] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) > [ 3502.150959] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) > [ 3502.157843] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) > [ 3502.165418] malloc_test_bcm R running 0 155 153 0x00000000 > [ 3502.171805] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) > [ 3502.179384] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) > [ 3502.187224] [<c04be150>] (_cond_resched) from [<c008c560>] (shrink_slab+0xe4/0xec) > [ 3502.194803] [<c008c560>] (shrink_slab) from [<c008e818>] (try_to_free_pages+0x310/0x490) > [ 3502.202906] [<c008e818>] (try_to_free_pages) from [<c0086184>] (__alloc_pages_nodemask+0x5a4/0x8f4) > [ 3502.211963] [<c0086184>] (__alloc_pages_nodemask) from [<c009cd60>] (__pte_alloc+0x24/0x168) > [ 3502.220411] [<c009cd60>] (__pte_alloc) from [<c00a0fdc>] (handle_mm_fault+0xc30/0xcdc) > [ 3502.228340] [<c00a0fdc>] (handle_mm_fault) from [<c001749c>] (do_page_fault+0x194/0x27c) > [ 3502.236441] [<c001749c>] (do_page_fault) from [<c000844c>] (do_DataAbort+0x30/0x90) > [ 3502.244107] [<c000844c>] (do_DataAbort) from [<c0011e34>] (__dabt_usr+0x34/0x40) > [ 3502.251509] Exception stack(0xcc9c7fb0 to 0xcc9c7ff8) > [ 3502.256566] 7fa0: 76388000 00101000 00101002 000aa280 > [ 3502.264752] 7fc0: 76388008 b6fa9508 00101000 00100008 b6fa9538 00100000 00001000 bed52d24 > [ 3502.272937] 7fe0: 00000000 bed52c80 b6efefa8 b6efefc4 40000010 ffffffff > [ 3502.279559] Sched Debug Version: v0.11, 3.14.0-rc4 #32 > [ 3502.284702] ktime : 3502275.231136 > [ 3502.291061] sched_clk : 3502279.556898 > [ 3502.297420] cpu_clk : 3502279.557268 > [ 3502.303778] jiffies : 320030 > [ 3502.309441] > [ 3502.310931] sysctl_sched > [ 3502.313465] .sysctl_sched_latency : 6.000000 > [ 3502.319563] .sysctl_sched_min_granularity : 0.750000 > [ 3502.325662] .sysctl_sched_wakeup_granularity : 1.000000 > [ 3502.331760] .sysctl_sched_child_runs_first : 0 > [ 3502.337250] .sysctl_sched_features : 11899 > [ 3502.343087] .sysctl_sched_tunable_scaling : 1 (logaritmic) > [ 3502.349706] > [ 3502.351198] cpu#0 > [ 3502.353124] .nr_running : 9 > [ 3502.357745] .load : 7168 > [ 3502.362626] .nr_switches : 41007 > [ 3502.367594] .nr_load_updates : 350030 > [ 3502.372649] .nr_uninterruptible : 0 > [ 3502.377269] .next_balance : 4294.942188 > [ 3502.382758] .curr->pid : 37 > [ 3502.387466] .clock : 3500304.328054 > [ 3502.393216] .cpu_load[0] : 31 > [ 3502.397922] .cpu_load[1] : 31 > [ 3502.402628] .cpu_load[2] : 31 > [ 3502.407335] .cpu_load[3] : 31 > [ 3502.412043] .cpu_load[4] : 31 > [ 3502.416750] > [ 3502.416750] cfs_rq[0]: > [ 3502.420589] .exec_clock : 0.000000 > [ 3502.425818] .MIN_vruntime : 1392.857683 > [ 3502.431308] .min_vruntime : 1395.857683 > [ 3502.436798] .max_vruntime : 1392.895054 > [ 3502.442287] .spread : 0.037371 > [ 3502.447515] .spread0 : 0.000000 > [ 3502.452744] .nr_spread_over : 0 > [ 3502.457364] .nr_running : 7 > [ 3502.461985] .load : 7168 > [ 3502.466866] .runnable_load_avg : 31 > [ 3502.471573] .blocked_load_avg : 0 > [ 3502.476193] > [ 3502.476193] rt_rq[0]: > [ 3502.479945] .rt_nr_running : 2 > [ 3502.484564] .rt_throttled : 0 > [ 3502.489185] .rt_time : 0.000000 > [ 3502.494414] .rt_runtime : 0.000001 > [ 3502.499643] > [ 3502.499643] runnable tasks: > [ 3502.499643] task PID tree-key switches prio exec-runtime sum-exec p > [ 3502.499643] ----------------------------------------------------------------------------------------------- > [ 3502.525299] init 1 1293.503936 967 120 0 0 0 > [ 3502.540386] kthreadd 2 -3.000000 47 2 0 0 0 > [ 3502.555474] ksoftirqd/0 3 -3.000000 411 2 0 0 0 > [ 3502.570562] kworker/0:0 4 1212.395251 9 120 0 0 0 > [ 3502.585647] kworker/0:0H 5 76.078793 3 100 0 0 0 > [ 3502.600732] kworker/u8:0 6 474.674159 9 120 0 0 0 > [ 3502.615820] rcu_sched 7 1392.871202 202 120 0 0 0 > [ 3502.630906] rcu_bh 8 15.631059 2 120 0 0 0 Keeping either of the above two kthreads can get you RCU CPU stall warnings. > [ 3502.645991] migration/0 9 0.000000 5 0 0 0 0 > [ 3502.661079] watchdog/0 10 -3.000000 878 0 0 0 0 > [ 3502.676164] watchdog/1 11 22.645905 2 120 0 0 0 > [ 3502.691250] migration/1 12 0.000000 2 0 0 0 0 > [ 3502.706336] ksoftirqd/1 13 28.653864 2 120 0 0 0 > [ 3502.721422] kworker/1:0 14 395.389726 8 120 0 0 0 > [ 3502.736508] kworker/1:0H 15 76.078608 3 100 0 0 0 > [ 3502.751595] watchdog/2 16 36.663186 2 120 0 0 0 > [ 3502.766680] migration/2 17 0.000000 2 0 0 0 0 > [ 3502.781767] ksoftirqd/2 18 42.671219 2 120 0 0 0 > [ 3502.796854] kworker/2:0 19 395.389431 8 120 0 0 0 > [ 3502.811941] kworker/2:0H 20 76.078598 3 100 0 0 0 > [ 3502.827027] watchdog/3 21 50.680315 2 120 0 0 0 > [ 3502.842112] migration/3 22 0.000000 2 0 0 0 0 > [ 3502.857198] ksoftirqd/3 23 56.688385 2 120 0 0 0 > [ 3502.872286] kworker/3:0 24 395.389949 8 120 0 0 0 > [ 3502.887372] kworker/3:0H 25 76.078597 3 100 0 0 0 > [ 3502.902457] khelper 26 -3.000000 2 2 0 0 0 > [ 3502.917543] kdevtmpfs 27 980.384584 647 120 0 0 0 > [ 3502.932629] writeback 28 77.578808 2 100 0 0 0 > [ 3502.947715] bioset 29 79.080205 2 100 0 0 0 > [ 3502.962804] kblockd 30 80.583022 2 100 0 0 0 > [ 3502.977890] ata_sff 31 82.086421 2 100 0 0 0 > [ 3502.992977] khubd 32 -3.000000 49 3 0 0 0 > [ 3503.008063] edac-poller 33 85.093351 2 100 0 0 0 > [ 3503.023148] rpciod 34 88.314163 2 100 0 0 0 > [ 3503.038233] kworker/0:1 35 1392.895054 589 120 0 0 0 > [ 3503.053319] khungtaskd 36 1392.857683 2 120 0 0 0 > [ 3503.068405] R kswapd0 37 -3.000000 17266 2 0 0 0 > [ 3503.083491] fsnotify_mark 38 396.392655 2 120 0 0 0 > [ 3503.098577] nfsiod 39 398.390829 2 100 0 0 0 > [ 3503.113663] crypto 40 400.392267 2 100 0 0 0 > [ 3503.128749] kworker/u8:1 44 1392.857683 18 120 0 0 0 > [ 3503.143835] kpsmoused 53 956.219135 2 100 0 0 0 > [ 3503.158921] deferwq 54 985.352494 2 100 0 0 0 > [ 3503.174006] udhcpc 92 1392.857683 14 120 0 0 0 > [ 3503.189092] telnetd 100 -3.000000 1 65 0 0 0 > [ 3503.204178] sh 101 -3.000000 224 2 0 0 0 > [ 3503.219265] portmap 102 -3.000000 13 2 0 0 0 > [ 3503.234351] kworker/0:2 122 1235.968172 3 120 0 0 0 > [ 3503.249436] udhcpc 132 1392.857683 1 120 0 0 0 > [ 3503.264522] udhcpc 137 1392.857683 1 120 0 0 0 > [ 3503.279608] lockd 143 1324.783814 2 120 0 0 0 > [ 3503.294694] rcu.sh 153 -3.000000 8 2 0 0 0 > [ 3503.309781] malloc_crazy 155 0.000000 18087 2 0 0 0 > [ 3503.324868] -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed 2014-03-05 1:41 ` Paul E. McKenney @ 2014-03-05 1:43 ` Florian Fainelli 0 siblings, 0 replies; 10+ messages in thread From: Florian Fainelli @ 2014-03-05 1:43 UTC (permalink / raw) To: Paul McKenney Cc: Eric Dumazet, linux-kernel@vger.kernel.org, linux-mm, linux-nfs, trond.myklebust, netdev 2014-03-04 17:41 GMT-08:00 Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > On Tue, Mar 04, 2014 at 05:03:24PM -0800, Florian Fainelli wrote: >> 2014-03-04 16:48 GMT-08:00 Eric Dumazet <eric.dumazet@gmail.com>: >> > On Tue, 2014-03-04 at 15:55 -0800, Florian Fainelli wrote: >> >> Hi all, >> >> >> >> I am seeing the following RCU stalls messages appearing on an ARMv7 >> >> 4xCPUs system running 3.14-rc4: >> >> >> >> [ 42.974327] INFO: rcu_sched detected stalls on CPUs/tasks: >> >> [ 42.979839] (detected by 0, t=2102 jiffies, g=4294967082, >> >> c=4294967081, q=516) >> >> [ 42.987169] INFO: Stall ended before state dump start >> >> >> >> this is happening under the following conditions: >> >> >> >> - the attached bumper.c binary alters various kernel thread priorities >> >> based on the contents of bumpup.cfg and >> >> - malloc_crazy is running from a NFS share >> >> - malloc_crazy.c is running in a loop allocating chunks of memory but >> >> never freeing it >> >> >> >> when the priorities are altered, instead of getting the OOM killer to >> >> be invoked, the RCU stalls are happening. Taking NFS out of the >> >> equation does not allow me to reproduce the problem even with the >> >> priorities altered. >> >> >> >> This "problem" seems to have been there for quite a while now since I >> >> was able to get 3.8.13 to trigger that bug as well, with a slightly >> >> more detailed RCU debugging trace which points the finger at kswapd0. > > The 3.8 kernel was where RCU grace-period processing moved to kthreads. > Does 3.7 or earlier trigger? I will try to test on 3.7, thanks for the hint. > > In any case, if you starve RCU's grace-period kthreads (rcu_bh and > rcu_sched in your kernel configuration), then RCU CPU stall-warning > messages are expected behavior. In 3.7 and earlier, you could get the > same effect by starving ksoftirqd. > >> >> You should be able to get that reproduced under QEMU with the >> >> Versatile Express platform emulating a Cortex A15 CPU and the attached >> >> files. >> >> >> >> Any help or suggestions would be greatly appreciated. Thanks! >> > >> > Do you have a more complete trace, including stack traces ? >> >> Attatched is what I get out of SysRq-t, which is the only thing I have >> (note that the kernel is built with CONFIG_RCU_CPU_STALL_INFO=y): >> >> Thanks! >> -- >> Florian > >> [ 3474.417333] INFO: Stall ended before state dump start > > This was running on 3.14-rc4? Correct, this was observed on 3.14-rc4. > > Thanx, Paul > >> [ 3500.312946] SysRq : Show State >> [ 3500.316015] task PC stack pid father >> [ 3500.321244] init S c04bda98 0 1 0 0x00000000 >> [ 3500.327640] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) >> [ 3500.334786] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) >> [ 3500.341672] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3500.349247] kthreadd S c04bda98 0 2 0 0x00000000 >> [ 3500.355635] [<c04bda98>] (__schedule) from [<c003c084>] (kthreadd+0x168/0x16c) >> [ 3500.362866] [<c003c084>] (kthreadd) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.370181] ksoftirqd/0 S c04bda98 0 3 2 0x00000000 >> [ 3500.376567] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) >> [ 3500.384494] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.392072] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.399300] kworker/0:0 S c04bda98 0 4 2 0x00000000 >> [ 3500.405691] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.413357] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.420588] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.427817] kworker/0:0H S c04bda98 0 5 2 0x00000000 >> [ 3500.434205] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.441871] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.449102] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.456329] kworker/u8:0 S c04bda98 0 6 2 0x00000000 >> [ 3500.462718] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.470384] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.477615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.484843] rcu_sched R running 0 7 2 0x00000000 >> [ 3500.491230] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) >> [ 3500.499157] [<c04bd378>] (schedule_timeout) from [<c006553c>] (rcu_gp_kthread+0x26c/0x5f8) >> [ 3500.507431] [<c006553c>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.514749] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.521977] rcu_bh S c04bda98 0 8 2 0x00000000 >> [ 3500.528363] [<c04bda98>] (__schedule) from [<c0065350>] (rcu_gp_kthread+0x80/0x5f8) >> [ 3500.536028] [<c0065350>] (rcu_gp_kthread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.543346] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.550573] migration/0 S c04bda98 0 9 2 0x00000000 >> [ 3500.556959] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) >> [ 3500.564885] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.572465] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.579692] watchdog/0 S c04bda98 0 10 2 0x00000000 >> [ 3500.586076] [<c04bda98>] (__schedule) from [<c0041ffc>] (smpboot_thread_fn+0xc4/0x17c) >> [ 3500.594001] [<c0041ffc>] (smpboot_thread_fn) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.601581] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.608808] watchdog/1 P c04bda98 0 11 2 0x00000000 >> [ 3500.615195] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.622948] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.630439] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.637667] migration/1 P c04bda98 0 12 2 0x00000000 >> [ 3500.644055] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.651807] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.659299] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.666527] ksoftirqd/1 P c04bda98 0 13 2 0x00000000 >> [ 3500.672912] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.680665] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.688156] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.695384] kworker/1:0 S c04bda98 0 14 2 0x00000000 >> [ 3500.701772] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.709438] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.716669] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.723896] kworker/1:0H S c04bda98 0 15 2 0x00000000 >> [ 3500.730284] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.737950] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.745181] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.752408] watchdog/2 P c04bda98 0 16 2 0x00000000 >> [ 3500.758794] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.766546] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.774038] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.781266] migration/2 P c04bda98 0 17 2 0x00000000 >> [ 3500.787652] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.795403] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.802894] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.810121] ksoftirqd/2 P c04bda98 0 18 2 0x00000000 >> [ 3500.816508] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.824259] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.831751] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.838979] kworker/2:0 S c04bda98 0 19 2 0x00000000 >> [ 3500.845367] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.853033] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.860264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.867491] kworker/2:0H S c04bda98 0 20 2 0x00000000 >> [ 3500.873880] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.881545] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3500.888776] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.896003] watchdog/3 P c04bda98 0 21 2 0x00000000 >> [ 3500.902389] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.910142] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.917633] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.924860] migration/3 P c04bda98 0 22 2 0x00000000 >> [ 3500.931246] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.938998] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.946489] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.953716] ksoftirqd/3 P c04bda98 0 23 2 0x00000000 >> [ 3500.960102] [<c04bda98>] (__schedule) from [<c003b5d8>] (__kthread_parkme+0x38/0x8c) >> [ 3500.967855] [<c003b5d8>] (__kthread_parkme) from [<c003b874>] (kthread+0xcc/0xec) >> [ 3500.975338] [<c003b874>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3500.982565] kworker/3:0 S c04bda98 0 24 2 0x00000000 >> [ 3500.988954] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3500.996620] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.003851] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.011079] kworker/3:0H S c04bda98 0 25 2 0x00000000 >> [ 3501.017466] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3501.025133] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.032364] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.039591] khelper S c04bda98 0 26 2 0x00000000 >> [ 3501.045979] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.053730] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.061048] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.068276] kdevtmpfs S c04bda98 0 27 2 0x00000000 >> [ 3501.074665] [<c04bda98>] (__schedule) from [<c028ba34>] (devtmpfsd+0x258/0x34c) >> [ 3501.081984] [<c028ba34>] (devtmpfsd) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.088867] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.096095] writeback S c04bda98 0 28 2 0x00000000 >> [ 3501.102484] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.110237] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.117555] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.124783] bioset S c04bda98 0 29 2 0x00000000 >> [ 3501.131170] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.138923] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.146240] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.153469] kblockd S c04bda98 0 30 2 0x00000000 >> [ 3501.159857] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.167610] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.174929] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.182156] ata_sff S c04bda98 0 31 2 0x00000000 >> [ 3501.188545] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.196297] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.203615] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.210843] khubd S c04bda98 0 32 2 0x00000000 >> [ 3501.217230] [<c04bda98>] (__schedule) from [<c0328cb8>] (hub_thread+0xf74/0x119c) >> [ 3501.224722] [<c0328cb8>] (hub_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.231692] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.238920] edac-poller S c04bda98 0 33 2 0x00000000 >> [ 3501.245308] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.253061] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.260379] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.267606] rpciod S c04bda98 0 34 2 0x00000000 >> [ 3501.273995] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.281748] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.289066] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.296293] kworker/0:1 R running 0 35 2 0x00000000 >> [ 3501.302679] Workqueue: nfsiod rpc_async_release >> [ 3501.307230] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) >> [ 3501.314809] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) >> [ 3501.322648] [<c04be150>] (_cond_resched) from [<c0035490>] (process_one_work+0x120/0x378) >> [ 3501.330836] [<c0035490>] (process_one_work) from [<c0036198>] (worker_thread+0x13c/0x404) >> [ 3501.339022] [<c0036198>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.346253] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.353481] khungtaskd R running 0 36 2 0x00000000 >> [ 3501.359868] [<c04bda98>] (__schedule) from [<c04bd378>] (schedule_timeout+0x130/0x1ac) >> [ 3501.367795] [<c04bd378>] (schedule_timeout) from [<c007cf8c>] (watchdog+0x68/0x2e8) >> [ 3501.375461] [<c007cf8c>] (watchdog) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.382257] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.389485] kswapd0 R running 0 37 2 0x00000000 >> [ 3501.395875] [<c001519c>] (unwind_backtrace) from [<c00111a4>] (show_stack+0x10/0x14) >> [ 3501.403630] [<c00111a4>] (show_stack) from [<c0046f68>] (show_state_filter+0x64/0x90) >> [ 3501.411470] [<c0046f68>] (show_state_filter) from [<c0249d90>] (__handle_sysrq+0xb0/0x17c) >> [ 3501.419746] [<c0249d90>] (__handle_sysrq) from [<c025b6fc>] (serial8250_rx_chars+0xf8/0x208) >> [ 3501.428195] [<c025b6fc>] (serial8250_rx_chars) from [<c025d360>] (serial8250_handle_irq.part.18+0x68/0x9c) >> [ 3501.437860] [<c025d360>] (serial8250_handle_irq.part.18) from [<c025c418>] (serial8250_interrupt+0x3c/0xc0) >> [ 3501.447613] [<c025c418>] (serial8250_interrupt) from [<c005e300>] (handle_irq_event_percpu+0x54/0x180) >> [ 3501.456930] [<c005e300>] (handle_irq_event_percpu) from [<c005e46c>] (handle_irq_event+0x40/0x60) >> [ 3501.465814] [<c005e46c>] (handle_irq_event) from [<c0061334>] (handle_fasteoi_irq+0x80/0x158) >> [ 3501.474349] [<c0061334>] (handle_fasteoi_irq) from [<c005dac8>] (generic_handle_irq+0x2c/0x3c) >> [ 3501.482971] [<c005dac8>] (generic_handle_irq) from [<c000eb7c>] (handle_IRQ+0x40/0x90) >> [ 3501.490897] [<c000eb7c>] (handle_IRQ) from [<c0008568>] (gic_handle_irq+0x2c/0x5c) >> [ 3501.498475] [<c0008568>] (gic_handle_irq) from [<c0011d00>] (__irq_svc+0x40/0x50) >> [ 3501.505964] Exception stack(0xcd21bdd8 to 0xcd21be20) >> [ 3501.511021] bdc0: 00000000 00000000 >> [ 3501.519207] bde0: 00004451 00004452 00000000 cd0e9940 cd0e9940 cd21bf00 00000000 00000000 >> [ 3501.527393] be00: 00000020 00000001 00000000 cd21be20 c009a5b0 c009a5d4 60000113 ffffffff >> [ 3501.535583] [<c0011d00>] (__irq_svc) from [<c009a5d4>] (list_lru_count_node+0x3c/0x74) >> [ 3501.543513] [<c009a5d4>] (list_lru_count_node) from [<c00c00b8>] (super_cache_count+0x60/0xc4) >> [ 3501.552137] [<c00c00b8>] (super_cache_count) from [<c008bbbc>] (shrink_slab_node+0x34/0x1e4) >> [ 3501.560585] [<c008bbbc>] (shrink_slab_node) from [<c008c53c>] (shrink_slab+0xc0/0xec) >> [ 3501.568424] [<c008c53c>] (shrink_slab) from [<c008ef14>] (kswapd+0x57c/0x994) >> [ 3501.575568] [<c008ef14>] (kswapd) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.582190] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.589418] fsnotify_mark S c04bda98 0 38 2 0x00000000 >> [ 3501.595807] [<c04bda98>] (__schedule) from [<c00f5a40>] (fsnotify_mark_destroy+0xf8/0x12c) >> [ 3501.604081] [<c00f5a40>] (fsnotify_mark_destroy) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.612007] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.619234] nfsiod S c04bda98 0 39 2 0x00000000 >> [ 3501.625623] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.633376] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.640693] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.647921] crypto S c04bda98 0 40 2 0x00000000 >> [ 3501.654309] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.662062] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.669380] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.676608] kworker/u8:1 R running 0 44 2 0x00000000 >> [ 3501.682994] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3501.690661] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.697892] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.705120] kpsmoused S c04bda98 0 53 2 0x00000000 >> [ 3501.711508] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.719261] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.726579] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.733807] deferwq S c04bda98 0 54 2 0x00000000 >> [ 3501.740194] [<c04bda98>] (__schedule) from [<c003595c>] (rescuer_thread+0x274/0x324) >> [ 3501.747946] [<c003595c>] (rescuer_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.755264] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.762492] udhcpc R running 0 92 1 0x00000000 >> [ 3501.768879] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) >> [ 3501.777938] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) >> [ 3501.787865] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) >> [ 3501.796140] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) >> [ 3501.803894] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) >> [ 3501.811648] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3501.819311] telnetd S c04bda98 0 100 1 0x00000000 >> [ 3501.825697] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) >> [ 3501.834841] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) >> [ 3501.844768] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) >> [ 3501.853043] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) >> [ 3501.860797] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) >> [ 3501.868550] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3501.876212] sh S c04bda98 0 101 1 0x00000000 >> [ 3501.882600] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) >> [ 3501.889746] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) >> [ 3501.896631] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3501.904206] portmap S c04bda98 0 102 1 0x00000000 >> [ 3501.910593] [<c04bda98>] (__schedule) from [<c04bd73c>] (schedule_hrtimeout_range_clock+0x134/0x150) >> [ 3501.919736] [<c04bd73c>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) >> [ 3501.929663] [<c00cde68>] (poll_schedule_timeout) from [<c00cf38c>] (do_sys_poll+0x3b8/0x478) >> [ 3501.938112] [<c00cf38c>] (do_sys_poll) from [<c00cf4fc>] (SyS_poll+0x5c/0xd4) >> [ 3501.945258] [<c00cf4fc>] (SyS_poll) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3501.952746] kworker/0:2 S c04bda98 0 122 2 0x00000000 >> [ 3501.959134] [<c04bda98>] (__schedule) from [<c003626c>] (worker_thread+0x210/0x404) >> [ 3501.966799] [<c003626c>] (worker_thread) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3501.974029] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3501.981257] udhcpc R running 0 132 1 0x00000000 >> [ 3501.987643] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) >> [ 3501.996701] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) >> [ 3502.006628] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) >> [ 3502.014903] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) >> [ 3502.022657] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) >> [ 3502.030411] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3502.038072] udhcpc R running 0 137 1 0x00000000 >> [ 3502.044459] [<c04bda98>] (__schedule) from [<c04bd6c8>] (schedule_hrtimeout_range_clock+0xc0/0x150) >> [ 3502.053515] [<c04bd6c8>] (schedule_hrtimeout_range_clock) from [<c00cde68>] (poll_schedule_timeout+0x3c/0x) >> [ 3502.063443] [<c00cde68>] (poll_schedule_timeout) from [<c00ce840>] (do_select+0x5c8/0x638) >> [ 3502.071718] [<c00ce840>] (do_select) from [<c00ce9d0>] (core_sys_select+0x120/0x31c) >> [ 3502.079472] [<c00ce9d0>] (core_sys_select) from [<c00cec90>] (SyS_select+0xc4/0x110) >> [ 3502.087226] [<c00cec90>] (SyS_select) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3502.094887] lockd S c04bda98 0 143 2 0x00000000 >> [ 3502.101273] [<c04bda98>] (__schedule) from [<c04bd3b4>] (schedule_timeout+0x16c/0x1ac) >> [ 3502.109201] [<c04bd3b4>] (schedule_timeout) from [<c04aad44>] (svc_recv+0x5ac/0x81c) >> [ 3502.116956] [<c04aad44>] (svc_recv) from [<c019e6bc>] (lockd+0x98/0x148) >> [ 3502.123666] [<c019e6bc>] (lockd) from [<c003b87c>] (kthread+0xd4/0xec) >> [ 3502.130201] [<c003b87c>] (kthread) from [<c000e338>] (ret_from_fork+0x14/0x3c) >> [ 3502.137429] rcu.sh S c04bda98 0 153 101 0x00000000 >> [ 3502.143815] [<c04bda98>] (__schedule) from [<c0022c2c>] (do_wait+0x220/0x244) >> [ 3502.150959] [<c0022c2c>] (do_wait) from [<c0022ff0>] (SyS_wait4+0x60/0xc4) >> [ 3502.157843] [<c0022ff0>] (SyS_wait4) from [<c000e2a0>] (ret_fast_syscall+0x0/0x30) >> [ 3502.165418] malloc_test_bcm R running 0 155 153 0x00000000 >> [ 3502.171805] [<c04bda98>] (__schedule) from [<c00450c4>] (__cond_resched+0x24/0x34) >> [ 3502.179384] [<c00450c4>] (__cond_resched) from [<c04be150>] (_cond_resched+0x3c/0x44) >> [ 3502.187224] [<c04be150>] (_cond_resched) from [<c008c560>] (shrink_slab+0xe4/0xec) >> [ 3502.194803] [<c008c560>] (shrink_slab) from [<c008e818>] (try_to_free_pages+0x310/0x490) >> [ 3502.202906] [<c008e818>] (try_to_free_pages) from [<c0086184>] (__alloc_pages_nodemask+0x5a4/0x8f4) >> [ 3502.211963] [<c0086184>] (__alloc_pages_nodemask) from [<c009cd60>] (__pte_alloc+0x24/0x168) >> [ 3502.220411] [<c009cd60>] (__pte_alloc) from [<c00a0fdc>] (handle_mm_fault+0xc30/0xcdc) >> [ 3502.228340] [<c00a0fdc>] (handle_mm_fault) from [<c001749c>] (do_page_fault+0x194/0x27c) >> [ 3502.236441] [<c001749c>] (do_page_fault) from [<c000844c>] (do_DataAbort+0x30/0x90) >> [ 3502.244107] [<c000844c>] (do_DataAbort) from [<c0011e34>] (__dabt_usr+0x34/0x40) >> [ 3502.251509] Exception stack(0xcc9c7fb0 to 0xcc9c7ff8) >> [ 3502.256566] 7fa0: 76388000 00101000 00101002 000aa280 >> [ 3502.264752] 7fc0: 76388008 b6fa9508 00101000 00100008 b6fa9538 00100000 00001000 bed52d24 >> [ 3502.272937] 7fe0: 00000000 bed52c80 b6efefa8 b6efefc4 40000010 ffffffff >> [ 3502.279559] Sched Debug Version: v0.11, 3.14.0-rc4 #32 >> [ 3502.284702] ktime : 3502275.231136 >> [ 3502.291061] sched_clk : 3502279.556898 >> [ 3502.297420] cpu_clk : 3502279.557268 >> [ 3502.303778] jiffies : 320030 >> [ 3502.309441] >> [ 3502.310931] sysctl_sched >> [ 3502.313465] .sysctl_sched_latency : 6.000000 >> [ 3502.319563] .sysctl_sched_min_granularity : 0.750000 >> [ 3502.325662] .sysctl_sched_wakeup_granularity : 1.000000 >> [ 3502.331760] .sysctl_sched_child_runs_first : 0 >> [ 3502.337250] .sysctl_sched_features : 11899 >> [ 3502.343087] .sysctl_sched_tunable_scaling : 1 (logaritmic) >> [ 3502.349706] >> [ 3502.351198] cpu#0 >> [ 3502.353124] .nr_running : 9 >> [ 3502.357745] .load : 7168 >> [ 3502.362626] .nr_switches : 41007 >> [ 3502.367594] .nr_load_updates : 350030 >> [ 3502.372649] .nr_uninterruptible : 0 >> [ 3502.377269] .next_balance : 4294.942188 >> [ 3502.382758] .curr->pid : 37 >> [ 3502.387466] .clock : 3500304.328054 >> [ 3502.393216] .cpu_load[0] : 31 >> [ 3502.397922] .cpu_load[1] : 31 >> [ 3502.402628] .cpu_load[2] : 31 >> [ 3502.407335] .cpu_load[3] : 31 >> [ 3502.412043] .cpu_load[4] : 31 >> [ 3502.416750] >> [ 3502.416750] cfs_rq[0]: >> [ 3502.420589] .exec_clock : 0.000000 >> [ 3502.425818] .MIN_vruntime : 1392.857683 >> [ 3502.431308] .min_vruntime : 1395.857683 >> [ 3502.436798] .max_vruntime : 1392.895054 >> [ 3502.442287] .spread : 0.037371 >> [ 3502.447515] .spread0 : 0.000000 >> [ 3502.452744] .nr_spread_over : 0 >> [ 3502.457364] .nr_running : 7 >> [ 3502.461985] .load : 7168 >> [ 3502.466866] .runnable_load_avg : 31 >> [ 3502.471573] .blocked_load_avg : 0 >> [ 3502.476193] >> [ 3502.476193] rt_rq[0]: >> [ 3502.479945] .rt_nr_running : 2 >> [ 3502.484564] .rt_throttled : 0 >> [ 3502.489185] .rt_time : 0.000000 >> [ 3502.494414] .rt_runtime : 0.000001 >> [ 3502.499643] >> [ 3502.499643] runnable tasks: >> [ 3502.499643] task PID tree-key switches prio exec-runtime sum-exec p >> [ 3502.499643] ----------------------------------------------------------------------------------------------- >> [ 3502.525299] init 1 1293.503936 967 120 0 0 0 >> [ 3502.540386] kthreadd 2 -3.000000 47 2 0 0 0 >> [ 3502.555474] ksoftirqd/0 3 -3.000000 411 2 0 0 0 >> [ 3502.570562] kworker/0:0 4 1212.395251 9 120 0 0 0 >> [ 3502.585647] kworker/0:0H 5 76.078793 3 100 0 0 0 >> [ 3502.600732] kworker/u8:0 6 474.674159 9 120 0 0 0 >> [ 3502.615820] rcu_sched 7 1392.871202 202 120 0 0 0 >> [ 3502.630906] rcu_bh 8 15.631059 2 120 0 0 0 > > Keeping either of the above two kthreads can get you RCU CPU stall warnings. > >> [ 3502.645991] migration/0 9 0.000000 5 0 0 0 0 >> [ 3502.661079] watchdog/0 10 -3.000000 878 0 0 0 0 >> [ 3502.676164] watchdog/1 11 22.645905 2 120 0 0 0 >> [ 3502.691250] migration/1 12 0.000000 2 0 0 0 0 >> [ 3502.706336] ksoftirqd/1 13 28.653864 2 120 0 0 0 >> [ 3502.721422] kworker/1:0 14 395.389726 8 120 0 0 0 >> [ 3502.736508] kworker/1:0H 15 76.078608 3 100 0 0 0 >> [ 3502.751595] watchdog/2 16 36.663186 2 120 0 0 0 >> [ 3502.766680] migration/2 17 0.000000 2 0 0 0 0 >> [ 3502.781767] ksoftirqd/2 18 42.671219 2 120 0 0 0 >> [ 3502.796854] kworker/2:0 19 395.389431 8 120 0 0 0 >> [ 3502.811941] kworker/2:0H 20 76.078598 3 100 0 0 0 >> [ 3502.827027] watchdog/3 21 50.680315 2 120 0 0 0 >> [ 3502.842112] migration/3 22 0.000000 2 0 0 0 0 >> [ 3502.857198] ksoftirqd/3 23 56.688385 2 120 0 0 0 >> [ 3502.872286] kworker/3:0 24 395.389949 8 120 0 0 0 >> [ 3502.887372] kworker/3:0H 25 76.078597 3 100 0 0 0 >> [ 3502.902457] khelper 26 -3.000000 2 2 0 0 0 >> [ 3502.917543] kdevtmpfs 27 980.384584 647 120 0 0 0 >> [ 3502.932629] writeback 28 77.578808 2 100 0 0 0 >> [ 3502.947715] bioset 29 79.080205 2 100 0 0 0 >> [ 3502.962804] kblockd 30 80.583022 2 100 0 0 0 >> [ 3502.977890] ata_sff 31 82.086421 2 100 0 0 0 >> [ 3502.992977] khubd 32 -3.000000 49 3 0 0 0 >> [ 3503.008063] edac-poller 33 85.093351 2 100 0 0 0 >> [ 3503.023148] rpciod 34 88.314163 2 100 0 0 0 >> [ 3503.038233] kworker/0:1 35 1392.895054 589 120 0 0 0 >> [ 3503.053319] khungtaskd 36 1392.857683 2 120 0 0 0 >> [ 3503.068405] R kswapd0 37 -3.000000 17266 2 0 0 0 >> [ 3503.083491] fsnotify_mark 38 396.392655 2 120 0 0 0 >> [ 3503.098577] nfsiod 39 398.390829 2 100 0 0 0 >> [ 3503.113663] crypto 40 400.392267 2 100 0 0 0 >> [ 3503.128749] kworker/u8:1 44 1392.857683 18 120 0 0 0 >> [ 3503.143835] kpsmoused 53 956.219135 2 100 0 0 0 >> [ 3503.158921] deferwq 54 985.352494 2 100 0 0 0 >> [ 3503.174006] udhcpc 92 1392.857683 14 120 0 0 0 >> [ 3503.189092] telnetd 100 -3.000000 1 65 0 0 0 >> [ 3503.204178] sh 101 -3.000000 224 2 0 0 0 >> [ 3503.219265] portmap 102 -3.000000 13 2 0 0 0 >> [ 3503.234351] kworker/0:2 122 1235.968172 3 120 0 0 0 >> [ 3503.249436] udhcpc 132 1392.857683 1 120 0 0 0 >> [ 3503.264522] udhcpc 137 1392.857683 1 120 0 0 0 >> [ 3503.279608] lockd 143 1324.783814 2 120 0 0 0 >> [ 3503.294694] rcu.sh 153 -3.000000 8 2 0 0 0 >> [ 3503.309781] malloc_crazy 155 0.000000 18087 2 0 0 0 >> [ 3503.324868] > -- Florian -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-03-06 1:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAGVrzcbsSV7h3qA3KuCTwKNFEeww_kSNcfUkfw3PPjeXQXBo6g@mail.gmail.com>
2014-03-05 0:48 ` RCU stalls when running out of memory on 3.14-rc4 w/ NFS and kernel threads priorities changed Eric Dumazet
2014-03-05 1:03 ` Florian Fainelli
2014-03-05 1:16 ` Florian Fainelli
2014-03-05 1:43 ` Paul E. McKenney
2014-03-05 3:55 ` Florian Fainelli
2014-03-05 5:34 ` Paul E. McKenney
[not found] ` <20140305053440.GD3334-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2014-03-06 0:42 ` Florian Fainelli
2014-03-06 1:42 ` Paul E. McKenney
2014-03-05 1:41 ` Paul E. McKenney
2014-03-05 1:43 ` Florian Fainelli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).