* 3.8.11-rt8 NFS triggered seizures
@ 2013-05-13 9:31 Mike Galbraith
2013-06-07 9:03 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2013-05-13 9:31 UTC (permalink / raw)
To: RT
Letting my little Toshiba Satellite download an opensuse install DVD (at
20 KiB/s so I can use the other half of my wonderful bandwidth for
work), after it has been up over night, if I mount my desktop box, and
try to install a kernel to later test, laptop goes comatose. It might
be workqueue related. Interrupts are still happening, I can ping and
poke sysrq-c, but box is completely useless.
I got kdump working again (finally), and crashed it this morning. Now a
few hours later, it fails to repeat of course, _seems_ it requires me
leaving it alone for an extended period to repeat.
I've switched kernels a few times, and it _seems_ to only happen with
3.8-rt, though I can't really be rock solid about that, can only say
3.8-rt has jammed up a few times, and no other kernel has.
When make modules_install install hung first thing this morning, box was
responsive enough to fire up top, which displayed nada. Desktop then
froze, so I crashed it. Seems completion ain't gonna happen.
Trying to repeat, now that I can crashdump the thing again.
KERNEL: vmlinux
DUMPFILE: vmcore
CPUS: 2
DATE: Mon May 13 07:56:02 2013
UPTIME: 15:27:46
LOAD AVERAGE: 1.84, 0.89, 0.38
TASKS: 308
NODENAME: maggy
RELEASE: 3.8.11-rt8-smp
VERSION: #35 SMP PREEMPT RT Tue May 7 15:11:32 CEST 2013
MACHINE: x86_64 (1296 Mhz)
MEMORY: 3.8 GB
PANIC: "Oops: 0002 [#1] PREEMPT SMP " (check log for details)
PID: 44
COMMAND: "irq/1-i8042"
TASK: ffff880134f760c0 [THREAD_INFO: ffff8801349b2000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash> ps|grep UN
3465 1 0 ffff880137d72600 UN 0.7 495440 37744 konsole
8743 3684 1 ffff8800a645c200 UN 0.0 9000 948 make
crash> bt 8743
PID: 8743 TASK: ffff8800a645c200 CPU: 1 COMMAND: "make"
#0 [ffff8800aace1828] __schedule at ffffffff8143b975
#1 [ffff8800aace18b0] schedule at ffffffff8143bfb9
#2 [ffff8800aace18c0] rpc_wait_bit_killable at ffffffffa06b6fa9 [sunrpc]
#3 [ffff8800aace18e0] __wait_on_bit at ffffffff8143ae7f
#4 [ffff8800aace1930] out_of_line_wait_on_bit at ffffffff8143af2c
#5 [ffff8800aace19a0] __rpc_wait_for_completion_task at ffffffffa06b6f6d [sunrpc]
#6 [ffff8800aace19b0] nfs4_run_open_task.isra.37 at ffffffffa07d3504 [nfsv4]
#7 [ffff8800aace1a40] _nfs4_proc_open at ffffffffa07d393b [nfsv4]
#8 [ffff8800aace1a70] _nfs4_do_open at ffffffffa07d5c58 [nfsv4]
#9 [ffff8800aace1b10] nfs4_do_open at ffffffffa07d5f82 [nfsv4]
#10 [ffff8800aace1bb0] nfs4_atomic_open at ffffffffa07d6070 [nfsv4]
#11 [ffff8800aace1be0] nfs4_file_open at ffffffffa07e2c62 [nfsv4]
#12 [ffff8800aace1c80] do_dentry_open.isra.16 at ffffffff81159676
#13 [ffff8800aace1cd0] finish_open at ffffffff81159722
#14 [ffff8800aace1cf0] do_last at ffffffff8116a1d9
#15 [ffff8800aace1da0] path_openat at ffffffff8116a5d3
#16 [ffff8800aace1e50] do_filp_open at ffffffff8116add2
#17 [ffff8800aace1f10] do_sys_open at ffffffff8115aade
#18 [ffff8800aace1f70] sys_open at ffffffff8115abe1
#19 [ffff8800aace1f80] system_call_fastpath at ffffffff814451c2
RIP: 00007f250a821fd0 RSP: 00007fff1755c4d8 RFLAGS: 00000202
RAX: 0000000000000002 RBX: ffffffff814451c2 RCX: ffffffffffffffff
RDX: 00000000000001b6 RSI: 0000000000000000 RDI: 0000000000647a0e
RBP: 00007fff1755c4c0 R8: 0000000000000008 R9: 0000000000000001
R10: 000000000041fef0 R11: 0000000000000246 R12: ffffffff8115abe1
R13: ffff8800aace1f78 R14: 00000000006474c0 R15: 0000000000000000
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
crash> bt 8748
PID: 8748 TASK: ffff8800a675a280 CPU: 1 COMMAND: "top"
#0 [ffff8800b148fd68] __schedule at ffffffff8143b975
#1 [ffff8800b148fdf0] schedule at ffffffff8143bfb9
#2 [ffff8800b148fe00] n_tty_write at ffffffff812c22bb
#3 [ffff8800b148fe90] tty_write at ffffffff812bf0e1
#4 [ffff8800b148ff00] vfs_write at ffffffff8115b49f
#5 [ffff8800b148ff30] sys_write at ffffffff8115b7a2
#6 [ffff8800b148ff80] system_call_fastpath at ffffffff814451c2
RIP: 00007f162d275220 RSP: 00007fff7fa2ef90 RFLAGS: 00000296
RAX: 0000000000000001 RBX: ffffffff814451c2 RCX: 0000000000000005
RDX: 0000000000000800 RSI: 0000000000616820 RDI: 0000000000000001
RBP: 0000000000616820 R8: 0000000000000020 R9: 00007f162db7b700
R10: 00007f162d1e626a R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000800 R14: 00007f162d53d140 R15: 0000000000000800
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
crash> bt 3465
PID: 3465 TASK: ffff880137d72600 CPU: 0 COMMAND: "konsole"
#0 [ffff880133361838] __schedule at ffffffff8143b975
#1 [ffff8801333618c0] schedule at ffffffff8143bfb9
#2 [ffff8801333618d0] schedule_timeout at ffffffff8143abed
#3 [ffff880133361980] wait_for_common at ffffffff8143b4af
#4 [ffff880133361a00] wait_for_completion at ffffffff8143b5ed
#5 [ffff880133361a10] flush_work at ffffffff8105bf79
#6 [ffff880133361a60] tty_flush_to_ldisc at ffffffff812c7d94
#7 [ffff880133361a70] n_tty_poll at ffffffff812c1eea
#8 [ffff880133361ab0] tty_poll at ffffffff812bed82
#9 [ffff880133361af0] do_poll.isra.7 at ffffffff8116e175
#10 [ffff880133361b80] do_sys_poll at ffffffff8116f149
#11 [ffff880133361f40] sys_poll at ffffffff8116f28b
#12 [ffff880133361f80] system_call_fastpath at ffffffff814451c2
RIP: 00007fefe00b913f RSP: 00007fff7a198f70 RFLAGS: 00000202
RAX: 0000000000000007 RBX: ffffffff814451c2 RCX: 0000000000e96368
RDX: 0000000000000007 RSI: 0000000000000020 RDI: 0000000000e69ed0
RBP: 0000000000000020 R8: 0000000000000000 R9: 0000000000000d89
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000007
R13: 0000000000e69ed0 R14: 0000000000612eb0 R15: 0000000000727cb8
ORIG_RAX: 0000000000000007 CS: 0033 SS: 002b
crash> ps|grep kworker
5 2 0 ffff88013b306140 IN 0.0 0 0 [kworker/0:0H]
7 2 0 ffff88013b3101c0 IN 0.0 0 0 [kworker/u:0H]
18 2 1 ffff88013b36c4c0 IN 0.0 0 0 [kworker/1:0]
19 2 1 ffff88013b370500 IN 0.0 0 0 [kworker/1:0H]
192 2 0 ffff88013476e380 IN 0.0 0 0 [kworker/0:1H]
194 2 1 ffff88013397e300 IN 0.0 0 0 [kworker/1:1H]
824 2 1 ffff880139662580 IN 0.0 0 0 [kworker/u:1H]
5371 2 0 ffff88009bae86c0 IN 0.0 0 0 [kworker/0:0]
5375 2 0 ffff8800a73b07c0 IN 0.0 0 0 [kworker/0:2]
8478 2 1 ffff8800378ac340 IN 0.0 0 0 [kworker/u:2]
8485 2 1 ffff88009c2b03c0 IN 0.0 0 0 [kworker/1:1]
8537 2 0 ffff8800b1660200 IN 0.0 0 0 [kworker/u:0]
8731 2 0 ffff8800a667c0c0 IN 0.0 0 0 [kworker/u:1]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-05-13 9:31 3.8.11-rt8 NFS triggered seizures Mike Galbraith
@ 2013-06-07 9:03 ` Sebastian Andrzej Siewior
2013-06-07 11:17 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-07 9:03 UTC (permalink / raw)
To: Mike Galbraith; +Cc: RT
* Mike Galbraith | 2013-05-13 11:31:05 [+0200]:
>Letting my little Toshiba Satellite download an opensuse install DVD (at
>20 KiB/s so I can use the other half of my wonderful bandwidth for
Didn't they say, that the won't shape their custimers for the next two
years?
>When make modules_install install hung first thing this morning, box was
>responsive enough to fire up top, which displayed nada. Desktop then
>froze, so I crashed it. Seems completion ain't gonna happen.
The "panic" says "Oops" so your karnel most likely hit a NULL pointer.
The completion you say is missing might run into the NULL pointer
problem. Any way to get the backtrace from the oops?
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 9:03 ` Sebastian Andrzej Siewior
@ 2013-06-07 11:17 ` Mike Galbraith
2013-06-07 11:36 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2013-06-07 11:17 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: RT
On Fri, 2013-06-07 at 11:03 +0200, Sebastian Andrzej Siewior wrote:
> The "panic" says "Oops" so your karnel most likely hit a NULL pointer.
> The completion you say is missing might run into the NULL pointer
> problem. Any way to get the backtrace from the oops?
The oops is just me poking sysrq-c to crash the box.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 11:17 ` Mike Galbraith
@ 2013-06-07 11:36 ` Sebastian Andrzej Siewior
2013-06-07 12:34 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-07 11:36 UTC (permalink / raw)
To: Mike Galbraith; +Cc: RT
On 06/07/2013 01:17 PM, Mike Galbraith wrote:
> On Fri, 2013-06-07 at 11:03 +0200, Sebastian Andrzej Siewior wrote:
>
>> The "panic" says "Oops" so your karnel most likely hit a NULL pointer.
>> The completion you say is missing might run into the NULL pointer
>> problem. Any way to get the backtrace from the oops?
>
> The oops is just me poking sysrq-c to crash the box.
Ach. This does not make any easier then.
>
> -Mike
>
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 11:36 ` Sebastian Andrzej Siewior
@ 2013-06-07 12:34 ` Mike Galbraith
2013-06-07 12:46 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2013-06-07 12:34 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: RT
On Fri, 2013-06-07 at 13:36 +0200, Sebastian Andrzej Siewior wrote:
> On 06/07/2013 01:17 PM, Mike Galbraith wrote:
> > On Fri, 2013-06-07 at 11:03 +0200, Sebastian Andrzej Siewior wrote:
> >
> >> The "panic" says "Oops" so your karnel most likely hit a NULL pointer.
> >> The completion you say is missing might run into the NULL pointer
> >> problem. Any way to get the backtrace from the oops?
> >
> > The oops is just me poking sysrq-c to crash the box.
>
> Ach. This does not make any easier then.
Yeah, it didn't repeat before DVD download finally finished (after mere
three _weeks_). The below fired during the same boot as the last hang,
but far earlier fwiw.
[10288.045302] ------------[ cut here ]------------
[10288.045330] WARNING: at kernel/workqueue.c:1575 worker_enter_idle+0xea/0x130()
[10288.045345] Hardware name: SATELLITE T130
[10288.045357] Modules linked in: fuse nfsd lockd nfs_acl auth_rpcgss sunrpc rfcomm bnep edd ipv6 cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave dm_mod arc4 rtl8192se rtlwifi mac80211 snd_hda_codec_hdmi snd_hda_codec_conexant btusb bluetooth snd_hda_intel snd_hda_codec snd_hwdep cfg80211 snd_pcm_oss snd_pcm snd_seq acpi_cpufreq snd_timer mperf snd_seq_device snd_mixer_oss coretemp snd iTCO_wdt iTCO_vendor_support toshiba_acpi sparse_keymap sg microcode joydev soundcore lpc_ich rfkill toshiba_bluetooth serio_raw atl1c i2c_i801 snd_page_alloc mfd_core wmi ehci_pci ac battery ext4 mbcache jbd2 crc16 hid_generic usbhid hid sd_mod i915 crc_t10dif rtc_cmos uhci_hcd drm_kms_helper ehci_hcd drm i2c_algo_bit button usbcore usb_common video ahci libahci libata scsi_mod fan
processor
[10288.045425] thermal
[10288.045426]
[10288.045514] Pid: 5371, comm: kworker/0:0 Not tainted 3.8.11-rt8-smp #35
[10288.045528] Call Trace:
[10288.045546] [<ffffffff8103dbef>] warn_slowpath_common+0x7f/0xc0
[10288.045550] [<ffffffff8103dc4a>] warn_slowpath_null+0x1a/0x20
[10288.045552] [<ffffffff8105b2da>] worker_enter_idle+0xea/0x130
[10288.045556] [<ffffffff8105ec78>] worker_thread+0x268/0x3f0
[10288.045559] [<ffffffff8105ea10>] ? rescuer_thread+0x2b0/0x2b0
[10288.045563] [<ffffffff81064332>] kthread+0xb2/0xc0
[10288.045567] [<ffffffff81040000>] ? console_unlock.part.11+0x170/0x320
[10288.045571] [<ffffffff81064280>] ? flush_kthread_worker+0xb0/0xb0
[10288.045576] [<ffffffff8144511c>] ret_from_fork+0x7c/0xb0
[10288.045579] [<ffffffff81064280>] ? flush_kthread_worker+0xb0/0xb0
[10288.045582] ---[ end trace 0000000000000002 ]---
1567 /*
1568 * Sanity check nr_running. Because gcwq_unbind_fn() releases
1569 * gcwq->lock between setting %WORKER_UNBOUND and zapping
1570 * nr_running, the warning may trigger spuriously. Check iff
1571 * unbind is not in progress.
1572 */
1573 WARN_ON_ONCE(!(gcwq->flags & GCWQ_DISASSOCIATED) &&
1574 pool->nr_workers == pool->nr_idle &&
1575 atomic_read(get_pool_nr_running(pool)));
1576 }
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 12:34 ` Mike Galbraith
@ 2013-06-07 12:46 ` Sebastian Andrzej Siewior
2013-06-07 12:50 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-07 12:46 UTC (permalink / raw)
To: Mike Galbraith; +Cc: RT
On 06/07/2013 02:34 PM, Mike Galbraith wrote:
> Yeah, it didn't repeat before DVD download finally finished (after mere
> three _weeks_). The below fired during the same boot as the last hang,
> but far earlier fwiw.
>
> [10288.045302] ------------[ cut here ]------------
> [10288.045330] WARNING: at kernel/workqueue.c:1575 worker_enter_idle+0xea/0x130()
This is actually something I'm looking at right now. But I can trigger
this only with CPU-hotplug. You don't play with CPU-hotplug, do you?
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 12:46 ` Sebastian Andrzej Siewior
@ 2013-06-07 12:50 ` Mike Galbraith
2013-06-07 12:55 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2013-06-07 12:50 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: RT
On Fri, 2013-06-07 at 14:46 +0200, Sebastian Andrzej Siewior wrote:
> On 06/07/2013 02:34 PM, Mike Galbraith wrote:
> > Yeah, it didn't repeat before DVD download finally finished (after mere
> > three _weeks_). The below fired during the same boot as the last hang,
> > but far earlier fwiw.
> >
> > [10288.045302] ------------[ cut here ]------------
> > [10288.045330] WARNING: at kernel/workqueue.c:1575 worker_enter_idle+0xea/0x130()
>
> This is actually something I'm looking at right now. But I can trigger
> this only with CPU-hotplug. You don't play with CPU-hotplug, do you?
No. Hm, I may have shut the lid due to bandwidth irritating me though.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 3.8.11-rt8 NFS triggered seizures
2013-06-07 12:50 ` Mike Galbraith
@ 2013-06-07 12:55 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-07 12:55 UTC (permalink / raw)
To: Mike Galbraith; +Cc: RT
On 06/07/2013 02:50 PM, Mike Galbraith wrote:
>> This is actually something I'm looking at right now. But I can trigger
>> this only with CPU-hotplug. You don't play with CPU-hotplug, do you?
>
> No. Hm, I may have shut the lid due to bandwidth irritating me though.
suspend & resumes drives the CPUs down & up so this could be it.
> -Mike
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-06-07 12:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-13 9:31 3.8.11-rt8 NFS triggered seizures Mike Galbraith
2013-06-07 9:03 ` Sebastian Andrzej Siewior
2013-06-07 11:17 ` Mike Galbraith
2013-06-07 11:36 ` Sebastian Andrzej Siewior
2013-06-07 12:34 ` Mike Galbraith
2013-06-07 12:46 ` Sebastian Andrzej Siewior
2013-06-07 12:50 ` Mike Galbraith
2013-06-07 12:55 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).