* [BUG] netconsole broken by scheduler updates @ 2011-05-26 9:21 Laurent Riffard 2011-05-26 9:48 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Laurent Riffard @ 2011-05-26 9:21 UTC (permalink / raw) To: netdev, Peter Zijlstra Hi, Recently, netconsole was broken by some scheduler updates. Kernel hangs on boot near the network card initialization. I noticed that it does hang just where a "inconsistent lock state" message normally appears. I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the first bad commit. commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f Author: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Tue Apr 5 17:23:54 2011 +0200 sched: Remove rq->lock from the first half of ttwu() Before this commit, kernel was booting succesfully despite the "inconsistent lock state" message. After this commit, the kernel does hang on boot, I have to push the reset button. Here are some lines from dmesg : Initializing cgroup subsys cpu Linux version 2.6.39-rc3-00102-g8f42ced (laurent@calimero) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #137 SMP PREEMPT Wed May 25 10:09:13 CEST 2011 Command line: BOOT_IMAGE=/vmlinuz-2.6.39-rc3-00102-g8f42ced root=/dev/mapper/vglinux1-lv_ubuntu_64bits ro debug netconsole=4444@192.168.0.9/eth0,6666@192.168. 0.10/6C:62:6D:48:4B:C7 splash vt.handoff=7 [....] Linux agpgart interface v0.103 forcedeth: Reverse Engineered nForce ethernet driver. Version 0.64. ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23 forcedeth 0000:00:0a.0: PCI INT A -> Link[APCH] -> GSI 23 (level, low) -> IRQ 23 forcedeth 0000:00:0a.0: setting latency timer to 64 forcedeth 0000:00:0a.0: ifname eth0, PHY OUI 0x732 @ 1, addr 00:1f:d0:53:49:ae forcedeth 0000:00:0a.0: highdma pwrctl mgmt gbit lnktim msi desc-v3 netconsole: local port 4444 netconsole: local IP 192.168.0.9 netconsole: interface 'eth0' netconsole: remote port 6666 netconsole: remote IP 192.168.0.10 netconsole: remote ethernet address 6c:62:6d:48:4b:c7 netconsole: device eth0 not up yet, forcing it forcedeth 0000:00:0a.0: irq 40 for MSI/MSI-X forcedeth 0000:00:0a.0: eth0: no link during initialization forcedeth 0000:00:0a.0: eth0: link up <============== kernel does hang here ================================= [ INFO: inconsistent lock state ] 2.6.39-rc3-00102-g8f42ced #137 --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&napi->poll_lock)->rlock){+.?...}, at: [<ffffffff812385a5>] netpoll_poll_dev+0x9e/0x4f1 {IN-SOFTIRQ-W} state was registered at: [<ffffffff81066607>] __lock_acquire+0x319/0xd62 [<ffffffff8106750e>] lock_acquire+0xd7/0x102 [<ffffffff81295a81>] _raw_spin_lock+0x36/0x45 [<ffffffff81227a13>] net_rx_action+0x8d/0x1e6 [<ffffffff810409b9>] __do_softirq+0xe0/0x1be [<ffffffff81297e1c>] call_softirq+0x1c/0x30 [<ffffffff810042de>] do_softirq+0x46/0x9c [<ffffffff81040d0a>] irq_exit+0x4d/0xa2 [<ffffffff81003f92>] do_IRQ+0x88/0x9f [<ffffffff81296613>] ret_from_intr+0x0/0x1a [<ffffffff810094b2>] default_idle+0x40/0x74 [<ffffffff810095ca>] c1e_idle+0xe4/0xe6 [<ffffffff81001270>] cpu_idle+0x5e/0x94 [<ffffffff8129018d>] start_secondary+0x18a/0x18f irq event stamp: 1010077 hardirqs last enabled at (1010077): [<ffffffff81296086>] _raw_spin_unlock_irqrestore+0x40/0x75 hardirqs last disabled at (1010076): [<ffffffff81295b54>] _raw_spin_lock_irqsave+0x1a/0x57 softirqs last enabled at (1008710): [<ffffffff8122cf91>] __dev_mc_add+0x5d/0x6b softirqs last disabled at (1008704): [<ffffffff81295bdf>] _raw_spin_lock_bh+0x11/0x4a other info that might help us debug this: 1 lock held by swapper/1: #0: (target_list_lock){+.+...}, at: [<ffffffff811e258a>] write_msg+0x34/0xd2 stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.39-rc3-00102-g8f42ced #137 Call Trace: [<ffffffff81066104>] valid_state+0x17e/0x190 [<ffffffff81065a4c>] ? check_usage_forwards+0x85/0x85 [<ffffffff810661f8>] mark_lock+0xe2/0x1d8 [<ffffffff81066685>] __lock_acquire+0x397/0xd62 [<ffffffff8115e0b5>] ? pci_bus_write_config_dword+0x5e/0x6a [<ffffffff81296086>] ? _raw_spin_unlock_irqrestore+0x40/0x75 [<ffffffff8106792a>] ? trace_hardirqs_on_caller+0x10b/0x12f [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1 [<ffffffff8106750e>] lock_acquire+0xd7/0x102 [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1 [<ffffffff81295fbf>] _raw_spin_trylock+0x4a/0x7f [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1 [<ffffffff812385a5>] netpoll_poll_dev+0x9e/0x4f1 [<ffffffff81238b95>] netpoll_send_skb_on_dev+0x112/0x216 [<ffffffff81238e84>] netpoll_send_udp+0x1eb/0x1fa [<ffffffff811e258a>] ? write_msg+0x34/0xd2 [<ffffffff811e25e2>] write_msg+0x8c/0xd2 [<ffffffff8103b215>] __call_console_drivers+0x72/0x84 [<ffffffff8103b280>] _call_console_drivers+0x59/0x5e [<ffffffff8103babb>] console_unlock+0xf3/0x1ac [<ffffffff8103c30f>] register_console+0x228/0x2b3 [<ffffffff816a71ec>] init_netconsole+0x15d/0x1e4 [<ffffffff8116379b>] ? __pci_register_driver+0x9b/0xcd [<ffffffff816a708f>] ? option_setup+0x1f/0x1f [<ffffffff8100030f>] do_one_initcall+0x7a/0x137 [<ffffffff81685be1>] kernel_init+0xb0/0x130 [<ffffffff811553ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81297d24>] kernel_thread_helper+0x4/0x10 [<ffffffff812966d8>] ? retint_restore_args+0x13/0x13 [<ffffffff81685b31>] ? start_kernel+0x379/0x379 [<ffffffff81297d20>] ? gs_change+0x13/0x13 ------------[ cut here ]------------ WARNING: at net/core/netpoll.c:346 netpoll_send_skb_on_dev+0x15f/0x216() Hardware name: M68SM-S2L netpoll_send_skb(): eth0 enabled interrupts in poll (nv_start_xmit_optimized+0x0/0x487) Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.39-rc3-00102-g8f42ced #137 Call Trace: [<ffffffff8103ae08>] warn_slowpath_common+0x80/0x98 [<ffffffff8103aeb4>] warn_slowpath_fmt+0x41/0x43 [<ffffffff811e1fa9>] ? nv_start_xmit+0x3df/0x3df [<ffffffff81238be2>] netpoll_send_skb_on_dev+0x15f/0x216 [<ffffffff81238e84>] netpoll_send_udp+0x1eb/0x1fa [<ffffffff811e258a>] ? write_msg+0x34/0xd2 [<ffffffff811e25e2>] write_msg+0x8c/0xd2 [<ffffffff8103b215>] __call_console_drivers+0x72/0x84 [<ffffffff8103b280>] _call_console_drivers+0x59/0x5e [<ffffffff8103babb>] console_unlock+0xf3/0x1ac [<ffffffff8103c30f>] register_console+0x228/0x2b3 [<ffffffff816a71ec>] init_netconsole+0x15d/0x1e4 [<ffffffff8116379b>] ? __pci_register_driver+0x9b/0xcd [<ffffffff816a708f>] ? option_setup+0x1f/0x1f [<ffffffff8100030f>] do_one_initcall+0x7a/0x137 [<ffffffff81685be1>] kernel_init+0xb0/0x130 [<ffffffff811553ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81297d24>] kernel_thread_helper+0x4/0x10 [<ffffffff812966d8>] ? retint_restore_args+0x13/0x13 [<ffffffff81685b31>] ? start_kernel+0x379/0x379 [<ffffffff81297d20>] ? gs_change+0x13/0x13 ---[ end trace 2b8ff4190d73b338 ]--- console [netcon0] enabled netconsole: network logging started ~~ laurent ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 9:21 [BUG] netconsole broken by scheduler updates Laurent Riffard @ 2011-05-26 9:48 ` Peter Zijlstra 2011-05-26 10:35 ` Laurent Riffard 2011-05-26 20:18 ` David Miller 0 siblings, 2 replies; 7+ messages in thread From: Peter Zijlstra @ 2011-05-26 9:48 UTC (permalink / raw) To: Laurent Riffard; +Cc: netdev On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote: > Hi, > > Recently, netconsole was broken by some scheduler updates. Kernel hangs > on boot near the network card initialization. I noticed that it does > hang just where a "inconsistent lock state" message normally appears. > > I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the > first bad commit. > commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f > Author: Peter Zijlstra <a.p.zijlstra@chello.nl> > Date: Tue Apr 5 17:23:54 2011 +0200 > > sched: Remove rq->lock from the first half of ttwu() > > > Before this commit, kernel was booting succesfully despite the > "inconsistent lock state" message. After this commit, the kernel does > hang on boot, I have to push the reset button. Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat after some 10 seconds? It would be interesting to see where its stuck. Also, are you very _very_ sure that lockdep message is a false positive? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 9:48 ` Peter Zijlstra @ 2011-05-26 10:35 ` Laurent Riffard 2011-05-26 16:12 ` Laurent Riffard 2011-05-26 20:18 ` David Miller 1 sibling, 1 reply; 7+ messages in thread From: Laurent Riffard @ 2011-05-26 10:35 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Laurent Riffard, netdev On Thu, 26 May 2011, Peter Zijlstra wrote: > On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote: >> Hi, >> >> Recently, netconsole was broken by some scheduler updates. Kernel hangs >> on boot near the network card initialization. I noticed that it does >> hang just where a "inconsistent lock state" message normally appears. >> >> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the >> first bad commit. >> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f >> Author: Peter Zijlstra <a.p.zijlstra@chello.nl> >> Date: Tue Apr 5 17:23:54 2011 +0200 >> >> sched: Remove rq->lock from the first half of ttwu() >> >> >> Before this commit, kernel was booting succesfully despite the >> "inconsistent lock state" message. After this commit, the kernel does >> hang on boot, I have to push the reset button. > > Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat > after some 10 seconds? It would be interesting to see where its stuck. CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try. > Also, are you very _very_ sure that lockdep message is a false positive? I can't answer this question. I just know that before commit e4a52bcb9, the kernel was able to boot with netconsole despite this message. Maybe some netconsole developper knows. ~~ laurent ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 10:35 ` Laurent Riffard @ 2011-05-26 16:12 ` Laurent Riffard 2011-05-26 16:17 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Laurent Riffard @ 2011-05-26 16:12 UTC (permalink / raw) To: Peter Zijlstra; +Cc: netdev Le 26/05/2011 12:35, Laurent Riffard a écrit : > On Thu, 26 May 2011, Peter Zijlstra wrote: >> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote: >>> Hi, >>> >>> Recently, netconsole was broken by some scheduler updates. Kernel hangs >>> on boot near the network card initialization. I noticed that it does >>> hang just where a "inconsistent lock state" message normally appears. >>> >>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the >>> first bad commit. >>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f >>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl> >>> Date: Tue Apr 5 17:23:54 2011 +0200 >>> >>> sched: Remove rq->lock from the first half of ttwu() >>> >>> >>> Before this commit, kernel was booting succesfully despite the >>> "inconsistent lock state" message. After this commit, the kernel does >>> hang on boot, I have to push the reset button. >> >> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat >> after some 10 seconds? It would be interesting to see where its stuck. > > CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try. I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more information. During the boot, the kernel hangs just after bringing up the NIC. I waited 2 minutes, without getting any more messages. ~~ laurent ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 16:12 ` Laurent Riffard @ 2011-05-26 16:17 ` Peter Zijlstra 2011-06-03 21:15 ` Laurent Riffard 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2011-05-26 16:17 UTC (permalink / raw) To: Laurent Riffard; +Cc: netdev On Thu, 2011-05-26 at 18:12 +0200, Laurent Riffard wrote: > Le 26/05/2011 12:35, Laurent Riffard a écrit : > > On Thu, 26 May 2011, Peter Zijlstra wrote: > >> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote: > >>> Hi, > >>> > >>> Recently, netconsole was broken by some scheduler updates. Kernel hangs > >>> on boot near the network card initialization. I noticed that it does > >>> hang just where a "inconsistent lock state" message normally appears. > >>> > >>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the > >>> first bad commit. > >>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f > >>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl> > >>> Date: Tue Apr 5 17:23:54 2011 +0200 > >>> > >>> sched: Remove rq->lock from the first half of ttwu() > >>> > >>> > >>> Before this commit, kernel was booting succesfully despite the > >>> "inconsistent lock state" message. After this commit, the kernel does > >>> hang on boot, I have to push the reset button. > >> > >> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat > >> after some 10 seconds? It would be interesting to see where its stuck. > > > > CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try. > > I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more > information. During the boot, the kernel hangs just after bringing up > the NIC. I waited 2 minutes, without getting any more messages. OK, thanks for trying, I shall try and reproduce locally. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 16:17 ` Peter Zijlstra @ 2011-06-03 21:15 ` Laurent Riffard 0 siblings, 0 replies; 7+ messages in thread From: Laurent Riffard @ 2011-06-03 21:15 UTC (permalink / raw) To: Peter Zijlstra; +Cc: netdev Le 26/05/2011 18:17, Peter Zijlstra a écrit : > On Thu, 2011-05-26 at 18:12 +0200, Laurent Riffard wrote: >> Le 26/05/2011 12:35, Laurent Riffard a écrit : >>> On Thu, 26 May 2011, Peter Zijlstra wrote: >>>> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote: >>>>> Hi, >>>>> >>>>> Recently, netconsole was broken by some scheduler updates. Kernel hangs >>>>> on boot near the network card initialization. I noticed that it does >>>>> hang just where a "inconsistent lock state" message normally appears. >>>>> >>>>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the >>>>> first bad commit. >>>>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f >>>>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl> >>>>> Date: Tue Apr 5 17:23:54 2011 +0200 >>>>> >>>>> sched: Remove rq->lock from the first half of ttwu() >>>>> >>>>> >>>>> Before this commit, kernel was booting succesfully despite the >>>>> "inconsistent lock state" message. After this commit, the kernel does >>>>> hang on boot, I have to push the reset button. >>>> >>>> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat >>>> after some 10 seconds? It would be interesting to see where its stuck. >>> >>> CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try. >> >> I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more >> information. During the boot, the kernel hangs just after bringing up >> the NIC. I waited 2 minutes, without getting any more messages. > > OK, thanks for trying, I shall try and reproduce locally. > Hello, I tried some newer kernel. It boots successfully now, but I've got interesting (?) logs : ==================================================== [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.0.0-rc1-00049-g1fa7b6a (laurent@calimero) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #148 SMP PREEMPT Thu Jun 2 18:56:41 CEST 2011 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.0.0-rc1-00049-g1fa7b6a root=/dev/mapper/vglinux1-lv_ubuntu_64bits ro debug netconsole=4444@192.168.0.9/eth0,6666@192.168.0.10/6C:62:6D:48:4B:C7 splash vt.handoff=7 [...] [ 0.685768] Console: switching to colour frame buffer device 160x64 [ 0.707611] fb0: VESA VGA frame buffer device [ 0.714029] Linux agpgart interface v0.103 [ 0.714206] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.64. [ 0.714837] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23 [ 0.715054] forcedeth 0000:00:0a.0: PCI INT A -> Link[APCH] -> GSI 23 (level, low) -> IRQ 23 [ 0.715283] forcedeth 0000:00:0a.0: setting latency timer to 64 [ 1.241552] forcedeth 0000:00:0a.0: ifname eth0, PHY OUI 0x732 @ 1, addr 00:1f:d0:53:49:ae [ 1.241784] forcedeth 0000:00:0a.0: highdma pwrctl mgmt gbit lnktim msi desc-v3 [ 1.242101] netconsole: local port 4444 [ 1.242207] netconsole: local IP 192.168.0.9 [ 1.242320] netconsole: interface 'eth0' [ 1.242426] netconsole: remote port 6666 [ 1.242531] netconsole: remote IP 192.168.0.10 [ 1.242650] netconsole: remote ethernet address 6c:62:6d:48:4b:c7 [ 1.242822] netconsole: device eth0 not up yet, forcing it [ 1.244612] forcedeth 0000:00:0a.0: irq 40 for MSI/MSI-X [ 1.245057] forcedeth 0000:00:0a.0: eth0: no link during initialization [ 2.374451] forcedeth 0000:00:0a.0: eth0: link up [ 2.381178] [ 2.381179] ================================= [ 2.381182] [ INFO: inconsistent lock state ] [ 2.381184] 3.0.0-rc1-00049-g1fa7b6a #148 [ 2.381186] --------------------------------- [ 2.381188] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 2.381191] swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 2.381193] (&(&napi->poll_lock)->rlock){+.?...}, at: [<ffffffff8123cd08>] netpoll_poll_dev+0x9d/0x4ed [ 2.381202] {IN-SOFTIRQ-W} state was registered at: [ 2.381204] [<ffffffff81068020>] __lock_acquire+0x319/0xdd8 [ 2.381208] [<ffffffff81068f9f>] lock_acquire+0xd8/0x103 [ 2.381211] [<ffffffff8129c019>] _raw_spin_lock+0x36/0x45 [ 2.381215] [<ffffffff8122b886>] net_rx_action+0x8d/0x1e7 [ 2.381219] [<ffffffff81040f73>] __do_softirq+0xe1/0x1c0 [ 2.381223] [<ffffffff8129e39c>] call_softirq+0x1c/0x30 [ 2.381226] [<ffffffff81004296>] do_softirq+0x46/0x9c [ 2.381229] [<ffffffff810412c5>] irq_exit+0x4d/0xa2 [ 2.381232] [<ffffffff81003f3a>] do_IRQ+0x88/0x9f [ 2.381235] [<ffffffff8129cb93>] ret_from_intr+0x0/0x1a [ 2.381239] [<ffffffff8100972c>] default_idle+0x40/0x74 [ 2.381242] [<ffffffff81009844>] amd_e400_idle+0xe4/0xe6 [ 2.381246] [<ffffffff81001270>] cpu_idle+0x5e/0x94 [ 2.381249] [<ffffffff81296383>] start_secondary+0x18a/0x18f [ 2.381252] irq event stamp: 1000341 [ 2.381254] hardirqs last enabled at (1000341): [<ffffffff8129cc58>] restore_args+0x0/0x30 [ 2.381257] hardirqs last disabled at (1000339): [<ffffffff8104100a>] __do_softirq+0x178/0x1c0 [ 2.381261] softirqs last enabled at (1000340): [<ffffffff8104103c>] __do_softirq+0x1aa/0x1c0 [ 2.381264] softirqs last disabled at (1000251): [<ffffffff8129e39c>] call_softirq+0x1c/0x30 [ 2.381267] [ 2.381268] other info that might help us debug this: [ 2.381269] Possible unsafe locking scenario: [ 2.381270] [ 2.381271] CPU0 [ 2.381272] ---- [ 2.381273] lock(&(&napi->poll_lock)->rlock); [ 2.381275] <Interrupt> [ 2.381276] lock(&(&napi->poll_lock)->rlock); [ 2.381279] [ 2.381279] *** DEADLOCK *** [ 2.381280] [ 2.381282] 1 lock held by swapper/1: [ 2.381283] #0: (target_list_lock){+.+...}, at: [<ffffffff811e5877>] write_msg+0x34/0xcd [ 2.381288] [ 2.381289] stack backtrace: [ 2.381291] Pid: 1, comm: swapper Not tainted 3.0.0-rc1-00049-g1fa7b6a #148 [ 2.381293] Call Trace: [ 2.381297] [<ffffffff81067b1d>] valid_state+0x217/0x229 [ 2.381300] [<ffffffff81067396>] ? check_usage_forwards+0x85/0x85 [ 2.381303] [<ffffffff81067c11>] mark_lock+0xe2/0x1d8 [ 2.381306] [<ffffffff810693bb>] ? trace_hardirqs_on_caller+0x10b/0x12f [ 2.381309] [<ffffffff8106809e>] __lock_acquire+0x397/0xdd8 [ 2.381314] [<ffffffff8106e882>] ? arch_local_irq_restore+0x6/0xd [ 2.381317] [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed [ 2.381320] [<ffffffff81068f9f>] lock_acquire+0xd8/0x103 [ 2.381324] [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed [ 2.381327] [<ffffffff8129c557>] _raw_spin_trylock+0x4a/0x7f [ 2.381331] [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed [ 2.381334] [<ffffffff8123cd08>] netpoll_poll_dev+0x9d/0x4ed [ 2.381338] [<ffffffff8123d2ee>] netpoll_send_skb_on_dev+0x10b/0x20f [ 2.381342] [<ffffffff8123d5dd>] netpoll_send_udp+0x1eb/0x1fa [ 2.381345] [<ffffffff811e5877>] ? write_msg+0x34/0xcd [ 2.381348] [<ffffffff811e58cf>] write_msg+0x8c/0xcd [ 2.381352] [<ffffffff8103b78d>] __call_console_drivers+0x72/0x84 [ 2.381355] [<ffffffff8103b7f8>] _call_console_drivers+0x59/0x5e [ 2.381359] [<ffffffff8103c033>] console_unlock+0xf3/0x1ac [ 2.381362] [<ffffffff8103c891>] register_console+0x228/0x2b3 [ 2.381366] [<ffffffff816a76b0>] init_netconsole+0x15d/0x1e4 [ 2.381371] [<ffffffff81166a64>] ? __pci_register_driver+0x9b/0xcd [ 2.381374] [<ffffffff816a7553>] ? option_setup+0x1f/0x1f [ 2.381377] [<ffffffff8100030f>] do_one_initcall+0x7a/0x137 [ 2.381381] [<ffffffff81685bcd>] kernel_init+0xb5/0x135 [ 2.381386] [<ffffffff81157d1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 2.381389] [<ffffffff8129e2a4>] kernel_thread_helper+0x4/0x10 [ 2.381392] [<ffffffff8129cc58>] ? retint_restore_args+0x13/0x13 [ 2.381396] [<ffffffff81685b18>] ? start_kernel+0x360/0x360 [ 2.381399] [<ffffffff8129e2a0>] ? gs_change+0x13/0x13 [ 2.381464] ------------[ cut here ]------------ [ 2.381468] WARNING: at net/core/netpoll.c:346 netpoll_send_skb_on_dev+0x158/0x20f() [ 2.381470] Hardware name: M68SM-S2L [ 2.381473] netpoll_send_skb(): eth0 enabled interrupts in poll (nv_start_xmit_optimized+0x0/0x484) [ 2.381474] Modules linked in: [ 2.381477] Pid: 1, comm: swapper Not tainted 3.0.0-rc1-00049-g1fa7b6a #148 [ 2.381479] Call Trace: [ 2.381482] [<ffffffff8103b381>] warn_slowpath_common+0x80/0x98 [ 2.381485] [<ffffffff8103b42d>] warn_slowpath_fmt+0x41/0x43 [ 2.381489] [<ffffffff811e526c>] ? nv_start_xmit+0x3df/0x3df [ 2.381492] [<ffffffff8123d33b>] netpoll_send_skb_on_dev+0x158/0x20f [ 2.381496] [<ffffffff8123d5dd>] netpoll_send_udp+0x1eb/0x1fa [ 2.381499] [<ffffffff811e5877>] ? write_msg+0x34/0xcd [ 2.381502] [<ffffffff811e58cf>] write_msg+0x8c/0xcd [ 2.381505] [<ffffffff8103b78d>] __call_console_drivers+0x72/0x84 [ 2.381508] [<ffffffff8103b7f8>] _call_console_drivers+0x59/0x5e [ 2.381512] [<ffffffff8103c033>] console_unlock+0xf3/0x1ac [ 2.381515] [<ffffffff8103c891>] register_console+0x228/0x2b3 [ 2.381518] [<ffffffff816a76b0>] init_netconsole+0x15d/0x1e4 [ 2.381522] [<ffffffff81166a64>] ? __pci_register_driver+0x9b/0xcd [ 2.381525] [<ffffffff816a7553>] ? option_setup+0x1f/0x1f [ 2.381528] [<ffffffff8100030f>] do_one_initcall+0x7a/0x137 [ 2.381531] [<ffffffff81685bcd>] kernel_init+0xb5/0x135 [ 2.381535] [<ffffffff81157d1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 2.381538] [<ffffffff8129e2a4>] kernel_thread_helper+0x4/0x10 [ 2.381541] [<ffffffff8129cc58>] ? retint_restore_args+0x13/0x13 [ 2.381545] [<ffffffff81685b18>] ? start_kernel+0x360/0x360 [ 2.381548] [<ffffffff8129e2a0>] ? gs_change+0x13/0x13 [ 2.381555] ---[ end trace f00f2127e552114f ]--- [ 2.383979] console [netcon0] enabled [ 2.384117] netconsole: network logging started [ 2.384492] i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 [ 2.384679] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp [ 2.385208] serio: i8042 KBD port at 0x60,0x64 irq 1 [...] ================================================================== hope this help ~~ laurent ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] netconsole broken by scheduler updates 2011-05-26 9:48 ` Peter Zijlstra 2011-05-26 10:35 ` Laurent Riffard @ 2011-05-26 20:18 ` David Miller 1 sibling, 0 replies; 7+ messages in thread From: David Miller @ 2011-05-26 20:18 UTC (permalink / raw) To: a.p.zijlstra; +Cc: laurent.riffard, netdev From: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Thu, 26 May 2011 11:48:28 +0200 > Also, are you very _very_ sure that lockdep message is a false positive? I suspect these warnings are entirely specific to using netconsole with the forcedeth driver. There is some really scary stuff in there, to say the least. Just have a look at nv_do_nic_poll(). That's the first time I've seen a networking driver use the {disable,enable}_irq_lockdep() interfaces. It's also using netif_tx_unlock_bh() in the "recovery" case which in netpoll will cause a local_bh_enable() with cpu interrupts disabled. Which is not allowed. Much of the forcedeth driver's locking seems suspect to me. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-06-03 21:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-05-26 9:21 [BUG] netconsole broken by scheduler updates Laurent Riffard 2011-05-26 9:48 ` Peter Zijlstra 2011-05-26 10:35 ` Laurent Riffard 2011-05-26 16:12 ` Laurent Riffard 2011-05-26 16:17 ` Peter Zijlstra 2011-06-03 21:15 ` Laurent Riffard 2011-05-26 20:18 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).