netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] netconsole broken by scheduler updates
@ 2011-05-26  9:21 Laurent Riffard
  2011-05-26  9:48 ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Laurent Riffard @ 2011-05-26  9:21 UTC (permalink / raw)
  To: netdev, Peter Zijlstra

Hi,

Recently, netconsole was broken by some scheduler updates. Kernel hangs 
on boot near the network card initialization. I noticed that it does 
hang just where a "inconsistent lock state" message normally appears.

I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the 
first bad commit.
commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Tue Apr 5 17:23:54 2011 +0200

    sched: Remove rq->lock from the first half of ttwu()


Before this commit, kernel was booting succesfully despite the 
"inconsistent lock state" message. After this commit, the kernel does 
hang on boot, I have to push the reset button.


Here are some lines from dmesg :

Initializing cgroup subsys cpu
Linux version 2.6.39-rc3-00102-g8f42ced (laurent@calimero) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #137 SMP PREEMPT Wed May 25 10:09:13 CEST 2011
Command line: BOOT_IMAGE=/vmlinuz-2.6.39-rc3-00102-g8f42ced root=/dev/mapper/vglinux1-lv_ubuntu_64bits ro debug netconsole=4444@192.168.0.9/eth0,6666@192.168.
0.10/6C:62:6D:48:4B:C7 splash vt.handoff=7
[....]
Linux agpgart interface v0.103
forcedeth: Reverse Engineered nForce ethernet driver. Version 0.64.
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23
forcedeth 0000:00:0a.0: PCI INT A -> Link[APCH] -> GSI 23 (level, low) -> IRQ 23
forcedeth 0000:00:0a.0: setting latency timer to 64
forcedeth 0000:00:0a.0: ifname eth0, PHY OUI 0x732 @ 1, addr 00:1f:d0:53:49:ae
forcedeth 0000:00:0a.0: highdma pwrctl mgmt gbit lnktim msi desc-v3
netconsole: local port 4444
netconsole: local IP 192.168.0.9
netconsole: interface 'eth0'
netconsole: remote port 6666
netconsole: remote IP 192.168.0.10
netconsole: remote ethernet address 6c:62:6d:48:4b:c7
netconsole: device eth0 not up yet, forcing it
forcedeth 0000:00:0a.0: irq 40 for MSI/MSI-X
forcedeth 0000:00:0a.0: eth0: no link during initialization
forcedeth 0000:00:0a.0: eth0: link up
                                       <============== kernel does hang here
=================================
[ INFO: inconsistent lock state ]
2.6.39-rc3-00102-g8f42ced #137
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&napi->poll_lock)->rlock){+.?...}, at: [<ffffffff812385a5>] netpoll_poll_dev+0x9e/0x4f1
{IN-SOFTIRQ-W} state was registered at:
  [<ffffffff81066607>] __lock_acquire+0x319/0xd62
  [<ffffffff8106750e>] lock_acquire+0xd7/0x102
  [<ffffffff81295a81>] _raw_spin_lock+0x36/0x45
  [<ffffffff81227a13>] net_rx_action+0x8d/0x1e6
  [<ffffffff810409b9>] __do_softirq+0xe0/0x1be
  [<ffffffff81297e1c>] call_softirq+0x1c/0x30
  [<ffffffff810042de>] do_softirq+0x46/0x9c
  [<ffffffff81040d0a>] irq_exit+0x4d/0xa2
  [<ffffffff81003f92>] do_IRQ+0x88/0x9f
  [<ffffffff81296613>] ret_from_intr+0x0/0x1a
  [<ffffffff810094b2>] default_idle+0x40/0x74
  [<ffffffff810095ca>] c1e_idle+0xe4/0xe6
  [<ffffffff81001270>] cpu_idle+0x5e/0x94
  [<ffffffff8129018d>] start_secondary+0x18a/0x18f
irq event stamp: 1010077
hardirqs last  enabled at (1010077): [<ffffffff81296086>] _raw_spin_unlock_irqrestore+0x40/0x75
hardirqs last disabled at (1010076): [<ffffffff81295b54>] _raw_spin_lock_irqsave+0x1a/0x57
softirqs last  enabled at (1008710): [<ffffffff8122cf91>] __dev_mc_add+0x5d/0x6b
softirqs last disabled at (1008704): [<ffffffff81295bdf>] _raw_spin_lock_bh+0x11/0x4a

other info that might help us debug this:
1 lock held by swapper/1:
 #0:  (target_list_lock){+.+...}, at: [<ffffffff811e258a>] write_msg+0x34/0xd2

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.39-rc3-00102-g8f42ced #137
Call Trace:
 [<ffffffff81066104>] valid_state+0x17e/0x190
 [<ffffffff81065a4c>] ? check_usage_forwards+0x85/0x85
 [<ffffffff810661f8>] mark_lock+0xe2/0x1d8
 [<ffffffff81066685>] __lock_acquire+0x397/0xd62
 [<ffffffff8115e0b5>] ? pci_bus_write_config_dword+0x5e/0x6a
 [<ffffffff81296086>] ? _raw_spin_unlock_irqrestore+0x40/0x75
 [<ffffffff8106792a>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1
 [<ffffffff8106750e>] lock_acquire+0xd7/0x102
 [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1
 [<ffffffff81295fbf>] _raw_spin_trylock+0x4a/0x7f
 [<ffffffff812385a5>] ? netpoll_poll_dev+0x9e/0x4f1
 [<ffffffff812385a5>] netpoll_poll_dev+0x9e/0x4f1
 [<ffffffff81238b95>] netpoll_send_skb_on_dev+0x112/0x216
 [<ffffffff81238e84>] netpoll_send_udp+0x1eb/0x1fa
 [<ffffffff811e258a>] ? write_msg+0x34/0xd2
 [<ffffffff811e25e2>] write_msg+0x8c/0xd2
 [<ffffffff8103b215>] __call_console_drivers+0x72/0x84
 [<ffffffff8103b280>] _call_console_drivers+0x59/0x5e
 [<ffffffff8103babb>] console_unlock+0xf3/0x1ac
 [<ffffffff8103c30f>] register_console+0x228/0x2b3
 [<ffffffff816a71ec>] init_netconsole+0x15d/0x1e4
 [<ffffffff8116379b>] ? __pci_register_driver+0x9b/0xcd
 [<ffffffff816a708f>] ? option_setup+0x1f/0x1f
 [<ffffffff8100030f>] do_one_initcall+0x7a/0x137
 [<ffffffff81685be1>] kernel_init+0xb0/0x130
 [<ffffffff811553ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81297d24>] kernel_thread_helper+0x4/0x10
 [<ffffffff812966d8>] ? retint_restore_args+0x13/0x13
 [<ffffffff81685b31>] ? start_kernel+0x379/0x379
 [<ffffffff81297d20>] ? gs_change+0x13/0x13
------------[ cut here ]------------
WARNING: at net/core/netpoll.c:346 netpoll_send_skb_on_dev+0x15f/0x216()
Hardware name: M68SM-S2L
netpoll_send_skb(): eth0 enabled interrupts in poll (nv_start_xmit_optimized+0x0/0x487)
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.39-rc3-00102-g8f42ced #137
Call Trace:
 [<ffffffff8103ae08>] warn_slowpath_common+0x80/0x98
 [<ffffffff8103aeb4>] warn_slowpath_fmt+0x41/0x43
 [<ffffffff811e1fa9>] ? nv_start_xmit+0x3df/0x3df
 [<ffffffff81238be2>] netpoll_send_skb_on_dev+0x15f/0x216
 [<ffffffff81238e84>] netpoll_send_udp+0x1eb/0x1fa
 [<ffffffff811e258a>] ? write_msg+0x34/0xd2
 [<ffffffff811e25e2>] write_msg+0x8c/0xd2
 [<ffffffff8103b215>] __call_console_drivers+0x72/0x84
 [<ffffffff8103b280>] _call_console_drivers+0x59/0x5e
 [<ffffffff8103babb>] console_unlock+0xf3/0x1ac
 [<ffffffff8103c30f>] register_console+0x228/0x2b3
 [<ffffffff816a71ec>] init_netconsole+0x15d/0x1e4
 [<ffffffff8116379b>] ? __pci_register_driver+0x9b/0xcd
 [<ffffffff816a708f>] ? option_setup+0x1f/0x1f
 [<ffffffff8100030f>] do_one_initcall+0x7a/0x137
 [<ffffffff81685be1>] kernel_init+0xb0/0x130
 [<ffffffff811553ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81297d24>] kernel_thread_helper+0x4/0x10
 [<ffffffff812966d8>] ? retint_restore_args+0x13/0x13
 [<ffffffff81685b31>] ? start_kernel+0x379/0x379
 [<ffffffff81297d20>] ? gs_change+0x13/0x13
---[ end trace 2b8ff4190d73b338 ]---
console [netcon0] enabled
netconsole: network logging started


~~
laurent

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26  9:21 [BUG] netconsole broken by scheduler updates Laurent Riffard
@ 2011-05-26  9:48 ` Peter Zijlstra
  2011-05-26 10:35   ` Laurent Riffard
  2011-05-26 20:18   ` David Miller
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2011-05-26  9:48 UTC (permalink / raw)
  To: Laurent Riffard; +Cc: netdev

On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote:
> Hi,
> 
> Recently, netconsole was broken by some scheduler updates. Kernel hangs 
> on boot near the network card initialization. I noticed that it does 
> hang just where a "inconsistent lock state" message normally appears.
> 
> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the 
> first bad commit.
> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date:   Tue Apr 5 17:23:54 2011 +0200
> 
>     sched: Remove rq->lock from the first half of ttwu()
> 
> 
> Before this commit, kernel was booting succesfully despite the 
> "inconsistent lock state" message. After this commit, the kernel does 
> hang on boot, I have to push the reset button.

Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat
after some 10 seconds? It would be interesting to see where its stuck.

Also, are you very _very_ sure that lockdep message is a false positive?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26  9:48 ` Peter Zijlstra
@ 2011-05-26 10:35   ` Laurent Riffard
  2011-05-26 16:12     ` Laurent Riffard
  2011-05-26 20:18   ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Laurent Riffard @ 2011-05-26 10:35 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Laurent Riffard, netdev



On Thu, 26 May 2011, Peter Zijlstra wrote:
> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote:
>> Hi,
>>
>> Recently, netconsole was broken by some scheduler updates. Kernel hangs
>> on boot near the network card initialization. I noticed that it does
>> hang just where a "inconsistent lock state" message normally appears.
>>
>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the
>> first bad commit.
>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Date:   Tue Apr 5 17:23:54 2011 +0200
>>
>>     sched: Remove rq->lock from the first half of ttwu()
>>
>>
>> Before this commit, kernel was booting succesfully despite the
>> "inconsistent lock state" message. After this commit, the kernel does
>> hang on boot, I have to push the reset button.
>
> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat
> after some 10 seconds? It would be interesting to see where its stuck.

CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try.

> Also, are you very _very_ sure that lockdep message is a false positive?

I can't answer this question. I just know that before commit e4a52bcb9, 
the kernel was able to boot with netconsole despite this message.

Maybe some netconsole developper knows.
~~
laurent

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26 10:35   ` Laurent Riffard
@ 2011-05-26 16:12     ` Laurent Riffard
  2011-05-26 16:17       ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Laurent Riffard @ 2011-05-26 16:12 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: netdev

Le 26/05/2011 12:35, Laurent Riffard a écrit :
> On Thu, 26 May 2011, Peter Zijlstra wrote:
>> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote:
>>> Hi,
>>>
>>> Recently, netconsole was broken by some scheduler updates. Kernel hangs
>>> on boot near the network card initialization. I noticed that it does
>>> hang just where a "inconsistent lock state" message normally appears.
>>>
>>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the
>>> first bad commit.
>>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
>>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Date:   Tue Apr 5 17:23:54 2011 +0200
>>>
>>>     sched: Remove rq->lock from the first half of ttwu()
>>>
>>>
>>> Before this commit, kernel was booting succesfully despite the
>>> "inconsistent lock state" message. After this commit, the kernel does
>>> hang on boot, I have to push the reset button.
>>
>> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat
>> after some 10 seconds? It would be interesting to see where its stuck.
> 
> CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try.

I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more 
information. During the boot, the kernel hangs just after bringing up 
the NIC. I waited 2 minutes, without getting any more messages.

~~
laurent


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26 16:12     ` Laurent Riffard
@ 2011-05-26 16:17       ` Peter Zijlstra
  2011-06-03 21:15         ` Laurent Riffard
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2011-05-26 16:17 UTC (permalink / raw)
  To: Laurent Riffard; +Cc: netdev

On Thu, 2011-05-26 at 18:12 +0200, Laurent Riffard wrote:
> Le 26/05/2011 12:35, Laurent Riffard a écrit :
> > On Thu, 26 May 2011, Peter Zijlstra wrote:
> >> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote:
> >>> Hi,
> >>>
> >>> Recently, netconsole was broken by some scheduler updates. Kernel hangs
> >>> on boot near the network card initialization. I noticed that it does
> >>> hang just where a "inconsistent lock state" message normally appears.
> >>>
> >>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the
> >>> first bad commit.
> >>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
> >>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >>> Date:   Tue Apr 5 17:23:54 2011 +0200
> >>>
> >>>     sched: Remove rq->lock from the first half of ttwu()
> >>>
> >>>
> >>> Before this commit, kernel was booting succesfully despite the
> >>> "inconsistent lock state" message. After this commit, the kernel does
> >>> hang on boot, I have to push the reset button.
> >>
> >> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat
> >> after some 10 seconds? It would be interesting to see where its stuck.
> > 
> > CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try.
> 
> I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more 
> information. During the boot, the kernel hangs just after bringing up 
> the NIC. I waited 2 minutes, without getting any more messages.

OK, thanks for trying, I shall try and reproduce locally.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26  9:48 ` Peter Zijlstra
  2011-05-26 10:35   ` Laurent Riffard
@ 2011-05-26 20:18   ` David Miller
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2011-05-26 20:18 UTC (permalink / raw)
  To: a.p.zijlstra; +Cc: laurent.riffard, netdev

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Thu, 26 May 2011 11:48:28 +0200

> Also, are you very _very_ sure that lockdep message is a false positive?

I suspect these warnings are entirely specific to using netconsole
with the forcedeth driver.

There is some really scary stuff in there, to say the least.

Just have a look at nv_do_nic_poll().  That's the first time I've
seen a networking driver use the {disable,enable}_irq_lockdep()
interfaces.

It's also using netif_tx_unlock_bh() in the "recovery" case which
in netpoll will cause a local_bh_enable() with cpu interrupts
disabled.  Which is not allowed.

Much of the forcedeth driver's locking seems suspect to me.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] netconsole broken by scheduler updates
  2011-05-26 16:17       ` Peter Zijlstra
@ 2011-06-03 21:15         ` Laurent Riffard
  0 siblings, 0 replies; 7+ messages in thread
From: Laurent Riffard @ 2011-06-03 21:15 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: netdev

Le 26/05/2011 18:17, Peter Zijlstra a écrit :
> On Thu, 2011-05-26 at 18:12 +0200, Laurent Riffard wrote:
>> Le 26/05/2011 12:35, Laurent Riffard a écrit :
>>> On Thu, 26 May 2011, Peter Zijlstra wrote:
>>>> On Thu, 2011-05-26 at 11:21 +0200, Laurent Riffard wrote:
>>>>> Hi,
>>>>>
>>>>> Recently, netconsole was broken by some scheduler updates. Kernel hangs
>>>>> on boot near the network card initialization. I noticed that it does
>>>>> hang just where a "inconsistent lock state" message normally appears.
>>>>>
>>>>> I did a bisection : e4a52bcb9a18142d79e231b6733cabdbf2e67c1f is the
>>>>> first bad commit.
>>>>> commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f
>>>>> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>>>> Date:   Tue Apr 5 17:23:54 2011 +0200
>>>>>
>>>>>     sched: Remove rq->lock from the first half of ttwu()
>>>>>
>>>>>
>>>>> Before this commit, kernel was booting succesfully despite the
>>>>> "inconsistent lock state" message. After this commit, the kernel does
>>>>> hang on boot, I have to push the reset button.
>>>>
>>>> Do you have CONFIG_HARDLOCKUP_DETECTOR=y, and do you get an NMI splat
>>>> after some 10 seconds? It would be interesting to see where its stuck.
>>>
>>> CONFIG_HARDLOCKUP_DETECTOR is not set for now. I'll give it a try.
>>
>> I activated CONFIG_HARDLOCKUP_DETECTOR, but failed to obtain more 
>> information. During the boot, the kernel hangs just after bringing up 
>> the NIC. I waited 2 minutes, without getting any more messages.
> 
> OK, thanks for trying, I shall try and reproduce locally.
> 

Hello,

I tried some newer kernel. It boots successfully now, but 
I've got interesting (?) logs :

====================================================
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.0.0-rc1-00049-g1fa7b6a (laurent@calimero) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #148 SMP PREEMPT Thu Jun 2 18:56:41 CEST 2011
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.0.0-rc1-00049-g1fa7b6a root=/dev/mapper/vglinux1-lv_ubuntu_64bits ro debug netconsole=4444@192.168.0.9/eth0,6666@192.168.0.10/6C:62:6D:48:4B:C7 splash vt.handoff=7
[...]
[    0.685768] Console: switching to colour frame buffer device 160x64
[    0.707611] fb0: VESA VGA frame buffer device
[    0.714029] Linux agpgart interface v0.103
[    0.714206] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.64.
[    0.714837] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23
[    0.715054] forcedeth 0000:00:0a.0: PCI INT A -> Link[APCH] -> GSI 23 (level, low) -> IRQ 23
[    0.715283] forcedeth 0000:00:0a.0: setting latency timer to 64
[    1.241552] forcedeth 0000:00:0a.0: ifname eth0, PHY OUI 0x732 @ 1, addr 00:1f:d0:53:49:ae
[    1.241784] forcedeth 0000:00:0a.0: highdma pwrctl mgmt gbit lnktim msi desc-v3
[    1.242101] netconsole: local port 4444
[    1.242207] netconsole: local IP 192.168.0.9
[    1.242320] netconsole: interface 'eth0'
[    1.242426] netconsole: remote port 6666
[    1.242531] netconsole: remote IP 192.168.0.10
[    1.242650] netconsole: remote ethernet address 6c:62:6d:48:4b:c7
[    1.242822] netconsole: device eth0 not up yet, forcing it
[    1.244612] forcedeth 0000:00:0a.0: irq 40 for MSI/MSI-X
[    1.245057] forcedeth 0000:00:0a.0: eth0: no link during initialization
[    2.374451] forcedeth 0000:00:0a.0: eth0: link up
[    2.381178]
[    2.381179] =================================
[    2.381182] [ INFO: inconsistent lock state ]
[    2.381184] 3.0.0-rc1-00049-g1fa7b6a #148
[    2.381186] ---------------------------------
[    2.381188] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[    2.381191] swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
[    2.381193]  (&(&napi->poll_lock)->rlock){+.?...}, at: [<ffffffff8123cd08>] netpoll_poll_dev+0x9d/0x4ed
[    2.381202] {IN-SOFTIRQ-W} state was registered at:
[    2.381204]   [<ffffffff81068020>] __lock_acquire+0x319/0xdd8
[    2.381208]   [<ffffffff81068f9f>] lock_acquire+0xd8/0x103
[    2.381211]   [<ffffffff8129c019>] _raw_spin_lock+0x36/0x45
[    2.381215]   [<ffffffff8122b886>] net_rx_action+0x8d/0x1e7
[    2.381219]   [<ffffffff81040f73>] __do_softirq+0xe1/0x1c0
[    2.381223]   [<ffffffff8129e39c>] call_softirq+0x1c/0x30
[    2.381226]   [<ffffffff81004296>] do_softirq+0x46/0x9c
[    2.381229]   [<ffffffff810412c5>] irq_exit+0x4d/0xa2
[    2.381232]   [<ffffffff81003f3a>] do_IRQ+0x88/0x9f
[    2.381235]   [<ffffffff8129cb93>] ret_from_intr+0x0/0x1a
[    2.381239]   [<ffffffff8100972c>] default_idle+0x40/0x74
[    2.381242]   [<ffffffff81009844>] amd_e400_idle+0xe4/0xe6
[    2.381246]   [<ffffffff81001270>] cpu_idle+0x5e/0x94
[    2.381249]   [<ffffffff81296383>] start_secondary+0x18a/0x18f
[    2.381252] irq event stamp: 1000341
[    2.381254] hardirqs last  enabled at (1000341): [<ffffffff8129cc58>] restore_args+0x0/0x30
[    2.381257] hardirqs last disabled at (1000339): [<ffffffff8104100a>] __do_softirq+0x178/0x1c0
[    2.381261] softirqs last  enabled at (1000340): [<ffffffff8104103c>] __do_softirq+0x1aa/0x1c0
[    2.381264] softirqs last disabled at (1000251): [<ffffffff8129e39c>] call_softirq+0x1c/0x30
[    2.381267]
[    2.381268] other info that might help us debug this:
[    2.381269]  Possible unsafe locking scenario:
[    2.381270]
[    2.381271]        CPU0
[    2.381272]        ----
[    2.381273]   lock(&(&napi->poll_lock)->rlock);
[    2.381275]   <Interrupt>
[    2.381276]     lock(&(&napi->poll_lock)->rlock);
[    2.381279]
[    2.381279]  *** DEADLOCK ***
[    2.381280]
[    2.381282] 1 lock held by swapper/1:
[    2.381283]  #0:  (target_list_lock){+.+...}, at: [<ffffffff811e5877>] write_msg+0x34/0xcd
[    2.381288]
[    2.381289] stack backtrace:
[    2.381291] Pid: 1, comm: swapper Not tainted 3.0.0-rc1-00049-g1fa7b6a #148
[    2.381293] Call Trace:
[    2.381297]  [<ffffffff81067b1d>] valid_state+0x217/0x229
[    2.381300]  [<ffffffff81067396>] ? check_usage_forwards+0x85/0x85
[    2.381303]  [<ffffffff81067c11>] mark_lock+0xe2/0x1d8
[    2.381306]  [<ffffffff810693bb>] ? trace_hardirqs_on_caller+0x10b/0x12f
[    2.381309]  [<ffffffff8106809e>] __lock_acquire+0x397/0xdd8
[    2.381314]  [<ffffffff8106e882>] ? arch_local_irq_restore+0x6/0xd
[    2.381317]  [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed
[    2.381320]  [<ffffffff81068f9f>] lock_acquire+0xd8/0x103
[    2.381324]  [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed
[    2.381327]  [<ffffffff8129c557>] _raw_spin_trylock+0x4a/0x7f
[    2.381331]  [<ffffffff8123cd08>] ? netpoll_poll_dev+0x9d/0x4ed
[    2.381334]  [<ffffffff8123cd08>] netpoll_poll_dev+0x9d/0x4ed
[    2.381338]  [<ffffffff8123d2ee>] netpoll_send_skb_on_dev+0x10b/0x20f
[    2.381342]  [<ffffffff8123d5dd>] netpoll_send_udp+0x1eb/0x1fa
[    2.381345]  [<ffffffff811e5877>] ? write_msg+0x34/0xcd
[    2.381348]  [<ffffffff811e58cf>] write_msg+0x8c/0xcd
[    2.381352]  [<ffffffff8103b78d>] __call_console_drivers+0x72/0x84
[    2.381355]  [<ffffffff8103b7f8>] _call_console_drivers+0x59/0x5e
[    2.381359]  [<ffffffff8103c033>] console_unlock+0xf3/0x1ac
[    2.381362]  [<ffffffff8103c891>] register_console+0x228/0x2b3
[    2.381366]  [<ffffffff816a76b0>] init_netconsole+0x15d/0x1e4
[    2.381371]  [<ffffffff81166a64>] ? __pci_register_driver+0x9b/0xcd
[    2.381374]  [<ffffffff816a7553>] ? option_setup+0x1f/0x1f
[    2.381377]  [<ffffffff8100030f>] do_one_initcall+0x7a/0x137
[    2.381381]  [<ffffffff81685bcd>] kernel_init+0xb5/0x135
[    2.381386]  [<ffffffff81157d1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[    2.381389]  [<ffffffff8129e2a4>] kernel_thread_helper+0x4/0x10
[    2.381392]  [<ffffffff8129cc58>] ? retint_restore_args+0x13/0x13
[    2.381396]  [<ffffffff81685b18>] ? start_kernel+0x360/0x360
[    2.381399]  [<ffffffff8129e2a0>] ? gs_change+0x13/0x13
[    2.381464] ------------[ cut here ]------------
[    2.381468] WARNING: at net/core/netpoll.c:346 netpoll_send_skb_on_dev+0x158/0x20f()
[    2.381470] Hardware name: M68SM-S2L
[    2.381473] netpoll_send_skb(): eth0 enabled interrupts in poll (nv_start_xmit_optimized+0x0/0x484)
[    2.381474] Modules linked in:
[    2.381477] Pid: 1, comm: swapper Not tainted 3.0.0-rc1-00049-g1fa7b6a #148
[    2.381479] Call Trace:
[    2.381482]  [<ffffffff8103b381>] warn_slowpath_common+0x80/0x98
[    2.381485]  [<ffffffff8103b42d>] warn_slowpath_fmt+0x41/0x43
[    2.381489]  [<ffffffff811e526c>] ? nv_start_xmit+0x3df/0x3df
[    2.381492]  [<ffffffff8123d33b>] netpoll_send_skb_on_dev+0x158/0x20f
[    2.381496]  [<ffffffff8123d5dd>] netpoll_send_udp+0x1eb/0x1fa
[    2.381499]  [<ffffffff811e5877>] ? write_msg+0x34/0xcd
[    2.381502]  [<ffffffff811e58cf>] write_msg+0x8c/0xcd
[    2.381505]  [<ffffffff8103b78d>] __call_console_drivers+0x72/0x84
[    2.381508]  [<ffffffff8103b7f8>] _call_console_drivers+0x59/0x5e
[    2.381512]  [<ffffffff8103c033>] console_unlock+0xf3/0x1ac
[    2.381515]  [<ffffffff8103c891>] register_console+0x228/0x2b3
[    2.381518]  [<ffffffff816a76b0>] init_netconsole+0x15d/0x1e4
[    2.381522]  [<ffffffff81166a64>] ? __pci_register_driver+0x9b/0xcd
[    2.381525]  [<ffffffff816a7553>] ? option_setup+0x1f/0x1f
[    2.381528]  [<ffffffff8100030f>] do_one_initcall+0x7a/0x137
[    2.381531]  [<ffffffff81685bcd>] kernel_init+0xb5/0x135
[    2.381535]  [<ffffffff81157d1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[    2.381538]  [<ffffffff8129e2a4>] kernel_thread_helper+0x4/0x10
[    2.381541]  [<ffffffff8129cc58>] ? retint_restore_args+0x13/0x13
[    2.381545]  [<ffffffff81685b18>] ? start_kernel+0x360/0x360
[    2.381548]  [<ffffffff8129e2a0>] ? gs_change+0x13/0x13
[    2.381555] ---[ end trace f00f2127e552114f ]---
[    2.383979] console [netcon0] enabled
[    2.384117] netconsole: network logging started
[    2.384492] i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[    2.384679] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[    2.385208] serio: i8042 KBD port at 0x60,0x64 irq 1
[...]
==================================================================


hope this help
~~
laurent


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-06-03 21:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-26  9:21 [BUG] netconsole broken by scheduler updates Laurent Riffard
2011-05-26  9:48 ` Peter Zijlstra
2011-05-26 10:35   ` Laurent Riffard
2011-05-26 16:12     ` Laurent Riffard
2011-05-26 16:17       ` Peter Zijlstra
2011-06-03 21:15         ` Laurent Riffard
2011-05-26 20:18   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).