All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Michael Buesch <mb@bu3sch.de>,
	Johannes Berg <johannes@sipsolutions.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel list <linux-kernel@vger.kernel.org>,
	David Ellingsworth <david@identd.dyndns.org>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	netdev@vger.kernel.org
Subject: Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
Date: Fri, 27 Jun 2008 01:33:08 -0400	[thread overview]
Message-ID: <48647B94.5000006@pobox.com> (raw)
In-Reply-To: <20080623084123.GA32688@elte.hu>

Ingo Molnar wrote:
> * Michael Buesch <mb@bu3sch.de> wrote:
> 
>> On Friday 20 June 2008 18:01:09 Michael Buesch wrote:
>>> On Friday 20 June 2008 17:55:41 Ingo Molnar wrote:
>>>> * Michael Buesch <mb@bu3sch.de> wrote:
>>>>
>>>>> On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
>>>>>>  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>>>>>>  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>>>>>>  [<c0471b92>] vortex_timer+0xe2/0x3e0
>>>>>> real bug or false positive?
>>>>> Well, a timer runs with IRQs disabled, no? So this would be a bug.
>>>> indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]
>>> A quick workaround always is to convert the lock into an _irqsafe lock.
>>> Although it introduces higher overhead (interrupt-wise), it prevents the bug.
>>> A real fix would require to understand the locking in the driver, which I
>>> don't, as I never looked at the driver. :)
>>
>> However, looking at the driver I think the fix actually is trivial:
>>
>> Index: wireless-testing/drivers/net/3c59x.c
>> ===================================================================
>> --- wireless-testing.orig/drivers/net/3c59x.c	2008-05-16 00:26:29.000000000 +0200
>> +++ wireless-testing/drivers/net/3c59x.c	2008-06-20 18:16:55.000000000 +0200
>> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>>  	case XCVR_MII: case XCVR_NWAY:
>>  		{
>>  			ok = 1;
>> -			spin_lock_bh(&vp->lock);
>> +			/* Interrupts are already disabled */
>> +			spin_lock(&vp->lock);
>>  			vortex_check_media(dev, 0);
>> -			spin_unlock_bh(&vp->lock);
>> +			spin_unlock(&vp->lock);
>>  		}
>>  		break;
>>  	  default:					/* Other media types handled by Tx timeouts. */
>>
>>
>> vp->lock is also taken in hardware IRQ context, so we _have_ to always 
>> use irqsafe locking. As we run in a timer with IRQs disabled, we can 
>> simply use spin_lock.
> 
> thanks Michael! I have applied your fix to the tip/out-of-tree 
> local/non-propagated fixlets branch for more testing. It takes up to 4 
> hours to trigger this warning on that testbox, so i should know it 
> whether it fixes the bug later today.
> 
> Cc:-ed Jeff, he might want to pick up this fix, find the tidied up 
> commit below. (3c59x is still present in various older laptops, so 
> -stable candidate too i guess.)
> 
> 	Ingo
> 
> ---------------------->
> commit 24a5454d9f7863b00143760197f0ec29c8234289
> Author: Michael Buesch <mb@bu3sch.de>
> Date:   Fri Jun 20 18:18:34 2008 +0200
> 
>     net, vortex: fix lockup
>     
>     Ingo Molnar reported:
>     
>     -tip testing found that Johannes Berg's "softirq: remove irqs_disabled
>     warning from local_bh_enable" enhancement to lockdep triggers a new
>     warning on an old testbox that uses 3c59x vortex and netlogging:
>     
>     ----->
>     calling  vortex_init+0x0/0xb0
>     PCI: Found IRQ 10 for device 0000:00:0b.0
>     PCI: Sharing IRQ 10 with 0000:00:0a.0
>     PCI: Sharing IRQ 10 with 0000:00:0b.1
>     3c59x: Donald Becker and others.
>     0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
>     PCI: Enabling bus mastering for device 0000:00:0b.0
>     initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
>     ...
>     calling  init_netconsole+0x0/0x1b0
>     netconsole: local port 4444
>     netconsole: local IP 10.0.1.9
>     netconsole: interface eth0
>     netconsole: remote port 4444
>     netconsole: remote IP 10.0.1.16
>     netconsole: remote ethernet address 00:19:xx:xx:xx:xx
>     netconsole: device eth0 not up yet, forcing it
>     eth0:  setting half-duplex.
>     eth0:  setting full-duplex.
>     ------------[ cut here ]------------
>     WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
>     Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
>      [<c0125ecf>] warn_on_slowpath+0x4f/0x70
>      [<c0126834>] ? release_console_sem+0x1b4/0x1d0
>      [<c0126d00>] ? vprintk+0x2a0/0x450
>      [<c012fde5>] ? __mod_timer+0xa5/0xc0
>      [<c046f7fd>] ? mdio_sync+0x3d/0x50
>      [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
>      [<c0126ed7>] ? printk+0x27/0x50
>      [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
>      [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
>      [<c0471b92>] ? vortex_timer+0xe2/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b60a>] __do_softirq+0x9a/0x160
>      [<c012b570>] ? __do_softirq+0x0/0x160
>      [<c0106775>] call_on_stack+0x15/0x30
>      [<c012b4f5>] ? irq_exit+0x55/0x60
>      [<c0106e85>] ? do_IRQ+0x85/0xd0
>      [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
>      [<c0104888>] ? common_interrupt+0x28/0x30
>      [<c08d8ac8>] ? mutex_unlock+0x8/0x10
>      [<c08d8180>] ? _cond_resched+0x10/0x30
>      [<c07a3be7>] ? netpoll_setup+0x117/0x390
>      [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
>      [<c013d539>] ? ktime_get+0x19/0x40
>      [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
>      [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
>      [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
>      [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
>      =======================
>     ---[ end trace 37f9c502aff112e0 ]---
>     console [netcon0] enabled
>     netconsole: network logging started
>     initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs
>     
>     looking at the driver I think the bug is real and the fix actually
>     is trivial.
>     
>     vp->lock is also taken in hardware IRQ context, so we _have_ to always
>     use irqsafe locking. As we run in a timer with IRQs disabled,
>     we can simply use spin_lock.
>     
>     Cc: <stable@kernel.org>
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
> index 2edda8c..aabad8c 100644
> --- a/drivers/net/3c59x.c
> +++ b/drivers/net/3c59x.c
> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>  	case XCVR_MII: case XCVR_NWAY:
>  		{
>  			ok = 1;
> -			spin_lock_bh(&vp->lock);
> +			/* Interrupts are already disabled */
> +			spin_lock(&vp->lock);
>  			vortex_check_media(dev, 0);
> -			spin_unlock_bh(&vp->lock);
> +			spin_unlock(&vp->lock);

applied



  parent reply	other threads:[~2008-06-27  5:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-17 21:57 [PATCH/RFC] remove irqs_disabled warning from local_bh_enable Johannes Berg
2008-06-17 23:08 ` David Miller
2008-06-17 23:55 ` Linus Torvalds
2008-06-18  7:01   ` Peter Zijlstra
2008-06-18  7:29   ` Johannes Berg
2008-06-20 13:46     ` Ingo Molnar
2008-06-20 15:27       ` Ingo Molnar
2008-06-20 15:36         ` Michael Buesch
2008-06-20 15:55           ` Ingo Molnar
2008-06-20 16:01             ` Michael Buesch
2008-06-20 16:18               ` Michael Buesch
2008-06-23  8:41                 ` Ingo Molnar
2008-06-23  8:41                   ` Ingo Molnar
2008-06-23  9:22                   ` Michael Buesch
2008-06-27  5:33                   ` Jeff Garzik [this message]
2008-06-27 10:53                     ` Ingo Molnar
2008-06-27 10:56                       ` Johannes Berg
2008-06-27 10:56                         ` Johannes Berg
2008-06-20 15:43         ` Johannes Berg
2008-06-20 15:46           ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48647B94.5000006@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=david@identd.dyndns.org \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mb@bu3sch.de \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.