public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
@ 2008-06-17 21:57 Johannes Berg
  2008-06-17 23:08 ` David Miller
  2008-06-17 23:55 ` Linus Torvalds
  0 siblings, 2 replies; 18+ messages in thread
From: Johannes Berg @ 2008-06-17 21:57 UTC (permalink / raw)
  To: Linux Kernel list
  Cc: Michael Buesch, David Ellingsworth, linux-wireless, Ingo Molnar,
	Linus Torvalds

This warning has started to trigger with mac80211 because it can, under
some circumstances, use spin_lock_bh() protected sections within
irq-disabled sections. Is that a bug?

Interestingly, the warning was never noticed until somebody ran the code
on UP/NO-PREEMPT because local_bh_enable_ip() does not contain such a
warning. Also, because softirq disabling is refcounted (via the preempt
counter), it ought to be safe to nest it and even use within
irqs-disabled sections, as far as I can tell.

Also, if you're going to treat IRQs being enabled as a bug, there's no
point in disabling them right afterwards, is there?

If you're going to reject this patch, I'll post one that adds the
warning to local_bh_enable_ip() to allow detecting this for everybody
and not just those poor people running UP/NO-PREEMPT :)

(and why doesn't local_bh_enable just call local_bh_enable_ip anyway? or
both "call" a common static inline?)
---
 kernel/softirq.c |    3 ---
 1 file changed, 3 deletions(-)

--- everything.orig/kernel/softirq.c	2008-06-17 23:48:25.000000000 +0200
+++ everything/kernel/softirq.c	2008-06-17 23:48:56.000000000 +0200
@@ -137,10 +137,7 @@ void local_bh_enable(void)
 	unsigned long flags;
 
 	WARN_ON_ONCE(in_irq());
-#endif
-	WARN_ON_ONCE(irqs_disabled());
 
-#ifdef CONFIG_TRACE_IRQFLAGS
 	local_irq_save(flags);
 #endif
 	/*



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-17 21:57 [PATCH/RFC] remove irqs_disabled warning from local_bh_enable Johannes Berg
@ 2008-06-17 23:08 ` David Miller
  2008-06-17 23:55 ` Linus Torvalds
  1 sibling, 0 replies; 18+ messages in thread
From: David Miller @ 2008-06-17 23:08 UTC (permalink / raw)
  To: johannes; +Cc: linux-kernel, mb, david, linux-wireless, mingo, torvalds

From: Johannes Berg <johannes@sipsolutions.net>
Date: Tue, 17 Jun 2008 23:57:14 +0200

> This warning has started to trigger with mac80211 because it can, under
> some circumstances, use spin_lock_bh() protected sections within
> irq-disabled sections. Is that a bug?

It is a bug.  The only legal nesting is base --> BH --> IRQ locking.

> Also, if you're going to treat IRQs being enabled as a bug, there's no
> point in disabling them right afterwards, is there?

It's treating IRQs being "disabled" as a bug.  It expects them to
be enabled, always.

> If you're going to reject this patch, I'll post one that adds the
> warning to local_bh_enable_ip() to allow detecting this for everybody
> and not just those poor people running UP/NO-PREEMPT :)

That might be useful :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-17 21:57 [PATCH/RFC] remove irqs_disabled warning from local_bh_enable Johannes Berg
  2008-06-17 23:08 ` David Miller
@ 2008-06-17 23:55 ` Linus Torvalds
  2008-06-18  7:01   ` Peter Zijlstra
  2008-06-18  7:29   ` Johannes Berg
  1 sibling, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2008-06-17 23:55 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linux Kernel list, Michael Buesch, David Ellingsworth,
	linux-wireless, Ingo Molnar



On Tue, 17 Jun 2008, Johannes Berg wrote:
>
> This warning has started to trigger with mac80211 because it can, under
> some circumstances, use spin_lock_bh() protected sections within
> irq-disabled sections. Is that a bug?

Yes, it's a bug.

Why? Not because of the "spin_lock_bh()" itself, but because of the 
_unlock_, which does a "local_bh_enable_ip()", which in turn will check 
the whole "do_softirq()" if it was the last softirq_count.

And you must not do softirq's when hard-irq's were disabled!

So it should in theory be ok (but perhaps a bit odd) to do something like

	spin_lock_irq(&irq_lock);
	..do something..
	spin_lock_bh(&bh_lock);
	spin_unlock_irq(&irq_lock);
	.. do something else ..
	spin_unlock_bh(&bh_lock);

where the "spin_lock_bh()" itself is in an irq-locked context - as long as 
the "spin_unlock_bh()" is *not*.

See?

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-17 23:55 ` Linus Torvalds
@ 2008-06-18  7:01   ` Peter Zijlstra
  2008-06-18  7:29   ` Johannes Berg
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2008-06-18  7:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Berg, Linux Kernel list, Michael Buesch,
	David Ellingsworth, linux-wireless, Ingo Molnar

On Tue, 2008-06-17 at 16:55 -0700, Linus Torvalds wrote:
> 
> On Tue, 17 Jun 2008, Johannes Berg wrote:
> >
> > This warning has started to trigger with mac80211 because it can, under
> > some circumstances, use spin_lock_bh() protected sections within
> > irq-disabled sections. Is that a bug?
> 
> Yes, it's a bug.
> 
> Why? Not because of the "spin_lock_bh()" itself, but because of the 
> _unlock_, which does a "local_bh_enable_ip()", which in turn will check 
> the whole "do_softirq()" if it was the last softirq_count.
> 
> And you must not do softirq's when hard-irq's were disabled!
> 
> So it should in theory be ok (but perhaps a bit odd) to do something like
> 
> 	spin_lock_irq(&irq_lock);
> 	..do something..
> 	spin_lock_bh(&bh_lock);
> 	spin_unlock_irq(&irq_lock);
> 	.. do something else ..
> 	spin_unlock_bh(&bh_lock);
> 
> where the "spin_lock_bh()" itself is in an irq-locked context - as long as 
> the "spin_unlock_bh()" is *not*.

I would suggest discouraging such madne^Wcreativity, its gains are
dubious at best and it doesn't make the locking any more obvious and
could be an indication of messy locking to begin with.

So I would like to see Johannes' other patch that allows all of us to
enjoy the warning he ran into ;-)

Peter


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-17 23:55 ` Linus Torvalds
  2008-06-18  7:01   ` Peter Zijlstra
@ 2008-06-18  7:29   ` Johannes Berg
  2008-06-20 13:46     ` Ingo Molnar
  1 sibling, 1 reply; 18+ messages in thread
From: Johannes Berg @ 2008-06-18  7:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel list, Michael Buesch, David Ellingsworth,
	linux-wireless, Ingo Molnar

On Tue, 2008-06-17 at 16:55 -0700, Linus Torvalds wrote:

> > This warning has started to trigger with mac80211 because it can, under
> > some circumstances, use spin_lock_bh() protected sections within
> > irq-disabled sections. Is that a bug?
> 
> Yes, it's a bug.

You know, that actually makes sense to me.

> Why? Not because of the "spin_lock_bh()" itself, but because of the 
> _unlock_, which does a "local_bh_enable_ip()", which in turn will check 
> the whole "do_softirq()" if it was the last softirq_count.

Yeah, that makes sense to me as well. In fact, I pondered a bit over the
code until posting this.

> And you must not do softirq's when hard-irq's were disabled!
> 
> So it should in theory be ok (but perhaps a bit odd) to do something like
> 
> 	spin_lock_irq(&irq_lock);
> 	..do something..
> 	spin_lock_bh(&bh_lock);
> 	spin_unlock_irq(&irq_lock);
> 	.. do something else ..
> 	spin_unlock_bh(&bh_lock);
> 
> where the "spin_lock_bh()" itself is in an irq-locked context - as long as 
> the "spin_unlock_bh()" is *not*.
> 
> See?

Well, yes, from an API POV I do see that. And it's actually what I
thought up front. The thing that started to confuse me is that
local_bh_enable_ip reads like this:

void local_bh_enable_ip(unsigned long ip)
{
#ifdef CONFIG_TRACE_IRQFLAGS
        unsigned long flags;

        WARN_ON_ONCE(in_irq());

        local_irq_save(flags);
#endif

[...]
        if (unlikely(!in_interrupt() && local_softirq_pending()))
                do_softirq();

        dec_preempt_count();
#ifdef CONFIG_TRACE_IRQFLAGS
        local_irq_restore(flags);
#endif

So doesn't it actually run do_softirq() when interrupts are still
disabled? If I look deeper into the code that appears to be a rather odd
trick to avoid locking and __do_softirq will unconditionally enable
IRQs, so I guess here's my answer...

How about the patch below?

johannes

From: Johannes Berg <johannes@sipsolutions.net>
Subject: clean up and comment local_bh_enable code

There's no need to use local_irq_save() over local_irq_disable() in the
local_bh_enable code since it is a bug to call it with irqs disabled and
do_softirq will enable irqs if there is any pending work. Consolidate
the code from local_bh_enable and ..._ip to avoid having a disconnect
between them in the warnings they trigger that is currently there. Also
always trigger the warning on in_irq(), not just in the trace-irqflags
case.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
I made it a static inline so that runtime behaviour isn't affected by
another function call, but that can be argued.

Maybe local_bh_enable() should just be a static inline somewhere in a
header file, that would get rid of one export_symbol as well.

 kernel/softirq.c |   48 +++++++++++-------------------------------------
 1 file changed, 11 insertions(+), 37 deletions(-)

--- everything.orig/kernel/softirq.c	2008-06-17 23:48:25.000000000 +0200
+++ everything/kernel/softirq.c	2008-06-18 09:13:21.000000000 +0200
@@ -131,23 +131,17 @@ void _local_bh_enable(void)
 
 EXPORT_SYMBOL(_local_bh_enable);
 
-void local_bh_enable(void)
+static inline void _local_bh_enable_ip(unsigned long ip)
 {
+	WARN_ON_ONCE(in_irq() || irqs_disabled());
 #ifdef CONFIG_TRACE_IRQFLAGS
-	unsigned long flags;
-
-	WARN_ON_ONCE(in_irq());
-#endif
-	WARN_ON_ONCE(irqs_disabled());
-
-#ifdef CONFIG_TRACE_IRQFLAGS
-	local_irq_save(flags);
+	local_irq_disable();
 #endif
 	/*
 	 * Are softirqs going to be turned on now:
 	 */
 	if (softirq_count() == SOFTIRQ_OFFSET)
-		trace_softirqs_on((unsigned long)__builtin_return_address(0));
+		trace_softirqs_on(ip);
 	/*
 	 * Keep preemption disabled until we are done with
 	 * softirq processing:
@@ -159,40 +153,20 @@ void local_bh_enable(void)
 
 	dec_preempt_count();
 #ifdef CONFIG_TRACE_IRQFLAGS
-	local_irq_restore(flags);
+	local_irq_enable();
 #endif
 	preempt_check_resched();
 }
+
+void local_bh_enable(void)
+{
+	_local_bh_enable_ip((unsigned long)__builtin_return_address(0));
+}
 EXPORT_SYMBOL(local_bh_enable);
 
 void local_bh_enable_ip(unsigned long ip)
 {
-#ifdef CONFIG_TRACE_IRQFLAGS
-	unsigned long flags;
-
-	WARN_ON_ONCE(in_irq());
-
-	local_irq_save(flags);
-#endif
-	/*
-	 * Are softirqs going to be turned on now:
-	 */
-	if (softirq_count() == SOFTIRQ_OFFSET)
-		trace_softirqs_on(ip);
-	/*
-	 * Keep preemption disabled until we are done with
-	 * softirq processing:
- 	 */
- 	sub_preempt_count(SOFTIRQ_OFFSET - 1);
-
-	if (unlikely(!in_interrupt() && local_softirq_pending()))
-		do_softirq();
-
-	dec_preempt_count();
-#ifdef CONFIG_TRACE_IRQFLAGS
-	local_irq_restore(flags);
-#endif
-	preempt_check_resched();
+	_local_bh_enable_ip(ip);
 }
 EXPORT_SYMBOL(local_bh_enable_ip);
 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-18  7:29   ` Johannes Berg
@ 2008-06-20 13:46     ` Ingo Molnar
  2008-06-20 15:27       ` Ingo Molnar
  0 siblings, 1 reply; 18+ messages in thread
From: Ingo Molnar @ 2008-06-20 13:46 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Linux Kernel list, Michael Buesch,
	David Ellingsworth, linux-wireless


* Johannes Berg <johannes@sipsolutions.net> wrote:

> Subject: clean up and comment local_bh_enable code
> 
> There's no need to use local_irq_save() over local_irq_disable() in 
> the local_bh_enable code since it is a bug to call it with irqs 
> disabled and do_softirq will enable irqs if there is any pending work. 
> Consolidate the code from local_bh_enable and ..._ip to avoid having a 
> disconnect between them in the warnings they trigger that is currently 
> there. Also always trigger the warning on in_irq(), not just in the 
> trace-irqflags case.

applied to tip/core/softirq for testing, thanks Johannes.

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 13:46     ` Ingo Molnar
@ 2008-06-20 15:27       ` Ingo Molnar
  2008-06-20 15:36         ` Michael Buesch
  2008-06-20 15:43         ` Johannes Berg
  0 siblings, 2 replies; 18+ messages in thread
From: Ingo Molnar @ 2008-06-20 15:27 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, Linux Kernel list, Michael Buesch,
	David Ellingsworth, linux-wireless


* Ingo Molnar <mingo@elte.hu> wrote:

> > Subject: clean up and comment local_bh_enable code
> > 
> > There's no need to use local_irq_save() over local_irq_disable() in 
> > the local_bh_enable code since it is a bug to call it with irqs 
> > disabled and do_softirq will enable irqs if there is any pending 
> > work. Consolidate the code from local_bh_enable and ..._ip to avoid 
> > having a disconnect between them in the warnings they trigger that 
> > is currently there. Also always trigger the warning on in_irq(), not 
> > just in the trace-irqflags case.
> 
> applied to tip/core/softirq for testing, thanks Johannes.

ok, -tip testing found that your patch triggers a new warning on an old 
testbox that uses 3c59x vortex and netlogging:

----->
calling  vortex_init+0x0/0xb0
PCI: Found IRQ 10 for device 0000:00:0b.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
PCI: Sharing IRQ 10 with 0000:00:0b.1
3c59x: Donald Becker and others.
0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
PCI: Enabling bus mastering for device 0000:00:0b.0
initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
...
calling  init_netconsole+0x0/0x1b0
netconsole: local port 4444
netconsole: local IP 10.0.1.9
netconsole: interface eth0
netconsole: remote port 4444
netconsole: remote IP 10.0.1.16
netconsole: remote ethernet address 00:19:xx:xx:xx:xx
netconsole: device eth0 not up yet, forcing it
eth0:  setting half-duplex.
eth0:  setting full-duplex.
------------[ cut here ]------------
WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
 [<c0125ecf>] warn_on_slowpath+0x4f/0x70
 [<c0126834>] ? release_console_sem+0x1b4/0x1d0
 [<c0126d00>] ? vprintk+0x2a0/0x450
 [<c012fde5>] ? __mod_timer+0xa5/0xc0
 [<c046f7fd>] ? mdio_sync+0x3d/0x50
 [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
 [<c0126ed7>] ? printk+0x27/0x50
 [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
 [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
 [<c0471b92>] ? vortex_timer+0xe2/0x3e0
 [<c012b361>] local_bh_enable_ip+0xd1/0xe0
 [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
 [<c0471b92>] vortex_timer+0xe2/0x3e0
 [<c014743b>] ? trace_hardirqs_on+0xb/0x10
 [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
 [<c012f8b2>] run_timer_softirq+0x162/0x1c0
 [<c0471ab0>] ? vortex_timer+0x0/0x3e0
 [<c012b361>] local_bh_enable_ip+0xd1/0xe0
 [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
 [<c0471b92>] vortex_timer+0xe2/0x3e0
 [<c014743b>] ? trace_hardirqs_on+0xb/0x10
 [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
 [<c012f8b2>] run_timer_softirq+0x162/0x1c0
 [<c0471ab0>] ? vortex_timer+0x0/0x3e0
 [<c0471ab0>] ? vortex_timer+0x0/0x3e0
 [<c012b60a>] __do_softirq+0x9a/0x160
 [<c012b570>] ? __do_softirq+0x0/0x160
 [<c0106775>] call_on_stack+0x15/0x30
 [<c012b4f5>] ? irq_exit+0x55/0x60
 [<c0106e85>] ? do_IRQ+0x85/0xd0
 [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
 [<c0104888>] ? common_interrupt+0x28/0x30
 [<c08d8ac8>] ? mutex_unlock+0x8/0x10
 [<c08d8180>] ? _cond_resched+0x10/0x30
 [<c07a3be7>] ? netpoll_setup+0x117/0x390
 [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
 [<c013d539>] ? ktime_get+0x19/0x40
 [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
 [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
 [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
 [<c0c9b900>] ? kernel_init+0x0/0x2c0
 [<c0c9b900>] ? kernel_init+0x0/0x2c0
 [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
 =======================
---[ end trace 37f9c502aff112e0 ]---
console [netcon0] enabled
netconsole: network logging started
initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs

real bug or false positive?

config is at:

  http://redhat.com/~mingo/misc/config-Fri_Jun_20_16_51_07_CEST_2008.bad

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 15:27       ` Ingo Molnar
@ 2008-06-20 15:36         ` Michael Buesch
  2008-06-20 15:55           ` Ingo Molnar
  2008-06-20 15:43         ` Johannes Berg
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Buesch @ 2008-06-20 15:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless

On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
>  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>  [<c0471b92>] vortex_timer+0xe2/0x3e0

> real bug or false positive?

Well, a timer runs with IRQs disabled, no? So this would be a bug.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 15:27       ` Ingo Molnar
  2008-06-20 15:36         ` Michael Buesch
@ 2008-06-20 15:43         ` Johannes Berg
  2008-06-20 15:46           ` Johannes Berg
  1 sibling, 1 reply; 18+ messages in thread
From: Johannes Berg @ 2008-06-20 15:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Linux Kernel list, Michael Buesch,
	David Ellingsworth, linux-wireless

[-- Attachment #1: Type: text/plain, Size: 2499 bytes --]

On Fri, 2008-06-20 at 17:27 +0200, Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > > Subject: clean up and comment local_bh_enable code
> > > 
> > > There's no need to use local_irq_save() over local_irq_disable() in 
> > > the local_bh_enable code since it is a bug to call it with irqs 
> > > disabled and do_softirq will enable irqs if there is any pending 
> > > work. Consolidate the code from local_bh_enable and ..._ip to avoid 
> > > having a disconnect between them in the warnings they trigger that 
> > > is currently there. Also always trigger the warning on in_irq(), not 
> > > just in the trace-irqflags case.
> > 
> > applied to tip/core/softirq for testing, thanks Johannes.
> 
> ok, -tip testing found that your patch triggers a new warning on an old 
> testbox that uses 3c59x vortex and netlogging:
> 
> ----->
> calling  vortex_init+0x0/0xb0
> PCI: Found IRQ 10 for device 0000:00:0b.0
> PCI: Sharing IRQ 10 with 0000:00:0a.0
> PCI: Sharing IRQ 10 with 0000:00:0b.1
> 3c59x: Donald Becker and others.
> 0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
> PCI: Enabling bus mastering for device 0000:00:0b.0
> initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
> ...
> calling  init_netconsole+0x0/0x1b0
> netconsole: local port 4444
> netconsole: local IP 10.0.1.9
> netconsole: interface eth0
> netconsole: remote port 4444
> netconsole: remote IP 10.0.1.16
> netconsole: remote ethernet address 00:19:xx:xx:xx:xx
> netconsole: device eth0 not up yet, forcing it
> eth0:  setting half-duplex.
> eth0:  setting full-duplex.
> ------------[ cut here ]------------
> WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
> Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
>  [<c0125ecf>] warn_on_slowpath+0x4f/0x70
>  [<c0126834>] ? release_console_sem+0x1b4/0x1d0
>  [<c0126d00>] ? vprintk+0x2a0/0x450

Now you can't printk in irq context any more?

vprintk:
        /*
         * Try to acquire and then immediately release the
         * console semaphore. The release will do all the 
         * actual magic (print out buffers, wake up klogd,
         * etc). 
         *
         * The acquire_console_semaphore_for_printk() function
         * will release 'logbuf_lock' regardless of whether it
         * actually gets the semaphore or not.
         */
        if (acquire_console_semaphore_for_printk(this_cpu))
                release_console_sem();


johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 15:43         ` Johannes Berg
@ 2008-06-20 15:46           ` Johannes Berg
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Berg @ 2008-06-20 15:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Linux Kernel list, Michael Buesch,
	David Ellingsworth, linux-wireless

[-- Attachment #1: Type: text/plain, Size: 2165 bytes --]

On Fri, 2008-06-20 at 17:43 +0200, Johannes Berg wrote:
> On Fri, 2008-06-20 at 17:27 +0200, Ingo Molnar wrote:
> > * Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > > Subject: clean up and comment local_bh_enable code
> > > > 
> > > > There's no need to use local_irq_save() over local_irq_disable() in 
> > > > the local_bh_enable code since it is a bug to call it with irqs 
> > > > disabled and do_softirq will enable irqs if there is any pending 
> > > > work. Consolidate the code from local_bh_enable and ..._ip to avoid 
> > > > having a disconnect between them in the warnings they trigger that 
> > > > is currently there. Also always trigger the warning on in_irq(), not 
> > > > just in the trace-irqflags case.
> > > 
> > > applied to tip/core/softirq for testing, thanks Johannes.
> > 
> > ok, -tip testing found that your patch triggers a new warning on an old 
> > testbox that uses 3c59x vortex and netlogging:
> > 
> > ----->
> > calling  vortex_init+0x0/0xb0
> > PCI: Found IRQ 10 for device 0000:00:0b.0
> > PCI: Sharing IRQ 10 with 0000:00:0a.0
> > PCI: Sharing IRQ 10 with 0000:00:0b.1
> > 3c59x: Donald Becker and others.
> > 0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
> > PCI: Enabling bus mastering for device 0000:00:0b.0
> > initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
> > ...
> > calling  init_netconsole+0x0/0x1b0
> > netconsole: local port 4444
> > netconsole: local IP 10.0.1.9
> > netconsole: interface eth0
> > netconsole: remote port 4444
> > netconsole: remote IP 10.0.1.16
> > netconsole: remote ethernet address 00:19:xx:xx:xx:xx
> > netconsole: device eth0 not up yet, forcing it
> > eth0:  setting half-duplex.
> > eth0:  setting full-duplex.
> > ------------[ cut here ]------------
> > WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
> > Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
> >  [<c0125ecf>] warn_on_slowpath+0x4f/0x70
> >  [<c0126834>] ? release_console_sem+0x1b4/0x1d0
> >  [<c0126d00>] ? vprintk+0x2a0/0x450
> 
> Now you can't printk in irq context any more?

Or I'm just confused by x86 stack dumps.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 15:36         ` Michael Buesch
@ 2008-06-20 15:55           ` Ingo Molnar
  2008-06-20 16:01             ` Michael Buesch
  0 siblings, 1 reply; 18+ messages in thread
From: Ingo Molnar @ 2008-06-20 15:55 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless


* Michael Buesch <mb@bu3sch.de> wrote:

> On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
> >  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
> >  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
> >  [<c0471b92>] vortex_timer+0xe2/0x3e0
> 
> > real bug or false positive?
> 
> Well, a timer runs with IRQs disabled, no? So this would be a bug.

indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]

so score +1 for Johannes's patch. I think if it works out fine and there 
are no false positives in further testing i'll put it up for pull into 
v2.6.26 as well, as it is catching real bugs.

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 15:55           ` Ingo Molnar
@ 2008-06-20 16:01             ` Michael Buesch
  2008-06-20 16:18               ` Michael Buesch
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Buesch @ 2008-06-20 16:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless

On Friday 20 June 2008 17:55:41 Ingo Molnar wrote:
> 
> * Michael Buesch <mb@bu3sch.de> wrote:
> 
> > On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
> > >  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
> > >  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
> > >  [<c0471b92>] vortex_timer+0xe2/0x3e0
> > 
> > > real bug or false positive?
> > 
> > Well, a timer runs with IRQs disabled, no? So this would be a bug.
> 
> indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]

A quick workaround always is to convert the lock into an _irqsafe lock.
Although it introduces higher overhead (interrupt-wise), it prevents the bug.
A real fix would require to understand the locking in the driver, which I
don't, as I never looked at the driver. :)

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 16:01             ` Michael Buesch
@ 2008-06-20 16:18               ` Michael Buesch
  2008-06-23  8:41                 ` Ingo Molnar
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Buesch @ 2008-06-20 16:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless

On Friday 20 June 2008 18:01:09 Michael Buesch wrote:
> On Friday 20 June 2008 17:55:41 Ingo Molnar wrote:
> > 
> > * Michael Buesch <mb@bu3sch.de> wrote:
> > 
> > > On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
> > > >  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
> > > >  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
> > > >  [<c0471b92>] vortex_timer+0xe2/0x3e0
> > > 
> > > > real bug or false positive?
> > > 
> > > Well, a timer runs with IRQs disabled, no? So this would be a bug.
> > 
> > indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]
> 
> A quick workaround always is to convert the lock into an _irqsafe lock.
> Although it introduces higher overhead (interrupt-wise), it prevents the bug.
> A real fix would require to understand the locking in the driver, which I
> don't, as I never looked at the driver. :)


However, looking at the driver I think the fix actually is trivial:

Index: wireless-testing/drivers/net/3c59x.c
===================================================================
--- wireless-testing.orig/drivers/net/3c59x.c	2008-05-16 00:26:29.000000000 +0200
+++ wireless-testing/drivers/net/3c59x.c	2008-06-20 18:16:55.000000000 +0200
@@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
 	case XCVR_MII: case XCVR_NWAY:
 		{
 			ok = 1;
-			spin_lock_bh(&vp->lock);
+			/* Interrupts are already disabled */
+			spin_lock(&vp->lock);
 			vortex_check_media(dev, 0);
-			spin_unlock_bh(&vp->lock);
+			spin_unlock(&vp->lock);
 		}
 		break;
 	  default:					/* Other media types handled by Tx timeouts. */


vp->lock is also taken in hardware IRQ context, so we _have_ to always
use irqsafe locking. As we run in a timer with IRQs disabled,
we can simply use spin_lock.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-20 16:18               ` Michael Buesch
@ 2008-06-23  8:41                 ` Ingo Molnar
  2008-06-23  9:22                   ` Michael Buesch
  2008-06-27  5:33                   ` Jeff Garzik
  0 siblings, 2 replies; 18+ messages in thread
From: Ingo Molnar @ 2008-06-23  8:41 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless, Jeff Garzik, netdev


* Michael Buesch <mb@bu3sch.de> wrote:

> On Friday 20 June 2008 18:01:09 Michael Buesch wrote:
> > On Friday 20 June 2008 17:55:41 Ingo Molnar wrote:
> > > 
> > > * Michael Buesch <mb@bu3sch.de> wrote:
> > > 
> > > > On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
> > > > >  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
> > > > >  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
> > > > >  [<c0471b92>] vortex_timer+0xe2/0x3e0
> > > > 
> > > > > real bug or false positive?
> > > > 
> > > > Well, a timer runs with IRQs disabled, no? So this would be a bug.
> > > 
> > > indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]
> > 
> > A quick workaround always is to convert the lock into an _irqsafe lock.
> > Although it introduces higher overhead (interrupt-wise), it prevents the bug.
> > A real fix would require to understand the locking in the driver, which I
> > don't, as I never looked at the driver. :)
> 
> 
> However, looking at the driver I think the fix actually is trivial:
> 
> Index: wireless-testing/drivers/net/3c59x.c
> ===================================================================
> --- wireless-testing.orig/drivers/net/3c59x.c	2008-05-16 00:26:29.000000000 +0200
> +++ wireless-testing/drivers/net/3c59x.c	2008-06-20 18:16:55.000000000 +0200
> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>  	case XCVR_MII: case XCVR_NWAY:
>  		{
>  			ok = 1;
> -			spin_lock_bh(&vp->lock);
> +			/* Interrupts are already disabled */
> +			spin_lock(&vp->lock);
>  			vortex_check_media(dev, 0);
> -			spin_unlock_bh(&vp->lock);
> +			spin_unlock(&vp->lock);
>  		}
>  		break;
>  	  default:					/* Other media types handled by Tx timeouts. */
> 
> 
> vp->lock is also taken in hardware IRQ context, so we _have_ to always 
> use irqsafe locking. As we run in a timer with IRQs disabled, we can 
> simply use spin_lock.

thanks Michael! I have applied your fix to the tip/out-of-tree 
local/non-propagated fixlets branch for more testing. It takes up to 4 
hours to trigger this warning on that testbox, so i should know it 
whether it fixes the bug later today.

Cc:-ed Jeff, he might want to pick up this fix, find the tidied up 
commit below. (3c59x is still present in various older laptops, so 
-stable candidate too i guess.)

	Ingo

---------------------->
commit 24a5454d9f7863b00143760197f0ec29c8234289
Author: Michael Buesch <mb@bu3sch.de>
Date:   Fri Jun 20 18:18:34 2008 +0200

    net, vortex: fix lockup
    
    Ingo Molnar reported:
    
    -tip testing found that Johannes Berg's "softirq: remove irqs_disabled
    warning from local_bh_enable" enhancement to lockdep triggers a new
    warning on an old testbox that uses 3c59x vortex and netlogging:
    
    ----->
    calling  vortex_init+0x0/0xb0
    PCI: Found IRQ 10 for device 0000:00:0b.0
    PCI: Sharing IRQ 10 with 0000:00:0a.0
    PCI: Sharing IRQ 10 with 0000:00:0b.1
    3c59x: Donald Becker and others.
    0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
    PCI: Enabling bus mastering for device 0000:00:0b.0
    initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
    ...
    calling  init_netconsole+0x0/0x1b0
    netconsole: local port 4444
    netconsole: local IP 10.0.1.9
    netconsole: interface eth0
    netconsole: remote port 4444
    netconsole: remote IP 10.0.1.16
    netconsole: remote ethernet address 00:19:xx:xx:xx:xx
    netconsole: device eth0 not up yet, forcing it
    eth0:  setting half-duplex.
    eth0:  setting full-duplex.
    ------------[ cut here ]------------
    WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
    Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
     [<c0125ecf>] warn_on_slowpath+0x4f/0x70
     [<c0126834>] ? release_console_sem+0x1b4/0x1d0
     [<c0126d00>] ? vprintk+0x2a0/0x450
     [<c012fde5>] ? __mod_timer+0xa5/0xc0
     [<c046f7fd>] ? mdio_sync+0x3d/0x50
     [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
     [<c0126ed7>] ? printk+0x27/0x50
     [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
     [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
     [<c0471b92>] ? vortex_timer+0xe2/0x3e0
     [<c012b361>] local_bh_enable_ip+0xd1/0xe0
     [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
     [<c0471b92>] vortex_timer+0xe2/0x3e0
     [<c014743b>] ? trace_hardirqs_on+0xb/0x10
     [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
     [<c012f8b2>] run_timer_softirq+0x162/0x1c0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c012b361>] local_bh_enable_ip+0xd1/0xe0
     [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
     [<c0471b92>] vortex_timer+0xe2/0x3e0
     [<c014743b>] ? trace_hardirqs_on+0xb/0x10
     [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
     [<c012f8b2>] run_timer_softirq+0x162/0x1c0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c012b60a>] __do_softirq+0x9a/0x160
     [<c012b570>] ? __do_softirq+0x0/0x160
     [<c0106775>] call_on_stack+0x15/0x30
     [<c012b4f5>] ? irq_exit+0x55/0x60
     [<c0106e85>] ? do_IRQ+0x85/0xd0
     [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
     [<c0104888>] ? common_interrupt+0x28/0x30
     [<c08d8ac8>] ? mutex_unlock+0x8/0x10
     [<c08d8180>] ? _cond_resched+0x10/0x30
     [<c07a3be7>] ? netpoll_setup+0x117/0x390
     [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
     [<c013d539>] ? ktime_get+0x19/0x40
     [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
     [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
     [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
     [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
     [<c0c9b900>] ? kernel_init+0x0/0x2c0
     [<c0c9b900>] ? kernel_init+0x0/0x2c0
     [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
     =======================
    ---[ end trace 37f9c502aff112e0 ]---
    console [netcon0] enabled
    netconsole: network logging started
    initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs
    
    looking at the driver I think the bug is real and the fix actually
    is trivial.
    
    vp->lock is also taken in hardware IRQ context, so we _have_ to always
    use irqsafe locking. As we run in a timer with IRQs disabled,
    we can simply use spin_lock.
    
    Cc: <stable@kernel.org>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
index 2edda8c..aabad8c 100644
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
 	case XCVR_MII: case XCVR_NWAY:
 		{
 			ok = 1;
-			spin_lock_bh(&vp->lock);
+			/* Interrupts are already disabled */
+			spin_lock(&vp->lock);
 			vortex_check_media(dev, 0);
-			spin_unlock_bh(&vp->lock);
+			spin_unlock(&vp->lock);
 		}
 		break;
 	  default:					/* Other media types handled by Tx timeouts. */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-23  8:41                 ` Ingo Molnar
@ 2008-06-23  9:22                   ` Michael Buesch
  2008-06-27  5:33                   ` Jeff Garzik
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Buesch @ 2008-06-23  9:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless, Jeff Garzik, netdev

On Monday 23 June 2008 10:41:23 Ingo Molnar wrote:
> ---------------------->
> commit 24a5454d9f7863b00143760197f0ec29c8234289
> Author: Michael Buesch <mb@bu3sch.de>
> Date:   Fri Jun 20 18:18:34 2008 +0200
> 
>     net, vortex: fix lockup
>     
>     Ingo Molnar reported:
>     
>     -tip testing found that Johannes Berg's "softirq: remove irqs_disabled
>     warning from local_bh_enable" enhancement to lockdep triggers a new
>     warning on an old testbox that uses 3c59x vortex and netlogging:
>     
>     ----->
>     calling  vortex_init+0x0/0xb0
>     PCI: Found IRQ 10 for device 0000:00:0b.0
>     PCI: Sharing IRQ 10 with 0000:00:0a.0
>     PCI: Sharing IRQ 10 with 0000:00:0b.1
>     3c59x: Donald Becker and others.
>     0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
>     PCI: Enabling bus mastering for device 0000:00:0b.0
>     initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
>     ...
>     calling  init_netconsole+0x0/0x1b0
>     netconsole: local port 4444
>     netconsole: local IP 10.0.1.9
>     netconsole: interface eth0
>     netconsole: remote port 4444
>     netconsole: remote IP 10.0.1.16
>     netconsole: remote ethernet address 00:19:xx:xx:xx:xx
>     netconsole: device eth0 not up yet, forcing it
>     eth0:  setting half-duplex.
>     eth0:  setting full-duplex.
>     ------------[ cut here ]------------
>     WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
>     Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
>      [<c0125ecf>] warn_on_slowpath+0x4f/0x70
>      [<c0126834>] ? release_console_sem+0x1b4/0x1d0
>      [<c0126d00>] ? vprintk+0x2a0/0x450
>      [<c012fde5>] ? __mod_timer+0xa5/0xc0
>      [<c046f7fd>] ? mdio_sync+0x3d/0x50
>      [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
>      [<c0126ed7>] ? printk+0x27/0x50
>      [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
>      [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
>      [<c0471b92>] ? vortex_timer+0xe2/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b60a>] __do_softirq+0x9a/0x160
>      [<c012b570>] ? __do_softirq+0x0/0x160
>      [<c0106775>] call_on_stack+0x15/0x30
>      [<c012b4f5>] ? irq_exit+0x55/0x60
>      [<c0106e85>] ? do_IRQ+0x85/0xd0
>      [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
>      [<c0104888>] ? common_interrupt+0x28/0x30
>      [<c08d8ac8>] ? mutex_unlock+0x8/0x10
>      [<c08d8180>] ? _cond_resched+0x10/0x30
>      [<c07a3be7>] ? netpoll_setup+0x117/0x390
>      [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
>      [<c013d539>] ? ktime_get+0x19/0x40
>      [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
>      [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
>      [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
>      [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
>      =======================
>     ---[ end trace 37f9c502aff112e0 ]---
>     console [netcon0] enabled
>     netconsole: network logging started
>     initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs
>     
>     looking at the driver I think the bug is real and the fix actually
>     is trivial.
>     
>     vp->lock is also taken in hardware IRQ context, so we _have_ to always
>     use irqsafe locking. As we run in a timer with IRQs disabled,
>     we can simply use spin_lock.
>     
>     Cc: <stable@kernel.org>

Signed-off-by: Michael Buesch <mb@bu3sch.de>

>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
> index 2edda8c..aabad8c 100644
> --- a/drivers/net/3c59x.c
> +++ b/drivers/net/3c59x.c
> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>  	case XCVR_MII: case XCVR_NWAY:
>  		{
>  			ok = 1;
> -			spin_lock_bh(&vp->lock);
> +			/* Interrupts are already disabled */
> +			spin_lock(&vp->lock);
>  			vortex_check_media(dev, 0);
> -			spin_unlock_bh(&vp->lock);
> +			spin_unlock(&vp->lock);
>  		}
>  		break;
>  	  default:					/* Other media types handled by Tx timeouts. */
> 
> 



-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-23  8:41                 ` Ingo Molnar
  2008-06-23  9:22                   ` Michael Buesch
@ 2008-06-27  5:33                   ` Jeff Garzik
  2008-06-27 10:53                     ` Ingo Molnar
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff Garzik @ 2008-06-27  5:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michael Buesch, Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless, netdev

Ingo Molnar wrote:
> * Michael Buesch <mb@bu3sch.de> wrote:
> 
>> On Friday 20 June 2008 18:01:09 Michael Buesch wrote:
>>> On Friday 20 June 2008 17:55:41 Ingo Molnar wrote:
>>>> * Michael Buesch <mb@bu3sch.de> wrote:
>>>>
>>>>> On Friday 20 June 2008 17:27:48 Ingo Molnar wrote:
>>>>>>  [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>>>>>>  [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>>>>>>  [<c0471b92>] vortex_timer+0xe2/0x3e0
>>>>>> real bug or false positive?
>>>>> Well, a timer runs with IRQs disabled, no? So this would be a bug.
>>>> indeed - agreed :) [no time for me to fix it, but can test any rfc patch.]
>>> A quick workaround always is to convert the lock into an _irqsafe lock.
>>> Although it introduces higher overhead (interrupt-wise), it prevents the bug.
>>> A real fix would require to understand the locking in the driver, which I
>>> don't, as I never looked at the driver. :)
>>
>> However, looking at the driver I think the fix actually is trivial:
>>
>> Index: wireless-testing/drivers/net/3c59x.c
>> ===================================================================
>> --- wireless-testing.orig/drivers/net/3c59x.c	2008-05-16 00:26:29.000000000 +0200
>> +++ wireless-testing/drivers/net/3c59x.c	2008-06-20 18:16:55.000000000 +0200
>> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>>  	case XCVR_MII: case XCVR_NWAY:
>>  		{
>>  			ok = 1;
>> -			spin_lock_bh(&vp->lock);
>> +			/* Interrupts are already disabled */
>> +			spin_lock(&vp->lock);
>>  			vortex_check_media(dev, 0);
>> -			spin_unlock_bh(&vp->lock);
>> +			spin_unlock(&vp->lock);
>>  		}
>>  		break;
>>  	  default:					/* Other media types handled by Tx timeouts. */
>>
>>
>> vp->lock is also taken in hardware IRQ context, so we _have_ to always 
>> use irqsafe locking. As we run in a timer with IRQs disabled, we can 
>> simply use spin_lock.
> 
> thanks Michael! I have applied your fix to the tip/out-of-tree 
> local/non-propagated fixlets branch for more testing. It takes up to 4 
> hours to trigger this warning on that testbox, so i should know it 
> whether it fixes the bug later today.
> 
> Cc:-ed Jeff, he might want to pick up this fix, find the tidied up 
> commit below. (3c59x is still present in various older laptops, so 
> -stable candidate too i guess.)
> 
> 	Ingo
> 
> ---------------------->
> commit 24a5454d9f7863b00143760197f0ec29c8234289
> Author: Michael Buesch <mb@bu3sch.de>
> Date:   Fri Jun 20 18:18:34 2008 +0200
> 
>     net, vortex: fix lockup
>     
>     Ingo Molnar reported:
>     
>     -tip testing found that Johannes Berg's "softirq: remove irqs_disabled
>     warning from local_bh_enable" enhancement to lockdep triggers a new
>     warning on an old testbox that uses 3c59x vortex and netlogging:
>     
>     ----->
>     calling  vortex_init+0x0/0xb0
>     PCI: Found IRQ 10 for device 0000:00:0b.0
>     PCI: Sharing IRQ 10 with 0000:00:0a.0
>     PCI: Sharing IRQ 10 with 0000:00:0b.1
>     3c59x: Donald Becker and others.
>     0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
>     PCI: Enabling bus mastering for device 0000:00:0b.0
>     initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
>     ...
>     calling  init_netconsole+0x0/0x1b0
>     netconsole: local port 4444
>     netconsole: local IP 10.0.1.9
>     netconsole: interface eth0
>     netconsole: remote port 4444
>     netconsole: remote IP 10.0.1.16
>     netconsole: remote ethernet address 00:19:xx:xx:xx:xx
>     netconsole: device eth0 not up yet, forcing it
>     eth0:  setting half-duplex.
>     eth0:  setting full-duplex.
>     ------------[ cut here ]------------
>     WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
>     Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
>      [<c0125ecf>] warn_on_slowpath+0x4f/0x70
>      [<c0126834>] ? release_console_sem+0x1b4/0x1d0
>      [<c0126d00>] ? vprintk+0x2a0/0x450
>      [<c012fde5>] ? __mod_timer+0xa5/0xc0
>      [<c046f7fd>] ? mdio_sync+0x3d/0x50
>      [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
>      [<c0126ed7>] ? printk+0x27/0x50
>      [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
>      [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
>      [<c0471b92>] ? vortex_timer+0xe2/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b361>] local_bh_enable_ip+0xd1/0xe0
>      [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
>      [<c0471b92>] vortex_timer+0xe2/0x3e0
>      [<c014743b>] ? trace_hardirqs_on+0xb/0x10
>      [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
>      [<c012f8b2>] run_timer_softirq+0x162/0x1c0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c0471ab0>] ? vortex_timer+0x0/0x3e0
>      [<c012b60a>] __do_softirq+0x9a/0x160
>      [<c012b570>] ? __do_softirq+0x0/0x160
>      [<c0106775>] call_on_stack+0x15/0x30
>      [<c012b4f5>] ? irq_exit+0x55/0x60
>      [<c0106e85>] ? do_IRQ+0x85/0xd0
>      [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
>      [<c0104888>] ? common_interrupt+0x28/0x30
>      [<c08d8ac8>] ? mutex_unlock+0x8/0x10
>      [<c08d8180>] ? _cond_resched+0x10/0x30
>      [<c07a3be7>] ? netpoll_setup+0x117/0x390
>      [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
>      [<c013d539>] ? ktime_get+0x19/0x40
>      [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
>      [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
>      [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
>      [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0c9b900>] ? kernel_init+0x0/0x2c0
>      [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
>      =======================
>     ---[ end trace 37f9c502aff112e0 ]---
>     console [netcon0] enabled
>     netconsole: network logging started
>     initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs
>     
>     looking at the driver I think the bug is real and the fix actually
>     is trivial.
>     
>     vp->lock is also taken in hardware IRQ context, so we _have_ to always
>     use irqsafe locking. As we run in a timer with IRQs disabled,
>     we can simply use spin_lock.
>     
>     Cc: <stable@kernel.org>
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
> index 2edda8c..aabad8c 100644
> --- a/drivers/net/3c59x.c
> +++ b/drivers/net/3c59x.c
> @@ -1768,9 +1768,10 @@ vortex_timer(unsigned long data)
>  	case XCVR_MII: case XCVR_NWAY:
>  		{
>  			ok = 1;
> -			spin_lock_bh(&vp->lock);
> +			/* Interrupts are already disabled */
> +			spin_lock(&vp->lock);
>  			vortex_check_media(dev, 0);
> -			spin_unlock_bh(&vp->lock);
> +			spin_unlock(&vp->lock);

applied



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-27  5:33                   ` Jeff Garzik
@ 2008-06-27 10:53                     ` Ingo Molnar
  2008-06-27 10:56                       ` Johannes Berg
  0 siblings, 1 reply; 18+ messages in thread
From: Ingo Molnar @ 2008-06-27 10:53 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Michael Buesch, Johannes Berg, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless, netdev


* Jeff Garzik <jgarzik@pobox.com> wrote:

> applied

track record of 4 days of testing: the warning indeed went away and no 
other funnies in this area happened on that box. Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH/RFC] remove irqs_disabled warning from local_bh_enable
  2008-06-27 10:53                     ` Ingo Molnar
@ 2008-06-27 10:56                       ` Johannes Berg
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Berg @ 2008-06-27 10:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Michael Buesch, Linus Torvalds, Linux Kernel list,
	David Ellingsworth, linux-wireless, netdev

[-- Attachment #1: Type: text/plain, Size: 352 bytes --]

On Fri, 2008-06-27 at 12:53 +0200, Ingo Molnar wrote:
> * Jeff Garzik <jgarzik@pobox.com> wrote:
> 
> > applied
> 
> track record of 4 days of testing: the warning indeed went away and no 
> other funnies in this area happened on that box. Thanks,

I've been running with my change for a while too now and never noticed
anything.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-06-27 10:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-17 21:57 [PATCH/RFC] remove irqs_disabled warning from local_bh_enable Johannes Berg
2008-06-17 23:08 ` David Miller
2008-06-17 23:55 ` Linus Torvalds
2008-06-18  7:01   ` Peter Zijlstra
2008-06-18  7:29   ` Johannes Berg
2008-06-20 13:46     ` Ingo Molnar
2008-06-20 15:27       ` Ingo Molnar
2008-06-20 15:36         ` Michael Buesch
2008-06-20 15:55           ` Ingo Molnar
2008-06-20 16:01             ` Michael Buesch
2008-06-20 16:18               ` Michael Buesch
2008-06-23  8:41                 ` Ingo Molnar
2008-06-23  9:22                   ` Michael Buesch
2008-06-27  5:33                   ` Jeff Garzik
2008-06-27 10:53                     ` Ingo Molnar
2008-06-27 10:56                       ` Johannes Berg
2008-06-20 15:43         ` Johannes Berg
2008-06-20 15:46           ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox