[patch] revert: [NET]: Fix races in net_rx

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
@ 2007-07-16  9:12 Ingo Molnar
  2007-07-16 10:35 ` Olaf Kirch
                   ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-16  9:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, olaf.kirch, davem


current -git broke my main testbox. No TCP/IP networking to/from the box 
and e1000 would time out in xmit:

 NETDEV WATCHDOG: eth0: transmit timed out
 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
   Tx Queue             <0>
   TDH                  <95>
   TDT                  <95>
   next_to_use          <95>
   next_to_clean        <ea>
 buffer_info[next_to_clean]
   time_stamp           <ffff8f43>
   next_to_watch        <ea>
   jiffies              <ffffb5cc>
   next_to_watch.status <1>

After a bisection session the bad commit turned out to be:

 29578624e354f56143d92510fff33a8b2aaa2c03 is first bad commit
 commit 29578624e354f56143d92510fff33a8b2aaa2c03
 Author: Olaf Kirch <olaf.kirch@oracle.com>
 Date:   Wed Jul 11 19:32:02 2007 -0700

     [NET]: Fix races in net_rx_action vs netpoll.

     Keep netpoll/poll_napi from messing with the poll_list.
     Only net_rx_action is allowed to manipulate the list.

     Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
     Signed-off-by: David S. Miller <davem@davemloft.net>

and indeed the testbox uses netconsole (it's a laptop so this is the 
only viable remote debugging option). Applying the revert patch below 
makes it work again.

	Ingo

------------------>
Subject: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
From: Ingo Molnar <mingo@elte.hu>

commit 29578624 causes netconsole failures:

  NETDEV WATCHDOG: eth0: transmit timed out
  e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
    Tx Queue             <0>
    TDH                  <95>
    TDT                  <95>
    next_to_use          <95>
    next_to_clean        <ea>
  buffer_info[next_to_clean]
    time_stamp           <ffff8f43>
    next_to_watch        <ea>
    jiffies              <ffffb5cc>
    next_to_watch.status <1>

revert it for now, to make my testsystem work.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/netdevice.h |   10 ----------
 net/core/netpoll.c        |    8 --------
 2 files changed, 18 deletions(-)

Index: linux/include/linux/netdevice.h
===================================================================
--- linux.orig/include/linux/netdevice.h
+++ linux/include/linux/netdevice.h
@@ -261,8 +261,6 @@ enum netdev_state_t
 	__LINK_STATE_LINKWATCH_PENDING,
 	__LINK_STATE_DORMANT,
 	__LINK_STATE_QDISC_RUNNING,
-	/* Set by the netpoll NAPI code */
-	__LINK_STATE_POLL_LIST_FROZEN,
 };
 
 
@@ -1016,14 +1014,6 @@ static inline void netif_rx_complete(str
 {
 	unsigned long flags;
 
-#ifdef CONFIG_NETPOLL
-	/* Prevent race with netpoll - yes, this is a kludge.
-	 * But at least it doesn't penalize the non-netpoll
-	 * code path. */
-	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
-		return;
-#endif
-
 	local_irq_save(flags);
 	__netif_rx_complete(dev);
 	local_irq_restore(flags);
Index: linux/net/core/netpoll.c
===================================================================
--- linux.orig/net/core/netpoll.c
+++ linux/net/core/netpoll.c
@@ -124,13 +124,6 @@ static void poll_napi(struct netpoll *np
 	if (test_bit(__LINK_STATE_RX_SCHED, &np->dev->state) &&
 	    npinfo->poll_owner != smp_processor_id() &&
 	    spin_trylock(&npinfo->poll_lock)) {
-		/* When calling dev->poll from poll_napi, we may end up in
-		 * netif_rx_complete. However, only the CPU to which the
-		 * device was queued is allowed to remove it from poll_list.
-		 * Setting POLL_LIST_FROZEN tells netif_rx_complete
-		 * to leave the NAPI state alone.
-		 */
-		set_bit(__LINK_STATE_POLL_LIST_FROZEN, &np->dev->state);
 		npinfo->rx_flags |= NETPOLL_RX_DROP;
 		atomic_inc(&trapped);
 
@@ -138,7 +131,6 @@ static void poll_napi(struct netpoll *np
 
 		atomic_dec(&trapped);
 		npinfo->rx_flags &= ~NETPOLL_RX_DROP;
-		clear_bit(__LINK_STATE_POLL_LIST_FROZEN, &np->dev->state);
 		spin_unlock(&npinfo->poll_lock);
 	}
 }

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16  9:12 [patch] revert: [NET]: Fix races in net_rx_action vs netpoll Ingo Molnar
@ 2007-07-16 10:35 ` Olaf Kirch
  2007-07-16 11:26 ` David Miller
  2007-07-17  5:46 ` Jarek Poplawski
  2 siblings, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-16 10:35 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, linux-kernel, davem

On Monday 16 July 2007 11:12, Ingo Molnar wrote:
> After a bisection session the bad commit turned out to be:
> 
>  29578624e354f56143d92510fff33a8b2aaa2c03 is first bad commit
>  commit 29578624e354f56143d92510fff33a8b2aaa2c03
>  Author: Olaf Kirch <olaf.kirch@oracle.com>
>  Date:   Wed Jul 11 19:32:02 2007 -0700
> 
>      [NET]: Fix races in net_rx_action vs netpoll.
> 
>      Keep netpoll/poll_napi from messing with the poll_list.
>      Only net_rx_action is allowed to manipulate the list.
> 
>      Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>

Apologies - I'll look into it.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16  9:12 [patch] revert: [NET]: Fix races in net_rx_action vs netpoll Ingo Molnar
  2007-07-16 10:35 ` Olaf Kirch
@ 2007-07-16 11:26 ` David Miller
  2007-07-16 12:18   ` Olaf Kirch
                     ` (2 more replies)
  2007-07-17  5:46 ` Jarek Poplawski
  2 siblings, 3 replies; 67+ messages in thread
From: David Miller @ 2007-07-16 11:26 UTC (permalink / raw)
  To: mingo; +Cc: torvalds, linux-kernel, olaf.kirch

From: Ingo Molnar <mingo@elte.hu>
Date: Mon, 16 Jul 2007 11:12:36 +0200

> Applying the revert patch below makes it work again.

Well, let's figure out why before we revert because it
is attempting to fix a legitimate bug.

Olaf, any ideas?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 11:26 ` David Miller
@ 2007-07-16 12:18   ` Olaf Kirch
  2007-07-16 13:29     ` Ingo Molnar
  2007-07-16 21:09   ` Ingo Molnar
  2007-07-16 21:40   ` Linus Torvalds
  2 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-16 12:18 UTC (permalink / raw)
  To: David Miller; +Cc: mingo, torvalds, linux-kernel

On Monday 16 July 2007 13:26, David Miller wrote:
> Well, let's figure out why before we revert because it
> is attempting to fix a legitimate bug.
> 
> Olaf, any ideas?

It seems as if the card is stuck in NAPI mode without being
serviced by net_rx_action.

Ingo, is this a UP or SMP machine? Are you still getting output
from netconsole, or is the network down completely?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 12:18   ` Olaf Kirch
@ 2007-07-16 13:29     ` Ingo Molnar
  0 siblings, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-16 13:29 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: David Miller, torvalds, linux-kernel


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Monday 16 July 2007 13:26, David Miller wrote:
> > Well, let's figure out why before we revert because it
> > is attempting to fix a legitimate bug.
> > 
> > Olaf, any ideas?
> 
> It seems as if the card is stuck in NAPI mode without being serviced 
> by net_rx_action.
> 
> Ingo, is this a UP or SMP machine? Are you still getting output from 
> netconsole, or is the network down completely?

there is some netconsole output but that too gets stuck after the tx 
timeout message. The e1000 options are:

CONFIG_E1000=y
CONFIG_E1000_NAPI=y
# CONFIG_E1000_DISABLE_PACKET_SPLIT is not set

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 11:26 ` David Miller
  2007-07-16 12:18   ` Olaf Kirch
@ 2007-07-16 21:09   ` Ingo Molnar
  2007-07-16 22:06     ` David Miller
  2007-07-16 21:40   ` Linus Torvalds
  2 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-16 21:09 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, linux-kernel, olaf.kirch

* David Miller <davem@davemloft.net> wrote:

> From: Ingo Molnar <mingo@elte.hu>
> Date: Mon, 16 Jul 2007 11:12:36 +0200
> 
> > Applying the revert patch below makes it work again.
> 
> Well, let's figure out why before we revert because it is attempting 
> to fix a legitimate bug.

yeah, no doubt about it, but this fix breaks my box _for sure_, while 
whatever problem the commit fixed, it was not something that ever 
triggered on my boxes =B-)

so ... i can promise to test whatever new version of the patch Olaf 
sends me (the problem is easy to reproduce and easy to test, so i can 
check it all in a heartbeat), so to get things back on track, and to 
value the time i spent on bisecting this, could we please apply the 
revert upstream (temporarily), because Linus' -git tree as of today is 
still broken for me. (and likely broken for others)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 21:09   ` Ingo Molnar
@ 2007-07-16 22:06     ` David Miller
  0 siblings, 0 replies; 67+ messages in thread
From: David Miller @ 2007-07-16 22:06 UTC (permalink / raw)
  To: mingo; +Cc: torvalds, linux-kernel, olaf.kirch

From: Ingo Molnar <mingo@elte.hu>
Date: Mon, 16 Jul 2007 23:09:24 +0200

> so ... i can promise to test whatever new version of the patch Olaf 
> sends me (the problem is easy to reproduce and easy to test, so i can 
> check it all in a heartbeat), so to get things back on track, and to 
> value the time i spent on bisecting this, could we please apply the 
> revert upstream (temporarily), because Linus' -git tree as of today is 
> still broken for me. (and likely broken for others)

You can't revert patch from your tree locally in the mean time
because???

Ingo, please be reasonable and a little patient.  Things break
all the time, and it's not like 2.6.23-final is going out
tomorrow.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 11:26 ` David Miller
  2007-07-16 12:18   ` Olaf Kirch
  2007-07-16 21:09   ` Ingo Molnar
@ 2007-07-16 21:40   ` Linus Torvalds
  2007-07-16 21:51     ` Ingo Molnar
                       ` (2 more replies)
  2 siblings, 3 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-07-16 21:40 UTC (permalink / raw)
  To: David Miller; +Cc: mingo, linux-kernel, olaf.kirch

On Mon, 16 Jul 2007, David Miller wrote:
> 
> Well, let's figure out why before we revert because it
> is attempting to fix a legitimate bug.

I'm reverting it. I don't think there is any excuse for not reverting 
something that provably breaks somebody's machine. I don't want this to be 
on the regression list - if you figure out why it broke, you can always 
just resubmit a fixed patch.

Keeping a known broken patch around just because you want to figure out 
why it's broken seems pointless.

Just looking at the patch I see two options:

 - The change seems to always set the LIST_FROZEN bit when calling 
   ->poll(), and at least on e1000, the NAPI poll() routine ends up doing 
   that netif_rx_complete(), so we're *guaranteed* to always take the 
   early exit and not do anything.

   Looks fundamentally broken to me, and not worth saving. But maybe I'm 
   missing something (and it doesn't really seem to explain the write 
   timeout per se).

 - The early return from netif_rx_complete() ends up meaning that an 
   edge-triggered interrupt isn't handled properly, and will this never 
   happen again, since it never goes away.

   With MSI, edge-triggered interrupts are making a comeback in a big way, 
   and yeah, e1000 is one of the drivers that do MSI. Ingo might want to 
   confirm whether it's actually enabled for him, and whether turning it 
   off might hide the problem, but if that's it, then the whole patch is 
   fundamentally broken, and not worth saving.

but in either case (or, indeed, even if I didn't see any problem at all), 
I think reverting a patch that isn't needed is _always_ the right choice. 

If we don't know what caused a problem in the first place, or if the fix 
is known to be required for something else and reverting it would cause 
*another* regression, it would be another issue. But as it is, reverting 
it would seem to unquestionably get rid of a regression, and is thus a 
no-brainer.

No?

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 21:40   ` Linus Torvalds
@ 2007-07-16 21:51     ` Ingo Molnar
  2007-07-16 22:09       ` David Miller
  2007-07-16 22:08     ` David Miller
  2007-07-17  8:16     ` Olaf Kirch
  2 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-16 21:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Miller, linux-kernel, olaf.kirch


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

>    With MSI, edge-triggered interrupts are making a comeback in a big 
>    way, and yeah, e1000 is one of the drivers that do MSI. Ingo might 
>    want to confirm whether it's actually enabled for him, and whether 
>    turning it off might hide the problem, but if that's it, then the 
>    whole patch is fundamentally broken, and not worth saving.

MSI was off for the test:

  # CONFIG_PCI_MSI is not set

full config is at:

  http://redhat.com/~mingo/misc/config

the hang-log is at:

  http://redhat.com/~mingo/misc/hang.log

netconsole output went silent during the last tx-timeout message. (the 
above hang.log is from dmesg)

> but in either case (or, indeed, even if I didn't see any problem at 
> all), I think reverting a patch that isn't needed is _always_ the 
> right choice.
> 
> If we don't know what caused a problem in the first place, or if the 
> fix is known to be required for something else and reverting it would 
> cause *another* regression, it would be another issue. But as it is, 
> reverting it would seem to unquestionably get rid of a regression, and 
> is thus a no-brainer.
> 
> No?

i also offered to quickly try any test-version of the fixed patch, so 
there's a real and deterministic path towards fixing the patch. The 
regression is obvious and triggers all the time.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 21:51     ` Ingo Molnar
@ 2007-07-16 22:09       ` David Miller
  2007-07-16 22:37         ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: David Miller @ 2007-07-16 22:09 UTC (permalink / raw)
  To: mingo; +Cc: torvalds, linux-kernel, olaf.kirch

From: Ingo Molnar <mingo@elte.hu>
Date: Mon, 16 Jul 2007 23:51:17 +0200

> i also offered to quickly try any test-version of the fixed patch, so 
> there's a real and deterministic path towards fixing the patch. The 
> regression is obvious and triggers all the time.

For you.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:09       ` David Miller
@ 2007-07-16 22:37         ` Ingo Molnar
  2007-07-16 22:57           ` David Miller
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-16 22:37 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, linux-kernel, olaf.kirch

* David Miller <davem@davemloft.net> wrote:

> From: Ingo Molnar <mingo@elte.hu>
> Date: Mon, 16 Jul 2007 23:51:17 +0200
> 
> > i also offered to quickly try any test-version of the fixed patch, so 
> > there's a real and deterministic path towards fixing the patch. The 
> > regression is obvious and triggers all the time.
> 
> For you.

I can certainly keep the revert around in my trees. (although it's a 
complication, i have to take care for it to never leak out into any 
external trees, etc. - but it's not a big issue)

Fundamentally, i trust Olaf to fix this quickly, and i dont want to make 
a too big fuss about this, but in general it's always better to revert 
patches causing known regressions (unless the revert is hugely complex 
and other changes depend on it - but this isnt the case here). I can 
also run whatever test-patches of Olaf, that would instrument/dump 
whatever info is needed to fix this. So Olaf's debugging effort is not 
hindered in any way as far as i can see.

I think if you leaned back and thought it through, and if you applied 
this scenario to a bad scheduler commit from me that broke your box, 
you'd readily agree with me =B-) (which scenario is purely hypothetical, 
my scheduler commits are all 100% perfect of course ;-)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:37         ` Ingo Molnar
@ 2007-07-16 22:57           ` David Miller
  2007-07-17 18:09             ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: David Miller @ 2007-07-16 22:57 UTC (permalink / raw)
  To: mingo; +Cc: torvalds, linux-kernel, olaf.kirch

From: Ingo Molnar <mingo@elte.hu>
Date: Tue, 17 Jul 2007 00:37:18 +0200

> I think if you leaned back and thought it through, and if you applied 
> this scenario to a bad scheduler commit from me that broke your box, 
> you'd readily agree with me =B-) (which scenario is purely hypothetical, 
> my scheduler commits are all 100% perfect of course ;-)

Actually I'd probably send you a patch for any bug I found that
triggered on sparc64, since that's faster than asking you to fix a bug
that you are unlikely to be able to trigger on your own systems.

But that's just how I operate.

Ask Thomas Gleixner.  Every single hrtimers/nohz bug I've found on
sparc64, I sent him either a full fix or a full analysis of the bug
and a description of exactly what is going on and what needs to be
changed so he can compose the bug fix patch with minimal effort.

I don't ask people to revert, ever.  It's never necessary when the
parties involved are extremely competent and responsive.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:57           ` David Miller
@ 2007-07-17 18:09             ` Ingo Molnar
  0 siblings, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-17 18:09 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, linux-kernel, olaf.kirch

* David Miller <davem@davemloft.net> wrote:

> From: Ingo Molnar <mingo@elte.hu>
> Date: Tue, 17 Jul 2007 00:37:18 +0200
> 
> > I think if you leaned back and thought it through, and if you 
> > applied this scenario to a bad scheduler commit from me that broke 
> > your box, you'd readily agree with me =B-) (which scenario is purely 
> > hypothetical, my scheduler commits are all 100% perfect of course 
> > ;-)
> 
> Actually I'd probably send you a patch for any bug I found that 
> triggered on sparc64, since that's faster than asking you to fix a bug 
> that you are unlikely to be able to trigger on your own systems.
> 
> But that's just how I operate.
> 
> Ask Thomas Gleixner.  Every single hrtimers/nohz bug I've found on 
> sparc64, I sent him either a full fix or a full analysis of the bug 
> and a description of exactly what is going on and what needs to be 
> changed so he can compose the bug fix patch with minimal effort.

yeah, i too usually try to fix bugs straight away - just check:

   git-log net drivers/net | grep "Author: Ingo Molnar"

but in this current case i was freshly back from a few days off, had 
quite a backlog after a rather complex set of merges and i just didnt 
have the time.

And i really applaud you for the clockevents/dynticks contributions (i 
loved your dynticks patches and the clockevents framework fixes that 
resulted out of it), but i really, really think this case is apples to 
oranges. I had some urgent hacking to do in some completely different 
area (not networking) and i ran across that networking regression and 
spent more than an hour on bisecting it.

The bad commit looked simple so i thought the person who is doing it 
(Olaf) would immediately see the bug (i certainly didnt see it). It's 
also not that easy to debug that box, the only remote debugging 
interface to it is ... netconsole. If i were into networking changes 
right now i'd probably have tried to debug and fix it straight away.

So instead i opted to bisect it, to give you the exact bad commit, give 
you full logs of the incident and an offer to run arbitrary patches 
immediately.

There's also little loss from the revert AFAICS: the commit went into 
your tree on July 11th and went upstream July 12th and i found it on 
July 16th. So (unless the commit dates are deceiving me) it does not 
seem to be an over-tested patch with lots of QA value (yet), and it's 
not some critical commit that another 100 commits grew to depend on. 
It's nevertheless a fix we want to have, and i'm willing to test 
anything.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 21:40   ` Linus Torvalds
  2007-07-16 21:51     ` Ingo Molnar
@ 2007-07-16 22:08     ` David Miller
  2007-07-16 22:29       ` Linus Torvalds
  2007-07-17  7:37       ` Olaf Kirch
  2007-07-17  8:16     ` Olaf Kirch
  2 siblings, 2 replies; 67+ messages in thread
From: David Miller @ 2007-07-16 22:08 UTC (permalink / raw)
  To: torvalds; +Cc: mingo, linux-kernel, olaf.kirch

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 16 Jul 2007 14:40:38 -0700 (PDT)

> If we don't know what caused a problem in the first place, or if the fix 
> is known to be required for something else and reverting it would cause 
> *another* regression, it would be another issue. But as it is, reverting 
> it would seem to unquestionably get rid of a regression, and is thus a 
> no-brainer.
> 
> No?

Sure, but I thought it would be nice to give Olaf a day or two to
figure out what's going on rather than have the knee-jerk reaction to
just revert.

Ingo is the only person hitting and reporting this and last time I
checked he is competent enough to revert the thing locally in his own
trees, right? :-)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:08     ` David Miller
@ 2007-07-16 22:29       ` Linus Torvalds
  2007-07-16 22:52         ` David Miller
  2007-07-16 23:17         ` Matt Mackall
  2007-07-17  7:37       ` Olaf Kirch
  1 sibling, 2 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-07-16 22:29 UTC (permalink / raw)
  To: David Miller; +Cc: mingo, linux-kernel, olaf.kirch

On Mon, 16 Jul 2007, David Miller wrote:
> 
> Ingo is the only person hitting and reporting this and last time I
> checked he is competent enough to revert the thing locally in his own
> trees, right? :-)

Umm. And your suggestion is what? Wait until -rc1, when non-developers 
(the kinds of people who are supposed to *not* be competent enough) come 
along, and report the problem?

That's explicitly what I do *not* want to happen.

If we knew something was wrong before the -rc1 release, all the better: we 
can avoid havign that bug in -rc1, and the people who test it will tell us 
about the problems we did *not* know about.

In contrast, if we leave a known bug in, that will just mean that other 
problems won't be found either, because people get hung up on the known 
bug.

Regressions should be fixed early and often. Otherwise there's no point in 
making the code available early and often in the first place.

A regression is basically MUCH WORSE than just about any possible bug that 
patch could have fixed, and I want people to really think that way.

The rule for *everybody* should be:

	"Regressions get fixed immediately"

with no ifs, buts or maybes. In fact, it would be even better if the kinds 
of patches that cause regressions wouldn't hit my tree at all, but once 
they have hit it, and once they've been identified, they get reverted.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:29       ` Linus Torvalds
@ 2007-07-16 22:52         ` David Miller
  2007-07-16 23:17         ` Matt Mackall
  1 sibling, 0 replies; 67+ messages in thread
From: David Miller @ 2007-07-16 22:52 UTC (permalink / raw)
  To: torvalds; +Cc: mingo, linux-kernel, olaf.kirch

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 16 Jul 2007 15:29:15 -0700 (PDT)

> If we knew something was wrong before the -rc1 release, all the better: we 
> can avoid havign that bug in -rc1, and the people who test it will tell us 
> about the problems we did *not* know about.
> 
> In contrast, if we leave a known bug in, that will just mean that other 
> problems won't be found either, because people get hung up on the known 
> bug.

Fair enough.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:29       ` Linus Torvalds
  2007-07-16 22:52         ` David Miller
@ 2007-07-16 23:17         ` Matt Mackall
  2007-07-16 23:34           ` Linus Torvalds
  1 sibling, 1 reply; 67+ messages in thread
From: Matt Mackall @ 2007-07-16 23:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Miller, mingo, linux-kernel, olaf.kirch

On Mon, Jul 16, 2007 at 03:29:15PM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 16 Jul 2007, David Miller wrote:
> > 
> > Ingo is the only person hitting and reporting this and last time I
> > checked he is competent enough to revert the thing locally in his own
> > trees, right? :-)
> 
> Umm. And your suggestion is what? Wait until -rc1, when non-developers 
> (the kinds of people who are supposed to *not* be competent enough) come 
> along, and report the problem?
> 
> That's explicitly what I do *not* want to happen.
> 
> If we knew something was wrong before the -rc1 release, all the better: we 
> can avoid havign that bug in -rc1, and the people who test it will tell us 
> about the problems we did *not* know about.

Unfortunately the particular patch from Olaf is presumably covering up
another bug that other people (including Olaf) had hit. So reverting
it is going to introduce a different regression.

I think this particular problem has _already_ bounced back and forth
once. But I haven't been keeping count as most of the relevant patches
weren't cc:ed to me.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 23:17         ` Matt Mackall
@ 2007-07-16 23:34           ` Linus Torvalds
  0 siblings, 0 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-07-16 23:34 UTC (permalink / raw)
  To: Matt Mackall; +Cc: David Miller, mingo, linux-kernel, olaf.kirch

On Mon, 16 Jul 2007, Matt Mackall wrote:
> 
> Unfortunately the particular patch from Olaf is presumably covering up
> another bug that other people (including Olaf) had hit. So reverting
> it is going to introduce a different regression.

It's not a regression, it's an old problem.

And the rule is: machines that _used_ to work are more important. That 
rule got hewn into stone when the ACPI layer changes constantly kept 
breaking things that used to work, while "fixing" other things.

So we don't fix bugs by introducing new problems.  That way lies madness, 
and nobody ever knows if you actually make any real progress at all. Is it 
two steps forwards, one step back, or one step forward and two steps back? 
Different people will give different answers.

That's why regressions are _so_ much more important than new bugfixes. 
Because it's much more important to make slow but _steady_ progress, and 
have people know things improve (or at least not "deprove"). We don't want 
any kind of "brownian motion development".

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 22:08     ` David Miller
  2007-07-16 22:29       ` Linus Torvalds
@ 2007-07-17  7:37       ` Olaf Kirch
  1 sibling, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17  7:37 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, mingo, linux-kernel

On Tuesday 17 July 2007 00:08, David Miller wrote:
> Sure, but I thought it would be nice to give Olaf a day or two to
> figure out what's going on rather than have the knee-jerk reaction to
> just revert.

Oh, reverting is fine with me. I'll just resubmit the patch.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16 21:40   ` Linus Torvalds
  2007-07-16 21:51     ` Ingo Molnar
  2007-07-16 22:08     ` David Miller
@ 2007-07-17  8:16     ` Olaf Kirch
  2 siblings, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17  8:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Miller, mingo, linux-kernel

On Monday 16 July 2007 23:40, Linus Torvalds wrote:
>  - The change seems to always set the LIST_FROZEN bit when calling 
>    ->poll(), and at least on e1000, the NAPI poll() routine ends up doing 
>    that netif_rx_complete(), so we're *guaranteed* to always take the 
>    early exit and not do anything.

That is only in the poll_napi case, where netpoll tries to push pending
frames. The default caller for dev->poll() is net_rx_action.

We only ever get into this particular code branch in poll_napi when
the RX_SCHED bit is already set, ie someone put the device on the NAPI
poll_list and raised the softirq. poll_napi is just doing manually what
net_rx_action would do anyway once the softirq is serviced.

>  - The early return from netif_rx_complete() ends up meaning that an 
>    edge-triggered interrupt isn't handled properly, and will this never 
>    happen again, since it never goes away.

Sorry, I may be sitting on my brain this morning, but I don't understand
how skipping netif_rx_complete would affect ACKing of interrupts.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-16  9:12 [patch] revert: [NET]: Fix races in net_rx_action vs netpoll Ingo Molnar
  2007-07-16 10:35 ` Olaf Kirch
  2007-07-16 11:26 ` David Miller
@ 2007-07-17  5:46 ` Jarek Poplawski
  2007-07-17  6:14   ` Jarek Poplawski
  2 siblings, 1 reply; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-17  5:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, linux-kernel, olaf.kirch, davem

On 16-07-2007 11:12, Ingo Molnar wrote:
> current -git broke my main testbox. No TCP/IP networking to/from the box 
> and e1000 would time out in xmit:
> 
>  NETDEV WATCHDOG: eth0: transmit timed out
>  e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
...

Olaf, I think this error can trigger in this place:

> static void net_rx_action(struct softirq_action *h)
> {
>         struct softnet_data *queue = &__get_cpu_var(softnet_data);
>         unsigned long start_time = jiffies;
>         int budget = netdev_budget;
>         void *have;
> 
>         local_irq_disable();
> 
>         while (!list_empty(&queue->poll_list)) {
>                 struct net_device *dev;
> 
>                 if (budget <= 0 || jiffies - start_time > 1)
>                         goto softnet_break;
> 
>                 local_irq_enable();
> 
>                 dev = list_entry(queue->poll_list.next,
>                                  struct net_device, poll_list);
>                 have = netpoll_poll_lock(dev);
> 
>                 if (dev->quota <= 0 || dev->poll(dev, &budget)) {

If after poll_napi dev->quota <= 0 dev->poll is not run and
__LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  5:46 ` Jarek Poplawski
@ 2007-07-17  6:14   ` Jarek Poplawski
  2007-07-17  7:55     ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-17  6:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, linux-kernel, olaf.kirch, davem

On Tue, Jul 17, 2007 at 07:46:39AM +0200, Jarek Poplawski wrote:
...
> > static void net_rx_action(struct softirq_action *h)
> > {
> >         struct softnet_data *queue = &__get_cpu_var(softnet_data);
> >         unsigned long start_time = jiffies;
> >         int budget = netdev_budget;
> >         void *have;
> > 
> >         local_irq_disable();
> > 
> >         while (!list_empty(&queue->poll_list)) {
> >                 struct net_device *dev;
> > 
> >                 if (budget <= 0 || jiffies - start_time > 1)
> >                         goto softnet_break;
> > 
> >                 local_irq_enable();
> > 
> >                 dev = list_entry(queue->poll_list.next,
> >                                  struct net_device, poll_list);
> >                 have = netpoll_poll_lock(dev);
> > 
> >                 if (dev->quota <= 0 || dev->poll(dev, &budget)) {
> 
> If after poll_napi dev->quota <= 0 dev->poll is not run and
> __LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.

Or, more precisely dev->poll_list will be cleared just after this,
and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

Jarek P.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  6:14   ` Jarek Poplawski
@ 2007-07-17  7:55     ` Olaf Kirch
  2007-07-17  8:28       ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17  7:55 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Ingo Molnar, Linus Torvalds, linux-kernel, davem

On Tuesday 17 July 2007 08:14, Jarek Poplawski wrote:
> > If after poll_napi dev->quota <= 0 dev->poll is not run and
> > __LINK_STATE_RX_SCHED bit (plus dev->poll_list) stays uncleared.
> 
> Or, more precisely dev->poll_list will be cleared just after this,
> and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.

I don't think so. dev will remain on poll_list.

What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
the poll_list.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  7:55     ` Olaf Kirch
@ 2007-07-17  8:28       ` Olaf Kirch
  2007-07-17  8:57         ` Ingo Molnar
  2007-07-17  9:12         ` Jarek Poplawski
  0 siblings, 2 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17  8:28 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Ingo Molnar, Linus Torvalds, linux-kernel, davem

On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
> What I find more problematic about this portion of code though
> is that once a net_device is over quota, net_rx_action will
> loop for up to one jiffy, even if there's just this one device on
> the poll_list.

Duh, wrong. For every loop, it'll add dev->weight to dev->quota,
so it'll be fine very soon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  8:28       ` Olaf Kirch
@ 2007-07-17  8:57         ` Ingo Molnar
  2007-07-17  9:29           ` Jarek Poplawski
                             ` (2 more replies)
  2007-07-17  9:12         ` Jarek Poplawski
  1 sibling, 3 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-17  8:57 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


Olaf,

i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
the problem go away. So it's somehow also related to jiffies.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  8:57         ` Ingo Molnar
@ 2007-07-17  9:29           ` Jarek Poplawski
  2007-07-17 14:07           ` Olaf Kirch
  2007-07-17 17:49           ` Linus Torvalds
  2 siblings, 0 replies; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-17  9:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Olaf Kirch, Linus Torvalds, linux-kernel, davem

On Tue, Jul 17, 2007 at 10:57:48AM +0200, Ingo Molnar wrote:
> 
> Olaf,
> 
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

IMHO it could be related with __LINK_STATE_RX_SCHED beeing set
too long e.g. between two softirqs. So, it could be really
a matter of timing out - not necessarily permanent error state.

Jarek P.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  8:57         ` Ingo Molnar
  2007-07-17  9:29           ` Jarek Poplawski
@ 2007-07-17 14:07           ` Olaf Kirch
  2007-07-17 16:57             ` Ingo Molnar
  2007-07-17 17:49           ` Linus Torvalds
  2 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17 14:07 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Tuesday 17 July 2007 10:57, Ingo Molnar wrote:
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

There are several "Tx Hang detected" messages in the log, which looks
a lot as if net_rx_action never runs, or at least never calls
dev->poll on the e1000 nic.

Can you check whether/how often it bails out of net_rx_action taking
the softnet_break path?

My suspicion right now is that dev->quota goes way negative when
pushing out netconsole output. Normally, we bump the quota in
__net_rx_schedule. But the whole point of the patch is that
netpoll has no business removing the device from the poll_list,
so it stays there, and we don't end up calling __net_rx_schedule
as often as we would otherwise.

Can you try what happens if you change netif_rx_complete to
something like this:

	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)) {
		dev->quota = dev->weight;
		return;
	}

This is just a hack to make sure that we don't go to insanely
negative quotas while sending packets through netpoll.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 14:07           ` Olaf Kirch
@ 2007-07-17 16:57             ` Ingo Molnar
  2007-07-17 18:06               ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-17 16:57 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> Can you try what happens if you change netif_rx_complete to something 
> like this:
> 
> 	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)) {
> 		dev->quota = dev->weight;
> 		return;
> 	}
> 
> This is just a hack to make sure that we don't go to insanely negative 
> quotas while sending packets through netpoll.

i've done the patch below, but it did not change the timeouts nor did it 
solve the 'no network' problem. netconsole output hung earlier as well.

i can try other things too. (it would be best if you sent me 
test/debug-patches, instead of letting me hack in the changes - that's 
the surest way of ensuring that i test exactly what you intended.)

	Ingo

------------->
Index: linux/include/linux/netdevice.h
===================================================================
--- linux.orig/include/linux/netdevice.h
+++ linux/include/linux/netdevice.h
@@ -1007,6 +1007,10 @@ static inline int netif_rx_reschedule(st
  */
 static inline void __netif_rx_complete(struct net_device *dev)
 {
+	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)) {
+		dev->quota = dev->weight;
+		return;
+	}
 	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
 	list_del(&dev->poll_list);
 	smp_mb__before_clear_bit();

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 16:57             ` Ingo Molnar
@ 2007-07-17 18:06               ` Olaf Kirch
  2007-07-17 18:18                 ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17 18:06 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

Hi Ingo,

On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
> i've done the patch below, but it did not change the timeouts nor did it 
> solve the 'no network' problem. netconsole output hung earlier as well.
Hm, pity.

To rule out any e1000 problem, can you try the the following please,
both with HZ=250 and HZ=1000?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
---
 drivers/net/e1000/e1000_main.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: build-2.6/drivers/net/e1000/e1000_main.c
===================================================================
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -3871,10 +3871,17 @@ e1000_intr(int irq, void *data)
 		adapter->total_rx_bytes = 0;
 		adapter->total_rx_packets = 0;
 		__netif_rx_schedule(netdev);
-	} else
+	} else {
 		/* this really should not happen! if it does it is basically a
 		 * bug, but not a hard error, so enable ints and continue */
+		static unsigned int been_here = 0;
+
+		been_here++;
+		if (net_ratelimit())
+			printk(KERN_NOTICE "e1000_intr and rx_sched set (%u); state=0x%lx\n",
+					been_here, netdev->state);
 		e1000_irq_enable(adapter);
+	}
 #else
 	/* Writing IMC and IMS is needed for 82547.
 	 * Due to Hub Link bus being occupied, an interrupt

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 18:06               ` Olaf Kirch
@ 2007-07-17 18:18                 ` Ingo Molnar
  2007-07-17 18:34                   ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-17 18:18 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> Hi Ingo,
> 
> On Tuesday 17 July 2007 18:57, Ingo Molnar wrote:
> > i've done the patch below, but it did not change the timeouts nor did it 
> > solve the 'no network' problem. netconsole output hung earlier as well.
> Hm, pity.
> 
> To rule out any e1000 problem, can you try the the following please, 
> both with HZ=250 and HZ=1000?

done, find the logs here:

 http://redhat.com/~mingo/misc/100hz.log
 http://redhat.com/~mingo/misc/1000hz.log

(one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
like HZ=250.)

no 'rx_sched set' messages in either case. Network still hung for 
HZ=100, and is working for HZ=1000.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 18:18                 ` Ingo Molnar
@ 2007-07-17 18:34                   ` Olaf Kirch
  2007-07-17 18:56                     ` Ingo Molnar
  2007-07-18 11:48                     ` Jarek Poplawski
  0 siblings, 2 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-17 18:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
> (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
> like HZ=250.)
> 
> no 'rx_sched set' messages in either case. Network still hung for 
> HZ=100, and is working for HZ=1000.

Is this from dmesg or the netconsole output? I don't see any Tx Unit
Hang messages from e1000 or netdev watchdog messages present in your
earlier dmesg logs. So maybe these messages are there, but never
get logged?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 18:34                   ` Olaf Kirch
@ 2007-07-17 18:56                     ` Ingo Molnar
  2007-07-18 12:04                       ` Olaf Kirch
  2007-07-18 11:48                     ` Jarek Poplawski
  1 sibling, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-17 18:56 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Tuesday 17 July 2007 20:18, Ingo Molnar wrote:
> > (one is HZ=100, the other HZ=1000. HZ=100 produces a hung network just 
> > like HZ=250.)
> > 
> > no 'rx_sched set' messages in either case. Network still hung for 
> > HZ=100, and is working for HZ=1000.
> 
> Is this from dmesg or the netconsole output? I don't see any Tx Unit 
> Hang messages from e1000 or netdev watchdog messages present in your 
> earlier dmesg logs. So maybe these messages are there, but never get 
> logged?

i logged these not via netconsole but via logging on over the console 
and using dmesg, so it should include everything. in the 100hz case the 
following seems to show the anomaly:

  NETDEV WATCHDOG: eth0: transmit timed out

indeed it's not followed by the e1000 messages. (perhaps i didnt wait 
long enough to reboot the box - i zapped it after a minute of uptime)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 18:56                     ` Ingo Molnar
@ 2007-07-18 12:04                       ` Olaf Kirch
  2007-07-18 12:41                         ` Ingo Molnar
  2007-07-18 12:48                         ` Ingo Molnar
  0 siblings, 2 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-18 12:04 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> i logged these not via netconsole but via logging on over the console 
> and using dmesg, so it should include everything. in the 100hz case the 
> following seems to show the anomaly:
> 
>   NETDEV WATCHDOG: eth0: transmit timed out

So, it seems as if for some reason, dev->poll isn't called frequently
enough.

Here's a debugging patch that tries to locate the problem - can you give it
a try, please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
--------
patching file drivers/net/e1000/e1000_main.c
---
 include/linux/netdevice.h |    1 +
 net/core/dev.c            |    7 +++++++
 net/core/netpoll.c        |    2 ++
 3 files changed, 10 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===================================================================
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
 	struct dma_chan		*net_dma;
 #endif
+	unsigned long		scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
Index: build-2.6/net/core/dev.c
===================================================================
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1199,6 +1199,7 @@ void __netif_rx_schedule(struct net_devi
 		dev->quota += dev->weight;
 	else
 		dev->quota = dev->weight;
+	__get_cpu_var(softnet_data).scheduled = jiffies;
 	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 	local_irq_restore(flags);
 }
@@ -2028,6 +2029,9 @@ static void net_rx_action(struct softirq
 
 	local_irq_disable();
 
+	/* Complain if we're scheduled way too late. */
+	WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + HZ));
+
 	while (!list_empty(&queue->poll_list)) {
 		struct net_device *dev;
 
@@ -2038,8 +2042,10 @@ static void net_rx_action(struct softirq
 
 		dev = list_entry(queue->poll_list.next,
 				 struct net_device, poll_list);
+
 		have = netpoll_poll_lock(dev);
 
+		BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state));
 		if (dev->quota <= 0 || dev->poll(dev, &budget)) {
 			netpoll_poll_unlock(have);
 			local_irq_disable();
@@ -2074,6 +2080,7 @@ out:
 
 softnet_break:
 	__get_cpu_var(netdev_rx_stat).time_squeeze++;
+	__get_cpu_var(softnet_data).scheduled = jiffies;
 	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 	goto out;
 }
Index: build-2.6/net/core/netpoll.c
===================================================================
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -130,6 +130,7 @@ static void poll_napi(struct netpoll *np
 		 * Setting POLL_LIST_FROZEN tells netif_rx_complete
 		 * to leave the NAPI state alone.
 		 */
+		local_bh_disable();
 		set_bit(__LINK_STATE_POLL_LIST_FROZEN, &np->dev->state);
 		npinfo->rx_flags |= NETPOLL_RX_DROP;
 		atomic_inc(&trapped);
@@ -140,6 +141,7 @@ static void poll_napi(struct netpoll *np
 		npinfo->rx_flags &= ~NETPOLL_RX_DROP;
 		clear_bit(__LINK_STATE_POLL_LIST_FROZEN, &np->dev->state);
 		spin_unlock(&npinfo->poll_lock);
+		local_bh_enable();
 	}
 }
 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 12:04                       ` Olaf Kirch
@ 2007-07-18 12:41                         ` Ingo Molnar
  2007-07-18 12:48                         ` Ingo Molnar
  1 sibling, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-18 12:41 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> > i logged these not via netconsole but via logging on over the console 
> > and using dmesg, so it should include everything. in the 100hz case the 
> > following seems to show the anomaly:
> > 
> >   NETDEV WATCHDOG: eth0: transmit timed out
> 
> So, it seems as if for some reason, dev->poll isn't called frequently 
> enough.
> 
> Here's a debugging patch that tries to locate the problem - can you 
> give it a try, please?

sure, give me a minute.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 12:04                       ` Olaf Kirch
  2007-07-18 12:41                         ` Ingo Molnar
@ 2007-07-18 12:48                         ` Ingo Molnar
  2007-07-18 14:41                           ` Olaf Kirch
  1 sibling, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-18 12:48 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Tuesday 17 July 2007 20:56, Ingo Molnar wrote:
> > i logged these not via netconsole but via logging on over the console 
> > and using dmesg, so it should include everything. in the 100hz case the 
> > following seems to show the anomaly:
> > 
> >   NETDEV WATCHDOG: eth0: transmit timed out
> 
> So, it seems as if for some reason, dev->poll isn't called frequently 
> enough.
> 
> Here's a debugging patch that tries to locate the problem - can you 
> give it a try, please?

it triggers here:

 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at kernel/softirq.c:138 local_bh_enable()
  [<c0105e4a>] show_trace_log_lvl+0x19/0x2e
  [<c0105f43>] show_trace+0x12/0x14
  [<c0105f59>] dump_stack+0x14/0x16
  [<c0130f8d>] local_bh_enable+0x95/0x15d
  [<c03cf35b>] netpoll_poll+0xaf/0x361
  [<c03cf248>] netpoll_send_skb+0xe8/0x14c
  [<c03cf8d4>] netpoll_send_udp+0x258/0x260
  [<c02f4016>] write_msg+0x53/0x8d
  [<c012c87e>] __call_console_drivers+0x4e/0x5a
  [<c012c8e7>] _call_console_drivers+0x5d/0x61
  [<c012cf06>] release_console_sem+0x120/0x1c1
  [<c012d70a>] register_console+0x22e/0x236
  [<c02f3f98>] init_netconsole+0x55/0x67
  [<c05e28e5>] kernel_init+0x154/0x2d9
  [<c0105c5f>] kernel_thread_helper+0x7/0x10

I've uploaded the full log to:

 http://redhat.com/~mingo/misc/100hz.2.log

something i noticed: netconsole output seems to trickle through though, 
but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
not functional.

also, i'm using netconsole via the command line (both the network driver 
and netconsole is built into the bzImage), maybe that makes a 
difference?

(if there's any other data you'd like to see, let me know.)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 12:48                         ` Ingo Molnar
@ 2007-07-18 14:41                           ` Olaf Kirch
  2007-07-18 16:43                             ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-18 14:41 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Wednesday 18 July 2007 14:48, Ingo Molnar wrote:
> something i noticed: netconsole output seems to trickle through though, 
> but very, very slowly (a packet once every 4 seconds or so). TCP/IP is 
> not functional.
> 
> also, i'm using netconsole via the command line (both the network driver 
> and netconsole is built into the bzImage), maybe that makes a 
> difference?

Possibly - but so far there's nothing in the code that jumped at me.

Can you try the following please? I'm still pretty much in the dark;
so I'm adding a bunch of printks. Let's hope this doesn't change the
timing in a way that makes the bug disappear.

Thanks
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-----

---
 include/linux/netdevice.h |    4 ++++
 net/core/dev.c            |   14 ++++++++++++++
 2 files changed, 18 insertions(+)

Index: build-2.6/include/linux/netdevice.h
===================================================================
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -692,6 +692,7 @@ struct softnet_data
 #ifdef CONFIG_NET_DMA
 	struct dma_chan		*net_dma;
 #endif
+	unsigned long		scheduled;
 };
 
 DECLARE_PER_CPU(struct softnet_data,softnet_data);
@@ -1026,6 +1027,9 @@ static inline void netif_rx_complete(str
 	/* Prevent race with netpoll - yes, this is a kludge.
 	 * But at least it doesn't penalize the non-netpoll
 	 * code path. */
+	printk(KERN_DEBUG "netif_rx_complete(%s)%s\n", dev->name,
+			test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)?
+			" - frozen" : "");
 	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
 		return;
 #endif
Index: build-2.6/net/core/dev.c
===================================================================
--- build-2.6.orig/net/core/dev.c
+++ build-2.6/net/core/dev.c
@@ -1192,6 +1192,7 @@ void __netif_rx_schedule(struct net_devi
 {
 	unsigned long flags;
 
+	printk(KERN_DEBUG "__netif_rx_schedule(%s)\n", dev->name);
 	local_irq_save(flags);
 	dev_hold(dev);
 	list_add_tail(&dev->poll_list, &__get_cpu_var(softnet_data).poll_list);
@@ -1199,6 +1200,7 @@ void __netif_rx_schedule(struct net_devi
 		dev->quota += dev->weight;
 	else
 		dev->quota = dev->weight;
+	__get_cpu_var(softnet_data).scheduled = jiffies;
 	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 	local_irq_restore(flags);
 }
@@ -2028,6 +2030,9 @@ static void net_rx_action(struct softirq
 
 	local_irq_disable();
 
+	/* Complain if we're scheduled way too late. */
+	WARN_ON(time_after(jiffies, __get_cpu_var(softnet_data).scheduled + HZ));
+
 	while (!list_empty(&queue->poll_list)) {
 		struct net_device *dev;
 
@@ -2038,9 +2043,13 @@ static void net_rx_action(struct softirq
 
 		dev = list_entry(queue->poll_list.next,
 				 struct net_device, poll_list);
+
 		have = netpoll_poll_lock(dev);
 
+		BUG_ON(test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state));
 		if (dev->quota <= 0 || dev->poll(dev, &budget)) {
+			printk(KERN_DEBUG "netif_rx_action(%s) - keep on polling\n",
+					dev->name);
 			netpoll_poll_unlock(have);
 			local_irq_disable();
 			list_move_tail(&dev->poll_list, &queue->poll_list);
@@ -2049,6 +2058,10 @@ static void net_rx_action(struct softirq
 			else
 				dev->quota = dev->weight;
 		} else {
+			printk(KERN_DEBUG "netif_rx_action(%s) - done%s\n",
+					dev->name,
+					test_bit(__LINK_STATE_RX_SCHED, &dev->state)?
+					 "; rx_sched set" : "");
 			netpoll_poll_unlock(have);
 			dev_put(dev);
 			local_irq_disable();
@@ -2074,6 +2087,7 @@ out:
 
 softnet_break:
 	__get_cpu_var(netdev_rx_stat).time_squeeze++;
+	__get_cpu_var(softnet_data).scheduled = jiffies;
 	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 	goto out;
 }


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 14:41                           ` Olaf Kirch
@ 2007-07-18 16:43                             ` Ingo Molnar
  2007-07-19  9:09                               ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-18 16:43 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@ORACLE.COM> wrote:

> > also, i'm using netconsole via the command line (both the network 
> > driver and netconsole is built into the bzImage), maybe that makes a 
> > difference?
> 
> Possibly - but so far there's nothing in the code that jumped at me.
> 
> Can you try the following please? I'm still pretty much in the dark; 
> so I'm adding a bunch of printks. Let's hope this doesn't change the 
> timing in a way that makes the bug disappear.

hm ... the box wouldnt even boot up because the 'keep on polling' and 
'done' messages are scrolling across so fast. Perhaps every printk 
triggers a packet which triggers another printk ... etc?

here's the 15MB log of it:

  http://redhat.com/~mingo/misc/100hz.3.log.bz2

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 16:43                             ` Ingo Molnar
@ 2007-07-19  9:09                               ` Ingo Molnar
  2007-07-19  9:44                                 ` Olaf Kirch
  2007-07-19 10:17                                 ` Ingo Molnar
  0 siblings, 2 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19  9:09 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

i have your original patch applied to my working tree to be able to 
observe this bug's behavior, and here's another observation: the problem 
seems to go away if i turn on CONFIG_NO_HZ. So it looks timing related 
indeed ...

but when the bug happens, it happens all the time, reboot after reboot. 
When it doesnt happen, networking and netlogging is robust for hours, 
reboot after reboot. That seems atypical for timing problems. I'm 
puzzled.

the e1000 in this laptop is historically pretty robust. The only problem 
i ever had with it were some rx/tx hw-engine latency problems [pings 
from the outside took up to 1 second to propagate] that were quickly 
fixed by the e1000 driver guys. Maybe that's related. (although it never 
caused total inavailability of networking - it was only latency 
problems)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19  9:09                               ` Ingo Molnar
@ 2007-07-19  9:44                                 ` Olaf Kirch
  2007-07-19 10:01                                   ` Ingo Molnar
  2007-07-19 10:17                                 ` Ingo Molnar
  1 sibling, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19  9:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Thursday 19 July 2007 11:09, Ingo Molnar wrote:
> the e1000 in this laptop is historically pretty robust. The only problem 
> i ever had with it were some rx/tx hw-engine latency problems [pings 
> from the outside took up to 1 second to propagate] that were quickly 
> fixed by the e1000 driver guys. Maybe that's related. (although it never 
> caused total inavailability of networking - it was only latency 
> problems)

I've been poring over this code for 3 days now, and I'm facing a blank
wall, mind-wise :-)

 -	it is pretty clear that net_rx_action is invoked every once
	in a while only. netdev watchdog timeouts are a pretty
	unmistakable sign for that.

 -	You say that netconsole output continues to trickle after
	the network gets wedged. This could be caused by the
	e1000 watchdog, which triggers a NIC interrupt "to ensure
	rx ring is cleaned". I assume that this triggers the
	regular e1000_intr, which succeeds in putting the NIC on
	the poll_list, and net_rx_action call dev->poll once.

	If this assumption is true, this means that
	 -	once an interrupt gets through, NAPI is working
		as designed
	 -	no other interrupts are arriving (Rx, Tx-completion)

So, can you verify whether there are any interrupts arriving on the
NIC after the network got wedged? You could also try
ethtool -s eth0 msglevel 65535 - would be interesting to see what
dmesg contains. If there's little to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19  9:44                                 ` Olaf Kirch
@ 2007-07-19 10:01                                   ` Ingo Molnar
  2007-07-19 10:37                                     ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 10:01 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

>  -	You say that netconsole output continues to trickle after
> 	the network gets wedged. This could be caused by the
> 	e1000 watchdog, which triggers a NIC interrupt "to ensure
> 	rx ring is cleaned". I assume that this triggers the
> 	regular e1000_intr, which succeeds in putting the NIC on
> 	the poll_list, and net_rx_action call dev->poll once.

no - it appears that 'trickle' only happened with one of your patches 
(to which i replied with that 'trickle' mail). With what i have booted 
now (only your original patch and nothing else, 100 Hz and !dynticks), 
netconsole output stopped here:

 Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
 initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
 initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
 Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
 NET: Registered protocol family 16

and no output ever since - and the box has been up for a few minutes.

> So, can you verify whether there are any interrupts arriving on the 
> NIC after the network got wedged? You could also try ethtool -s eth0 
> msglevel 65535 - would be interesting to see what dmesg contains. If 
> there's little to no debug output from the driver, let it run for 10 
> seconds or so, in order to catch the e1000 watchdog timer a few times.

eth0's irq count is stuck at 5 interrupts - and has not changed for 
minutes.

i tried ethtool -s eth0 msglvl 65535, but (sa expected) there's no 
output. I've attached below ifconfig output and ethtool -S output - 
maybe that tells you something new about the state of eth0. (to me it 
only tells what we already know: tx timed out once and eth0 is stuck 
ever since.)

Btw., i definitely need your help with this bug as it's now hopelessly 
out of my league :-/

	Ingo

------------------>
eth0      Link encap:Ethernet  HWaddr 00:16:41:17:49:D2
          inet addr:10.0.1.15  Bcast:10.255.255.255  Mask:255.0.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:873 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:87076 (85.0 KiB)
          Base address:0x2000 Memory:ee000000-ee020000 

NIC statistics:
     rx_packets: 0
     tx_packets: 873
     rx_bytes: 0
     tx_bytes: 87076
     rx_broadcast: 0
     tx_broadcast: 0
     rx_multicast: 0
     tx_multicast: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 1
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 0
     rx_csum_offload_good: 0
     rx_csum_offload_errors: 0
     rx_header_split: 0
     alloc_rx_buff_failed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 10:01                                   ` Ingo Molnar
@ 2007-07-19 10:37                                     ` Olaf Kirch
  2007-07-19 10:47                                       ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 10:37 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
>  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
>  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
>  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
>  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
>  NET: Registered protocol family 16
> 
> and no output ever since - and the box has been up for a few minutes.

Okay, I need to ask a stupid question - did you verify that it's not
spinning on a spinlock?

Specifically, I'm wondering whether the net_rx_action softirq may
be scheduled while we're in poll_napi holding the poll_lock.
net_rx_action would try to take the poll_lock as well, and we'd
be hung for good. The patch with local_bh_disable/enable was
supposed to test that idea (this is the "trickle" patch)

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 10:37                                     ` Olaf Kirch
@ 2007-07-19 10:47                                       ` Ingo Molnar
  2007-07-19 10:58                                         ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 10:47 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
> >  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
> >  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
> >  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
> >  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
> >  NET: Registered protocol family 16
> > 
> > and no output ever since - and the box has been up for a few minutes.
> 
> Okay, I need to ask a stupid question - did you verify that it's not 
> spinning on a spinlock?

the box is fully usable after it has booted up and (as you can see it in 
the config i've uploaded) i've got every kernel debug option enabled, 
including lockdep.

> Specifically, I'm wondering whether the net_rx_action softirq may be 
> scheduled while we're in poll_napi holding the poll_lock. 
> net_rx_action would try to take the poll_lock as well, and we'd be 
> hung for good. The patch with local_bh_disable/enable was supposed to 
> test that idea (this is the "trickle" patch)

the box isnt hung - it just has no networking. (i couldnt get the logs 
out if it were hung - netconsole is the only remote debugging option and 
with that broken i can only get info out if it boots up fine)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 10:47                                       ` Ingo Molnar
@ 2007-07-19 10:58                                         ` Ingo Molnar
  2007-07-19 12:52                                           ` Olaf Kirch
  2007-07-19 15:07                                           ` Ingo Molnar
  0 siblings, 2 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 10:58 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

* Ingo Molnar <mingo@elte.hu> wrote:

> * Olaf Kirch <olaf.kirch@oracle.com> wrote:
> 
> > On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
> > >  Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
> > >  initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
> > >  initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
> > >  Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
> > >  NET: Registered protocol family 16
> > > 
> > > and no output ever since - and the box has been up for a few minutes.
> > 
> > Okay, I need to ask a stupid question - did you verify that it's not 
> > spinning on a spinlock?

ok, i just to make my description clearer: 'netconsole output hung' 
means that on the netconsole-receiving box (which is not the laptop with 
the problem) i dont get any netconsole output printed. (i.e. UDP packets 
are not being sent by the laptop.) The above snippet is the last i get. 
Otherwise the laptop boots up fine, but has that early tx timeout and 
subsequently no networking. I can still do things on the laptop as 
normal, but eth0 irqs do not advance (are stuck at 5) and networking is 
stuck.

i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
hickup symptoms, with no other bad symptoms such as lockups or crashes.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 10:58                                         ` Ingo Molnar
@ 2007-07-19 12:52                                           ` Olaf Kirch
  2007-07-19 12:54                                             ` Olaf Kirch
  2007-07-19 15:42                                             ` Kok, Auke
  2007-07-19 15:07                                           ` Ingo Molnar
  1 sibling, 2 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 12:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
> i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
> hickup symptoms, with no other bad symptoms such as lockups or crashes.

Duh, I found it.

The e1000 poll routine does this to leave polling mode.

	netif_rx_complete(poll_dev);
        e1000_irq_enable(adapter);
        return 0;

Which looks innocent enough, except that e1000_irq_enable has
this little irq_sem counter:

	if (likely(atomic_dec_and_test(&adapter->irq_sem))) {
                E1000_WRITE_REG(&adapter->hw, IMS, IMS_ENABLE_MASK);
                E1000_WRITE_FLUSH(&adapter->hw);
        }

So as poll_napi calls the poll() routine repeatedly, the irq_sem
counter is decremented by one each time. During the first call,
it re-enables the interrupt. During the next calls, irq_sem goes
negative.

Then an interrupt comes in, e1000_intr disables the interrupt,
increments irq_sem by one, and schedules the device for rx_action.
rx_action calls dev->poll(), which finishes cleaning rx/rx rings,
and when it finds there's no more work, it calls rx_complete and
irq_enable. Except irq_enable doesn't enable anything now, since
irq_sem is <= 0, and dec_and_test returns false.

The whole irq_sem accounting in the e1000 does not rhyme well with
netpoll's way of exercising dev->poll(). The reason my patch triggers
the problem reliably for you is that now, we always get at least
two invocations of dev->poll: once from poll_napi - where we do not
remove the device from the poll list any longer - and another one
from net_rx_action.

I don't have a fix ready yet - I hope I'll have something later
this afternoon.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 12:52                                           ` Olaf Kirch
@ 2007-07-19 12:54                                             ` Olaf Kirch
  2007-07-19 15:42                                             ` Kok, Auke
  1 sibling, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 12:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

On Thursday 19 July 2007 14:52, Olaf Kirch wrote:
> On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
> > i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
> > hickup symptoms, with no other bad symptoms such as lockups or crashes.
> 
> Duh, I found it.

The following patch should confirm my theory. Can you give it a try,
please?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---------
Index: build-2.6/drivers/net/e1000/e1000_main.c
===================================================================
--- build-2.6.orig/drivers/net/e1000/e1000_main.c
+++ build-2.6/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
 		E1000_WRITE_REG(&adapter->hw, IMS, IMS_ENABLE_MASK);
 		E1000_WRITE_FLUSH(&adapter->hw);
 	}
+	BUG_ON(atomic_read(&adapter->irq_sem) < 0);
 }
 
 static void


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 12:52                                           ` Olaf Kirch
  2007-07-19 12:54                                             ` Olaf Kirch
@ 2007-07-19 15:42                                             ` Kok, Auke
  2007-07-19 16:07                                               ` Ingo Molnar
  1 sibling, 1 reply; 67+ messages in thread
From: Kok, Auke @ 2007-07-19 15:42 UTC (permalink / raw)
  To: Olaf Kirch
  Cc: Ingo Molnar, Jarek Poplawski, Linus Torvalds, linux-kernel, davem,
	Auke Kok

Olaf Kirch wrote:
> On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
>> i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine 
>> hickup symptoms, with no other bad symptoms such as lockups or crashes.
> 
> Duh, I found it.
> 
> The e1000 poll routine does this to leave polling mode.
> 
> 	netif_rx_complete(poll_dev);
>         e1000_irq_enable(adapter);
>         return 0;
> 
> Which looks innocent enough, except that e1000_irq_enable has
> this little irq_sem counter:
> 
> 	if (likely(atomic_dec_and_test(&adapter->irq_sem))) {
>                 E1000_WRITE_REG(&adapter->hw, IMS, IMS_ENABLE_MASK);
>                 E1000_WRITE_FLUSH(&adapter->hw);
>         }
> 
> So as poll_napi calls the poll() routine repeatedly, the irq_sem
> counter is decremented by one each time. During the first call,
> it re-enables the interrupt. During the next calls, irq_sem goes
> negative.
> 
> Then an interrupt comes in, e1000_intr disables the interrupt,
> increments irq_sem by one, and schedules the device for rx_action.
> rx_action calls dev->poll(), which finishes cleaning rx/rx rings,
> and when it finds there's no more work, it calls rx_complete and
> irq_enable. Except irq_enable doesn't enable anything now, since
> irq_sem is <= 0, and dec_and_test returns false.
> 
> The whole irq_sem accounting in the e1000 does not rhyme well with
> netpoll's way of exercising dev->poll().

it's been accused of worse things ;)

> The reason my patch triggers
> the problem reliably for you is that now, we always get at least
> two invocations of dev->poll: once from poll_napi - where we do not
> remove the device from the poll list any longer - and another one
> from net_rx_action.
> 
> I don't have a fix ready yet - I hope I'll have something later
> this afternoon.

interesting, you seem to found the cause allright. I can't confirm the problem 
but I know that netpoll and NAPI has historically been an issue. I look forward 
to your suggestions...

Cheers,

Auke

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 15:42                                             ` Kok, Auke
@ 2007-07-19 16:07                                               ` Ingo Molnar
  2007-07-19 19:13                                                 ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 16:07 UTC (permalink / raw)
  To: Kok, Auke
  Cc: Olaf Kirch, Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Kok, Auke <auke-jan.h.kok@intel.com> wrote:

> > I don't have a fix ready yet - I hope I'll have something later this 
> > afternoon.
> 
> interesting, you seem to found the cause allright. I can't confirm the 
> problem but I know that netpoll and NAPI has historically been an 
> issue. I look forward to your suggestions...

because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
anything in the ethtool output that i sent in the previous mail(s)?

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 16:07                                               ` Ingo Molnar
@ 2007-07-19 19:13                                                 ` Olaf Kirch
  2007-07-19 19:22                                                   ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 19:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Thursday 19 July 2007 18:07, Ingo Molnar wrote:
> because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
> anything in the ethtool output that i sent in the previous mail(s)?

If the WARN_ON doesn't trigger, I cannot see how my patch would affect
your system.

 -	IF we enter the if() branch in poll_napi, then the following
	must hold:
	 -	an interrupt from the e1000 arrived
	 -	e1000_intr disables interrupts, increments irq_sem
		(which is now 1) and schedules rx_action

 -	Now, inside poll_napi, the following happens:
	 -	poll_napi is called, finds the device is marked for
		polling, invokes dev->poll
	 -	dev->poll calls netif_rx_complete (which does *not*
		remove the device from the poll list), and re-enables
		interrupts. irq_sem is now 0.

 -	Finally, the rx_action softirq is run, which calls dev->poll
	again, which ends up invoking netif_rx_complete once more,
	and tries to re-enable interrupts. The latter doesn't do
	anything except decrementing irq_sem once more, which now
	goes negative.

	Which would trigger the WARN_ON.

Now, as you say the WARN_ON is never triggered, it follows that
we never end up in the if() branch of poll_napi. But that is
where the only substantial modification of my patch is.

Here's a somewhat drastic modification that should not change any
timing, but just verifies whether my patch is to blame at all. Can
you give it a try?

Thanks,
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
 include/linux/netdevice.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: build-2.6/include/linux/netdevice.h
===================================================================
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 	 * But at least it doesn't penalize the non-netpoll
 	 * code path. */
 	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
-		return;
+		BUG();
 #endif
 
 	local_irq_save(flags);

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 19:13                                                 ` Olaf Kirch
@ 2007-07-19 19:22                                                   ` Ingo Molnar
  2007-07-19 19:35                                                     ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 19:22 UTC (permalink / raw)
  To: Olaf Kirch
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem

* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> Here's a somewhat drastic modification that should not change any 
> timing, but just verifies whether my patch is to blame at all. Can you 
> give it a try?

> @@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
>  	 * But at least it doesn't penalize the non-netpoll
>  	 * code path. */
>  	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
> -		return;
> +		BUG();

ok, i tried the patch below, and it gave this (single) warning:

 Calling initcall 0xc02f5c17: init_netconsole+0x0/0x67()
 netconsole: device eth0 not up yet, forcing it
 netconsole: timeout waiting for carrier
 console [netcon0] enabled
 WARNING: at include/linux/netdevice.h:1030 netif_rx_complete()
  [<c0105e3e>] show_trace_log_lvl+0x19/0x2e
  [<c0105f37>] show_trace+0x12/0x14
  [<c0105f4d>] dump_stack+0x14/0x16
  [<c02c4fef>] e1000_clean+0x1f4/0x26f
  [<c03d0f6f>] netpoll_poll+0x8b/0x357
  [<c03d0e80>] netpoll_send_skb+0xe8/0x14c
  [<c03d1502>] netpoll_send_udp+0x258/0x260
  [<c02f5cea>] write_msg+0x53/0x8d
  [<c012c5ba>] __call_console_drivers+0x4e/0x5a
  [<c012c623>] _call_console_drivers+0x5d/0x61
  [<c012cc42>] release_console_sem+0x120/0x1c1
  [<c012d446>] register_console+0x22e/0x236
  [<c02f5c6c>] init_netconsole+0x55/0x67
  [<c05e48e5>] kernel_init+0x154/0x2d9
  [<c0105c53>] kernel_thread_helper+0x7/0x10
  =======================
 e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
 netconsole: network logging started
 initcall 0xc02f5c17: init_netconsole+0x0/0x67() returned 0.
 initcall 0xc02f5c17 ran for 4012 msecs: init_netconsole+0x0/0x67()
 Calling initcall 0xc0600ff9: spi_transport_init+0x0/0x27()
 initcall 0xc0600ff9: spi_transport_init+0x0/0x27() returned 0.

	Ingo

---
 include/linux/netdevice.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
===================================================================
--- linux-cfs-2.6.23-git.q.prev3.orig/include/linux/netdevice.h
+++ linux-cfs-2.6.23-git.q.prev3/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 	 * But at least it doesn't penalize the non-netpoll
 	 * code path. */
 	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
-		return;
+		WARN_ON(1);
 #endif

 	local_irq_save(flags);

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 19:22                                                   ` Ingo Molnar
@ 2007-07-19 19:35                                                     ` Olaf Kirch
  2007-07-19 19:56                                                       ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 19:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem


Does the following help?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
Test patch
---
Index: build-2.6/drivers/net/netconsole.c
===================================================================
--- build-2.6.orig/drivers/net/netconsole.c
+++ build-2.6/drivers/net/netconsole.c
@@ -70,7 +70,7 @@ static void write_msg(struct console *co
 	int frag, left;
 	unsigned long flags;
 
-	if (!np.dev)
+	if (!np.dev || !netif_running(np.dev))
 		return;
 
 	local_irq_save(flags);


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 19:35                                                     ` Olaf Kirch
@ 2007-07-19 19:56                                                       ` Ingo Molnar
  2007-07-19 20:02                                                         ` Olaf Kirch
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 19:56 UTC (permalink / raw)
  To: Olaf Kirch
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> Does the following help?

> --- build-2.6.orig/drivers/net/netconsole.c
> +++ build-2.6/drivers/net/netconsole.c
> @@ -70,7 +70,7 @@ static void write_msg(struct console *co
>  	int frag, left;
>  	unsigned long flags;
>  
> -	if (!np.dev)
> +	if (!np.dev || !netif_running(np.dev))
>  		return;

nope - with this patch applied the box still has no network, symptoms 
are similar. (should i apply the WARN_ON() patch too?)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 19:56                                                       ` Ingo Molnar
@ 2007-07-19 20:02                                                         ` Olaf Kirch
  2007-07-20  9:45                                                           ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 20:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem

On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
> nope - with this patch applied the box still has no network, symptoms 
> are similar. (should i apply the WARN_ON() patch too?)

Yes, that would be nice. If that doesn't help, you can also throw in
the one below.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Index: build-2.6/net/core/netpoll.c
===================================================================
--- build-2.6.orig/net/core/netpoll.c
+++ build-2.6/net/core/netpoll.c
@@ -650,7 +650,6 @@ int netpoll_setup(struct netpoll *np)
 		return -ENODEV;
 	}
 
-	np->dev = ndev;
 	if (!ndev->npinfo) {
 		npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
 		if (!npinfo) {
@@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
 		}
 	}
 
+	np->dev = ndev;
+
 	if (is_zero_ether_addr(np->local_mac) && ndev->dev_addr)
 		memcpy(np->local_mac, ndev->dev_addr, 6);
 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 20:02                                                         ` Olaf Kirch
@ 2007-07-20  9:45                                                           ` Ingo Molnar
  0 siblings, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-20  9:45 UTC (permalink / raw)
  To: Olaf Kirch
  Cc: Kok, Auke, Jarek Poplawski, Linus Torvalds, linux-kernel, davem


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
> > nope - with this patch applied the box still has no network, symptoms 
> > are similar. (should i apply the WARN_ON() patch too?)
> 
> Yes, that would be nice. If that doesn't help, you can also throw in
> the one below.

> -	np->dev = ndev;
>  	if (!ndev->npinfo) {
>  		npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
>  		if (!npinfo) {
> @@ -722,6 +721,8 @@ int netpoll_setup(struct netpoll *np)
>  		}
>  	}
>  
> +	np->dev = ndev;

no change in behavior from this patch. If i add the WARN_ON() it 
triggers once and it makes the network work as well (probably due to the 
side-effect of the packets generated by the WARN_ON() printks).

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 10:58                                         ` Ingo Molnar
  2007-07-19 12:52                                           ` Olaf Kirch
@ 2007-07-19 15:07                                           ` Ingo Molnar
  2007-07-19 15:27                                             ` Olaf Kirch
  2007-07-19 15:32                                             ` Ingo Molnar
  1 sibling, 2 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 15:07 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

ugh. Something really weird happened with this e1000 problem.

i crashed the laptop in a weird way and had to power-cycle it in an 
unusual fashion. After that i wanted to try your latest BUG_ON() theory 
but the network hang went away!

For 3 hours i tried to reproduce the hang (i went back to the original 
git tree and the .config under which i found it, i power cycled it 
again, i unplugged the power cord to make it go off battery, unplugged 
the ethernet, recreated a completely new tree, etc. etc.) but with no 
success! Total Heisenbug - and the really annoying thing is that you 
just had a good theory about what might be happening. I'm now at kernel 
build iteration #75 in this tree ...

maybe it's not the power-cycling that somehow brought the e1000 out of 
its weird state but the ethtool ops (and the related firmware readouts, 
etc.)?

(i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
Not that this means anything under these circumstances ...)

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 15:07                                           ` Ingo Molnar
@ 2007-07-19 15:27                                             ` Olaf Kirch
  2007-07-19 15:32                                             ` Ingo Molnar
  1 sibling, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 15:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

On Thursday 19 July 2007 17:07, Ingo Molnar wrote:
> i crashed the laptop in a weird way and had to power-cycle it in an 
> unusual fashion. After that i wanted to try your latest BUG_ON() theory 
> but the network hang went away!

Should I rejoice, or regret? :-)

> maybe it's not the power-cycling that somehow brought the e1000 out of 
> its weird state but the ethtool ops (and the related firmware readouts, 
> etc.)?

Oh, don't tell me there's a --scrub-bad-karma switch in ethtool.

> (i have your BUG_ON() applied (as a WARN_ON()), but it doesnt trigger. 
> Not that this means anything under these circumstances ...)

Too bad. Now where do we take it from here? I'm currently thinking of
ways to do this patch differently. But that is kind of relying on a
good testbed to verify whether a different patch works better for you
or not...

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 15:07                                           ` Ingo Molnar
  2007-07-19 15:27                                             ` Olaf Kirch
@ 2007-07-19 15:32                                             ` Ingo Molnar
  2007-07-19 15:52                                               ` Ingo Molnar
  1 sibling, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 15:32 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok


* Ingo Molnar <mingo@elte.hu> wrote:

> ugh. Something really weird happened with this e1000 problem.
> 
> i crashed the laptop in a weird way and had to power-cycle it in an 
> unusual fashion. After that i wanted to try your latest BUG_ON() 
> theory but the network hang went away!
> 
> For 3 hours i tried to reproduce the hang (i went back to the original 
> git tree and the .config under which i found it, i power cycled it 
> again, i unplugged the power cord to make it go off battery, unplugged 
> the ethernet, recreated a completely new tree, etc. etc.) but with no 
> success! Total Heisenbug - and the really annoying thing is that you 
> just had a good theory about what might be happening. I'm now at 
> kernel build iteration #75 in this tree ...

ah! Just found the reason: the bug apparently depends on the precise 
kernel command-line contents. I accidentally dropped ignore_loglevel 
(found this while comparing with the older logs i sent to you), adding 
it back in produces hung networking too. So it appears that a netconsole 
printout while e1000 is initializing (or while some other networking 
component is initializing) might be the culprit?

	Ingo


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 15:32                                             ` Ingo Molnar
@ 2007-07-19 15:52                                               ` Ingo Molnar
  2007-07-19 16:05                                                 ` Ingo Molnar
  0 siblings, 1 reply; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 15:52 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok


* Ingo Molnar <mingo@elte.hu> wrote:

> ah! Just found the reason: the bug apparently depends on the precise 
> kernel command-line contents. I accidentally dropped ignore_loglevel 
> (found this while comparing with the older logs i sent to you), adding 
> it back in produces hung networking too. So it appears that a 
> netconsole printout while e1000 is initializing (or while some other 
> networking component is initializing) might be the culprit?

and the WARN_ON() below does not seem to trigger.

i'll now check whether removing ignore_on_loglevel (no other changes) 
makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
produces an immediate printk (going out to the interface) during a 
particularly sensitive period of network initialization.

	Ingo

----------------->
Index: linux/drivers/net/e1000/e1000_main.c
===================================================================
--- linux.orig/drivers/net/e1000/e1000_main.c
+++ linux/drivers/net/e1000/e1000_main.c
@@ -358,6 +358,7 @@ e1000_irq_enable(struct e1000_adapter *a
 		E1000_WRITE_REG(&adapter->hw, IMS, IMS_ENABLE_MASK);
 		E1000_WRITE_FLUSH(&adapter->hw);
 	}
+	WARN_ON_ONCE(atomic_read(&adapter->irq_sem) < 0);
 }
 
 static void

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 15:52                                               ` Ingo Molnar
@ 2007-07-19 16:05                                                 ` Ingo Molnar
  2007-07-19 16:13                                                   ` Ingo Molnar
  2007-07-19 17:36                                                   ` Olaf Kirch
  0 siblings, 2 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 16:05 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

* Ingo Molnar <mingo@elte.hu> wrote:

> i'll now check whether removing ignore_on_loglevel (no other changes) 
> makes the hang go away. Maybe ignore_on_loglevel is buggy - or it 
> produces an immediate printk (going out to the interface) during a 
> particularly sensitive period of network initialization.

nope, that didnt have any effect. I have another theory: i had a 
network-intense stress-test going on in the past few hours on the same 
network (not involving the laptop) and i stopped it recently. Perhaps 
that network-intense test also produced periodic broadcast packets that 
got the e1000 out of its weird state before the tx timeout could hit. 
Now that i've stopped the test, the network is quiescent again and the 
e1000 hangs.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 16:05                                                 ` Ingo Molnar
@ 2007-07-19 16:13                                                   ` Ingo Molnar
  2007-07-19 17:36                                                   ` Olaf Kirch
  1 sibling, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 16:13 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok


* Ingo Molnar <mingo@elte.hu> wrote:

> > i'll now check whether removing ignore_on_loglevel (no other 
> > changes) makes the hang go away. Maybe ignore_on_loglevel is buggy - 
> > or it produces an immediate printk (going out to the interface) 
> > during a particularly sensitive period of network initialization.
> 
> nope, that didnt have any effect. I have another theory: i had a 
> network-intense stress-test going on in the past few hours on the same 
> network (not involving the laptop) and i stopped it recently. Perhaps 
> that network-intense test also produced periodic broadcast packets 
> that got the e1000 out of its weird state before the tx timeout could 
> hit. Now that i've stopped the test, the network is quiescent again 
> and the e1000 hangs.

yep - i can confirm this theory now: if i start a broadcast ping on 
another box while the laptop is booting up, the network hang does not 
happen. If i stop the ping and do another bootup with the very same 
kernel, it hangs. So for the hang to happen, the network has to be very 
quiet. At least one mystery solved :-/

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 16:05                                                 ` Ingo Molnar
  2007-07-19 16:13                                                   ` Ingo Molnar
@ 2007-07-19 17:36                                                   ` Olaf Kirch
  2007-07-19 17:41                                                     ` Ingo Molnar
  2007-07-19 17:51                                                     ` Olaf Kirch
  1 sibling, 2 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 17:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
> that network-intense test also produced periodic broadcast packets that 
> got the e1000 out of its weird state before the tx timeout could hit. 
> Now that i've stopped the test, the network is quiescent again and the 
> e1000 hangs.

Can you confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 17:36                                                   ` Olaf Kirch
@ 2007-07-19 17:41                                                     ` Ingo Molnar
  2007-07-19 17:51                                                     ` Olaf Kirch
  1 sibling, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 17:41 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok


* Olaf Kirch <olaf.kirch@oracle.com> wrote:

> On Thursday 19 July 2007 18:05, Ingo Molnar wrote:
> > that network-intense test also produced periodic broadcast packets that 
> > got the e1000 out of its weird state before the tx timeout could hit. 
> > Now that i've stopped the test, the network is quiescent again and the 
> > e1000 hangs.
> 
> Can you confirm this by spraying the laptop with arp packets or 
> broadcast pings while it's booting?

yes, i did exactly that, and it made the box boot.

	Ingo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19 17:36                                                   ` Olaf Kirch
  2007-07-19 17:41                                                     ` Ingo Molnar
@ 2007-07-19 17:51                                                     ` Olaf Kirch
  1 sibling, 0 replies; 67+ messages in thread
From: Olaf Kirch @ 2007-07-19 17:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok

On Thursday 19 July 2007 19:36, Olaf Kirch wrote:
> Can you confirm this by spraying the laptop with arp packets
> or broadcast pings while it's booting?

Sorry for the noise - didn't see your other message where you
described just that.

This sounds more like a hardware issue - Rx interrupt seems to
go through (triggering all the usual NAPI cleanup), but the Tx
descriptor writeback interrupt doesn't.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-19  9:09                               ` Ingo Molnar
  2007-07-19  9:44                                 ` Olaf Kirch
@ 2007-07-19 10:17                                 ` Ingo Molnar
  1 sibling, 0 replies; 67+ messages in thread
From: Ingo Molnar @ 2007-07-19 10:17 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Jarek Poplawski, Linus Torvalds, linux-kernel, davem, Auke Kok


* Ingo Molnar <mingo@elte.hu> wrote:

> the e1000 in this laptop is historically pretty robust. The only 
> problem i ever had with it were some rx/tx hw-engine latency problems 
> [pings from the outside took up to 1 second to propagate] that were 
> quickly fixed by the e1000 driver guys. Maybe that's related. 
> (although it never caused total inavailability of networking - it was 
> only latency problems)

i've Cc:-ed Auke - maybe he has some ideas? I've also attached all the 
other possibly relevant ethtool dumps of eth0 below, done when the hung 
interface happens. (Auke, see more details about this problem in this 
thread on lkml.)

another thing: how can i make the tx timeout much longer than it is 
right now? Maybe an interrupt is delayed for long and we just dont 
tolerate things long enough - and once the device tx timeout has been 
noticed we kill the interface from subsequent use in essence?

there's one oddity i noticed:

 0x02810: RDH   (Receive desc head)       0x00000000
 0x02818: RDT   (Receive desc tail)       0x000000FE

 0x03810: TDH   (Transmit desc head)      0x00000000
 0x03818: TDT   (Transmit desc tail)      0x00000000

doesnt this suggest that we have a receive packet, just the irq didnt 
come and we didnt process it?

although .. the tx timeout disabled the transmitter and the receiver of 
the device, correct? That in itself could have destroyed any evidence of 
what happened earlier on. But ... i'm really only guessing here.

	Ingo

------------------------------------>

MAC Registers
-------------
0x00000: CTRL (Device control register)  0x00140248
      Endian mode (buffers):             little
      Link reset:                        reset
      Set link up:                       1
      Invert Loss-Of-Signal:             no
      Receive flow control:              disabled
      Transmit flow control:             disabled
      VLAN mode:                         disabled
      Auto speed detect:                 disabled
      Speed select:                      1000Mb/s
      Force speed:                       no
      Force duplex:                      no
0x00008: STATUS (Device status register) 0x80080783
      Duplex:                            full
      Link up:                           link config
      TBI mode:                          disabled
      Link speed:                        1000Mb/s
      Bus type:                          PCI
      Bus speed:                         33MHz
      Bus width:                         32-bit
0x00100: RCTL (Receive control register) 0x00008002
      Receiver:                          enabled
      Store bad packets:                 disabled
      Unicast promiscuous:               disabled
      Multicast promiscuous:             disabled
      Long packet:                       disabled
      Descriptor minimum threshold size: 1/2
      Broadcast accept mode:             accept
      VLAN filter:                       disabled
      Cononical form indicator:          disabled
      Discard pause frames:              filtered
      Pass MAC control frames:           don't pass
      Receive buffer size:               2048
0x02808: RDLEN (Receive desc length)     0x00001000
0x02810: RDH   (Receive desc head)       0x00000000
0x02818: RDT   (Receive desc tail)       0x000000FE
0x02820: RDTR  (Receive delay timer)     0x00000000
0x00400: TCTL (Transmit ctrl register)   0x310000F8
      Transmitter:                       disabled
      Pad short packets:                 enabled
      Software XOFF Transmission:        disabled
      Re-transmit on late collision:     enabled
0x03808: TDLEN (Transmit desc length)    0x00001000
0x03810: TDH   (Transmit desc head)      0x00000000
0x03818: TDT   (Transmit desc tail)      0x00000000
0x03820: TIDV  (Transmit delay timer)    0x00000008
PHY type:                                M88
Offset		Values
------		------
0x0000		00 16 41 17 49 d2 30 0b b2 ff 51 00 ff ff ff ff 
0x0010		53 00 03 01 6b 02 01 20 aa 17 9a 10 86 80 df 80 
0x0020		00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 
0x0030		c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 
0x0040		08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 
0x0050		14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 
0x0060		00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 
0x0070		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ef 9f 
Ring parameters for eth0:
Pre-set maximums:
RX:		4096
RX Mini:	0
RX Jumbo:	0
TX:		4096
Current hardware settings:
RX:		256
RX Mini:	0
RX Jumbo:	0
TX:		256

driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: 0.5-1
bus-info: 0000:02:00.0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
NIC statistics:
     rx_packets: 0
     tx_packets: 591
     rx_bytes: 0
     tx_bytes: 58116
     rx_broadcast: 0
     tx_broadcast: 0
     rx_multicast: 0
     tx_multicast: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 1
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 0
     rx_csum_offload_good: 0
     rx_csum_offload_errors: 0
     rx_header_split: 0
     alloc_rx_buff_failed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17 18:34                   ` Olaf Kirch
  2007-07-17 18:56                     ` Ingo Molnar
@ 2007-07-18 11:48                     ` Jarek Poplawski
  2007-07-19  5:58                       ` Jarek Poplawski
  1 sibling, 1 reply; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-18 11:48 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Ingo Molnar, Linus Torvalds, linux-kernel, davem

Hi,

Here is my proposal of a solution based on dev->state flag,
but intended mainly to prevent poll_napi from disturbing
while net_rx_action is running and polling the device.

It doesn't look very nice or clean but I hope it could
guard net_rx_action enough with some room for netpoll too.

I'd be very glad if it could be verified and/or tested.

Regards,
Jarek P.

PS: not tested!

---

diff -Nurp 2.6.22-git9-/include/linux/netdevice.h 2.6.22-git9/include/linux/netdevice.h
--- 2.6.22-git9-/include/linux/netdevice.h	2007-07-18 07:50:56.000000000 +0200
+++ 2.6.22-git9/include/linux/netdevice.h	2007-07-18 13:21:12.000000000 +0200
@@ -262,6 +262,8 @@ enum netdev_state_t
 	__LINK_STATE_LINKWATCH_PENDING,
 	__LINK_STATE_DORMANT,
 	__LINK_STATE_QDISC_RUNNING,
+	/* Set by net_rx_action vs poll_napi */
+	__LINK_STATE_POLL_LIST_FROZEN,
 };
 
 
diff -Nurp 2.6.22-git9-/net/core/dev.c 2.6.22-git9/net/core/dev.c
--- 2.6.22-git9-/net/core/dev.c	2007-07-18 07:50:58.000000000 +0200
+++ 2.6.22-git9/net/core/dev.c	2007-07-18 13:18:23.000000000 +0200
@@ -2025,6 +2025,7 @@ static void net_rx_action(struct softirq
 	unsigned long start_time = jiffies;
 	int budget = netdev_budget;
 	void *have;
+	struct net_device *dev_frozen = NULL;
 
 	local_irq_disable();
 
@@ -2040,15 +2041,35 @@ static void net_rx_action(struct softirq
 				 struct net_device, poll_list);
 		have = netpoll_poll_lock(dev);
 
+#ifdef CONFIG_NETPOLL
+		/* prevent a race with poll_napi() */
+		if (dev_frozen && dev_frozen != dev) {
+                	clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+						&dev_frozen->state);
+                	set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+							 &dev->state);
+			dev_frozen = dev;
+		} else if (!dev_frozen) {
+                	set_bit(__LINK_STATE_POLL_LIST_FROZEN,
+							 &dev->state);
+			dev_frozen = dev;
+		}
+#endif
 		if (dev->quota <= 0 || dev->poll(dev, &budget)) {
 			netpoll_poll_unlock(have);
 			local_irq_disable();
-			list_move_tail(&dev->poll_list, &queue->poll_list);
+			list_move_tail(&dev->poll_list,
+						 &queue->poll_list);
 			if (dev->quota < 0)
 				dev->quota += dev->weight;
 			else
 				dev->quota = dev->weight;
 		} else {
+#ifdef CONFIG_NETPOLL
+                	clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+							&dev->state);
+			dev_frozen = NULL;
+#endif
 			netpoll_poll_unlock(have);
 			dev_put(dev);
 			local_irq_disable();
@@ -2056,6 +2077,14 @@ static void net_rx_action(struct softirq
 	}
 out:
 	local_irq_enable();
+#ifdef CONFIG_NETPOLL
+	if (dev_frozen) {
+		have = netpoll_poll_lock(dev_frozen);
+		clear_bit(__LINK_STATE_POLL_LIST_FROZEN,
+						&dev_frozen->state);
+		netpoll_poll_unlock(have);
+	}
+#endif
 #ifdef CONFIG_NET_DMA
 	/*
 	 * There may not be any more sk_buffs coming right now, so push
diff -Nurp 2.6.22-git9-/net/core/netpoll.c 2.6.22-git9/net/core/netpoll.c
--- 2.6.22-git9-/net/core/netpoll.c	2007-07-18 07:50:58.000000000 +0200
+++ 2.6.22-git9/net/core/netpoll.c	2007-07-18 12:09:43.000000000 +0200
@@ -124,13 +124,20 @@ static void poll_napi(struct netpoll *np
 	if (test_bit(__LINK_STATE_RX_SCHED, &np->dev->state) &&
 	    npinfo->poll_owner != smp_processor_id() &&
 	    spin_trylock(&npinfo->poll_lock)) {
-		npinfo->rx_flags |= NETPOLL_RX_DROP;
-		atomic_inc(&trapped);
+		/*
+		 * If POLL_LIST_FROZEN is set net_rx_action
+		 * is working with this device.
+		 */
+		if (!test_bit(__LINK_STATE_POLL_LIST_FROZEN,
+						 &np->dev->state)) {
+			npinfo->rx_flags |= NETPOLL_RX_DROP;
+			atomic_inc(&trapped);
 
-		np->dev->poll(np->dev, &budget);
+			np->dev->poll(np->dev, &budget);
 
-		atomic_dec(&trapped);
-		npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+			atomic_dec(&trapped);
+			npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+		}
 		spin_unlock(&npinfo->poll_lock);
 	}
 }

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-18 11:48                     ` Jarek Poplawski
@ 2007-07-19  5:58                       ` Jarek Poplawski
  0 siblings, 0 replies; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-19  5:58 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Ingo Molnar, Linus Torvalds, linux-kernel, davem

On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote:
...
> I'd be very glad if it could be verified and/or tested.

Jarek,

This patch is verified crap!

Regards,
Jarek P.

PS: Olaf,

You've written earlier that one of the main reasons for poll_napi is
to work when the kernel "doesn't even service softirqs anymore". But
in your patch poll_napi leaves netif_rx_complete for softirqs, so
even if it starts to work for Ingo in normal conditions, probably
something else is needed, anyway.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  8:57         ` Ingo Molnar
  2007-07-17  9:29           ` Jarek Poplawski
  2007-07-17 14:07           ` Olaf Kirch
@ 2007-07-17 17:49           ` Linus Torvalds
  2 siblings, 0 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-07-17 17:49 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Olaf Kirch, Jarek Poplawski, linux-kernel, davem

On Tue, 17 Jul 2007, Ingo Molnar wrote:
> 
> i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes 
> the problem go away. So it's somehow also related to jiffies.

No, I suspect it's just related to timing: you need to hit that window 
when the LIST_FROZEN bit is set, and since it was so reliable for you 
before (and others didn't see it), your timing probably happened to hit it 
exactly. And changing HZ probably just changed some timeout or softirq 
getting called, and you didn't hit the bug any more.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll
  2007-07-17  8:28       ` Olaf Kirch
  2007-07-17  8:57         ` Ingo Molnar
@ 2007-07-17  9:12         ` Jarek Poplawski
  1 sibling, 0 replies; 67+ messages in thread
From: Jarek Poplawski @ 2007-07-17  9:12 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Ingo Molnar, Linus Torvalds, linux-kernel, davem

On Tue, Jul 17, 2007 at 10:28:34AM +0200, Olaf Kirch wrote:
> On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
> > What I find more problematic about this portion of code though
> > is that once a net_device is over quota, net_rx_action will
> > loop for up to one jiffy, even if there's just this one device on
> > the poll_list.
> 
> Duh, wrong. For every loop, it'll add dev->weight to dev->quota,
> so it'll be fine very soon.

After your patch only two things can be different here: dev->poll_list
and this flag. Since Ingo offered testing it should be easy to check
after moving the check of "your" flag to __netif_rx_complete, and
skipping only list_del on true.

Jarek P.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2007-07-20  9:46 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-16  9:12 [patch] revert: [NET]: Fix races in net_rx_action vs netpoll Ingo Molnar
2007-07-16 10:35 ` Olaf Kirch
2007-07-16 11:26 ` David Miller
2007-07-16 12:18   ` Olaf Kirch
2007-07-16 13:29     ` Ingo Molnar
2007-07-16 21:09   ` Ingo Molnar
2007-07-16 22:06     ` David Miller
2007-07-16 21:40   ` Linus Torvalds
2007-07-16 21:51     ` Ingo Molnar
2007-07-16 22:09       ` David Miller
2007-07-16 22:37         ` Ingo Molnar
2007-07-16 22:57           ` David Miller
2007-07-17 18:09             ` Ingo Molnar
2007-07-16 22:08     ` David Miller
2007-07-16 22:29       ` Linus Torvalds
2007-07-16 22:52         ` David Miller
2007-07-16 23:17         ` Matt Mackall
2007-07-16 23:34           ` Linus Torvalds
2007-07-17  7:37       ` Olaf Kirch
2007-07-17  8:16     ` Olaf Kirch
2007-07-17  5:46 ` Jarek Poplawski
2007-07-17  6:14   ` Jarek Poplawski
2007-07-17  7:55     ` Olaf Kirch
2007-07-17  8:28       ` Olaf Kirch
2007-07-17  8:57         ` Ingo Molnar
2007-07-17  9:29           ` Jarek Poplawski
2007-07-17 14:07           ` Olaf Kirch
2007-07-17 16:57             ` Ingo Molnar
2007-07-17 18:06               ` Olaf Kirch
2007-07-17 18:18                 ` Ingo Molnar
2007-07-17 18:34                   ` Olaf Kirch
2007-07-17 18:56                     ` Ingo Molnar
2007-07-18 12:04                       ` Olaf Kirch
2007-07-18 12:41                         ` Ingo Molnar
2007-07-18 12:48                         ` Ingo Molnar
2007-07-18 14:41                           ` Olaf Kirch
2007-07-18 16:43                             ` Ingo Molnar
2007-07-19  9:09                               ` Ingo Molnar
2007-07-19  9:44                                 ` Olaf Kirch
2007-07-19 10:01                                   ` Ingo Molnar
2007-07-19 10:37                                     ` Olaf Kirch
2007-07-19 10:47                                       ` Ingo Molnar
2007-07-19 10:58                                         ` Ingo Molnar
2007-07-19 12:52                                           ` Olaf Kirch
2007-07-19 12:54                                             ` Olaf Kirch
2007-07-19 15:42                                             ` Kok, Auke
2007-07-19 16:07                                               ` Ingo Molnar
2007-07-19 19:13                                                 ` Olaf Kirch
2007-07-19 19:22                                                   ` Ingo Molnar
2007-07-19 19:35                                                     ` Olaf Kirch
2007-07-19 19:56                                                       ` Ingo Molnar
2007-07-19 20:02                                                         ` Olaf Kirch
2007-07-20  9:45                                                           ` Ingo Molnar
2007-07-19 15:07                                           ` Ingo Molnar
2007-07-19 15:27                                             ` Olaf Kirch
2007-07-19 15:32                                             ` Ingo Molnar
2007-07-19 15:52                                               ` Ingo Molnar
2007-07-19 16:05                                                 ` Ingo Molnar
2007-07-19 16:13                                                   ` Ingo Molnar
2007-07-19 17:36                                                   ` Olaf Kirch
2007-07-19 17:41                                                     ` Ingo Molnar
2007-07-19 17:51                                                     ` Olaf Kirch
2007-07-19 10:17                                 ` Ingo Molnar
2007-07-18 11:48                     ` Jarek Poplawski
2007-07-19  5:58                       ` Jarek Poplawski
2007-07-17 17:49           ` Linus Torvalds
2007-07-17  9:12         ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox