All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kok, Auke" <auke-jan.h.kok@intel.com>
To: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Robert Olsson <Robert.Olsson@data.slu.se>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org
Subject: Re: net_rx_action/NAPI oops [PATCH]
Date: Tue, 27 Nov 2007 16:24:42 -0800	[thread overview]
Message-ID: <474CB54A.2060804@intel.com> (raw)
In-Reply-To: <20071127145537.6bf1af68@freepuppy.rosehill>

Stephen Hemminger wrote:
> On Tue, 27 Nov 2007 14:34:44 -0800
> "Kok, Auke" <auke-jan.h.kok@intel.com> wrote:
> 
>> Stephen Hemminger wrote:
>>> On Tue, 27 Nov 2007 19:52:24 +0100
>>> Robert Olsson <Robert.Olsson@data.slu.se> wrote:
>>>
>>>> Hello!
>>>>
>>>> I've discovered a bug while testing the new multiQ NAPI code. In hi-load 
>>>> situations when we take down an interface we get a kernel panic. The
>>>> oops is below.
>>>>
>>>> From what I see this happens when driver does napi_disable() and clears
>>>> NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight 
>>>> a sort indirect test but that's now not enough to cover the load situation. 
>>>> where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still 
>>>> full quota. Latest git but I'll guess the is the same in all later kernels.
>>>> There might be different solutions... one variant is below:
>>> It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear NAPI_STATE_SCHED)
>>> and do a full quota. That bug already had to be fixed in other drivers,
>>> look like e1000 has same problem.
>> Stephen,
>>
>> please enlighten me, can you e.g. show me a commit of other drivers where you
>> fixed this up?
>>
>> Thanks,
>>
>> Auke
> 
> Author: David S. Miller <davem@sunset.davemloft.net>  2007-10-11 18:08:29
> Committer: David S. Miller <davem@sunset.davemloft.net>  2007-10-11 18:08:29
> Parent: b9f2c0440d806e01968c3ed4def930a43be248ad ([netdrvr] Stop using legacy hooks ->self_test_count, ->get_stats_count)
> Child:  266918303226cceac7eca38ced30f15f277bd89c ([SKY2]: status polling loop (post merge))
> Branches: master, origin
> Follows: v2.6.23
> Precedes: v2.6.24-rc1
> 
>     [NET]: Fix NAPI completion handling in some drivers.
>     
>     In order for the list handling in net_rx_action() to be
>     correct, drivers must follow certain rules as stated by
>     this comment in net_rx_action():
>     
>     		/* Drivers must not modify the NAPI state if they
>     		 * consume the entire weight.  In such cases this code
>     		 * still "owns" the NAPI instance and therefore can
>     		 * move the instance around on the list at-will.
>     		 */
>     
>     A few drivers do not do this because they mix the budget checks
>     with reading hardware state, resulting in crashes like the one
>     reported by takano@axe-inc.co.jp.
>     
>     BNX2 and TG3 are taken care of here, SKY2 fix is from Stephen
>     Hemminger.
>     
>     Signed-off-by: David S. Miller <davem@davemloft.net>

OK, I'm not sure what went wrong there with e1000, but I'll send a patch in a second.

Robert, please give that patch a try (it fixes a crash that I had here as well)
and let us know if it works for you.

Auke

  reply	other threads:[~2007-11-28  0:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-27 18:52 net_rx_action/NAPI oops [PATCH] Robert Olsson
2007-11-27 22:09 ` Stephen Hemminger
2007-11-27 22:34   ` Kok, Auke
2007-11-27 22:55     ` Stephen Hemminger
2007-11-28  0:24       ` Kok, Auke [this message]
2007-11-28 12:36         ` Robert Olsson
2007-11-28 16:38           ` Stephen Hemminger
2007-11-28 17:22             ` Robert Olsson
2007-11-30 16:56         ` Robert Olsson
2007-11-30 17:13           ` Kok, Auke
2007-11-28 12:27   ` Robert Olsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=474CB54A.2060804@intel.com \
    --to=auke-jan.h.kok@intel.com \
    --cc=Robert.Olsson@data.slu.se \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.