public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <ak@suse.de>, Andrew Morton <akpm@osdl.org>,
	Ingo Molnar <mingo@elte.hu>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	John B?ckstrand <sandos@home.se>
Subject: Re: [PATCH] netpoll can lock up on low memory.
Date: Fri, 5 Aug 2005 14:28:08 -0700	[thread overview]
Message-ID: <20050805212808.GV8074@waste.org> (raw)
In-Reply-To: <1123275420.18332.81.camel@localhost.localdomain>

On Fri, Aug 05, 2005 at 04:57:00PM -0400, Steven Rostedt wrote:
> On Fri, 2005-08-05 at 13:01 -0700, Matt Mackall wrote:
> > On Fri, Aug 05, 2005 at 10:36:31AM -0400, Steven Rostedt wrote:
> > > Looking at the netpoll routines, I noticed that the find_skb could
> > > lockup if the memory is low. This is because the allocations are
> > > called with GFP_ATOMIC (since this is in interrupt context) and if
> > > it fails, it will continue to fail. This is just by observing the
> > > code, I didn't have this actually happen. So if this is not the
> > > case, please let me know how it can get out. Otherwise, please
> > > accept this patch.
> > 
> > By netpoll_poll() tickling the driver enough to free the currently
> > queued outgoing SKBs.
> 
> I believe that the e1000 wont free up any outgoing packets since the
> netpoll call doesn't seem to get to the e1000_clean_tx part of the
> e1000_intr, otherwise the system wouldn't lock under the
> netpoll_send_skb when one disconnects the wire and puts it back in.  The
> disconnect would lock it up anyway (with Andi's patch it now doesn't)
> but since it won't come back up after the link is back up, there seems
> to be something wrong with the e1000 netpoll driver.  This is because
> the e1000_netpoll doesn't seem to be cleaning up the tx buffer and start
> the queue back up.

That does seem like a driver problem.

> > Also note that by the time we're in this loop, we're ready to take
> > desperate measures. We've already exhausted our private queue of SKBs
> > so we have no alternative but to keep kicking the driver until
> > something happens.
> 
> OK, the system is under heavy memory load and starts eating up the
> netpoll packets.  When the last packet is gone, and you have something
> like the e1000 that doesn't clean up its packets with netpoll, then you
> just locked up the system.
> 
> The scary part of this loop is that if the netpoll doesn't come up with
> the goods, its game over.  Say we are at desperate measures but it could
> be a case where we need to output more information and lockup here
> before we can go out and free some memory. 

Realistically, we were probably going to crash anyway at this point as
we're apparently failing to recycle SKBs.

Netpoll generally must assume it won't get a second chance, as it's
being called by things like oops() and panic() and used by things like
kgdb. If netpoll fails, the box is dead anyway.

> > The netpoll philosophy is to assume that its traffic is an absolute
> > priority - it is better to potentially hang trying to deliver a panic
> > message than to give up and crash silently.
> 
> So even a long timeout would not do?  So you don't even get a message to
> the console?

In general, there's no way to measure time here. And if we're
using netconsole, what makes you think there's any other console?

> > > Also, as Andi told me, the printk here would probably not show up
> > > anyway if this happens with netconsole.
> > 
> > That's fine. But in fact, it does show up occassionally - I've seen
> > it.
> 
> Then maybe what Andi told me is not true ;-)
> 
> Oh, and did your machine crash when you saw it?  Have you seen it with
> the e1000 driver?

No and no. Most of my own testing is done with tg3.

-- 
Mathematics is the supreme nostalgia of our time.

  reply	other threads:[~2005-08-05 21:30 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <42F347D2.7000207@home.se.suse.lists.linux.kernel>
2005-08-05 11:45 ` lockups with netconsole on e1000 on media insertion Andi Kleen
2005-08-05 12:44   ` John Bäckstrand
2005-08-05 13:49   ` Steven Rostedt
2005-08-05 13:55     ` Andi Kleen
2005-08-05 14:10       ` Steven Rostedt
2005-08-05 14:14         ` Andi Kleen
2005-08-05 14:27           ` Steven Rostedt
2005-08-05 14:36             ` David S. Miller
2005-08-05 15:02               ` Steven Rostedt
2005-08-05 14:36           ` [PATCH] netpoll can lock up on low memory Steven Rostedt
2005-08-05 20:01             ` Matt Mackall
2005-08-05 20:57               ` Steven Rostedt
2005-08-05 21:28                 ` Matt Mackall [this message]
2005-08-06  0:23                   ` Steven Rostedt
2005-08-06  1:53                     ` Matt Mackall
2005-08-06  2:32                       ` Steven Rostedt
2005-08-06  7:30                         ` Daniel Phillips
2005-08-06  7:58                         ` Ingo Molnar
2005-08-06 23:10                           ` Matt Mackall
2005-08-06  9:46                         ` David S. Miller
2005-08-06  9:57                           ` Steven Rostedt
2005-08-06 12:09                             ` John Bäckstrand
2005-08-07  5:40                             ` Matt Mackall
2005-08-05 21:26               ` Andi Kleen
2005-08-05 21:42                 ` Matt Mackall
2005-08-05 21:51                   ` Andi Kleen
2005-08-06  1:16                     ` Matt Mackall
2005-08-06  0:30                 ` Steven Rostedt
2005-08-06  7:45                 ` Ingo Molnar
2005-08-06 11:29                   ` Andi Kleen
2005-08-07 21:12     ` lockups with netconsole on e1000 on media insertion John Bäckstrand
2005-08-08  2:29       ` Steven Rostedt
2005-08-05 20:12   ` Matt Mackall
2005-08-05 21:56     ` Andi Kleen
2005-08-05 23:20       ` Matt Mackall
2005-08-05 23:51         ` Andi Kleen
2005-08-06  1:22           ` Matt Mackall
2005-08-06  1:37             ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050805212808.GV8074@waste.org \
    --to=mpm@selenic.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sandos@home.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox