public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Just Marc <marc@corky.net>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Crash soon after an alloc_skb failure in 2.6.16 and previous, swap disabled
Date: Fri, 31 Mar 2006 09:16:50 +0100	[thread overview]
Message-ID: <442CE572.3060408@corky.net> (raw)
In-Reply-To: <20060330145422.6c7e2517.akpm@osdl.org>

Hello Andrew,
> Just Marc <marc@corky.net> wrote:
>   
>> I'm running a few machines with swap turned off and am experiencing 
>> crashes when the system is extremely low on kernel memory.   So far the 
>> crashes observed are always inside the recv function of the Ethernet 
>> module, below is the trace for the tg3 module but a similar result is 
>> also seen with the e1000 module.   Th crash is not necessarily related 
>> to the Ethernet modules but may happen at a later stage deeper in the 
>> networking code.
>>
>> I don't have console access to the machine so I can't know what the 
>> final oops/crash message is (if any) but this can be reproduced on any 
>> machine quite easily by consuming all of the available memory,  I guess 
>> that if done at userspace the OOM killer will prevent this from 
>> happening but a simple LKM can allocate all this memory and this issue 
>> should surface quickly.
>>     
>
> We'd really need to see that final oops trace, please.
>
>   
I will get that for you once I have physical access to the machines.   
I'm hoping there would be one rather than a hard lockup.
> It's not unusual for a hard-working gigabit NIC to exhaust the page
> allocator reserves and perhaps we're a bit too noisy in the logs when it
> happens.  But it's sufficiently rare and sufficiently associated with other
> problems (like this one) that nobody has yet gone and stuck the
> __GFP_NOWARN into the relevant drivers to suppress the messages.
>
> If we really have broken something in there then someone else will hit this
> soon enough.  But nobody has, as far as I know.
>
> A digital photo of the screen would suit.
>
> Or perhaps netconsole.  If the crash is really associated with the NIC
> running out of txbufs then netconsole might not be useful.  But perhaps the
> crash is something else altogether.
>
>   
I suspect netconsole is going to have a hard time transmitting anything 
at that moment.   I'll try it anyway.

But still, this problem is highly reproducible, in my setup anyway, it 
happens to the machine once every few days.

I'll get back to you with more details as they become available.

Thanks

      reply	other threads:[~2006-03-31  7:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-30 16:47 Crash soon after an alloc_skb failure in 2.6.16 and previous, swap disabled Just Marc
2006-03-30 16:10 ` linux-os (Dick Johnson)
2006-03-30 19:43   ` Just Marc
2006-03-30 22:54 ` Andrew Morton
2006-03-31  8:16   ` Just Marc [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=442CE572.3060408@corky.net \
    --to=marc@corky.net \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox