public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Jesper Dangaard Brouer <hawk@diku.dk>
Cc: Robert Olsson <Robert.Olsson@data.slu.se>,
	jens.laas@data.slu.se, hans.liss@its.uu.se,
	linux-net@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Kernel panic: Route cache, RCU, possibly FIB trie.
Date: Tue, 21 Mar 2006 16:37:19 +0100	[thread overview]
Message-ID: <44201DAF.7090707@cosmosbay.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0603211552590.28173@ask.diku.dk>

Jesper Dangaard Brouer a écrit :
> 
> On Tue, 21 Mar 2006, Robert Olsson wrote:
> 
>> Jesper Dangaard Brouer writes:
>>
>> > I have tried to track down the problem, and I think I have narrowed it
>> > a bit down.  My theory is that it is related to the route cache
>> > (ip_dst_cache) or FIB, which cannot dealloacate route cache slab
>> > elements (maybe RCU related).  (I have seen my route cache increase to
>> > around 520k entries using rtstat, before dying).
>> >
>> > I'm using the FIB trie system/algorithm (CONFIG_IP_FIB_TRIE). Think
>> > that the error might be cause by the "fib_trie" code.  See the syslog,
>> > output below.
>>
>> > Syslog#1 (indicating a problem with the fib trie)
>> > --------
>> > Mar 20 18:00:04 hostname kernel: Debug: sleeping function called 
>> from invalid context at mm/slab.c:2472
>> > Mar 20 18:00:04 hostname kernel: in_atomic():1, irqs_disabled():0
>> > Mar 20 18:00:04 hostname kernel:  [<c0103d9f>] dump_stack+0x1e/0x22
>> > Mar 20 18:00:04 hostname kernel:  [<c011cbe1>] __might_sleep+0xa6/0xae
>> > Mar 20 18:00:04 hostname kernel:  [<c014f3e9>] __kmalloc+0xd9/0xf3
>> > Mar 20 18:00:04 hostname kernel:  [<c014f5a4>] kzalloc+0x23/0x50
>> > Mar 20 18:00:04 hostname kernel:  [<c030ecd1>] tnode_alloc+0x3c/0x82
>> > Mar 20 18:00:04 hostname kernel:  [<c030edf6>] tnode_new+0x26/0x91
>> > Mar 20 18:00:04 hostname kernel:  [<c030f757>] halve+0x43/0x31d
>> > Mar 20 18:00:04 hostname kernel:  [<c030f090>] resize+0x118/0x27e
>>
>> Hello!
>>
>> Out of memory?
> One of the crashed was caused by out of memory, but all the memory was 
> allocated through slab.  More specifically to ip_dst_cache.
> 
>> Running BGP with full routing?
> No, running OSPF with around 760 subnets.
> 
>> And large number of flows.
> Yes, very large number of flows.
> 
>> Whats your normal number of entries route cache?
> On this machine, rigth now, between 14000 to 60000 entries in the route 
> cache.  On other machines, rigth now, I have a max of 151560 entries.
> 
>> And how much memory do you have?
> On this machine 1Gb memory (and 4 others), most of the machines have 2Gb.
> 
> 
>> From your report problems seems to related to flushing either 
>> rt_cache_flush
>> or fib_flush (before there was dev_close()?) so all associated entries 
>> should
>> freed. All the entries are freed via RCU which due to the deferred delete
>> can give a very high transient memory pressure. If we believe it's 
>> memory problem
>> we can try something out...
> 
> There is definitly high memory pressure on this machine!
> Slab memory usage, range from 39Mb to 205Mb (at the moment on the 
> production servers).
> 

Did you tried 2.6.16 ?

It contains changes in kernel/rcupdate.c so that not too many RCU elems are 
queued (force_quiescent_state()). So in the case a rt_cache_flush is done, you 
have the guarantee all entries are not pushed into rcu at once.

Eric

  reply	other threads:[~2006-03-21 15:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-20 21:44 Kernel panic: Route cache, RCU, possibly FIB trie Jesper Dangaard Brouer
2006-03-20 22:09 ` Dipankar Sarma
2006-03-21 10:29   ` Jesper Dangaard Brouer
2006-03-21 10:37     ` David S. Miller
2006-03-21 14:51       ` Jesper Dangaard Brouer
2006-03-21 21:25         ` David S. Miller
2006-03-23 15:35           ` Jesper Dangaard Brouer
2006-03-23 15:44             ` Jesper Dangaard Brouer
2006-03-23 16:15               ` Eric Dumazet
2006-03-23 21:37                 ` Jesper Dangaard Brouer
2006-03-24  6:11                   ` Eric Dumazet
2006-03-24 10:34                     ` Jesper Dangaard Brouer
2006-03-23 21:32             ` Robert Olsson
2006-03-21 13:28 ` Robert Olsson
2006-03-21 15:27   ` Jesper Dangaard Brouer
2006-03-21 15:37     ` Eric Dumazet [this message]
2006-03-21 15:43       ` Dipankar Sarma
2006-03-21 16:30         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44201DAF.7090707@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=Robert.Olsson@data.slu.se \
    --cc=hans.liss@its.uu.se \
    --cc=hawk@diku.dk \
    --cc=jens.laas@data.slu.se \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-net@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox