From: Eric Dumazet <dada1@cosmosbay.com>
To: Jesper Dangaard Brouer <hawk@diku.dk>
Cc: Robert Olsson <Robert.Olsson@data.slu.se>,
jens.laas@data.slu.se, hans.liss@its.uu.se,
linux-net@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Kernel panic: Route cache, RCU, possibly FIB trie.
Date: Tue, 21 Mar 2006 16:37:19 +0100 [thread overview]
Message-ID: <44201DAF.7090707@cosmosbay.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0603211552590.28173@ask.diku.dk>
Jesper Dangaard Brouer a écrit :
>
> On Tue, 21 Mar 2006, Robert Olsson wrote:
>
>> Jesper Dangaard Brouer writes:
>>
>> > I have tried to track down the problem, and I think I have narrowed it
>> > a bit down. My theory is that it is related to the route cache
>> > (ip_dst_cache) or FIB, which cannot dealloacate route cache slab
>> > elements (maybe RCU related). (I have seen my route cache increase to
>> > around 520k entries using rtstat, before dying).
>> >
>> > I'm using the FIB trie system/algorithm (CONFIG_IP_FIB_TRIE). Think
>> > that the error might be cause by the "fib_trie" code. See the syslog,
>> > output below.
>>
>> > Syslog#1 (indicating a problem with the fib trie)
>> > --------
>> > Mar 20 18:00:04 hostname kernel: Debug: sleeping function called
>> from invalid context at mm/slab.c:2472
>> > Mar 20 18:00:04 hostname kernel: in_atomic():1, irqs_disabled():0
>> > Mar 20 18:00:04 hostname kernel: [<c0103d9f>] dump_stack+0x1e/0x22
>> > Mar 20 18:00:04 hostname kernel: [<c011cbe1>] __might_sleep+0xa6/0xae
>> > Mar 20 18:00:04 hostname kernel: [<c014f3e9>] __kmalloc+0xd9/0xf3
>> > Mar 20 18:00:04 hostname kernel: [<c014f5a4>] kzalloc+0x23/0x50
>> > Mar 20 18:00:04 hostname kernel: [<c030ecd1>] tnode_alloc+0x3c/0x82
>> > Mar 20 18:00:04 hostname kernel: [<c030edf6>] tnode_new+0x26/0x91
>> > Mar 20 18:00:04 hostname kernel: [<c030f757>] halve+0x43/0x31d
>> > Mar 20 18:00:04 hostname kernel: [<c030f090>] resize+0x118/0x27e
>>
>> Hello!
>>
>> Out of memory?
> One of the crashed was caused by out of memory, but all the memory was
> allocated through slab. More specifically to ip_dst_cache.
>
>> Running BGP with full routing?
> No, running OSPF with around 760 subnets.
>
>> And large number of flows.
> Yes, very large number of flows.
>
>> Whats your normal number of entries route cache?
> On this machine, rigth now, between 14000 to 60000 entries in the route
> cache. On other machines, rigth now, I have a max of 151560 entries.
>
>> And how much memory do you have?
> On this machine 1Gb memory (and 4 others), most of the machines have 2Gb.
>
>
>> From your report problems seems to related to flushing either
>> rt_cache_flush
>> or fib_flush (before there was dev_close()?) so all associated entries
>> should
>> freed. All the entries are freed via RCU which due to the deferred delete
>> can give a very high transient memory pressure. If we believe it's
>> memory problem
>> we can try something out...
>
> There is definitly high memory pressure on this machine!
> Slab memory usage, range from 39Mb to 205Mb (at the moment on the
> production servers).
>
Did you tried 2.6.16 ?
It contains changes in kernel/rcupdate.c so that not too many RCU elems are
queued (force_quiescent_state()). So in the case a rt_cache_flush is done, you
have the guarantee all entries are not pushed into rcu at once.
Eric
next prev parent reply other threads:[~2006-03-21 15:37 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-20 21:44 Kernel panic: Route cache, RCU, possibly FIB trie Jesper Dangaard Brouer
2006-03-20 22:09 ` Dipankar Sarma
2006-03-21 10:29 ` Jesper Dangaard Brouer
2006-03-21 10:37 ` David S. Miller
2006-03-21 14:51 ` Jesper Dangaard Brouer
2006-03-21 21:25 ` David S. Miller
2006-03-23 15:35 ` Jesper Dangaard Brouer
2006-03-23 15:44 ` Jesper Dangaard Brouer
2006-03-23 16:15 ` Eric Dumazet
2006-03-23 21:37 ` Jesper Dangaard Brouer
2006-03-24 6:11 ` Eric Dumazet
2006-03-24 10:34 ` Jesper Dangaard Brouer
2006-03-23 21:32 ` Robert Olsson
2006-03-21 13:28 ` Robert Olsson
2006-03-21 15:27 ` Jesper Dangaard Brouer
2006-03-21 15:37 ` Eric Dumazet [this message]
2006-03-21 15:43 ` Dipankar Sarma
2006-03-21 16:30 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44201DAF.7090707@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=Robert.Olsson@data.slu.se \
--cc=hans.liss@its.uu.se \
--cc=hawk@diku.dk \
--cc=jens.laas@data.slu.se \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-net@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox