From: Stephen Hemminger <shemminger@vyatta.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
Robert Olsson <Robert.Olsson@data.slu.se>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org
Subject: Re: [RFC] fib_trie: flush improvement
Date: Wed, 2 Apr 2008 11:03:35 -0700 [thread overview]
Message-ID: <20080402110335.66b04181@extreme> (raw)
In-Reply-To: <47F39998.8040605@cosmosbay.com>
On Wed, 02 Apr 2008 16:35:04 +0200
Eric Dumazet <dada1@cosmosbay.com> wrote:
> Eric Dumazet a écrit :
> > Stephen Hemminger a écrit :
> >> This is an attempt to fix the problem described in:
> >> http://bugzilla.kernel.org/show_bug.cgi?id=6648
> >> I can reproduce this by loading lots and lots of routes and the taking
> >> the interface down. This causes all entries in trie to be flushed, but
> >> each leaf removal causes a rebalance of the trie. And since the removal
> >> is depth first, it creates lots of needless work.
> >>
> >> Instead on flush, just walk the trie and prune as we go.
> >> The implementation is for description only, it probably doesn't work
> >> yet.
> >>
> >>
> >
> > I dont get it, since the bug reporter mentions with recent kernels :
> >
> > Fix inflate_threshold_root. Now=15 size=11 bits
> >
> > Is it what you get with your tests ?
> >
> > Pawel reports :
> >
> > cat /proc/net/fib_triestat
> > Main: Aver depth: 2.26 Max depth: 6 Leaves: 235924
> > Internal nodes: 57854 1: 31632 2: 11422 3: 8475 4: 3755 5: 1676 6: 893
> > 18: 1
> >
> > Pointers: 609760 Null ptrs: 315983 Total size: 16240 kB
> >
> > warning messages comes from rootnode that cannot be expanded, since it
> > hits MAX_ORDER (on a 32bit x86)
> >
> >
> >
> > (sizeof(struct tnode) + (sizeof(struct node *) << bits);) is rounded
> > to 4 << (bit + 1), ie 2 << 20
> >
> > For larger allocations Pawel has two choices :
> >
> > change MAX_ORDER from 11 to 13 or 14
> > If this machine is a pure router, this change wont have performance
> > impact.
> >
> > Or (more difficult, but more appropriate for mainline) change
> > fib_trie.c to use vmalloc() for very big allocaions (for the root
> > only), and vfree()
> >
> > Since vfree() cannot be called from rcu callback, one has to setup a
> > struct work_struct helper.
> >
> Here is a patch (untested unfortunatly) to implement this.
>
> [IPV4] fib_trie: root_tnode can benefit of vmalloc()
>
> FIB_TRIE root node can be very large and currently hits MAX_ORDER limit.
> It also wastes about 50% of allocated size, because of power of two
> rounding of tnode.
>
> A switch to vmalloc() can improve FIB_TRIE performance by allowing root
> node to grow
> past the alloc_pages() limit, while preserving memory.
>
> Special care must be taken to free such zone, as rcu handler is not
> allowed to call vfree(),
> we use a worker instead.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
>
Rather than switching between three allocation strategies, I would rather
just have kmalloc and vmalloc.
next prev parent reply other threads:[~2008-04-02 18:03 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-02 0:27 [RFC] fib_trie: flush improvement Stephen Hemminger
2008-04-02 8:01 ` Eric Dumazet
2008-04-02 14:35 ` Eric Dumazet
2008-04-02 18:03 ` Stephen Hemminger [this message]
2008-04-02 19:36 ` Eric Dumazet
2008-04-04 16:02 ` [RFC] fib_trie: memory waste solutions Stephen Hemminger
2008-04-07 6:55 ` Robert Olsson
2008-04-07 7:58 ` Andi Kleen
2008-04-07 14:42 ` Robert Olsson
2008-04-07 15:15 ` Andi Kleen
2008-04-07 15:36 ` Eric Dumazet
2008-04-07 16:46 ` Eric Dumazet
2008-04-07 22:48 ` Stephen Hemminger
2008-04-10 9:57 ` David Miller
2008-04-02 9:31 ` [RFC] fib_trie: flush improvement Robert Olsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080402110335.66b04181@extreme \
--to=shemminger@vyatta.com \
--cc=Robert.Olsson@data.slu.se \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.