* Re: [PATCH 9/9] fix sparse warnings
[not found] ` <20080112064646.659443238@linux-foundation.org>
@ 2008-01-12 11:16 ` Eric Dumazet
2008-01-12 11:28 ` David Miller
` (3 more replies)
2008-01-13 5:25 ` [PATCH 9/9] fix sparse warnings David Miller
1 sibling, 4 replies; 52+ messages in thread
From: Eric Dumazet @ 2008-01-12 11:16 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Robert Olsson, netdev
[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]
Stephen Hemminger a écrit :
> Make FIB TRIE go through sparse checker without warnings.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Hi Stephen
While reviewing your patches (and fib code) I had some questions :
1) I was wondering isn't trie_collect_stats() a potential cpu hog
(big latency) ?
2) struct tnode layout
If tnode->bits is large enough, we allocate a big area
of memory but roughly use only first half of it.
We could use a better scheme with an extra indirection. For small
nodes, we use space right after tnode, but for big nodes, we allocate
a power of two set of pages, to exactly match the memory need.
3) 'pos' and 'bits' fields of 'struct tnode' might be converted to
plain uchar, instead of 5-bits fields, to reduce complexity for
generated code.
4) full_children & empty_children being 'unsigned short',
we probably are limited to 2^15 elements, but I could not
find this limit enforced somewhere.
[FIB]: Reduce text size of net/ipv4/fib_trie.o
In struct tnode, we use two fields of 5 bits for 'pos' and 'bits'.
Switching to plain 'unsigned char' (8 bits) take the same space
because of compiler alignments, and reduce text size by 435 bytes
on i386.
On i386 :
$ size net/ipv4/fib_trie.o.before_patch net/ipv4/fib_trie.o
text data bss dec hex filename
13714 4 64 13782 35d6 net/ipv4/fib_trie.o.before
13279 4 64 13347 3423 net/ipv4/fib_trie.o
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: fib_trie.patch --]
[-- Type: text/plain, Size: 605 bytes --]
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 2832610..4e91532 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -119,8 +119,8 @@ struct leaf_info {
struct tnode {
t_key key;
unsigned long parent;
- unsigned short pos:5; /* 2log(KEYLENGTH) bits needed */
- unsigned short bits:5; /* 2log(KEYLENGTH) bits needed */
+ unsigned char pos; /* 2log(KEYLENGTH) bits needed */
+ unsigned char bits; /* 2log(KEYLENGTH) bits needed */
unsigned short full_children; /* KEYLENGTH bits needed */
unsigned short empty_children; /* KEYLENGTH bits needed */
struct rcu_head rcu;
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
@ 2008-01-12 11:28 ` David Miller
2008-01-12 21:08 ` Stephen Hemminger
` (2 subsequent siblings)
3 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-12 11:28 UTC (permalink / raw)
To: dada1; +Cc: stephen.hemminger, robert.olsson, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Sat, 12 Jan 2008 12:16:13 +0100
> We could use a better scheme with an extra indirection.
Unfortunately, indirection will likely have a negative
impact upon performance. We go only as fast as the
number of memory references made by this code.
> 3) 'pos' and 'bits' fields of 'struct tnode' might be converted to
> plain uchar, instead of 5-bits fields, to reduce complexity for
> generated code.
This seems reasonable, I'll likely apply this.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
2008-01-12 11:28 ` David Miller
@ 2008-01-12 21:08 ` Stephen Hemminger
2008-01-14 11:07 ` Robert Olsson
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
2008-01-13 18:30 ` [FIB]: full_children & empty_children should be uint, not ushort Eric Dumazet
3 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-12 21:08 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, Robert Olsson, netdev
On Sat, 12 Jan 2008 12:16:13 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:
> Stephen Hemminger a écrit :
> > Make FIB TRIE go through sparse checker without warnings.
> >
> > Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>
> Hi Stephen
>
> While reviewing your patches (and fib code) I had some questions :
>
> 1) I was wondering isn't trie_collect_stats() a potential cpu hog
> (big latency) ?
>
> 2) struct tnode layout
> If tnode->bits is large enough, we allocate a big area
> of memory but roughly use only first half of it.
> We could use a better scheme with an extra indirection. For small
> nodes, we use space right after tnode, but for big nodes, we allocate
> a power of two set of pages, to exactly match the memory need.
>
> 3) 'pos' and 'bits' fields of 'struct tnode' might be converted to
> plain uchar, instead of 5-bits fields, to reduce complexity for
> generated code.
>
> 4) full_children & empty_children being 'unsigned short',
> we probably are limited to 2^15 elements, but I could not
> find this limit enforced somewhere.
Remember that the code should be optimized for lookup, not management
operations. We ran into this during testing (the test suite was looking
for number of routes), thats why I put in the size field.
The existing dump code is really slow:
1) FIB_TRIE Under KVM:
load 164393 routes 12.436 sec
ip route | wc -l 12.569 sec
grep /proc/net/route 25.357 sec
99% of the cpu time is spent in nextleaf() during these dump operations.
2) FIB_HASH Under KVM:
load 164393 routes 10.833 sec
ip route | wc -l 1.981 sec
grep /proc/net/route 0.204 sec
--
Stephen Hemminger <stephen.hemminger@vyatta.com>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
2008-01-12 11:28 ` David Miller
2008-01-12 21:08 ` Stephen Hemminger
@ 2008-01-12 21:09 ` Stephen Hemminger
2008-01-13 5:28 ` David Miller
2008-01-13 18:30 ` [FIB]: full_children & empty_children should be uint, not ushort Eric Dumazet
3 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-12 21:09 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, Robert Olsson, netdev
On Sat, 12 Jan 2008 12:16:13 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:
> Stephen Hemminger a écrit :
> > Make FIB TRIE go through sparse checker without warnings.
> >
> > Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>
> Hi Stephen
>
> While reviewing your patches (and fib code) I had some questions :
>
> 1) I was wondering isn't trie_collect_stats() a potential cpu hog
> (big latency) ?
>
> 2) struct tnode layout
> If tnode->bits is large enough, we allocate a big area
> of memory but roughly use only first half of it.
> We could use a better scheme with an extra indirection. For small
> nodes, we use space right after tnode, but for big nodes, we allocate
> a power of two set of pages, to exactly match the memory need.
>
> 3) 'pos' and 'bits' fields of 'struct tnode' might be converted to
> plain uchar, instead of 5-bits fields, to reduce complexity for
> generated code.
>
> 4) full_children & empty_children being 'unsigned short',
> we probably are limited to 2^15 elements, but I could not
> find this limit enforced somewhere.
>
> [FIB]: Reduce text size of net/ipv4/fib_trie.o
>
> In struct tnode, we use two fields of 5 bits for 'pos' and 'bits'.
> Switching to plain 'unsigned char' (8 bits) take the same space
> because of compiler alignments, and reduce text size by 435 bytes
> on i386.
>
> On i386 :
> $ size net/ipv4/fib_trie.o.before_patch net/ipv4/fib_trie.o
> text data bss dec hex filename
> 13714 4 64 13782 35d6 net/ipv4/fib_trie.o.before
> 13279 4 64 13347 3423 net/ipv4/fib_trie.o
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
I agree they should not have been bitfields in the first place.
--
Stephen Hemminger <stephen.hemminger@vyatta.com>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/9] get rid of trie_init
[not found] ` <20080112064646.056241123@linux-foundation.org>
@ 2008-01-13 4:49 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 4:49 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:14 -0800
> trie_init is worthless it is just zeroing stuff that is already
> zero! Move the memset() down to make it obvious.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/9] get rid of unused revision element
[not found] ` <20080112064646.132747871@linux-foundation.org>
@ 2008-01-13 4:50 ` David Miller
2008-01-14 11:44 ` Robert Olsson
0 siblings, 1 reply; 52+ messages in thread
From: David Miller @ 2008-01-13 4:50 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:15 -0800
> The revision element must of been part of an earlier design,
> because currently it is set but never used.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied.
I suspect Robert wanted to play around with some generation
ID optimizations but never got around to it.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 3/9] move size information to pr_debug()
[not found] ` <20080112064646.207183428@linux-foundation.org>
@ 2008-01-13 4:53 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 4:53 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:16 -0800
> The size of structures is a debug thing, not something that needs to
> be part of a /proc api.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
I don't necessarily agree with this one, the size of the
structures is an important inherent attribute contributing
directly to memory usage and performance of the algorithms.
So knowing those values are necessary to diagnose problems.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 4/9] statistics improvements
[not found] ` <20080112064646.282104074@linux-foundation.org>
@ 2008-01-13 4:55 ` David Miller
2008-01-13 5:33 ` Stephen Hemminger
0 siblings, 1 reply; 52+ messages in thread
From: David Miller @ 2008-01-13 4:55 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:17 -0800
> Turn the unused size field into a useful counter for the number
> of routes.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
It's not useful if nothing reports it's value. I'm dropping
this.
Unless you add code to provide this information via procfs
or similar, better to just remove it.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 5/9] use %u for unsigned printfs
[not found] ` <20080112064646.356466158@linux-foundation.org>
@ 2008-01-13 4:56 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 4:56 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:18 -0800
> Use %u instead of %d when printing unsigned values.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 6/9] : fib_insert_node cleanup
[not found] ` <20080112064646.432200237@linux-foundation.org>
@ 2008-01-13 4:57 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 4:57 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:19 -0800
> The only error from fib_insert_node is if memory allocation fails,
> so instead of passing by reference, just use the convention of returning
> NULL.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
I'd be a hypocrite if I didn't apply this one :-)
Applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 7/9] printk related cleanups
[not found] ` <20080112064646.507015655@linux-foundation.org>
@ 2008-01-13 4:58 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 4:58 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:20 -0800
> printk related cleanups:
> * Get rid of unused printk wrappers.
> * Make bug checks into KERN_WARNING because KERN_DEBUG gets ignored
> * Turn one cryptic old message into something real
> * Make sure all messages have KERN_XXX
Applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 8/9] add statistics
[not found] ` <20080112064646.583836190@linux-foundation.org>
@ 2008-01-13 5:23 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 5:23 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:21 -0800
> The FIB TRIE code has a bunch of statistics, but the code is hidden
> behind an ifdef that was never implemented. Since it was dead code,
> it was broken as well.
>
> This patch fixes that by making it a config option.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
[not found] ` <20080112064646.659443238@linux-foundation.org>
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
@ 2008-01-13 5:25 ` David Miller
1 sibling, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 5:25 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Fri, 11 Jan 2008 22:45:22 -0800
> Make FIB TRIE go through sparse checker without warnings.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Also applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
@ 2008-01-13 5:28 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-13 5:28 UTC (permalink / raw)
To: stephen.hemminger; +Cc: dada1, robert.olsson, netdev
From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Sat, 12 Jan 2008 13:09:46 -0800
> On Sat, 12 Jan 2008 12:16:13 +0100
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>
> > [FIB]: Reduce text size of net/ipv4/fib_trie.o
> >
> > In struct tnode, we use two fields of 5 bits for 'pos' and 'bits'.
> > Switching to plain 'unsigned char' (8 bits) take the same space
> > because of compiler alignments, and reduce text size by 435 bytes
> > on i386.
> >
> > On i386 :
> > $ size net/ipv4/fib_trie.o.before_patch net/ipv4/fib_trie.o
> > text data bss dec hex filename
> > 13714 4 64 13782 35d6 net/ipv4/fib_trie.o.before
> > 13279 4 64 13347 3423 net/ipv4/fib_trie.o
> >
> > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> >
>
> I agree they should not have been bitfields in the first place.
Applied.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 4/9] statistics improvements
2008-01-13 4:55 ` [PATCH 4/9] statistics improvements David Miller
@ 2008-01-13 5:33 ` Stephen Hemminger
2008-01-13 5:44 ` David Miller
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-13 5:33 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev, stephen.hemminger
David Miller wrote:
> From: Stephen Hemminger <shemminger@linux-foundation.org>
> Date: Fri, 11 Jan 2008 22:45:17 -0800
>
>
>> Turn the unused size field into a useful counter for the number
>> of routes.
>>
>> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>>
>
> It's not useful if nothing reports it's value. I'm dropping
> this.
>
> Unless you add code to provide this information via procfs
> or similar, better to just remove it.
>
The size field is added to /proc/net/fib_triestat that was the point.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 4/9] statistics improvements
2008-01-13 5:33 ` Stephen Hemminger
@ 2008-01-13 5:44 ` David Miller
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
0 siblings, 1 reply; 52+ messages in thread
From: David Miller @ 2008-01-13 5:44 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Sat, 12 Jan 2008 21:33:16 -0800
> The size field is added to /proc/net/fib_triestat that was the point.
Not from what I can see.
davem@sunset:~/src/GIT/net-2.6.25$ git grep -e "->size" net/ipv4/fib_trie.c
net/ipv4/fib_trie.c: t->size++;
net/ipv4/fib_trie.c: t->size--;
davem@sunset:~/src/GIT/net-2.6.25$ git grep -e "\.size" net/ipv4/fib_trie.c
davem@sunset:~/src/GIT/net-2.6.25$
Nothing uses the field, it is merely incremented and decremented.
trie_show_stats() and trie_show_usage() do not access this field.
^ permalink raw reply [flat|nested] 52+ messages in thread
* [FIB]: full_children & empty_children should be uint, not ushort
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
` (2 preceding siblings ...)
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
@ 2008-01-13 18:30 ` Eric Dumazet
2008-01-13 22:02 ` Robert Olsson
3 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-13 18:30 UTC (permalink / raw)
To: David Miller; +Cc: Stephen Hemminger, Robert Olsson, netdev
[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]
Eric Dumazet a écrit :
> 4) full_children & empty_children being 'unsigned short',
> we probably are limited to 2^15 elements, but I could not
> find this limit enforced somewhere.
Hi David
In my testings, I found that once a tnode is built with 2^16 slots (or more),
it cannot be freed.
Extract of /proc/net/fib_triestat
Main:
Aver depth: 1.50
Max depth: 2
Leaves: 2
Internal nodes: 3
1: 1 2: 1 17: 1
Pointers: 131078
Null ptrs: 131074
Total size: 513 kB
# ip route
192.168.11.0/24 dev eth0 proto kernel scope link src 192.168.11.129
default via 192.168.11.2 dev eth0
Two fixes are possible : Enlarge full_children & empty_children to 32bits, or
force a limit in code to never exceed 2^15 children in a tnode. I chose the
first solution since it can be done with 0 memory cost on 64bit arches.
Thank you
[FIB]: full_children & empty_children should be uint, not ushort
If declared as unsigned short, these fields can overflow, and whole trie logic
is broken. I could not make the machine crash, but some tnode can never
be freed.
Note for 64 bit arches : By reordering t_key and parent in [node, leaf, tnode]
structures, we can use 32 bits hole after t_key so that sizeof(struct tnode)
doesnt change after this patch.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: fib_trie.patch --]
[-- Type: text/plain, Size: 2515 bytes --]
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index f26ba31..9696722 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -97,13 +97,13 @@ typedef unsigned int t_key;
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- t_key key;
unsigned long parent;
+ t_key key;
};
struct leaf {
- t_key key;
unsigned long parent;
+ t_key key;
struct hlist_head list;
struct rcu_head rcu;
};
@@ -116,12 +116,12 @@ struct leaf_info {
};
struct tnode {
- t_key key;
unsigned long parent;
+ t_key key;
unsigned char pos; /* 2log(KEYLENGTH) bits needed */
unsigned char bits; /* 2log(KEYLENGTH) bits needed */
- unsigned short full_children; /* KEYLENGTH bits needed */
- unsigned short empty_children; /* KEYLENGTH bits needed */
+ unsigned int full_children; /* KEYLENGTH bits needed */
+ unsigned int empty_children; /* KEYLENGTH bits needed */
struct rcu_head rcu;
struct node *child[0];
};
@@ -329,12 +329,12 @@ static inline void free_leaf_info(struct leaf_info *leaf)
call_rcu(&leaf->rcu, __leaf_info_free_rcu);
}
-static struct tnode *tnode_alloc(unsigned int size)
+static struct tnode *tnode_alloc(size_t size)
{
struct page *pages;
if (size <= PAGE_SIZE)
- return kcalloc(size, 1, GFP_KERNEL);
+ return kzalloc(size, GFP_KERNEL);
pages = alloc_pages(GFP_KERNEL|__GFP_ZERO, get_order(size));
if (!pages)
@@ -346,8 +346,8 @@ static struct tnode *tnode_alloc(unsigned int size)
static void __tnode_free_rcu(struct rcu_head *head)
{
struct tnode *tn = container_of(head, struct tnode, rcu);
- unsigned int size = sizeof(struct tnode) +
- (1 << tn->bits) * sizeof(struct node *);
+ size_t size = sizeof(struct tnode) +
+ (sizeof(struct node *) << tn->bits);
if (size <= PAGE_SIZE)
kfree(tn);
@@ -386,8 +386,7 @@ static struct leaf_info *leaf_info_new(int plen)
static struct tnode* tnode_new(t_key key, int pos, int bits)
{
- int nchildren = 1<<bits;
- int sz = sizeof(struct tnode) + nchildren * sizeof(struct node *);
+ size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
struct tnode *tn = tnode_alloc(sz);
if (tn) {
@@ -399,8 +398,8 @@ static struct tnode* tnode_new(t_key key, int pos, int bits)
tn->empty_children = 1<<bits;
}
- pr_debug("AT %p s=%u %u\n", tn, (unsigned int) sizeof(struct tnode),
- (unsigned int) (sizeof(struct node) * 1<<bits));
+ pr_debug("AT %p s=%u %lu\n", tn, (unsigned int) sizeof(struct tnode),
+ (unsigned long) (sizeof(struct node) << bits));
return tn;
}
^ permalink raw reply related [flat|nested] 52+ messages in thread
* [FIB]: full_children & empty_children should be uint, not ushort
2008-01-13 18:30 ` [FIB]: full_children & empty_children should be uint, not ushort Eric Dumazet
@ 2008-01-13 22:02 ` Robert Olsson
2008-01-14 6:32 ` David Miller
0 siblings, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-13 22:02 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, Stephen Hemminger, Robert Olsson, netdev
Eric Dumazet writes:
> Eric Dumazet a écrit :
> > 4) full_children & empty_children being 'unsigned short',
> > we probably are limited to 2^15 elements, but I could not
> > find this limit enforced somewhere.
> Two fixes are possible : Enlarge full_children & empty_children to 32bits, or
> force a limit in code to never exceed 2^15 children in a tnode. I chose the
> first solution since it can be done with 0 memory cost on 64bit arches.
Hello,
Thanks for spotting this. No we don't want put limits on the (root) node size.
You see the comment in code is correct so unsigned short are some leftover from
old testing which could have hit us hard as the routing table slowly grows.
Cheers
--ro
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
> [FIB]: full_children & empty_children should be uint, not ushort
>
> If declared as unsigned short, these fields can overflow, and whole trie logic
> is broken. I could not make the machine crash, but some tnode can never
> be freed.
>
> Note for 64 bit arches : By reordering t_key and parent in [node, leaf, tnode]
> structures, we can use 32 bits hole after t_key so that sizeof(struct tnode)
> doesnt change after this patch.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index f26ba31..9696722 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -97,13 +97,13 @@ typedef unsigned int t_key;
> #define IS_LEAF(n) (n->parent & T_LEAF)
>
> struct node {
> - t_key key;
> unsigned long parent;
> + t_key key;
> };
>
> struct leaf {
> - t_key key;
> unsigned long parent;
> + t_key key;
> struct hlist_head list;
> struct rcu_head rcu;
> };
> @@ -116,12 +116,12 @@ struct leaf_info {
> };
>
> struct tnode {
> - t_key key;
> unsigned long parent;
> + t_key key;
> unsigned char pos; /* 2log(KEYLENGTH) bits needed */
> unsigned char bits; /* 2log(KEYLENGTH) bits needed */
> - unsigned short full_children; /* KEYLENGTH bits needed */
> - unsigned short empty_children; /* KEYLENGTH bits needed */
> + unsigned int full_children; /* KEYLENGTH bits needed */
> + unsigned int empty_children; /* KEYLENGTH bits needed */
> struct rcu_head rcu;
> struct node *child[0];
> };
> @@ -329,12 +329,12 @@ static inline void free_leaf_info(struct leaf_info *leaf)
> call_rcu(&leaf->rcu, __leaf_info_free_rcu);
> }
>
> -static struct tnode *tnode_alloc(unsigned int size)
> +static struct tnode *tnode_alloc(size_t size)
> {
> struct page *pages;
>
> if (size <= PAGE_SIZE)
> - return kcalloc(size, 1, GFP_KERNEL);
> + return kzalloc(size, GFP_KERNEL);
>
> pages = alloc_pages(GFP_KERNEL|__GFP_ZERO, get_order(size));
> if (!pages)
> @@ -346,8 +346,8 @@ static struct tnode *tnode_alloc(unsigned int size)
> static void __tnode_free_rcu(struct rcu_head *head)
> {
> struct tnode *tn = container_of(head, struct tnode, rcu);
> - unsigned int size = sizeof(struct tnode) +
> - (1 << tn->bits) * sizeof(struct node *);
> + size_t size = sizeof(struct tnode) +
> + (sizeof(struct node *) << tn->bits);
>
> if (size <= PAGE_SIZE)
> kfree(tn);
> @@ -386,8 +386,7 @@ static struct leaf_info *leaf_info_new(int plen)
>
> static struct tnode* tnode_new(t_key key, int pos, int bits)
> {
> - int nchildren = 1<<bits;
> - int sz = sizeof(struct tnode) + nchildren * sizeof(struct node *);
> + size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
> struct tnode *tn = tnode_alloc(sz);
>
> if (tn) {
> @@ -399,8 +398,8 @@ static struct tnode* tnode_new(t_key key, int pos, int bits)
> tn->empty_children = 1<<bits;
> }
>
> - pr_debug("AT %p s=%u %u\n", tn, (unsigned int) sizeof(struct tnode),
> - (unsigned int) (sizeof(struct node) * 1<<bits));
> + pr_debug("AT %p s=%u %lu\n", tn, (unsigned int) sizeof(struct tnode),
> + (unsigned long) (sizeof(struct node) << bits));
> return tn;
> }
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [FIB]: full_children & empty_children should be uint, not ushort
2008-01-13 22:02 ` Robert Olsson
@ 2008-01-14 6:32 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-14 6:32 UTC (permalink / raw)
To: Robert.Olsson; +Cc: dada1, stephen.hemminger, robert.olsson, netdev
From: Robert Olsson <Robert.Olsson@data.slu.se>
Date: Sun, 13 Jan 2008 23:02:11 +0100
>
> Eric Dumazet writes:
> > Eric Dumazet a écrit :
> > > 4) full_children & empty_children being 'unsigned short',
> > > we probably are limited to 2^15 elements, but I could not
> > > find this limit enforced somewhere.
>
> > Two fixes are possible : Enlarge full_children & empty_children to 32bits, or
> > force a limit in code to never exceed 2^15 children in a tnode. I chose the
> > first solution since it can be done with 0 memory cost on 64bit arches.
...
> Thanks for spotting this. No we don't want put limits on the (root) node size.
> You see the comment in code is correct so unsigned short are some leftover from
> old testing which could have hit us hard as the routing table slowly grows.
...
> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Applied, thanks everyone.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-12 21:08 ` Stephen Hemminger
@ 2008-01-14 11:07 ` Robert Olsson
2008-01-14 17:34 ` Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-14 11:07 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Eric Dumazet, David Miller, Robert Olsson, netdev
Thanks for hacking and improving and the trie... another idea that could
be also tested. If we look into routing table we see that most leafs
only has one prefix
Main:
Aver depth: 2.57
Max depth: 7
Leaves: 231173
ip route | wc -l
241649
Thats 231173/241649 = 96% with the current Internet routing.
How about if would have a fastpath and store one entry direct in the
leaf struct this to avoid loading the leaf_info list in most cases?
One could believe that both lookup and dump could improve.
Cheers.
--ro
Stephen Hemminger writes:
> Remember that the code should be optimized for lookup, not management
> operations. We ran into this during testing (the test suite was looking
> for number of routes), thats why I put in the size field.
>
> The existing dump code is really slow:
>
> 1) FIB_TRIE Under KVM:
> load 164393 routes 12.436 sec
> ip route | wc -l 12.569 sec
> grep /proc/net/route 25.357 sec
>
> 99% of the cpu time is spent in nextleaf() during these dump operations.
>
> 2) FIB_HASH Under KVM:
> load 164393 routes 10.833 sec
> ip route | wc -l 1.981 sec
> grep /proc/net/route 0.204 sec
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/9] get rid of unused revision element
2008-01-13 4:50 ` [PATCH 2/9] get rid of unused revision element David Miller
@ 2008-01-14 11:44 ` Robert Olsson
2008-01-14 12:06 ` David Miller
0 siblings, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-14 11:44 UTC (permalink / raw)
To: David Miller; +Cc: shemminger, robert.olsson, netdev, stephen.hemminger
David Miller writes:
> > The revision element must of been part of an earlier design,
> > because currently it is set but never used.
> >
> > Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>
> I suspect Robert wanted to play around with some generation
> ID optimizations but never got around to it.
The idea was to have a selective flush of route cache entries when
a fib insert/delete happened. From what I remember you added another/
better solution. Just a list with route cache entries pointing to parent
route. So yes this was obsoleted by your/our effort to avoid total
flushing of the route cache. Unfinished work.
According to http://bgpupdates.potaroo.net/instability/bgpupd.html
(last in page) we currently flush the route cache 2.80 times per second.
when using full Internet routing with Linux. Maybe we're forced to pick
up this thread again someday.
Cheers.
--ro
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/9] get rid of unused revision element
2008-01-14 11:44 ` Robert Olsson
@ 2008-01-14 12:06 ` David Miller
2008-01-14 16:35 ` Stephen Hemminger
0 siblings, 1 reply; 52+ messages in thread
From: David Miller @ 2008-01-14 12:06 UTC (permalink / raw)
To: Robert.Olsson; +Cc: shemminger, robert.olsson, netdev, stephen.hemminger
From: Robert Olsson <Robert.Olsson@data.slu.se>
Date: Mon, 14 Jan 2008 12:44:32 +0100
> The idea was to have a selective flush of route cache entries when
> a fib insert/delete happened. From what I remember you added another/
> better solution. Just a list with route cache entries pointing to parent
> route. So yes this was obsoleted by your/our effort to avoid total
> flushing of the route cache. Unfinished work.
Yes, that's right. The synchronization was very hard.
But there is another issue, see below....
> According to http://bgpupdates.potaroo.net/instability/bgpupd.html
> (last in page) we currently flush the route cache 2.80 times per second.
> when using full Internet routing with Linux. Maybe we're forced to pick
> up this thread again someday.
This proves we need to solve this problem.
The reason I've never gone back to that work is that I didn't
want to do it while we still had multiple FIB data structure
implementations.
Someone needs to go over whatever deficiencies exist in fib_trie
vs. fib_hash so that we can delete fib_hash and move over to using
fib_trie always. It makes no sense to implement everything
interfacing into that code twice.
There was a full consensus that this was the way to move forward,
we just need the dirty work to be done.
If someone wants to show their gratitude for my getting rid of
the multipath cached routing code, the above work would be a
great way to do so (hint hint) :-)
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/9] get rid of unused revision element
2008-01-14 12:06 ` David Miller
@ 2008-01-14 16:35 ` Stephen Hemminger
2008-01-15 7:07 ` David Miller
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-14 16:35 UTC (permalink / raw)
To: David Miller; +Cc: Robert.Olsson, robert.olsson, netdev, stephen.hemminger
On Mon, 14 Jan 2008 04:06:57 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Robert Olsson <Robert.Olsson@data.slu.se>
> Date: Mon, 14 Jan 2008 12:44:32 +0100
>
> > The idea was to have a selective flush of route cache entries when
> > a fib insert/delete happened. From what I remember you added another/
> > better solution. Just a list with route cache entries pointing to parent
> > route. So yes this was obsoleted by your/our effort to avoid total
> > flushing of the route cache. Unfinished work.
>
> Yes, that's right. The synchronization was very hard.
>
> But there is another issue, see below....
>
> > According to http://bgpupdates.potaroo.net/instability/bgpupd.html
> > (last in page) we currently flush the route cache 2.80 times per second.
> > when using full Internet routing with Linux. Maybe we're forced to pick
> > up this thread again someday.
>
> This proves we need to solve this problem.
>
> The reason I've never gone back to that work is that I didn't
> want to do it while we still had multiple FIB data structure
> implementations.
>
> Someone needs to go over whatever deficiencies exist in fib_trie
> vs. fib_hash so that we can delete fib_hash and move over to using
> fib_trie always. It makes no sense to implement everything
> interfacing into that code twice.
>
> There was a full consensus that this was the way to move forward,
> we just need the dirty work to be done.
>
> If someone wants to show their gratitude for my getting rid of
> the multipath cached routing code, the above work would be a
> great way to do so (hint hint) :-)
I will be glad to get this working. Is there any point in doing the a
small systems version as well?
--
Stephen Hemminger <stephen.hemminger@vyatta.com>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-14 11:07 ` Robert Olsson
@ 2008-01-14 17:34 ` Eric Dumazet
2008-01-14 17:59 ` Robert Olsson
0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-14 17:34 UTC (permalink / raw)
To: Robert Olsson; +Cc: Stephen Hemminger, David Miller, Robert Olsson, netdev
Robert Olsson a écrit :
> Thanks for hacking and improving and the trie... another idea that could
> be also tested. If we look into routing table we see that most leafs
> only has one prefix
>
> Main:
> Aver depth: 2.57
> Max depth: 7
> Leaves: 231173
>
> ip route | wc -l
> 241649
>
> Thats 231173/241649 = 96% with the current Internet routing.
>
> How about if would have a fastpath and store one entry direct in the
> leaf struct this to avoid loading the leaf_info list in most cases?
>
> One could believe that both lookup and dump could improve.
>
>
You mean to include one "leaf_info" inside leaf structure, so that we
can access it without cache line miss ?
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 9/9] fix sparse warnings
2008-01-14 17:34 ` Eric Dumazet
@ 2008-01-14 17:59 ` Robert Olsson
2008-01-14 19:27 ` [FIB]: Avoid using static variables without proper locking Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-14 17:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Robert Olsson, Stephen Hemminger, David Miller, Robert Olsson,
netdev
Eric Dumazet writes:
> > Thats 231173/241649 = 96% with the current Internet routing.
> >
> > How about if would have a fastpath and store one entry direct in the
> > leaf struct this to avoid loading the leaf_info list in most cases?
> >
> > One could believe that both lookup and dump could improve.
> >
> You mean to include one "leaf_info" inside leaf structure, so that we
> can access it without cache line miss ?
Yes.
Cheers
--ro
^ permalink raw reply [flat|nested] 52+ messages in thread
* [FIB]: Avoid using static variables without proper locking
2008-01-14 17:59 ` Robert Olsson
@ 2008-01-14 19:27 ` Eric Dumazet
2008-01-15 7:10 ` David Miller
0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-14 19:27 UTC (permalink / raw)
To: David Miller; +Cc: Robert Olsson, netdev
[-- Attachment #1: Type: text/plain, Size: 287 bytes --]
fib_trie_seq_show() uses two helper functions, rtn_scope() and
rtn_type() that can
write to static storage without locking.
Just pass to them a temporary buffer to avoid potential corruption
(probably not triggerable but still...)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: fib_trie.patch --]
[-- Type: text/plain, Size: 1936 bytes --]
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 8d8c291..15a555a 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -2276,10 +2276,8 @@ static void seq_indent(struct seq_file *seq, int n)
while (n-- > 0) seq_puts(seq, " ");
}
-static inline const char *rtn_scope(enum rt_scope_t s)
+static inline const char *rtn_scope(char *buf, size_t len, enum rt_scope_t s)
{
- static char buf[32];
-
switch (s) {
case RT_SCOPE_UNIVERSE: return "universe";
case RT_SCOPE_SITE: return "site";
@@ -2287,7 +2285,7 @@ static inline const char *rtn_scope(enum rt_scope_t s)
case RT_SCOPE_HOST: return "host";
case RT_SCOPE_NOWHERE: return "nowhere";
default:
- snprintf(buf, sizeof(buf), "scope=%d", s);
+ snprintf(buf, len, "scope=%d", s);
return buf;
}
}
@@ -2307,13 +2305,11 @@ static const char *rtn_type_names[__RTN_MAX] = {
[RTN_XRESOLVE] = "XRESOLVE",
};
-static inline const char *rtn_type(unsigned t)
+static inline const char *rtn_type(char *buf, size_t len, unsigned t)
{
- static char buf[32];
-
if (t < __RTN_MAX && rtn_type_names[t])
return rtn_type_names[t];
- snprintf(buf, sizeof(buf), "type %d", t);
+ snprintf(buf, len, "type %d", t);
return buf;
}
@@ -2351,13 +2347,19 @@ static int fib_trie_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, " |-- %d.%d.%d.%d\n", NIPQUAD(val));
for (i = 32; i >= 0; i--) {
struct leaf_info *li = find_leaf_info(l, i);
+
if (li) {
struct fib_alias *fa;
+
list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+ char buf1[32], buf2[32];
+
seq_indent(seq, iter->depth+1);
seq_printf(seq, " /%d %s %s", i,
- rtn_scope(fa->fa_scope),
- rtn_type(fa->fa_type));
+ rtn_scope(buf1, sizeof(buf1),
+ fa->fa_scope),
+ rtn_type(buf2, sizeof(buf2),
+ fa->fa_type));
if (fa->fa_tos)
seq_printf(seq, "tos =%d\n",
fa->fa_tos);
^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH] [IPV4] fib_trie: size and statistics
2008-01-13 5:44 ` David Miller
@ 2008-01-14 20:57 ` Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
` (3 more replies)
0 siblings, 4 replies; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-14 20:57 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev, stephen.hemminger
Show number of entries in trie, the size field was being set but never used,
but it only counted leaves, not all entries. Refactor the two cases in
fib_triestat_seq_show into a single routine.
Note: the stat structure was being malloc'd but the stack usage isn't so
high (288 bytes) that it is worth the additional complexity.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
---
Patch against current net-2.6.25
--- a/net/ipv4/fib_trie.c 2008-01-14 10:16:06.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 10:30:11.000000000 -0800
@@ -148,10 +148,10 @@ struct trie_stat {
struct trie {
struct node *trie;
+ unsigned int size;
#ifdef CONFIG_IP_FIB_TRIE_STATS
struct trie_use_stats stats;
#endif
- int size;
};
static void put_child(struct trie *t, struct tnode *tn, int i, struct node *n);
@@ -1045,7 +1045,6 @@ static struct list_head *fib_insert_node
insert_leaf_info(&l->list, li);
goto done;
}
- t->size++;
l = leaf_new();
if (!l)
@@ -1258,6 +1257,8 @@ static int fn_trie_insert(struct fib_tab
list_add_tail_rcu(&new_fa->fa_list,
(fa ? &fa->fa_list : fa_head));
+ t->size++;
+
rt_cache_flush(-1);
rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, tb->tb_id,
&cfg->fc_nlinfo, 0);
@@ -2128,50 +2129,34 @@ static void trie_show_usage(struct seq_f
}
#endif /* CONFIG_IP_FIB_TRIE_STATS */
+static void fib_trie_show(struct seq_file *seq, const char *name, struct trie *trie)
+{
+ struct trie_stat stat;
+
+ seq_printf(seq, "%s: %d\n", name, trie->size);
+ trie_collect_stats(trie, &stat);
+ trie_show_stats(seq, &stat);
+#ifdef CONFIG_IP_FIB_TRIE_STATS
+ trie_show_usage(seq, &trie->stats);
+#endif
+}
static int fib_triestat_seq_show(struct seq_file *seq, void *v)
{
struct net *net = (struct net *)seq->private;
- struct trie *trie_local, *trie_main;
- struct trie_stat *stat;
struct fib_table *tb;
- trie_local = NULL;
+ seq_printf(seq,
+ "Basic info: size of leaf: %Zd bytes, size of tnode: %Zd bytes.\n",
+ sizeof(struct leaf), sizeof(struct tnode));
+
tb = fib_get_table(net, RT_TABLE_LOCAL);
if (tb)
- trie_local = (struct trie *) tb->tb_data;
-
- trie_main = NULL;
+ fib_trie_show(seq, "Local", (struct trie *) tb->tb_data);
+
tb = fib_get_table(net, RT_TABLE_MAIN);
if (tb)
- trie_main = (struct trie *) tb->tb_data;
-
-
- stat = kmalloc(sizeof(*stat), GFP_KERNEL);
- if (!stat)
- return -ENOMEM;
-
- seq_printf(seq, "Basic info: size of leaf: %Zd bytes, size of tnode: %Zd bytes.\n",
- sizeof(struct leaf), sizeof(struct tnode));
-
- if (trie_local) {
- seq_printf(seq, "Local:\n");
- trie_collect_stats(trie_local, stat);
- trie_show_stats(seq, stat);
-#ifdef CONFIG_IP_FIB_TRIE_STATS
- trie_show_usage(seq, &trie_local->stats);
-#endif
- }
-
- if (trie_main) {
- seq_printf(seq, "Main:\n");
- trie_collect_stats(trie_main, stat);
- trie_show_stats(seq, stat);
-#ifdef CONFIG_IP_FIB_TRIE_STATS
- trie_show_usage(seq, &trie_main->stats);
-#endif
- }
- kfree(stat);
+ fib_trie_show(seq, "Main", (struct trie *) tb->tb_data);
return 0;
}
^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache
[not found] ` <20080114164450.55f8c9b2@deepthought>
@ 2008-01-15 0:46 ` Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 0:46 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev
This improves locality for operations that touch all the leaves.
Later patch will grow the size of the leaf so it becomes more
important.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
--- a/net/ipv4/fib_trie.c 2008-01-14 12:26:51.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 13:41:00.000000000 -0800
@@ -162,6 +162,7 @@ static struct tnode *halve(struct trie *
static void tnode_free(struct tnode *tn);
static struct kmem_cache *fn_alias_kmem __read_mostly;
+static struct kmem_cache *trie_leaf_kmem __read_mostly;
static inline struct tnode *node_parent(struct node *node)
{
@@ -316,7 +317,8 @@ static inline void alias_free_mem_rcu(st
static void __leaf_free_rcu(struct rcu_head *head)
{
- kfree(container_of(head, struct leaf, rcu));
+ struct leaf *leaf = container_of(head, struct leaf, rcu);
+ kmem_cache_free(trie_leaf_kmem, leaf);
}
static void __leaf_info_free_rcu(struct rcu_head *head)
@@ -366,7 +368,7 @@ static inline void tnode_free(struct tno
static struct leaf *leaf_new(void)
{
- struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
+ struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
if (l) {
l->parent = T_LEAF;
INIT_HLIST_HEAD(&l->list);
@@ -1927,6 +1929,9 @@ void __init fib_hash_init(void)
{
fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+ trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
+ 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
}
^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 4/6] [IPV4] fib_trie style cleanup
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
@ 2008-01-15 0:47 ` Stephen Hemminger
2008-01-15 2:58 ` [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention Stephen Hemminger
2008-01-15 6:49 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Eric Dumazet
2008-01-15 7:29 ` David Miller
2 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 0:47 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev
Get rid of #ifdef, and split out embedded function calls in if statements.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
--- a/net/ipv4/fib_trie.c 2008-01-14 14:24:58.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 14:26:05.000000000 -0800
@@ -1293,7 +1293,9 @@ static inline int check_leaf(struct trie
if (l->key != (key & ntohl(mask)))
continue;
- if ((err = fib_semantic_match(&li->falh, flp, res, htonl(l->key), mask, i)) <= 0) {
+ err = fib_semantic_match(&li->falh, flp, res,
+ htonl(l->key), mask, i);
+ if (err <= 0) {
*plen = i;
#ifdef CONFIG_IP_FIB_TRIE_STATS
t->stats.semantic_match_passed++;
@@ -1335,7 +1337,8 @@ fn_trie_lookup(struct fib_table *tb, con
/* Just a leaf? */
if (IS_LEAF(n)) {
- if ((ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res)) <= 0)
+ ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
+ if (ret <= 0)
goto found;
goto failed;
}
@@ -1360,14 +1363,13 @@ fn_trie_lookup(struct fib_table *tb, con
}
if (IS_LEAF(n)) {
- if ((ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res)) <= 0)
+ ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
+ if (ret <= 0)
goto found;
else
goto backtrace;
}
-#define HL_OPTIMIZE
-#ifdef HL_OPTIMIZE
cn = (struct tnode *)n;
/*
@@ -1455,7 +1457,7 @@ fn_trie_lookup(struct fib_table *tb, con
if (current_prefix_length >= cn->pos)
current_prefix_length = mp;
}
-#endif
+
pn = (struct tnode *)n; /* Descend */
chopped_off = 0;
continue;
^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
@ 2008-01-15 2:58 ` Stephen Hemminger
2008-01-15 5:07 ` [RFC 6/6] fib_trie: combine leaf and info Stephen Hemminger
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 2:58 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev
Another case where just returning a negative value gets rid of
call by reference. In this case the return value for checkleaf is
now:
-1 no match
0..32 prefix length
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
--- a/net/ipv4/fib_trie.c 2008-01-14 18:02:18.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 18:37:51.000000000 -0800
@@ -1277,36 +1277,36 @@ err:
/* should be called with rcu_read_lock */
-static inline int check_leaf(struct trie *t, struct leaf *l,
- t_key key, int *plen, const struct flowi *flp,
- struct fib_result *res)
+static int check_leaf(struct trie *t, struct leaf *l,
+ t_key key, const struct flowi *flp,
+ struct fib_result *res)
{
- int err, i;
- __be32 mask;
struct leaf_info *li;
struct hlist_head *hhead = &l->list;
struct hlist_node *node;
hlist_for_each_entry_rcu(li, node, hhead, hlist) {
- i = li->plen;
- mask = inet_make_mask(i);
+ int err;
+ int plen = li->plen;
+ __be32 mask = inet_make_mask(plen);
+
if (l->key != (key & ntohl(mask)))
continue;
err = fib_semantic_match(&li->falh, flp, res,
- htonl(l->key), mask, i);
- if (err <= 0) {
- *plen = i;
+ htonl(l->key), mask, plen);
+
#ifdef CONFIG_IP_FIB_TRIE_STATS
+ if (err <= 0)
t->stats.semantic_match_passed++;
+ else
+ t->stats.semantic_match_miss++;
#endif
- return err;
- }
-#ifdef CONFIG_IP_FIB_TRIE_STATS
- t->stats.semantic_match_miss++;
-#endif
+ if (err <= 0)
+ return plen;
}
- return 1;
+
+ return -1;
}
static int
@@ -1337,11 +1337,13 @@ fn_trie_lookup(struct fib_table *tb, con
/* Just a leaf? */
if (IS_LEAF(n)) {
- ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
- if (ret <= 0)
- goto found;
- goto failed;
+ plen = check_leaf(t, (struct leaf *)n, key, flp, res);
+ if (plen < 0)
+ goto failed;
+ ret = 0;
+ goto found;
}
+
pn = (struct tnode *) n;
chopped_off = 0;
@@ -1363,11 +1365,12 @@ fn_trie_lookup(struct fib_table *tb, con
}
if (IS_LEAF(n)) {
- ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
- if (ret <= 0)
- goto found;
- else
+ plen = check_leaf(t, (struct leaf *)n, key, flp, res);
+ if (plen < 0)
goto backtrace;
+
+ ret = 0;
+ goto found;
}
cn = (struct tnode *)n;
^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 2/6] [IPV4] fib hash|trie initialization
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
@ 2008-01-15 5:00 ` Stephen Hemminger
2008-01-15 7:14 ` David Miller
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
2008-01-15 7:12 ` David Miller
3 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 5:00 UTC (permalink / raw)
Cc: David Miller, robert.olsson, netdev
Initialization of the slab cache's should be done when IP is
initialized to make sure of available memory, and that code
can be marked __init.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
---
Applies after statistics (#1) patch for net-2.6.25
include/net/ip_fib.h | 5 +++--
net/ipv4/fib_frontend.c | 9 ++++++---
net/ipv4/fib_hash.c | 24 +++++++++++-------------
net/ipv4/fib_trie.c | 16 ++++++++--------
4 files changed, 28 insertions(+), 26 deletions(-)
--- a/include/net/ip_fib.h 2008-01-14 14:59:32.000000000 -0800
+++ b/include/net/ip_fib.h 2008-01-14 15:01:12.000000000 -0800
@@ -231,8 +231,9 @@ extern int fib_sync_down(__be32 local, s
extern int fib_sync_up(struct net_device *dev);
extern __be32 __fib_res_prefsrc(struct fib_result *res);
-/* Exported by fib_hash.c */
-extern struct fib_table *fib_hash_init(u32 id);
+/* Exported by fib_{hash|trie}.c */
+extern void fib_hash_init(void);
+extern struct fib_table *fib_hash_table(u32 id);
static inline void fib_combine_itag(u32 *itag, struct fib_result *res)
{
--- a/net/ipv4/fib_frontend.c 2008-01-14 14:59:32.000000000 -0800
+++ b/net/ipv4/fib_frontend.c 2008-01-14 15:01:12.000000000 -0800
@@ -53,11 +53,11 @@ static int __net_init fib4_rules_init(st
{
struct fib_table *local_table, *main_table;
- local_table = fib_hash_init(RT_TABLE_LOCAL);
+ local_table = fib_hash_table(RT_TABLE_LOCAL);
if (local_table == NULL)
return -ENOMEM;
- main_table = fib_hash_init(RT_TABLE_MAIN);
+ main_table = fib_hash_table(RT_TABLE_MAIN);
if (main_table == NULL)
goto fail;
@@ -83,7 +83,8 @@ struct fib_table *fib_new_table(struct n
tb = fib_get_table(net, id);
if (tb)
return tb;
- tb = fib_hash_init(id);
+
+ tb = fib_hash_table(id);
if (!tb)
return NULL;
h = id & (FIB_TABLE_HASHSZ - 1);
@@ -1042,6 +1043,8 @@ void __init ip_fib_init(void)
register_pernet_subsys(&fib_net_ops);
register_netdevice_notifier(&fib_netdev_notifier);
register_inetaddr_notifier(&fib_inetaddr_notifier);
+
+ fib_hash_init();
}
EXPORT_SYMBOL(inet_addr_type);
--- a/net/ipv4/fib_hash.c 2008-01-14 14:59:32.000000000 -0800
+++ b/net/ipv4/fib_hash.c 2008-01-14 15:01:12.000000000 -0800
@@ -746,21 +746,19 @@ static int fn_hash_dump(struct fib_table
return skb->len;
}
-struct fib_table *fib_hash_init(u32 id)
+void __init fib_hash_init(void)
{
- struct fib_table *tb;
+ fn_hash_kmem = kmem_cache_create("ip_fib_hash", sizeof(struct fib_node),
+ 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+ fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
+ 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+}
- if (fn_hash_kmem == NULL)
- fn_hash_kmem = kmem_cache_create("ip_fib_hash",
- sizeof(struct fib_node),
- 0, SLAB_HWCACHE_ALIGN,
- NULL);
-
- if (fn_alias_kmem == NULL)
- fn_alias_kmem = kmem_cache_create("ip_fib_alias",
- sizeof(struct fib_alias),
- 0, SLAB_HWCACHE_ALIGN,
- NULL);
+struct fib_table *fib_hash_table(u32 id)
+{
+ struct fib_table *tb;
tb = kmalloc(sizeof(struct fib_table) + sizeof(struct fn_hash),
GFP_KERNEL);
--- a/net/ipv4/fib_trie.c 2008-01-14 15:01:12.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 20:57:41.000000000 -0800
@@ -1923,19 +1923,19 @@ out:
return -1;
}
-/* Fix more generic FIB names for init later */
+void __init fib_hash_init(void)
+{
+ fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
+ 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+}
-struct fib_table *fib_hash_init(u32 id)
+
+/* Fix more generic FIB names for init later */
+struct fib_table *fib_hash_table(u32 id)
{
struct fib_table *tb;
struct trie *t;
- if (fn_alias_kmem == NULL)
- fn_alias_kmem = kmem_cache_create("ip_fib_alias",
- sizeof(struct fib_alias),
- 0, SLAB_HWCACHE_ALIGN,
- NULL);
-
tb = kmalloc(sizeof(struct fib_table) + sizeof(struct trie),
GFP_KERNEL);
if (tb == NULL)
^ permalink raw reply [flat|nested] 52+ messages in thread
* [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 2:58 ` [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention Stephen Hemminger
@ 2008-01-15 5:07 ` Stephen Hemminger
2008-01-15 6:12 ` Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 5:07 UTC (permalink / raw)
To: David Miller; +Cc: robert.olsson, netdev
Combine the prefix information and the leaf together into one
allocation. This is furthur simplified by converting the hlist
into a simple bitfield.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
This patch is experimental (ie it boots and loads routes), but
is slower for the 163,000 route dump test.
---
net/ipv4/fib_trie.c | 165 ++++++++++++++++++++--------------------------------
1 file changed, 65 insertions(+), 100 deletions(-)
--- a/net/ipv4/fib_trie.c 2008-01-14 18:37:51.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-14 20:57:12.000000000 -0800
@@ -50,10 +50,9 @@
* Patrick McHardy <kaber@trash.net>
*/
-#define VERSION "0.408"
+#define VERSION "0.409"
+
-#include <asm/uaccess.h>
-#include <asm/system.h>
#include <linux/bitops.h>
#include <linux/types.h>
#include <linux/kernel.h>
@@ -80,6 +79,8 @@
#include <net/tcp.h>
#include <net/sock.h>
#include <net/ip_fib.h>
+#include <asm/bitops.h>
+
#include "fib_lookup.h"
#define MAX_STAT_DEPTH 32
@@ -101,20 +102,24 @@ struct node {
t_key key;
};
+struct leaf_info {
+ struct list_head falh;
+};
+
+#define INFO_SIZE 33
+/* NB: bitmap is in reverse order, so that find_first returns largest prefix */
+#define PLEN2BIT(x) (INFO_SIZE - (x))
+
struct leaf {
unsigned long parent;
t_key key;
- struct hlist_head list;
struct rcu_head rcu;
-};
-struct leaf_info {
- struct hlist_node hlist;
- struct rcu_head rcu;
- int plen;
- struct list_head falh;
+ unsigned long bitmap[INFO_SIZE / BITS_PER_LONG + 1];
+ struct leaf_info info[INFO_SIZE];
};
+
struct tnode {
unsigned long parent;
t_key key;
@@ -321,16 +326,6 @@ static void __leaf_free_rcu(struct rcu_h
kmem_cache_free(trie_leaf_kmem, leaf);
}
-static void __leaf_info_free_rcu(struct rcu_head *head)
-{
- kfree(container_of(head, struct leaf_info, rcu));
-}
-
-static inline void free_leaf_info(struct leaf_info *leaf)
-{
- call_rcu(&leaf->rcu, __leaf_info_free_rcu);
-}
-
static struct tnode *tnode_alloc(size_t size)
{
struct page *pages;
@@ -371,21 +366,37 @@ static struct leaf *leaf_new(void)
struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
if (l) {
l->parent = T_LEAF;
- INIT_HLIST_HEAD(&l->list);
+ bitmap_zero(l->bitmap, INFO_SIZE);
}
return l;
}
-static struct leaf_info *leaf_info_new(int plen)
+static inline struct leaf_info *find_leaf_info(struct leaf *l, int plen)
{
- struct leaf_info *li = kmalloc(sizeof(struct leaf_info), GFP_KERNEL);
- if (li) {
- li->plen = plen;
- INIT_LIST_HEAD(&li->falh);
- }
+ return test_bit(PLEN2BIT(plen), l->bitmap) ? &l->info[plen] : NULL;
+}
+
+static struct leaf_info *set_leaf_info(struct leaf *l, int plen)
+{
+ struct leaf_info *li = &l->info[plen];
+
+ INIT_LIST_HEAD(&li->falh);
+ set_bit(PLEN2BIT(plen), l->bitmap);
+
return li;
}
+static inline void clear_leaf_info(struct leaf *l, int plen)
+{
+ smp_mb__before_clear_bit();
+ clear_bit(PLEN2BIT(plen), &l->bitmap);
+}
+
+static inline int leaf_info_empty(const struct leaf *l)
+{
+ return find_first_bit(l->bitmap, INFO_SIZE) >= INFO_SIZE;
+}
+
static struct tnode* tnode_new(t_key key, int pos, int bits)
{
size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
@@ -876,20 +887,6 @@ nomem:
/* readside must use rcu_read_lock currently dump routines
via get_fa_head and dump */
-
-static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
-{
- struct hlist_head *head = &l->list;
- struct hlist_node *node;
- struct leaf_info *li;
-
- hlist_for_each_entry_rcu(li, node, head, hlist)
- if (li->plen == plen)
- return li;
-
- return NULL;
-}
-
static inline struct list_head * get_fa_head(struct leaf *l, int plen)
{
struct leaf_info *li = find_leaf_info(l, plen);
@@ -900,26 +897,6 @@ static inline struct list_head * get_fa_
return &li->falh;
}
-static void insert_leaf_info(struct hlist_head *head, struct leaf_info *new)
-{
- struct leaf_info *li = NULL, *last = NULL;
- struct hlist_node *node;
-
- if (hlist_empty(head)) {
- hlist_add_head_rcu(&new->hlist, head);
- } else {
- hlist_for_each_entry(li, node, head, hlist) {
- if (new->plen > li->plen)
- break;
-
- last = li;
- }
- if (last)
- hlist_add_after_rcu(&last->hlist, &new->hlist);
- else
- hlist_add_before_rcu(&new->hlist, &li->hlist);
- }
-}
/* rcu_read_lock needs to be hold by caller from readside */
@@ -993,6 +970,8 @@ static struct list_head *fib_insert_node
pos = 0;
n = t->trie;
+ pr_debug("fib_insert_node: %x/%d\n", key, plen);
+
/* If we point to NULL, stop. Either the tree is empty and we should
* just put a new leaf in if, or we have reached an empty child slot,
* and we should just put our new leaf in that.
@@ -1038,13 +1017,9 @@ static struct list_head *fib_insert_node
if (n != NULL && IS_LEAF(n) && tkey_equals(key, n->key)) {
l = (struct leaf *) n;
- li = leaf_info_new(plen);
-
- if (!li)
- return NULL;
-
+ li = set_leaf_info(l, plen);
fa_head = &li->falh;
- insert_leaf_info(&l->list, li);
+
goto done;
}
l = leaf_new();
@@ -1053,15 +1028,8 @@ static struct list_head *fib_insert_node
return NULL;
l->key = key;
- li = leaf_info_new(plen);
-
- if (!li) {
- tnode_free((struct tnode *) l);
- return NULL;
- }
-
+ li = set_leaf_info(l, plen);
fa_head = &li->falh;
- insert_leaf_info(&l->list, li);
if (t->trie && n == NULL) {
/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1091,7 +1059,6 @@ static struct list_head *fib_insert_node
}
if (!tn) {
- free_leaf_info(li);
tnode_free((struct tnode *) l);
return NULL;
}
@@ -1119,7 +1086,7 @@ static struct list_head *fib_insert_node
rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
done:
- return fa_head;
+ return &li->falh;
}
/*
@@ -1281,14 +1248,14 @@ static int check_leaf(struct trie *t, st
t_key key, const struct flowi *flp,
struct fib_result *res)
{
- struct leaf_info *li;
- struct hlist_head *hhead = &l->list;
- struct hlist_node *node;
+ int bit;
- hlist_for_each_entry_rcu(li, node, hhead, hlist) {
- int err;
- int plen = li->plen;
+ /* need find highest prefix */
+ for_each_bit(bit, l->bitmap, INFO_SIZE) {
+ int plen = PLEN2BIT(bit);
+ struct leaf_info *li = &l->info[plen];
__be32 mask = inet_make_mask(plen);
+ int err;
if (l->key != (key & ntohl(mask)))
continue;
@@ -1622,12 +1589,10 @@ static int fn_trie_delete(struct fib_tab
list_del_rcu(&fa->fa_list);
- if (list_empty(fa_head)) {
- hlist_del_rcu(&li->hlist);
- free_leaf_info(li);
- }
+ if (list_empty(fa_head))
+ clear_leaf_info(l, plen);
- if (hlist_empty(&l->list))
+ if (leaf_info_empty(l))
trie_leaf_remove(t, key);
if (fa->fa_state & FA_S_ACCESSED)
@@ -1659,18 +1624,17 @@ static int trie_flush_list(struct trie *
static int trie_flush_leaf(struct trie *t, struct leaf *l)
{
int found = 0;
- struct hlist_head *lih = &l->list;
- struct hlist_node *node, *tmp;
- struct leaf_info *li = NULL;
+ int bit;
- hlist_for_each_entry_safe(li, node, tmp, lih, hlist) {
+ for_each_bit(bit, l->bitmap, INFO_SIZE) {
+ int plen = PLEN2BIT(bit);
+ struct leaf_info *li = &l->info[plen];
found += trie_flush_list(t, &li->falh);
- if (list_empty(&li->falh)) {
- hlist_del_rcu(&li->hlist);
- free_leaf_info(li);
- }
+ if (list_empty(&li->falh))
+ clear_leaf_info(l, plen);
}
+
return found;
}
@@ -1746,12 +1710,12 @@ static int fn_trie_flush(struct fib_tabl
for (h = 0; (l = nextleaf(t, l)) != NULL; h++) {
found += trie_flush_leaf(t, l);
- if (ll && hlist_empty(&ll->list))
+ if (ll && leaf_info_empty(ll))
trie_leaf_remove(t, ll->key);
ll = l;
}
- if (ll && hlist_empty(&ll->list))
+ if (ll && leaf_info_empty(ll))
trie_leaf_remove(t, ll->key);
pr_debug("trie_flush found=%d\n", found);
@@ -2428,10 +2392,11 @@ static int fib_route_seq_show(struct seq
if (iter->trie == iter->trie_local)
return 0;
+
if (IS_TNODE(l))
return 0;
- for (i=32; i>=0; i--) {
+ for (i = 32; i >= 0; i--) {
struct leaf_info *li = find_leaf_info(l, i);
struct fib_alias *fa;
__be32 mask, prefix;
@@ -2439,7 +2404,7 @@ static int fib_route_seq_show(struct seq
if (!li)
continue;
- mask = inet_make_mask(li->plen);
+ mask = inet_make_mask(i);
prefix = htonl(l->key);
list_for_each_entry_rcu(fa, &li->falh, fa_list) {
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 5:07 ` [RFC 6/6] fib_trie: combine leaf and info Stephen Hemminger
@ 2008-01-15 6:12 ` Eric Dumazet
2008-01-15 6:16 ` Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 6:12 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev
[-- Attachment #1: Type: text/plain, Size: 9629 bytes --]
Stephen Hemminger a écrit :
> Combine the prefix information and the leaf together into one
> allocation. This is furthur simplified by converting the hlist
> into a simple bitfield.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>
> This patch is experimental (ie it boots and loads routes), but
> is slower for the 163,000 route dump test.
>
Its also very memory expensive. That was not Robert suggestion I believe.
Its suggestion is to embedd one info into each leaves.
Please find attached to this mail a preliminary and incomplete patch I wrote
this morning before coffee :), to get the idea.
Before patching this file, maybe we should first do a cleanup (checkpatch.pl
and obvious modern style formating :) )
> ---
> net/ipv4/fib_trie.c | 165 ++++++++++++++++++++--------------------------------
> 1 file changed, 65 insertions(+), 100 deletions(-)
>
> --- a/net/ipv4/fib_trie.c 2008-01-14 18:37:51.000000000 -0800
> +++ b/net/ipv4/fib_trie.c 2008-01-14 20:57:12.000000000 -0800
> @@ -50,10 +50,9 @@
> * Patrick McHardy <kaber@trash.net>
> */
>
> -#define VERSION "0.408"
> +#define VERSION "0.409"
> +
>
> -#include <asm/uaccess.h>
> -#include <asm/system.h>
> #include <linux/bitops.h>
> #include <linux/types.h>
> #include <linux/kernel.h>
> @@ -80,6 +79,8 @@
> #include <net/tcp.h>
> #include <net/sock.h>
> #include <net/ip_fib.h>
> +#include <asm/bitops.h>
> +
> #include "fib_lookup.h"
>
> #define MAX_STAT_DEPTH 32
> @@ -101,20 +102,24 @@ struct node {
> t_key key;
> };
>
> +struct leaf_info {
> + struct list_head falh;
> +};
> +
> +#define INFO_SIZE 33
> +/* NB: bitmap is in reverse order, so that find_first returns largest prefix */
> +#define PLEN2BIT(x) (INFO_SIZE - (x))
> +
> struct leaf {
> unsigned long parent;
> t_key key;
> - struct hlist_head list;
> struct rcu_head rcu;
> -};
>
> -struct leaf_info {
> - struct hlist_node hlist;
> - struct rcu_head rcu;
> - int plen;
> - struct list_head falh;
> + unsigned long bitmap[INFO_SIZE / BITS_PER_LONG + 1];
> + struct leaf_info info[INFO_SIZE];
> };
>
> +
> struct tnode {
> unsigned long parent;
> t_key key;
> @@ -321,16 +326,6 @@ static void __leaf_free_rcu(struct rcu_h
> kmem_cache_free(trie_leaf_kmem, leaf);
> }
>
> -static void __leaf_info_free_rcu(struct rcu_head *head)
> -{
> - kfree(container_of(head, struct leaf_info, rcu));
> -}
> -
> -static inline void free_leaf_info(struct leaf_info *leaf)
> -{
> - call_rcu(&leaf->rcu, __leaf_info_free_rcu);
> -}
> -
> static struct tnode *tnode_alloc(size_t size)
> {
> struct page *pages;
> @@ -371,21 +366,37 @@ static struct leaf *leaf_new(void)
> struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
> if (l) {
> l->parent = T_LEAF;
> - INIT_HLIST_HEAD(&l->list);
> + bitmap_zero(l->bitmap, INFO_SIZE);
> }
> return l;
> }
>
> -static struct leaf_info *leaf_info_new(int plen)
> +static inline struct leaf_info *find_leaf_info(struct leaf *l, int plen)
> {
> - struct leaf_info *li = kmalloc(sizeof(struct leaf_info), GFP_KERNEL);
> - if (li) {
> - li->plen = plen;
> - INIT_LIST_HEAD(&li->falh);
> - }
> + return test_bit(PLEN2BIT(plen), l->bitmap) ? &l->info[plen] : NULL;
> +}
> +
> +static struct leaf_info *set_leaf_info(struct leaf *l, int plen)
> +{
> + struct leaf_info *li = &l->info[plen];
> +
> + INIT_LIST_HEAD(&li->falh);
> + set_bit(PLEN2BIT(plen), l->bitmap);
> +
> return li;
> }
>
> +static inline void clear_leaf_info(struct leaf *l, int plen)
> +{
> + smp_mb__before_clear_bit();
> + clear_bit(PLEN2BIT(plen), &l->bitmap);
> +}
> +
> +static inline int leaf_info_empty(const struct leaf *l)
> +{
> + return find_first_bit(l->bitmap, INFO_SIZE) >= INFO_SIZE;
> +}
> +
> static struct tnode* tnode_new(t_key key, int pos, int bits)
> {
> size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
> @@ -876,20 +887,6 @@ nomem:
>
> /* readside must use rcu_read_lock currently dump routines
> via get_fa_head and dump */
> -
> -static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
> -{
> - struct hlist_head *head = &l->list;
> - struct hlist_node *node;
> - struct leaf_info *li;
> -
> - hlist_for_each_entry_rcu(li, node, head, hlist)
> - if (li->plen == plen)
> - return li;
> -
> - return NULL;
> -}
> -
> static inline struct list_head * get_fa_head(struct leaf *l, int plen)
> {
> struct leaf_info *li = find_leaf_info(l, plen);
> @@ -900,26 +897,6 @@ static inline struct list_head * get_fa_
> return &li->falh;
> }
>
> -static void insert_leaf_info(struct hlist_head *head, struct leaf_info *new)
> -{
> - struct leaf_info *li = NULL, *last = NULL;
> - struct hlist_node *node;
> -
> - if (hlist_empty(head)) {
> - hlist_add_head_rcu(&new->hlist, head);
> - } else {
> - hlist_for_each_entry(li, node, head, hlist) {
> - if (new->plen > li->plen)
> - break;
> -
> - last = li;
> - }
> - if (last)
> - hlist_add_after_rcu(&last->hlist, &new->hlist);
> - else
> - hlist_add_before_rcu(&new->hlist, &li->hlist);
> - }
> -}
>
> /* rcu_read_lock needs to be hold by caller from readside */
>
> @@ -993,6 +970,8 @@ static struct list_head *fib_insert_node
> pos = 0;
> n = t->trie;
>
> + pr_debug("fib_insert_node: %x/%d\n", key, plen);
> +
> /* If we point to NULL, stop. Either the tree is empty and we should
> * just put a new leaf in if, or we have reached an empty child slot,
> * and we should just put our new leaf in that.
> @@ -1038,13 +1017,9 @@ static struct list_head *fib_insert_node
>
> if (n != NULL && IS_LEAF(n) && tkey_equals(key, n->key)) {
> l = (struct leaf *) n;
> - li = leaf_info_new(plen);
> -
> - if (!li)
> - return NULL;
> -
> + li = set_leaf_info(l, plen);
> fa_head = &li->falh;
> - insert_leaf_info(&l->list, li);
> +
> goto done;
> }
> l = leaf_new();
> @@ -1053,15 +1028,8 @@ static struct list_head *fib_insert_node
> return NULL;
>
> l->key = key;
> - li = leaf_info_new(plen);
> -
> - if (!li) {
> - tnode_free((struct tnode *) l);
> - return NULL;
> - }
> -
> + li = set_leaf_info(l, plen);
> fa_head = &li->falh;
> - insert_leaf_info(&l->list, li);
>
> if (t->trie && n == NULL) {
> /* Case 2: n is NULL, and will just insert a new leaf */
> @@ -1091,7 +1059,6 @@ static struct list_head *fib_insert_node
> }
>
> if (!tn) {
> - free_leaf_info(li);
> tnode_free((struct tnode *) l);
> return NULL;
> }
> @@ -1119,7 +1086,7 @@ static struct list_head *fib_insert_node
>
> rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
> done:
> - return fa_head;
> + return &li->falh;
> }
>
> /*
> @@ -1281,14 +1248,14 @@ static int check_leaf(struct trie *t, st
> t_key key, const struct flowi *flp,
> struct fib_result *res)
> {
> - struct leaf_info *li;
> - struct hlist_head *hhead = &l->list;
> - struct hlist_node *node;
> + int bit;
>
> - hlist_for_each_entry_rcu(li, node, hhead, hlist) {
> - int err;
> - int plen = li->plen;
> + /* need find highest prefix */
> + for_each_bit(bit, l->bitmap, INFO_SIZE) {
> + int plen = PLEN2BIT(bit);
> + struct leaf_info *li = &l->info[plen];
> __be32 mask = inet_make_mask(plen);
> + int err;
>
> if (l->key != (key & ntohl(mask)))
> continue;
> @@ -1622,12 +1589,10 @@ static int fn_trie_delete(struct fib_tab
>
> list_del_rcu(&fa->fa_list);
>
> - if (list_empty(fa_head)) {
> - hlist_del_rcu(&li->hlist);
> - free_leaf_info(li);
> - }
> + if (list_empty(fa_head))
> + clear_leaf_info(l, plen);
>
> - if (hlist_empty(&l->list))
> + if (leaf_info_empty(l))
> trie_leaf_remove(t, key);
>
> if (fa->fa_state & FA_S_ACCESSED)
> @@ -1659,18 +1624,17 @@ static int trie_flush_list(struct trie *
> static int trie_flush_leaf(struct trie *t, struct leaf *l)
> {
> int found = 0;
> - struct hlist_head *lih = &l->list;
> - struct hlist_node *node, *tmp;
> - struct leaf_info *li = NULL;
> + int bit;
>
> - hlist_for_each_entry_safe(li, node, tmp, lih, hlist) {
> + for_each_bit(bit, l->bitmap, INFO_SIZE) {
> + int plen = PLEN2BIT(bit);
> + struct leaf_info *li = &l->info[plen];
> found += trie_flush_list(t, &li->falh);
>
> - if (list_empty(&li->falh)) {
> - hlist_del_rcu(&li->hlist);
> - free_leaf_info(li);
> - }
> + if (list_empty(&li->falh))
> + clear_leaf_info(l, plen);
> }
> +
> return found;
> }
>
> @@ -1746,12 +1710,12 @@ static int fn_trie_flush(struct fib_tabl
> for (h = 0; (l = nextleaf(t, l)) != NULL; h++) {
> found += trie_flush_leaf(t, l);
>
> - if (ll && hlist_empty(&ll->list))
> + if (ll && leaf_info_empty(ll))
> trie_leaf_remove(t, ll->key);
> ll = l;
> }
>
> - if (ll && hlist_empty(&ll->list))
> + if (ll && leaf_info_empty(ll))
> trie_leaf_remove(t, ll->key);
>
> pr_debug("trie_flush found=%d\n", found);
> @@ -2428,10 +2392,11 @@ static int fib_route_seq_show(struct seq
>
> if (iter->trie == iter->trie_local)
> return 0;
> +
> if (IS_TNODE(l))
> return 0;
>
> - for (i=32; i>=0; i--) {
> + for (i = 32; i >= 0; i--) {
> struct leaf_info *li = find_leaf_info(l, i);
> struct fib_alias *fa;
> __be32 mask, prefix;
> @@ -2439,7 +2404,7 @@ static int fib_route_seq_show(struct seq
> if (!li)
> continue;
>
> - mask = inet_make_mask(li->plen);
> + mask = inet_make_mask(i);
> prefix = htonl(l->key);
>
> list_for_each_entry_rcu(fa, &li->falh, fa_list) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
[-- Attachment #2: info_embedded.patch --]
[-- Type: text/plain, Size: 3670 bytes --]
--- net/ipv4/fib_trie.c
+++ net/ipv4/fib_trie.c.rcu
@@ -97,30 +97,22 @@
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- unsigned long parent;
- t_key key;
+ unsigned long parent;
+ t_key key;
};
struct leaf {
- unsigned long parent;
- t_key key;
- /*
- * Because we have at least one info per leaf, we embedd one here
- * to save some space and speedup lookups (sharing cache line)
- * Note : not inside a structure so that we dont have holes on 64bit
- */
- int plen_iinfo;
- struct list_head falh_iinfo;
-
- struct hlist_head list;
- struct rcu_head rcu;
+ unsigned long parent;
+ t_key key;
+ struct hlist_head list;
+ struct rcu_head rcu;
};
struct leaf_info {
- struct hlist_node hlist;
- int plen;
- struct list_head falh;
- struct rcu_head rcu;
+ struct hlist_node hlist;
+ struct rcu_head rcu;
+ int plen;
+ struct list_head falh;
};
struct tnode {
@@ -381,13 +373,11 @@
call_rcu(&tn->rcu, __tnode_free_rcu);
}
-static struct leaf *leaf_new(int plen)
+static struct leaf *leaf_new(void)
{
struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
if (l) {
l->parent = T_LEAF;
- l->plen_iinfo = plen;
- INIT_LIST_HEAD(&l->falh_iinfo);
INIT_HLIST_HEAD(&l->list);
}
return l;
@@ -909,12 +899,7 @@
static inline struct list_head * get_fa_head(struct leaf *l, int plen)
{
- struct leaf_info *li;
-
- if (plen == l->plen_iinfo)
- return &l->falh_iinfo;
-
- li = find_leaf_info(l, plen);
+ struct leaf_info *li = find_leaf_info(l, plen);
if (!li)
return NULL;
@@ -1071,22 +1056,21 @@
goto done;
}
t->size++;
- l = leaf_new(plen);
+ l = leaf_new();
if (!l)
return NULL;
l->key = key;
-// li = leaf_info_new(plen);
+ li = leaf_info_new(plen);
+
+ if (!li) {
+ tnode_free((struct tnode *) l);
+ return NULL;
+ }
-// if (!li) {
-// tnode_free((struct tnode *) l);
-// return NULL;
-// }
-
-// fa_head = &li->falh;
-// insert_leaf_info(&l->list, li);
- fa_head = &l->falh_iinfo;
+ fa_head = &li->falh;
+ insert_leaf_info(&l->list, li);
if (t->trie && n == NULL) {
/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1116,7 +1100,7 @@
}
if (!tn) {
-// free_leaf_info(li);
+ free_leaf_info(li);
tnode_free((struct tnode *) l);
return NULL;
}
@@ -1640,7 +1624,7 @@
li = find_leaf_info(l, plen);
list_del_rcu(&fa->fa_list);
-// FIXME
+
if (list_empty(fa_head)) {
hlist_del_rcu(&li->hlist);
free_leaf_info(li);
@@ -2382,11 +2366,10 @@
seq_indent(seq, iter->depth);
seq_printf(seq, " |-- %d.%d.%d.%d\n", NIPQUAD(val));
for (i = 32; i >= 0; i--) {
-// struct leaf_info *li = get_fa_head(l, i); //find_leaf_info(l, i);
- struct list_head *fa_head = get_fa_head(l, i);
- if (fa_head) {
+ struct leaf_info *li = find_leaf_info(l, i);
+ if (li) {
struct fib_alias *fa;
- list_for_each_entry_rcu(fa, fa_head, fa_list) {
+ list_for_each_entry_rcu(fa, &li->falh, fa_list) {
seq_indent(seq, iter->depth+1);
seq_printf(seq, " /%d %s %s", i,
rtn_scope(fa->fa_scope),
@@ -2465,18 +2448,17 @@
return 0;
for (i=32; i>=0; i--) {
- //struct leaf_info *li = find_leaf_info(l, i);
- struct list_head *fa_head = get_fa_head(l, i);
+ struct leaf_info *li = find_leaf_info(l, i);
struct fib_alias *fa;
__be32 mask, prefix;
- if (!fa_head)
+ if (!li)
continue;
- mask = inet_make_mask(i);
+ mask = inet_make_mask(li->plen);
prefix = htonl(l->key);
- list_for_each_entry_rcu(fa, fa_head, fa_list) {
+ list_for_each_entry_rcu(fa, &li->falh, fa_list) {
const struct fib_info *fi = fa->fa_info;
unsigned flags = fib_flag_trans(fa->fa_type, mask, fi);
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 6:12 ` Eric Dumazet
@ 2008-01-15 6:16 ` Eric Dumazet
2008-01-15 16:19 ` Stephen Hemminger
0 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 6:16 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev
[-- Attachment #1: Type: text/plain, Size: 770 bytes --]
Eric Dumazet a écrit :
> Stephen Hemminger a écrit :
>> Combine the prefix information and the leaf together into one
>> allocation. This is furthur simplified by converting the hlist
>> into a simple bitfield.
>>
>> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>>
>> This patch is experimental (ie it boots and loads routes), but
>> is slower for the 163,000 route dump test.
>>
>
> Its also very memory expensive. That was not Robert suggestion I believe.
>
> Its suggestion is to embedd one info into each leaves.
>
> Please find attached to this mail a preliminary and incomplete patch I
> wrote this morning before coffee :), to get the idea.
Oops, patch reversed, sorry, and against another work pending in my tree.
Now time for cofee :)
[-- Attachment #2: info_embedded.patch --]
[-- Type: text/plain, Size: 3670 bytes --]
--- net/ipv4/fib_trie.c.old
+++ net/ipv4/fib_trie.c
@@ -97,22 +97,30 @@
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- unsigned long parent;
- t_key key;
+ unsigned long parent;
+ t_key key;
};
struct leaf {
- unsigned long parent;
- t_key key;
- struct hlist_head list;
- struct rcu_head rcu;
+ unsigned long parent;
+ t_key key;
+ /*
+ * Because we have at least one info per leaf, we embedd one here
+ * to save some space and speedup lookups (sharing cache line)
+ * Note : not inside a structure so that we dont have holes on 64bit
+ */
+ int plen_iinfo;
+ struct list_head falh_iinfo;
+
+ struct hlist_head list;
+ struct rcu_head rcu;
};
struct leaf_info {
- struct hlist_node hlist;
- struct rcu_head rcu;
- int plen;
- struct list_head falh;
+ struct hlist_node hlist;
+ int plen;
+ struct list_head falh;
+ struct rcu_head rcu;
};
struct tnode {
@@ -373,11 +381,13 @@
call_rcu(&tn->rcu, __tnode_free_rcu);
}
-static struct leaf *leaf_new(void)
+static struct leaf *leaf_new(int plen)
{
struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
if (l) {
l->parent = T_LEAF;
+ l->plen_iinfo = plen;
+ INIT_LIST_HEAD(&l->falh_iinfo);
INIT_HLIST_HEAD(&l->list);
}
return l;
@@ -899,7 +909,12 @@
static inline struct list_head * get_fa_head(struct leaf *l, int plen)
{
- struct leaf_info *li = find_leaf_info(l, plen);
+ struct leaf_info *li;
+
+ if (plen == l->plen_iinfo)
+ return &l->falh_iinfo;
+
+ li = find_leaf_info(l, plen);
if (!li)
return NULL;
@@ -1056,21 +1071,22 @@
goto done;
}
t->size++;
- l = leaf_new();
+ l = leaf_new(plen);
if (!l)
return NULL;
l->key = key;
- li = leaf_info_new(plen);
-
- if (!li) {
- tnode_free((struct tnode *) l);
- return NULL;
- }
+// li = leaf_info_new(plen);
- fa_head = &li->falh;
- insert_leaf_info(&l->list, li);
+// if (!li) {
+// tnode_free((struct tnode *) l);
+// return NULL;
+// }
+
+// fa_head = &li->falh;
+// insert_leaf_info(&l->list, li);
+ fa_head = &l->falh_iinfo;
if (t->trie && n == NULL) {
/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1100,7 +1116,7 @@
}
if (!tn) {
- free_leaf_info(li);
+// free_leaf_info(li);
tnode_free((struct tnode *) l);
return NULL;
}
@@ -1624,7 +1640,7 @@
li = find_leaf_info(l, plen);
list_del_rcu(&fa->fa_list);
-
+// FIXME
if (list_empty(fa_head)) {
hlist_del_rcu(&li->hlist);
free_leaf_info(li);
@@ -2366,10 +2382,11 @@
seq_indent(seq, iter->depth);
seq_printf(seq, " |-- %d.%d.%d.%d\n", NIPQUAD(val));
for (i = 32; i >= 0; i--) {
- struct leaf_info *li = find_leaf_info(l, i);
- if (li) {
+// struct leaf_info *li = get_fa_head(l, i); //find_leaf_info(l, i);
+ struct list_head *fa_head = get_fa_head(l, i);
+ if (fa_head) {
struct fib_alias *fa;
- list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+ list_for_each_entry_rcu(fa, fa_head, fa_list) {
seq_indent(seq, iter->depth+1);
seq_printf(seq, " /%d %s %s", i,
rtn_scope(fa->fa_scope),
@@ -2448,17 +2465,18 @@
return 0;
for (i=32; i>=0; i--) {
- struct leaf_info *li = find_leaf_info(l, i);
+ //struct leaf_info *li = find_leaf_info(l, i);
+ struct list_head *fa_head = get_fa_head(l, i);
struct fib_alias *fa;
__be32 mask, prefix;
- if (!li)
+ if (!fa_head)
continue;
- mask = inet_make_mask(li->plen);
+ mask = inet_make_mask(i);
prefix = htonl(l->key);
- list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+ list_for_each_entry_rcu(fa, fa_head, fa_list) {
const struct fib_info *fi = fa->fa_info;
unsigned flags = fib_flag_trans(fa->fa_type, mask, fi);
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
@ 2008-01-15 6:49 ` Eric Dumazet
2008-01-15 7:29 ` David Miller
2 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 6:49 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev
Stephen Hemminger a écrit :
> This improves locality for operations that touch all the leaves.
> Later patch will grow the size of the leaf so it becomes more
> important.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>
>
> --- a/net/ipv4/fib_trie.c 2008-01-14 12:26:51.000000000 -0800
> +++ b/net/ipv4/fib_trie.c 2008-01-14 13:41:00.000000000 -0800
> @@ -162,6 +162,7 @@ static struct tnode *halve(struct trie *
> static void tnode_free(struct tnode *tn);
>
> static struct kmem_cache *fn_alias_kmem __read_mostly;
> +static struct kmem_cache *trie_leaf_kmem __read_mostly;
>
> static inline struct tnode *node_parent(struct node *node)
> {
> @@ -316,7 +317,8 @@ static inline void alias_free_mem_rcu(st
>
> static void __leaf_free_rcu(struct rcu_head *head)
> {
> - kfree(container_of(head, struct leaf, rcu));
> + struct leaf *leaf = container_of(head, struct leaf, rcu);
> + kmem_cache_free(trie_leaf_kmem, leaf);
> }
>
> static void __leaf_info_free_rcu(struct rcu_head *head)
> @@ -366,7 +368,7 @@ static inline void tnode_free(struct tno
>
> static struct leaf *leaf_new(void)
> {
> - struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
> + struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
> if (l) {
> l->parent = T_LEAF;
> INIT_HLIST_HEAD(&l->list);
> @@ -1927,6 +1929,9 @@ void __init fib_hash_init(void)
> {
> fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
> 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> +
> + trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
> + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> }
>
Do we really need HWCACHE_ALIGN ? so many wasted space ...
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] [IPV4] fib_trie: size and statistics
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
2008-01-15 5:00 ` [PATCH 2/6] [IPV4] fib hash|trie initialization Stephen Hemminger
@ 2008-01-15 6:55 ` Eric Dumazet
2008-01-15 7:28 ` David Miller
2008-01-15 7:12 ` David Miller
3 siblings, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 6:55 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev, stephen.hemminger
Stephen Hemminger a écrit :
> Show number of entries in trie, the size field was being set but never used,
> but it only counted leaves, not all entries. Refactor the two cases in
> fib_triestat_seq_show into a single routine.
>
> Note: the stat structure was being malloc'd but the stack usage isn't so
> high (288 bytes) that it is worth the additional complexity.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
> ---
> Patch against current net-2.6.25
>
> --- a/net/ipv4/fib_trie.c 2008-01-14 10:16:06.000000000 -0800
> +++ b/net/ipv4/fib_trie.c 2008-01-14 10:30:11.000000000 -0800
> @@ -148,10 +148,10 @@ struct trie_stat {
>
> struct trie {
> struct node *trie;
> + unsigned int size;
> #ifdef CONFIG_IP_FIB_TRIE_STATS
> struct trie_use_stats stats;
> #endif
> - int size;
> };
>
> static void put_child(struct trie *t, struct tnode *tn, int i, struct node *n);
> @@ -1045,7 +1045,6 @@ static struct list_head *fib_insert_node
> insert_leaf_info(&l->list, li);
> goto done;
> }
> - t->size++;
> l = leaf_new();
>
> if (!l)
> @@ -1258,6 +1257,8 @@ static int fn_trie_insert(struct fib_tab
> list_add_tail_rcu(&new_fa->fa_list,
> (fa ? &fa->fa_list : fa_head));
>
> + t->size++;
> +
> rt_cache_flush(-1);
> rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, tb->tb_id,
> &cfg->fc_nlinfo, 0);
> @@ -2128,50 +2129,34 @@ static void trie_show_usage(struct seq_f
> }
> #endif /* CONFIG_IP_FIB_TRIE_STATS */
>
> +static void fib_trie_show(struct seq_file *seq, const char *name, struct trie *trie)
> +{
> + struct trie_stat stat;
> +
> + seq_printf(seq, "%s: %d\n", name, trie->size);
> + trie_collect_stats(trie, &stat);
> + trie_show_stats(seq, &stat);
> +#ifdef CONFIG_IP_FIB_TRIE_STATS
> + trie_show_usage(seq, &trie->stats);
> +#endif
> +}
>
> static int fib_triestat_seq_show(struct seq_file *seq, void *v)
> {
> struct net *net = (struct net *)seq->private;
> - struct trie *trie_local, *trie_main;
> - struct trie_stat *stat;
> struct fib_table *tb;
>
> - trie_local = NULL;
> + seq_printf(seq,
> + "Basic info: size of leaf: %Zd bytes, size of tnode: %Zd bytes.\n",
> + sizeof(struct leaf), sizeof(struct tnode));
> +
> tb = fib_get_table(net, RT_TABLE_LOCAL);
> if (tb)
> - trie_local = (struct trie *) tb->tb_data;
> -
> - trie_main = NULL;
> + fib_trie_show(seq, "Local", (struct trie *) tb->tb_data);
> +
> tb = fib_get_table(net, RT_TABLE_MAIN);
> if (tb)
> - trie_main = (struct trie *) tb->tb_data;
> -
> -
> - stat = kmalloc(sizeof(*stat), GFP_KERNEL);
> - if (!stat)
> - return -ENOMEM;
> -
> - seq_printf(seq, "Basic info: size of leaf: %Zd bytes, size of tnode: %Zd bytes.\n",
> - sizeof(struct leaf), sizeof(struct tnode));
> -
> - if (trie_local) {
> - seq_printf(seq, "Local:\n");
> - trie_collect_stats(trie_local, stat);
> - trie_show_stats(seq, stat);
> -#ifdef CONFIG_IP_FIB_TRIE_STATS
> - trie_show_usage(seq, &trie_local->stats);
> -#endif
> - }
> -
> - if (trie_main) {
> - seq_printf(seq, "Main:\n");
> - trie_collect_stats(trie_main, stat);
> - trie_show_stats(seq, stat);
> -#ifdef CONFIG_IP_FIB_TRIE_STATS
> - trie_show_usage(seq, &trie_main->stats);
> -#endif
> - }
> - kfree(stat);
> + fib_trie_show(seq, "Main", (struct trie *) tb->tb_data);
>
> return 0;
> }
Keeping a 'size' counter is not necessary, since trie_collect_stats() must go
through all the tree and get this information for free.
This 'size' field only slows down inserts/deletes and dirty a cacheline that
readers will hit.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/9] get rid of unused revision element
2008-01-14 16:35 ` Stephen Hemminger
@ 2008-01-15 7:07 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:07 UTC (permalink / raw)
To: shemminger; +Cc: Robert.Olsson, robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Mon, 14 Jan 2008 08:35:55 -0800
> I will be glad to get this working. Is there any point in doing the a
> small systems version as well?
Not at this time.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [FIB]: Avoid using static variables without proper locking
2008-01-14 19:27 ` [FIB]: Avoid using static variables without proper locking Eric Dumazet
@ 2008-01-15 7:10 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:10 UTC (permalink / raw)
To: dada1; +Cc: Robert.Olsson, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Mon, 14 Jan 2008 20:27:11 +0100
> fib_trie_seq_show() uses two helper functions, rtn_scope() and
> rtn_type() that can
> write to static storage without locking.
>
> Just pass to them a temporary buffer to avoid potential corruption
> (probably not triggerable but still...)
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Applied to net-2.6.25, but I had to tweak it to apply
cleanly since the %d in rtn_type() is now a %u
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] [IPV4] fib_trie: size and statistics
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
` (2 preceding siblings ...)
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
@ 2008-01-15 7:12 ` David Miller
3 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:12 UTC (permalink / raw)
To: shemminger; +Cc: robert.olsson, netdev, stephen.hemminger
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Mon, 14 Jan 2008 12:57:55 -0800
> Show number of entries in trie, the size field was being set but never used,
> but it only counted leaves, not all entries. Refactor the two cases in
> fib_triestat_seq_show into a single routine.
>
> Note: the stat structure was being malloc'd but the stack usage isn't so
> high (288 bytes) that it is worth the additional complexity.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 2/6] [IPV4] fib hash|trie initialization
2008-01-15 5:00 ` [PATCH 2/6] [IPV4] fib hash|trie initialization Stephen Hemminger
@ 2008-01-15 7:14 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:14 UTC (permalink / raw)
To: stephen.hemminger; +Cc: robert.olsson, netdev
From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Mon, 14 Jan 2008 21:00:08 -0800
> Initialization of the slab cache's should be done when IP is
> initialized to make sure of available memory, and that code
> can be marked __init.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Applied.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] [IPV4] fib_trie: size and statistics
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
@ 2008-01-15 7:28 ` David Miller
0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:28 UTC (permalink / raw)
To: dada1; +Cc: shemminger, robert.olsson, netdev, stephen.hemminger
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 15 Jan 2008 07:55:57 +0100
> Keeping a 'size' counter is not necessary, since trie_collect_stats() must go
> through all the tree and get this information for free.
>
> This 'size' field only slows down inserts/deletes and dirty a cacheline that
> readers will hit.
Good point, we can do this better in a follow-on patch.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
2008-01-15 6:49 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Eric Dumazet
@ 2008-01-15 7:29 ` David Miller
2 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2008-01-15 7:29 UTC (permalink / raw)
To: stephen.hemminger; +Cc: robert.olsson, netdev
From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Mon, 14 Jan 2008 16:46:21 -0800
> This improves locality for operations that touch all the leaves.
> Later patch will grow the size of the leaf so it becomes more
> important.
>
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
I'll let 3, 4, 5, and 6 sit while you work out the issues
brought up by Eric Dumazet.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 6:16 ` Eric Dumazet
@ 2008-01-15 16:19 ` Stephen Hemminger
2008-01-15 16:44 ` Robert Olsson
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 16:19 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, robert.olsson, netdev
On Tue, 15 Jan 2008 07:16:30 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:
> Eric Dumazet a écrit :
> > Stephen Hemminger a écrit :
> >> Combine the prefix information and the leaf together into one
> >> allocation. This is furthur simplified by converting the hlist
> >> into a simple bitfield.
> >>
> >> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
> >>
> >> This patch is experimental (ie it boots and loads routes), but
> >> is slower for the 163,000 route dump test.
> >>
> >
> > Its also very memory expensive. That was not Robert suggestion I believe.
> >
> > Its suggestion is to embedd one info into each leaves.
> >
> > Please find attached to this mail a preliminary and incomplete patch I
> > wrote this morning before coffee :), to get the idea.
>
> Oops, patch reversed, sorry, and against another work pending in my tree.
>
> Now time for cofee :)
>
Okay, I would rather see the leaf_info explicit inside the leaf, also
your scheme probably breaks if I add two prefixes and then delete the first.
Let me have a go at it.
--
Stephen Hemminger <stephen.hemminger@vyatta.com>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 16:19 ` Stephen Hemminger
@ 2008-01-15 16:44 ` Robert Olsson
2008-01-15 17:25 ` Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-15 16:44 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Eric Dumazet, David Miller, robert.olsson, netdev
Stephen Hemminger writes:
> Okay, I would rather see the leaf_info explicit inside the leaf, also
> your scheme probably breaks if I add two prefixes and then delete the first.
> Let me have a go at it.
I took Eric's patch a bit further...
Support for delete and dump is needed before any testing at all
and maybe some macro for checking and setting FP-leaf's
Cheers.
--ro
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 6dab753..f5b276c 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -97,22 +97,32 @@ typedef unsigned int t_key;
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- unsigned long parent;
- t_key key;
+ unsigned long parent;
+ t_key key;
};
struct leaf {
- unsigned long parent;
- t_key key;
- struct hlist_head list;
- struct rcu_head rcu;
+ unsigned long parent;
+ t_key key;
+ /*
+ * Because we often have only one info per leaf, we embedd one here
+ * to save some space and speedup lookups (sharing cache line)
+ * a sort of fastpatch leaf (FP-leaf) this indicated a negative of
+ * plen_iinfo.
+ * Note : not inside a structure so that we dont have holes on 64bit
+ */
+ int plen_iinfo;
+ struct list_head falh_iinfo;
+
+ struct hlist_head list;
+ struct rcu_head rcu;
};
struct leaf_info {
- struct hlist_node hlist;
- struct rcu_head rcu;
- int plen;
- struct list_head falh;
+ struct hlist_node hlist;
+ int plen;
+ struct list_head falh;
+ struct rcu_head rcu;
};
struct tnode {
@@ -364,11 +374,13 @@ static inline void tnode_free(struct tnode *tn)
call_rcu(&tn->rcu, __tnode_free_rcu);
}
-static struct leaf *leaf_new(void)
+static struct leaf *leaf_new(int plen)
{
struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
if (l) {
l->parent = T_LEAF;
+ l->plen_iinfo = plen;
+ INIT_LIST_HEAD(&l->falh_iinfo);
INIT_HLIST_HEAD(&l->list);
}
return l;
@@ -890,7 +902,16 @@ static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
static inline struct list_head * get_fa_head(struct leaf *l, int plen)
{
- struct leaf_info *li = find_leaf_info(l, plen);
+ struct leaf_info *li;
+
+ if (l->plen_iinfo >= 0) {
+ if(l->plen_iinfo == plen)
+ return &l->falh_iinfo;
+ else
+ return NULL;
+ }
+
+ li = find_leaf_info(l, plen);
if (!li)
return NULL;
@@ -1036,6 +1057,21 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
if (n != NULL && IS_LEAF(n) && tkey_equals(key, n->key)) {
l = (struct leaf *) n;
+
+ /*
+ * Check if it is a FP-leaf if so we have to convert
+ * it to an ordinary leaf with leaf_infos
+ */
+
+ if( l->plen_iinfo >= 0 ) {
+ li = leaf_info_new(l->plen_iinfo);
+ if (!li)
+ return NULL;
+
+ insert_leaf_info(&l->list, li);
+ l->plen_iinfo = -1;
+ }
+
li = leaf_info_new(plen);
if (!li)
@@ -1045,21 +1081,16 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
insert_leaf_info(&l->list, li);
goto done;
}
- l = leaf_new();
+
+ /* Insert a FP-leaf */
+ l = leaf_new(plen);
if (!l)
return NULL;
l->key = key;
- li = leaf_info_new(plen);
-
- if (!li) {
- tnode_free((struct tnode *) l);
- return NULL;
- }
+ fa_head = &l->falh_iinfo;
- fa_head = &li->falh;
- insert_leaf_info(&l->list, li);
if (t->trie && n == NULL) {
/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1089,7 +1120,6 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
}
if (!tn) {
- free_leaf_info(li);
tnode_free((struct tnode *) l);
return NULL;
}
@@ -1285,6 +1315,20 @@ static inline int check_leaf(struct trie *t, struct leaf *l,
struct hlist_head *hhead = &l->list;
struct hlist_node *node;
+ i = l->plen_iinfo;
+ if( i >= 0) { /* FP-leaf */
+ mask = inet_make_mask(i);
+ err = fib_semantic_match(&l->falh_iinfo, flp, res, htonl(l->key), mask, i);
+ if(err <= 0 ) {
+ *plen = i;
+#ifdef CONFIG_IP_FIB_TRIE_STATS
+ t->stats.semantic_match_passed++;
+#endif
+ return err;
+ }
+ }
+
+
hlist_for_each_entry_rcu(li, node, hhead, hlist) {
i = li->plen;
mask = inet_make_mask(i);
@@ -1614,7 +1658,7 @@ static int fn_trie_delete(struct fib_table *tb, struct fib_config *cfg)
li = find_leaf_info(l, plen);
list_del_rcu(&fa->fa_list);
-
+// FIXME
if (list_empty(fa_head)) {
hlist_del_rcu(&li->hlist);
free_leaf_info(li);
@@ -2336,12 +2380,11 @@ static int fib_trie_seq_show(struct seq_file *seq, void *v)
seq_indent(seq, iter->depth);
seq_printf(seq, " |-- %d.%d.%d.%d\n", NIPQUAD(val));
for (i = 32; i >= 0; i--) {
- struct leaf_info *li = find_leaf_info(l, i);
+ struct list_head *fa_head = get_fa_head(l, i);
+ if (fa_head) {
- if (li) {
struct fib_alias *fa;
-
- list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+ list_for_each_entry_rcu(fa, fa_head, fa_list) {
char buf1[32], buf2[32];
seq_indent(seq, iter->depth+1);
@@ -2424,17 +2467,18 @@ static int fib_route_seq_show(struct seq_file *seq, void *v)
return 0;
for (i=32; i>=0; i--) {
- struct leaf_info *li = find_leaf_info(l, i);
+ //struct leaf_info *li = find_leaf_info(l, i);
+ struct list_head *fa_head = get_fa_head(l, i);
struct fib_alias *fa;
__be32 mask, prefix;
- if (!li)
+ if (!fa_head)
continue;
- mask = inet_make_mask(li->plen);
+ mask = inet_make_mask(i);
prefix = htonl(l->key);
- list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+ list_for_each_entry_rcu(fa, fa_head, fa_list) {
const struct fib_info *fi = fa->fa_info;
unsigned flags = fib_flag_trans(fa->fa_type, mask, fi);
^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 16:44 ` Robert Olsson
@ 2008-01-15 17:25 ` Eric Dumazet
2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 17:59 ` Robert Olsson
0 siblings, 2 replies; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 17:25 UTC (permalink / raw)
To: Robert Olsson; +Cc: Stephen Hemminger, David Miller, robert.olsson, netdev
On Tue, 15 Jan 2008 17:44:47 +0100
Robert Olsson <Robert.Olsson@data.slu.se> wrote:
>
> Stephen Hemminger writes:
>
> > Okay, I would rather see the leaf_info explicit inside the leaf, also
> > your scheme probably breaks if I add two prefixes and then delete the first.
> > Let me have a go at it.
>
> I took Eric's patch a bit further...
>
> Support for delete and dump is needed before any testing at all
> and maybe some macro for checking and setting FP-leaf's
>
> Cheers.
> --ro
>
>
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index 6dab753..f5b276c 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -97,22 +97,32 @@ typedef unsigned int t_key;
> #define IS_LEAF(n) (n->parent & T_LEAF)
>
> struct node {
> - unsigned long parent;
> - t_key key;
> + unsigned long parent;
> + t_key key;
> };
>
> struct leaf {
> - unsigned long parent;
> - t_key key;
> - struct hlist_head list;
> - struct rcu_head rcu;
> + unsigned long parent;
> + t_key key;
> + /*
> + * Because we often have only one info per leaf, we embedd one here
> + * to save some space and speedup lookups (sharing cache line)
> + * a sort of fastpatch leaf (FP-leaf) this indicated a negative of
> + * plen_iinfo.
> + * Note : not inside a structure so that we dont have holes on 64bit
> + */
> + int plen_iinfo;
> + struct list_head falh_iinfo;
yep, I renamed them to einfo_plen & einfo_falh
> +
> + struct hlist_head list;
> + struct rcu_head rcu;
> };
>
> struct leaf_info {
> - struct hlist_node hlist;
> - struct rcu_head rcu;
> - int plen;
> - struct list_head falh;
> + struct hlist_node hlist;
> + int plen;
> + struct list_head falh;
> + struct rcu_head rcu;
> };
>
> struct tnode {
> @@ -364,11 +374,13 @@ static inline void tnode_free(struct tnode *tn)
> call_rcu(&tn->rcu, __tnode_free_rcu);
> }
>
> -static struct leaf *leaf_new(void)
> +static struct leaf *leaf_new(int plen)
> {
> struct leaf *l = kmalloc(sizeof(struct leaf), GFP_KERNEL);
> if (l) {
> l->parent = T_LEAF;
> + l->plen_iinfo = plen;
> + INIT_LIST_HEAD(&l->falh_iinfo);
> INIT_HLIST_HEAD(&l->list);
> }
> return l;
> @@ -890,7 +902,16 @@ static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
>
> static inline struct list_head * get_fa_head(struct leaf *l, int plen)
> {
> - struct leaf_info *li = find_leaf_info(l, plen);
> + struct leaf_info *li;
> +
> + if (l->plen_iinfo >= 0) {
> + if(l->plen_iinfo == plen)
> + return &l->falh_iinfo;
> + else
> + return NULL;
> + }
So you think that a leaf cannot have 2 infos, one 'embeded' and one in the list ?
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 17:25 ` Eric Dumazet
@ 2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 18:10 ` Eric Dumazet
2008-01-15 20:18 ` Robert Olsson
2008-01-15 17:59 ` Robert Olsson
1 sibling, 2 replies; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 17:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Robert Olsson, David Miller, robert.olsson, netdev
This is how I did it:
--- a/net/ipv4/fib_trie.c 2008-01-15 09:14:53.000000000 -0800
+++ b/net/ipv4/fib_trie.c 2008-01-15 09:21:48.000000000 -0800
@@ -101,13 +101,6 @@ struct node {
t_key key;
};
-struct leaf {
- unsigned long parent;
- t_key key;
- struct hlist_head list;
- struct rcu_head rcu;
-};
-
struct leaf_info {
struct hlist_node hlist;
struct rcu_head rcu;
@@ -115,6 +108,13 @@ struct leaf_info {
struct list_head falh;
};
+struct leaf {
+ unsigned long parent;
+ t_key key;
+ struct hlist_head list;
+ struct rcu_head rcu;
+};
+
struct tnode {
unsigned long parent;
t_key key;
@@ -321,16 +321,6 @@ static void __leaf_free_rcu(struct rcu_h
kmem_cache_free(trie_leaf_kmem, leaf);
}
-static void __leaf_info_free_rcu(struct rcu_head *head)
-{
- kfree(container_of(head, struct leaf_info, rcu));
-}
-
-static inline void free_leaf_info(struct leaf_info *leaf)
-{
- call_rcu(&leaf->rcu, __leaf_info_free_rcu);
-}
-
static struct tnode *tnode_alloc(size_t size)
{
struct page *pages;
@@ -357,7 +347,7 @@ static void __tnode_free_rcu(struct rcu_
free_pages((unsigned long)tn, get_order(size));
}
-static inline void tnode_free(struct tnode *tn)
+static void tnode_free(struct tnode *tn)
{
if (IS_LEAF(tn)) {
struct leaf *l = (struct leaf *) tn;
@@ -376,16 +366,41 @@ static struct leaf *leaf_new(void)
return l;
}
+static void leaf_info_init(struct leaf_info *li, int plen)
+{
+ li->plen = plen;
+ INIT_LIST_HEAD(&li->falh);
+}
+
+static struct leaf_info *leaf_info_first(struct leaf *l, int plen)
+{
+ struct leaf_info *li = (struct leaf_info *) (l + 1);
+ leaf_info_init(li, plen);
+ return li;
+}
+
static struct leaf_info *leaf_info_new(int plen)
{
struct leaf_info *li = kmalloc(sizeof(struct leaf_info), GFP_KERNEL);
- if (li) {
- li->plen = plen;
- INIT_LIST_HEAD(&li->falh);
- }
+ if (li)
+ leaf_info_init(li, plen);
+
return li;
}
+static void __leaf_info_free_rcu(struct rcu_head *head)
+{
+ kfree(container_of(head, struct leaf_info, rcu));
+}
+
+static inline void free_leaf_info(struct leaf *l, struct leaf_info *leaf)
+{
+ if (leaf == (struct leaf_info *)(l + 1))
+ return;
+
+ call_rcu(&leaf->rcu, __leaf_info_free_rcu);
+}
+
static struct tnode* tnode_new(t_key key, int pos, int bits)
{
size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
@@ -1047,18 +1062,13 @@ static struct list_head *fib_insert_node
insert_leaf_info(&l->list, li);
goto done;
}
- l = leaf_new();
+ l = leaf_new();
if (!l)
return NULL;
l->key = key;
- li = leaf_info_new(plen);
-
- if (!li) {
- tnode_free((struct tnode *) l);
- return NULL;
- }
+ li = leaf_info_first(l, plen);
fa_head = &li->falh;
insert_leaf_info(&l->list, li);
@@ -1091,7 +1101,7 @@ static struct list_head *fib_insert_node
}
if (!tn) {
- free_leaf_info(li);
+ free_leaf_info(l, li);
tnode_free((struct tnode *) l);
return NULL;
}
@@ -1624,7 +1634,7 @@ static int fn_trie_delete(struct fib_tab
if (list_empty(fa_head)) {
hlist_del_rcu(&li->hlist);
- free_leaf_info(li);
+ free_leaf_info(l, li);
}
if (hlist_empty(&l->list))
@@ -1668,7 +1678,7 @@ static int trie_flush_leaf(struct trie *
if (list_empty(&li->falh)) {
hlist_del_rcu(&li->hlist);
- free_leaf_info(li);
+ free_leaf_info(l, li);
}
}
return found;
@@ -1935,7 +1945,8 @@ void __init fib_hash_init(void)
fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
0, SLAB_PANIC, NULL);
- trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
+ trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
+ sizeof(struct leaf) + sizeof(struct leaf_info),
0, SLAB_PANIC, NULL);
}
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 17:25 ` Eric Dumazet
2008-01-15 17:47 ` Stephen Hemminger
@ 2008-01-15 17:59 ` Robert Olsson
1 sibling, 0 replies; 52+ messages in thread
From: Robert Olsson @ 2008-01-15 17:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Robert Olsson, Stephen Hemminger, David Miller, robert.olsson,
netdev
Eric Dumazet writes:
>
> So you think that a leaf cannot have 2 infos, one 'embeded' and one in the list ?
Hello,
The model I thought of is to have either:
1) One leaf_info embedded in leaf. A fast-path leaf. FP-leaf
Or
2) The intct old leaf_info list with arbitrary number leaf_infos
If plen_iinfo is >=0 It's a FP-leaf
Cheers.
--ro
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 17:47 ` Stephen Hemminger
@ 2008-01-15 18:10 ` Eric Dumazet
2008-01-15 18:15 ` Stephen Hemminger
2008-01-15 20:18 ` Robert Olsson
1 sibling, 1 reply; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 18:10 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Robert Olsson, David Miller, robert.olsson, netdev
On Tue, 15 Jan 2008 09:47:53 -0800
Stephen Hemminger <stephen.hemminger@vyatta.com> wrote:
> This is how I did it:
>
> --- a/net/ipv4/fib_trie.c 2008-01-15 09:14:53.000000000 -0800
> +++ b/net/ipv4/fib_trie.c 2008-01-15 09:21:48.000000000 -0800
> @@ -101,13 +101,6 @@ struct node {
> t_key key;
> };
>
> -struct leaf {
> - unsigned long parent;
> - t_key key;
> - struct hlist_head list;
> - struct rcu_head rcu;
> -};
> -
> struct leaf_info {
> struct hlist_node hlist;
> struct rcu_head rcu;
> @@ -115,6 +108,13 @@ struct leaf_info {
> struct list_head falh;
> };
>
> +struct leaf {
> + unsigned long parent;
> + t_key key;
> + struct hlist_head list;
> + struct rcu_head rcu;
> +};
I like this :)
Your design is clean, but we waste some space (rcu in leaf_info "included"), we probably can do a litle bit better
(moving rcu at the end of leaf_info, and kmem_cache_create("ip_fib_trie", sizeof(struct leaf) + sizeof(struct_leaf_info) - sizeof(struct rcu_head))
> - trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
> + trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
> + sizeof(struct leaf) + sizeof(struct leaf_info),
> 0, SLAB_PANIC, NULL);
> }
>
>
Thank you
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 18:10 ` Eric Dumazet
@ 2008-01-15 18:15 ` Stephen Hemminger
2008-01-15 18:32 ` Eric Dumazet
0 siblings, 1 reply; 52+ messages in thread
From: Stephen Hemminger @ 2008-01-15 18:15 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Robert Olsson, David Miller, robert.olsson, netdev
On Tue, 15 Jan 2008 19:10:31 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:
> On Tue, 15 Jan 2008 09:47:53 -0800
> Stephen Hemminger <stephen.hemminger@vyatta.com> wrote:
>
> > This is how I did it:
> >
> > --- a/net/ipv4/fib_trie.c 2008-01-15 09:14:53.000000000 -0800
> > +++ b/net/ipv4/fib_trie.c 2008-01-15 09:21:48.000000000 -0800
> > @@ -101,13 +101,6 @@ struct node {
> > t_key key;
> > };
> >
> > -struct leaf {
> > - unsigned long parent;
> > - t_key key;
> > - struct hlist_head list;
> > - struct rcu_head rcu;
> > -};
> > -
> > struct leaf_info {
> > struct hlist_node hlist;
> > struct rcu_head rcu;
> > @@ -115,6 +108,13 @@ struct leaf_info {
> > struct list_head falh;
> > };
> >
> > +struct leaf {
> > + unsigned long parent;
> > + t_key key;
> > + struct hlist_head list;
> > + struct rcu_head rcu;
> > +};
>
> I like this :)
>
> Your design is clean, but we waste some space (rcu in leaf_info "included"), we probably can do a litle bit better
> (moving rcu at the end of leaf_info, and kmem_cache_create("ip_fib_trie", sizeof(struct leaf) + sizeof(struct_leaf_info) - sizeof(struct rcu_head))
>
>
> > - trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
> > + trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
> > + sizeof(struct leaf) + sizeof(struct leaf_info),
> > 0, SLAB_PANIC, NULL);
> > }
> >
> >
>
> Thank you
Having multiple RCU links is a waste. I started on code that just splice's the
leaf_info's off to a free_list and then do a mass free after and RCU barrier.
For the normal case of just freeing a leaf, it could just walk the chain
in the RCU free of the leaf.
--
Stephen Hemminger <stephen.hemminger@vyatta.com>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 18:15 ` Stephen Hemminger
@ 2008-01-15 18:32 ` Eric Dumazet
0 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 18:32 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Robert Olsson, David Miller, robert.olsson, netdev
On Tue, 15 Jan 2008 10:15:43 -0800
Stephen Hemminger <stephen.hemminger@vyatta.com> wrote:
> On Tue, 15 Jan 2008 19:10:31 +0100
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>
> > On Tue, 15 Jan 2008 09:47:53 -0800
> > Stephen Hemminger <stephen.hemminger@vyatta.com> wrote:
> >
> > > This is how I did it:
> > >
> > > --- a/net/ipv4/fib_trie.c 2008-01-15 09:14:53.000000000 -0800
> > > +++ b/net/ipv4/fib_trie.c 2008-01-15 09:21:48.000000000 -0800
> > > @@ -101,13 +101,6 @@ struct node {
> > > t_key key;
> > > };
> > >
> > > -struct leaf {
> > > - unsigned long parent;
> > > - t_key key;
> > > - struct hlist_head list;
> > > - struct rcu_head rcu;
> > > -};
> > > -
> > > struct leaf_info {
> > > struct hlist_node hlist;
> > > struct rcu_head rcu;
> > > @@ -115,6 +108,13 @@ struct leaf_info {
> > > struct list_head falh;
> > > };
> > >
> > > +struct leaf {
> > > + unsigned long parent;
> > > + t_key key;
> > > + struct hlist_head list;
> > > + struct rcu_head rcu;
> > > +};
> >
> > I like this :)
> >
> > Your design is clean, but we waste some space (rcu in leaf_info "included"), we probably can do a litle bit better
> > (moving rcu at the end of leaf_info, and kmem_cache_create("ip_fib_trie", sizeof(struct leaf) + sizeof(struct_leaf_info) - sizeof(struct rcu_head))
> >
> >
> > > - trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
> > > + trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
> > > + sizeof(struct leaf) + sizeof(struct leaf_info),
> > > 0, SLAB_PANIC, NULL);
> > > }
> > >
> > >
> >
> > Thank you
>
> Having multiple RCU links is a waste. I started on code that just splice's the
> leaf_info's off to a free_list and then do a mass free after and RCU barrier.
>
> For the normal case of just freeing a leaf, it could just walk the chain
> in the RCU free of the leaf.
Well, If you take this path, we can also copy the leaf itself, and
just use a variable size array of infos[], no more list, and maximal locality
kmalloc(sizeof(leaf) + nb_infos * sizeof(info))
(Still we can use a kmem cache for nb_infos=1 leaves)
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 18:10 ` Eric Dumazet
@ 2008-01-15 20:18 ` Robert Olsson
2008-01-15 21:16 ` Eric Dumazet
1 sibling, 1 reply; 52+ messages in thread
From: Robert Olsson @ 2008-01-15 20:18 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Eric Dumazet, David Miller, robert.olsson, netdev
Stephen Hemminger writes:
> This is how I did it:
Yes looks like an elegant solution. Did you even test it?
Maybe we see some effects in just dumping a full table?
Anyway lookup should be tested in some way. We can a lot
of analyzing before getting to right entry, local_table
backtracking, main lookup w. ev. backtracking etc. So
hopefully we get paid for this work.
Also it might be idea to do some analysis of the fib_aliases
list. Maybe the trick can be done again? ;)
Cheers
--ro
> --- a/net/ipv4/fib_trie.c 2008-01-15 09:14:53.000000000 -0800
> +++ b/net/ipv4/fib_trie.c 2008-01-15 09:21:48.000000000 -0800
> @@ -101,13 +101,6 @@ struct node {
> t_key key;
> };
>
> -struct leaf {
> - unsigned long parent;
> - t_key key;
> - struct hlist_head list;
> - struct rcu_head rcu;
> -};
> -
> struct leaf_info {
> struct hlist_node hlist;
> struct rcu_head rcu;
> @@ -115,6 +108,13 @@ struct leaf_info {
> struct list_head falh;
> };
>
> +struct leaf {
> + unsigned long parent;
> + t_key key;
> + struct hlist_head list;
> + struct rcu_head rcu;
> +};
> +
> struct tnode {
> unsigned long parent;
> t_key key;
> @@ -321,16 +321,6 @@ static void __leaf_free_rcu(struct rcu_h
> kmem_cache_free(trie_leaf_kmem, leaf);
> }
>
> -static void __leaf_info_free_rcu(struct rcu_head *head)
> -{
> - kfree(container_of(head, struct leaf_info, rcu));
> -}
> -
> -static inline void free_leaf_info(struct leaf_info *leaf)
> -{
> - call_rcu(&leaf->rcu, __leaf_info_free_rcu);
> -}
> -
> static struct tnode *tnode_alloc(size_t size)
> {
> struct page *pages;
> @@ -357,7 +347,7 @@ static void __tnode_free_rcu(struct rcu_
> free_pages((unsigned long)tn, get_order(size));
> }
>
> -static inline void tnode_free(struct tnode *tn)
> +static void tnode_free(struct tnode *tn)
> {
> if (IS_LEAF(tn)) {
> struct leaf *l = (struct leaf *) tn;
> @@ -376,16 +366,41 @@ static struct leaf *leaf_new(void)
> return l;
> }
>
> +static void leaf_info_init(struct leaf_info *li, int plen)
> +{
> + li->plen = plen;
> + INIT_LIST_HEAD(&li->falh);
> +}
> +
> +static struct leaf_info *leaf_info_first(struct leaf *l, int plen)
> +{
> + struct leaf_info *li = (struct leaf_info *) (l + 1);
> + leaf_info_init(li, plen);
> + return li;
> +}
> +
> static struct leaf_info *leaf_info_new(int plen)
> {
> struct leaf_info *li = kmalloc(sizeof(struct leaf_info), GFP_KERNEL);
> - if (li) {
> - li->plen = plen;
> - INIT_LIST_HEAD(&li->falh);
> - }
> + if (li)
> + leaf_info_init(li, plen);
> +
> return li;
> }
>
> +static void __leaf_info_free_rcu(struct rcu_head *head)
> +{
> + kfree(container_of(head, struct leaf_info, rcu));
> +}
> +
> +static inline void free_leaf_info(struct leaf *l, struct leaf_info *leaf)
> +{
> + if (leaf == (struct leaf_info *)(l + 1))
> + return;
> +
> + call_rcu(&leaf->rcu, __leaf_info_free_rcu);
> +}
> +
> static struct tnode* tnode_new(t_key key, int pos, int bits)
> {
> size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
> @@ -1047,18 +1062,13 @@ static struct list_head *fib_insert_node
> insert_leaf_info(&l->list, li);
> goto done;
> }
> - l = leaf_new();
>
> + l = leaf_new();
> if (!l)
> return NULL;
>
> l->key = key;
> - li = leaf_info_new(plen);
> -
> - if (!li) {
> - tnode_free((struct tnode *) l);
> - return NULL;
> - }
> + li = leaf_info_first(l, plen);
>
> fa_head = &li->falh;
> insert_leaf_info(&l->list, li);
> @@ -1091,7 +1101,7 @@ static struct list_head *fib_insert_node
> }
>
> if (!tn) {
> - free_leaf_info(li);
> + free_leaf_info(l, li);
> tnode_free((struct tnode *) l);
> return NULL;
> }
> @@ -1624,7 +1634,7 @@ static int fn_trie_delete(struct fib_tab
>
> if (list_empty(fa_head)) {
> hlist_del_rcu(&li->hlist);
> - free_leaf_info(li);
> + free_leaf_info(l, li);
> }
>
> if (hlist_empty(&l->list))
> @@ -1668,7 +1678,7 @@ static int trie_flush_leaf(struct trie *
>
> if (list_empty(&li->falh)) {
> hlist_del_rcu(&li->hlist);
> - free_leaf_info(li);
> + free_leaf_info(l, li);
> }
> }
> return found;
> @@ -1935,7 +1945,8 @@ void __init fib_hash_init(void)
> fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
> 0, SLAB_PANIC, NULL);
>
> - trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
> + trie_leaf_kmem = kmem_cache_create("ip_fib_trie",
> + sizeof(struct leaf) + sizeof(struct leaf_info),
> 0, SLAB_PANIC, NULL);
> }
>
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC 6/6] fib_trie: combine leaf and info
2008-01-15 20:18 ` Robert Olsson
@ 2008-01-15 21:16 ` Eric Dumazet
0 siblings, 0 replies; 52+ messages in thread
From: Eric Dumazet @ 2008-01-15 21:16 UTC (permalink / raw)
To: Robert Olsson; +Cc: Stephen Hemminger, David Miller, robert.olsson, netdev
Robert Olsson a écrit :
>
> Stephen Hemminger writes:
> > This is how I did it:
>
> Yes looks like an elegant solution. Did you even test it?
> Maybe we see some effects in just dumping a full table?
>
> Anyway lookup should be tested in some way. We can a lot
> of analyzing before getting to right entry, local_table
> backtracking, main lookup w. ev. backtracking etc. So
> hopefully we get paid for this work.
>
> Also it might be idea to do some analysis of the fib_aliases
> list. Maybe the trick can be done again? ;)
>
Back in 2.6.9 times, sizeof(bi_alias) was 16 bytes on i386
Nowadays, 64/128 bytes are the norm :(
SLAB_HWCACHE_ALIGN is not our friend.
^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2008-01-15 21:16 UTC | newest]
Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080112064513.803976049@linux-foundation.org>
[not found] ` <20080112064646.659443238@linux-foundation.org>
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
2008-01-12 11:28 ` David Miller
2008-01-12 21:08 ` Stephen Hemminger
2008-01-14 11:07 ` Robert Olsson
2008-01-14 17:34 ` Eric Dumazet
2008-01-14 17:59 ` Robert Olsson
2008-01-14 19:27 ` [FIB]: Avoid using static variables without proper locking Eric Dumazet
2008-01-15 7:10 ` David Miller
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
2008-01-13 5:28 ` David Miller
2008-01-13 18:30 ` [FIB]: full_children & empty_children should be uint, not ushort Eric Dumazet
2008-01-13 22:02 ` Robert Olsson
2008-01-14 6:32 ` David Miller
2008-01-13 5:25 ` [PATCH 9/9] fix sparse warnings David Miller
[not found] ` <20080112064646.056241123@linux-foundation.org>
2008-01-13 4:49 ` [PATCH 1/9] get rid of trie_init David Miller
[not found] ` <20080112064646.132747871@linux-foundation.org>
2008-01-13 4:50 ` [PATCH 2/9] get rid of unused revision element David Miller
2008-01-14 11:44 ` Robert Olsson
2008-01-14 12:06 ` David Miller
2008-01-14 16:35 ` Stephen Hemminger
2008-01-15 7:07 ` David Miller
[not found] ` <20080112064646.207183428@linux-foundation.org>
2008-01-13 4:53 ` [PATCH 3/9] move size information to pr_debug() David Miller
[not found] ` <20080112064646.282104074@linux-foundation.org>
2008-01-13 4:55 ` [PATCH 4/9] statistics improvements David Miller
2008-01-13 5:33 ` Stephen Hemminger
2008-01-13 5:44 ` David Miller
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
2008-01-15 2:58 ` [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention Stephen Hemminger
2008-01-15 5:07 ` [RFC 6/6] fib_trie: combine leaf and info Stephen Hemminger
2008-01-15 6:12 ` Eric Dumazet
2008-01-15 6:16 ` Eric Dumazet
2008-01-15 16:19 ` Stephen Hemminger
2008-01-15 16:44 ` Robert Olsson
2008-01-15 17:25 ` Eric Dumazet
2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 18:10 ` Eric Dumazet
2008-01-15 18:15 ` Stephen Hemminger
2008-01-15 18:32 ` Eric Dumazet
2008-01-15 20:18 ` Robert Olsson
2008-01-15 21:16 ` Eric Dumazet
2008-01-15 17:59 ` Robert Olsson
2008-01-15 6:49 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Eric Dumazet
2008-01-15 7:29 ` David Miller
2008-01-15 5:00 ` [PATCH 2/6] [IPV4] fib hash|trie initialization Stephen Hemminger
2008-01-15 7:14 ` David Miller
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
2008-01-15 7:28 ` David Miller
2008-01-15 7:12 ` David Miller
[not found] ` <20080112064646.356466158@linux-foundation.org>
2008-01-13 4:56 ` [PATCH 5/9] use %u for unsigned printfs David Miller
[not found] ` <20080112064646.432200237@linux-foundation.org>
2008-01-13 4:57 ` [PATCH 6/9] : fib_insert_node cleanup David Miller
[not found] ` <20080112064646.507015655@linux-foundation.org>
2008-01-13 4:58 ` [PATCH 7/9] printk related cleanups David Miller
[not found] ` <20080112064646.583836190@linux-foundation.org>
2008-01-13 5:23 ` [PATCH 8/9] add statistics David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).