From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>,
Robert Olsson <robert.olsson@its.uu.se>,
netdev@vger.kernel.org
Subject: [FIB]: full_children & empty_children should be uint, not ushort
Date: Sun, 13 Jan 2008 19:30:21 +0100 [thread overview]
Message-ID: <478A58BD.8010304@cosmosbay.com> (raw)
In-Reply-To: <4788A17D.5070903@cosmosbay.com>
[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]
Eric Dumazet a écrit :
> 4) full_children & empty_children being 'unsigned short',
> we probably are limited to 2^15 elements, but I could not
> find this limit enforced somewhere.
Hi David
In my testings, I found that once a tnode is built with 2^16 slots (or more),
it cannot be freed.
Extract of /proc/net/fib_triestat
Main:
Aver depth: 1.50
Max depth: 2
Leaves: 2
Internal nodes: 3
1: 1 2: 1 17: 1
Pointers: 131078
Null ptrs: 131074
Total size: 513 kB
# ip route
192.168.11.0/24 dev eth0 proto kernel scope link src 192.168.11.129
default via 192.168.11.2 dev eth0
Two fixes are possible : Enlarge full_children & empty_children to 32bits, or
force a limit in code to never exceed 2^15 children in a tnode. I chose the
first solution since it can be done with 0 memory cost on 64bit arches.
Thank you
[FIB]: full_children & empty_children should be uint, not ushort
If declared as unsigned short, these fields can overflow, and whole trie logic
is broken. I could not make the machine crash, but some tnode can never
be freed.
Note for 64 bit arches : By reordering t_key and parent in [node, leaf, tnode]
structures, we can use 32 bits hole after t_key so that sizeof(struct tnode)
doesnt change after this patch.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: fib_trie.patch --]
[-- Type: text/plain, Size: 2515 bytes --]
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index f26ba31..9696722 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -97,13 +97,13 @@ typedef unsigned int t_key;
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- t_key key;
unsigned long parent;
+ t_key key;
};
struct leaf {
- t_key key;
unsigned long parent;
+ t_key key;
struct hlist_head list;
struct rcu_head rcu;
};
@@ -116,12 +116,12 @@ struct leaf_info {
};
struct tnode {
- t_key key;
unsigned long parent;
+ t_key key;
unsigned char pos; /* 2log(KEYLENGTH) bits needed */
unsigned char bits; /* 2log(KEYLENGTH) bits needed */
- unsigned short full_children; /* KEYLENGTH bits needed */
- unsigned short empty_children; /* KEYLENGTH bits needed */
+ unsigned int full_children; /* KEYLENGTH bits needed */
+ unsigned int empty_children; /* KEYLENGTH bits needed */
struct rcu_head rcu;
struct node *child[0];
};
@@ -329,12 +329,12 @@ static inline void free_leaf_info(struct leaf_info *leaf)
call_rcu(&leaf->rcu, __leaf_info_free_rcu);
}
-static struct tnode *tnode_alloc(unsigned int size)
+static struct tnode *tnode_alloc(size_t size)
{
struct page *pages;
if (size <= PAGE_SIZE)
- return kcalloc(size, 1, GFP_KERNEL);
+ return kzalloc(size, GFP_KERNEL);
pages = alloc_pages(GFP_KERNEL|__GFP_ZERO, get_order(size));
if (!pages)
@@ -346,8 +346,8 @@ static struct tnode *tnode_alloc(unsigned int size)
static void __tnode_free_rcu(struct rcu_head *head)
{
struct tnode *tn = container_of(head, struct tnode, rcu);
- unsigned int size = sizeof(struct tnode) +
- (1 << tn->bits) * sizeof(struct node *);
+ size_t size = sizeof(struct tnode) +
+ (sizeof(struct node *) << tn->bits);
if (size <= PAGE_SIZE)
kfree(tn);
@@ -386,8 +386,7 @@ static struct leaf_info *leaf_info_new(int plen)
static struct tnode* tnode_new(t_key key, int pos, int bits)
{
- int nchildren = 1<<bits;
- int sz = sizeof(struct tnode) + nchildren * sizeof(struct node *);
+ size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
struct tnode *tn = tnode_alloc(sz);
if (tn) {
@@ -399,8 +398,8 @@ static struct tnode* tnode_new(t_key key, int pos, int bits)
tn->empty_children = 1<<bits;
}
- pr_debug("AT %p s=%u %u\n", tn, (unsigned int) sizeof(struct tnode),
- (unsigned int) (sizeof(struct node) * 1<<bits));
+ pr_debug("AT %p s=%u %lu\n", tn, (unsigned int) sizeof(struct tnode),
+ (unsigned long) (sizeof(struct node) << bits));
return tn;
}
next prev parent reply other threads:[~2008-01-13 18:30 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080112064513.803976049@linux-foundation.org>
[not found] ` <20080112064646.056241123@linux-foundation.org>
2008-01-13 4:49 ` [PATCH 1/9] get rid of trie_init David Miller
[not found] ` <20080112064646.132747871@linux-foundation.org>
2008-01-13 4:50 ` [PATCH 2/9] get rid of unused revision element David Miller
2008-01-14 11:44 ` Robert Olsson
2008-01-14 12:06 ` David Miller
2008-01-14 16:35 ` Stephen Hemminger
2008-01-15 7:07 ` David Miller
[not found] ` <20080112064646.207183428@linux-foundation.org>
2008-01-13 4:53 ` [PATCH 3/9] move size information to pr_debug() David Miller
[not found] ` <20080112064646.282104074@linux-foundation.org>
2008-01-13 4:55 ` [PATCH 4/9] statistics improvements David Miller
2008-01-13 5:33 ` Stephen Hemminger
2008-01-13 5:44 ` David Miller
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
2008-01-15 2:58 ` [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention Stephen Hemminger
2008-01-15 5:07 ` [RFC 6/6] fib_trie: combine leaf and info Stephen Hemminger
2008-01-15 6:12 ` Eric Dumazet
2008-01-15 6:16 ` Eric Dumazet
2008-01-15 16:19 ` Stephen Hemminger
2008-01-15 16:44 ` Robert Olsson
2008-01-15 17:25 ` Eric Dumazet
2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 18:10 ` Eric Dumazet
2008-01-15 18:15 ` Stephen Hemminger
2008-01-15 18:32 ` Eric Dumazet
2008-01-15 20:18 ` Robert Olsson
2008-01-15 21:16 ` Eric Dumazet
2008-01-15 17:59 ` Robert Olsson
2008-01-15 6:49 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Eric Dumazet
2008-01-15 7:29 ` David Miller
2008-01-15 5:00 ` [PATCH 2/6] [IPV4] fib hash|trie initialization Stephen Hemminger
2008-01-15 7:14 ` David Miller
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
2008-01-15 7:28 ` David Miller
2008-01-15 7:12 ` David Miller
[not found] ` <20080112064646.356466158@linux-foundation.org>
2008-01-13 4:56 ` [PATCH 5/9] use %u for unsigned printfs David Miller
[not found] ` <20080112064646.432200237@linux-foundation.org>
2008-01-13 4:57 ` [PATCH 6/9] : fib_insert_node cleanup David Miller
[not found] ` <20080112064646.507015655@linux-foundation.org>
2008-01-13 4:58 ` [PATCH 7/9] printk related cleanups David Miller
[not found] ` <20080112064646.583836190@linux-foundation.org>
2008-01-13 5:23 ` [PATCH 8/9] add statistics David Miller
[not found] ` <20080112064646.659443238@linux-foundation.org>
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
2008-01-12 11:28 ` David Miller
2008-01-12 21:08 ` Stephen Hemminger
2008-01-14 11:07 ` Robert Olsson
2008-01-14 17:34 ` Eric Dumazet
2008-01-14 17:59 ` Robert Olsson
2008-01-14 19:27 ` [FIB]: Avoid using static variables without proper locking Eric Dumazet
2008-01-15 7:10 ` David Miller
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
2008-01-13 5:28 ` David Miller
2008-01-13 18:30 ` Eric Dumazet [this message]
2008-01-13 22:02 ` [FIB]: full_children & empty_children should be uint, not ushort Robert Olsson
2008-01-14 6:32 ` David Miller
2008-01-13 5:25 ` [PATCH 9/9] fix sparse warnings David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478A58BD.8010304@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=robert.olsson@its.uu.se \
--cc=stephen.hemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.