From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>,
Robert Olsson <robert.olsson@its.uu.se>,
netdev@vger.kernel.org
Subject: [FIB]: full_children & empty_children should be uint, not ushort
Date: Sun, 13 Jan 2008 19:30:21 +0100 [thread overview]
Message-ID: <478A58BD.8010304@cosmosbay.com> (raw)
In-Reply-To: <4788A17D.5070903@cosmosbay.com>
[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]
Eric Dumazet a écrit :
> 4) full_children & empty_children being 'unsigned short',
> we probably are limited to 2^15 elements, but I could not
> find this limit enforced somewhere.
Hi David
In my testings, I found that once a tnode is built with 2^16 slots (or more),
it cannot be freed.
Extract of /proc/net/fib_triestat
Main:
Aver depth: 1.50
Max depth: 2
Leaves: 2
Internal nodes: 3
1: 1 2: 1 17: 1
Pointers: 131078
Null ptrs: 131074
Total size: 513 kB
# ip route
192.168.11.0/24 dev eth0 proto kernel scope link src 192.168.11.129
default via 192.168.11.2 dev eth0
Two fixes are possible : Enlarge full_children & empty_children to 32bits, or
force a limit in code to never exceed 2^15 children in a tnode. I chose the
first solution since it can be done with 0 memory cost on 64bit arches.
Thank you
[FIB]: full_children & empty_children should be uint, not ushort
If declared as unsigned short, these fields can overflow, and whole trie logic
is broken. I could not make the machine crash, but some tnode can never
be freed.
Note for 64 bit arches : By reordering t_key and parent in [node, leaf, tnode]
structures, we can use 32 bits hole after t_key so that sizeof(struct tnode)
doesnt change after this patch.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: fib_trie.patch --]
[-- Type: text/plain, Size: 2515 bytes --]
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index f26ba31..9696722 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -97,13 +97,13 @@ typedef unsigned int t_key;
#define IS_LEAF(n) (n->parent & T_LEAF)
struct node {
- t_key key;
unsigned long parent;
+ t_key key;
};
struct leaf {
- t_key key;
unsigned long parent;
+ t_key key;
struct hlist_head list;
struct rcu_head rcu;
};
@@ -116,12 +116,12 @@ struct leaf_info {
};
struct tnode {
- t_key key;
unsigned long parent;
+ t_key key;
unsigned char pos; /* 2log(KEYLENGTH) bits needed */
unsigned char bits; /* 2log(KEYLENGTH) bits needed */
- unsigned short full_children; /* KEYLENGTH bits needed */
- unsigned short empty_children; /* KEYLENGTH bits needed */
+ unsigned int full_children; /* KEYLENGTH bits needed */
+ unsigned int empty_children; /* KEYLENGTH bits needed */
struct rcu_head rcu;
struct node *child[0];
};
@@ -329,12 +329,12 @@ static inline void free_leaf_info(struct leaf_info *leaf)
call_rcu(&leaf->rcu, __leaf_info_free_rcu);
}
-static struct tnode *tnode_alloc(unsigned int size)
+static struct tnode *tnode_alloc(size_t size)
{
struct page *pages;
if (size <= PAGE_SIZE)
- return kcalloc(size, 1, GFP_KERNEL);
+ return kzalloc(size, GFP_KERNEL);
pages = alloc_pages(GFP_KERNEL|__GFP_ZERO, get_order(size));
if (!pages)
@@ -346,8 +346,8 @@ static struct tnode *tnode_alloc(unsigned int size)
static void __tnode_free_rcu(struct rcu_head *head)
{
struct tnode *tn = container_of(head, struct tnode, rcu);
- unsigned int size = sizeof(struct tnode) +
- (1 << tn->bits) * sizeof(struct node *);
+ size_t size = sizeof(struct tnode) +
+ (sizeof(struct node *) << tn->bits);
if (size <= PAGE_SIZE)
kfree(tn);
@@ -386,8 +386,7 @@ static struct leaf_info *leaf_info_new(int plen)
static struct tnode* tnode_new(t_key key, int pos, int bits)
{
- int nchildren = 1<<bits;
- int sz = sizeof(struct tnode) + nchildren * sizeof(struct node *);
+ size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
struct tnode *tn = tnode_alloc(sz);
if (tn) {
@@ -399,8 +398,8 @@ static struct tnode* tnode_new(t_key key, int pos, int bits)
tn->empty_children = 1<<bits;
}
- pr_debug("AT %p s=%u %u\n", tn, (unsigned int) sizeof(struct tnode),
- (unsigned int) (sizeof(struct node) * 1<<bits));
+ pr_debug("AT %p s=%u %lu\n", tn, (unsigned int) sizeof(struct tnode),
+ (unsigned long) (sizeof(struct node) << bits));
return tn;
}
next prev parent reply other threads:[~2008-01-13 18:30 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080112064513.803976049@linux-foundation.org>
[not found] ` <20080112064646.056241123@linux-foundation.org>
2008-01-13 4:49 ` [PATCH 1/9] get rid of trie_init David Miller
[not found] ` <20080112064646.132747871@linux-foundation.org>
2008-01-13 4:50 ` [PATCH 2/9] get rid of unused revision element David Miller
2008-01-14 11:44 ` Robert Olsson
2008-01-14 12:06 ` David Miller
2008-01-14 16:35 ` Stephen Hemminger
2008-01-15 7:07 ` David Miller
[not found] ` <20080112064646.207183428@linux-foundation.org>
2008-01-13 4:53 ` [PATCH 3/9] move size information to pr_debug() David Miller
[not found] ` <20080112064646.282104074@linux-foundation.org>
2008-01-13 4:55 ` [PATCH 4/9] statistics improvements David Miller
2008-01-13 5:33 ` Stephen Hemminger
2008-01-13 5:44 ` David Miller
2008-01-14 20:57 ` [PATCH] [IPV4] fib_trie: size and statistics Stephen Hemminger
[not found] ` <20080114164450.55f8c9b2@deepthought>
2008-01-15 0:46 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Stephen Hemminger
2008-01-15 0:47 ` [PATCH 4/6] [IPV4] fib_trie style cleanup Stephen Hemminger
2008-01-15 2:58 ` [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention Stephen Hemminger
2008-01-15 5:07 ` [RFC 6/6] fib_trie: combine leaf and info Stephen Hemminger
2008-01-15 6:12 ` Eric Dumazet
2008-01-15 6:16 ` Eric Dumazet
2008-01-15 16:19 ` Stephen Hemminger
2008-01-15 16:44 ` Robert Olsson
2008-01-15 17:25 ` Eric Dumazet
2008-01-15 17:47 ` Stephen Hemminger
2008-01-15 18:10 ` Eric Dumazet
2008-01-15 18:15 ` Stephen Hemminger
2008-01-15 18:32 ` Eric Dumazet
2008-01-15 20:18 ` Robert Olsson
2008-01-15 21:16 ` Eric Dumazet
2008-01-15 17:59 ` Robert Olsson
2008-01-15 6:49 ` [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache Eric Dumazet
2008-01-15 7:29 ` David Miller
2008-01-15 5:00 ` [PATCH 2/6] [IPV4] fib hash|trie initialization Stephen Hemminger
2008-01-15 7:14 ` David Miller
2008-01-15 6:55 ` [PATCH] [IPV4] fib_trie: size and statistics Eric Dumazet
2008-01-15 7:28 ` David Miller
2008-01-15 7:12 ` David Miller
[not found] ` <20080112064646.356466158@linux-foundation.org>
2008-01-13 4:56 ` [PATCH 5/9] use %u for unsigned printfs David Miller
[not found] ` <20080112064646.432200237@linux-foundation.org>
2008-01-13 4:57 ` [PATCH 6/9] : fib_insert_node cleanup David Miller
[not found] ` <20080112064646.507015655@linux-foundation.org>
2008-01-13 4:58 ` [PATCH 7/9] printk related cleanups David Miller
[not found] ` <20080112064646.583836190@linux-foundation.org>
2008-01-13 5:23 ` [PATCH 8/9] add statistics David Miller
[not found] ` <20080112064646.659443238@linux-foundation.org>
2008-01-12 11:16 ` [PATCH 9/9] fix sparse warnings Eric Dumazet
2008-01-12 11:28 ` David Miller
2008-01-12 21:08 ` Stephen Hemminger
2008-01-14 11:07 ` Robert Olsson
2008-01-14 17:34 ` Eric Dumazet
2008-01-14 17:59 ` Robert Olsson
2008-01-14 19:27 ` [FIB]: Avoid using static variables without proper locking Eric Dumazet
2008-01-15 7:10 ` David Miller
2008-01-12 21:09 ` [PATCH 9/9] fix sparse warnings Stephen Hemminger
2008-01-13 5:28 ` David Miller
2008-01-13 18:30 ` Eric Dumazet [this message]
2008-01-13 22:02 ` [FIB]: full_children & empty_children should be uint, not ushort Robert Olsson
2008-01-14 6:32 ` David Miller
2008-01-13 5:25 ` [PATCH 9/9] fix sparse warnings David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478A58BD.8010304@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=robert.olsson@its.uu.se \
--cc=stephen.hemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).