* [PATCH 4/4] first conntrack ID must be 1 not 2
@ 2006-02-13 2:41 Pablo Neira Ayuso
2006-02-13 11:20 ` Harald Welte
0 siblings, 1 reply; 40+ messages in thread
From: Pablo Neira Ayuso @ 2006-02-13 2:41 UTC (permalink / raw)
To: Netfilter Development Mailinglist
Cc: Harald Welte, Patrick McHardy, Yasuyuki Kozakai
[-- Attachment #1: Type: text/plain, Size: 474 bytes --]
[NF_CONNTRACK] first conntrack ID must be 1 not 2
The first conntrack ID must be 1. If a new conntrack is created, the
general ID counter must be post-incremented instead pre-incremented
since [ip|nf]_conntrack_next_id is initialized to 1. Same applies for
expectations.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
--
The dawn of the fourth age of Linux firewalling is coming; a time of
great struggle and heroic deeds -- J.Kadlecsik got inspired by J.Morris
[-- Attachment #2: id.patch --]
[-- Type: text/plain, Size: 2225 bytes --]
[NETFILTER] first conntrack ID must be 1 not 2
The first conntrack ID must be 1. If a new conntrack is created, the
general ID counter must be post-incremented instead pre-incremented since
[ip|nf]_conntrack_next_id is initialized to 1. Same applies for expectations.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Index: net-2.6.git/net/netfilter/nf_conntrack_core.c
===================================================================
--- net-2.6.git.orig/net/netfilter/nf_conntrack_core.c 2006-02-13 01:05:04.000000000 +0100
+++ net-2.6.git/net/netfilter/nf_conntrack_core.c 2006-02-13 01:47:04.000000000 +0100
@@ -682,7 +682,7 @@ static void __nf_conntrack_hash_insert(s
unsigned int hash,
unsigned int repl_hash)
{
- ct->id = ++nf_conntrack_next_id;
+ ct->id = nf_conntrack_next_id++;
list_prepend(&nf_conntrack_hash[hash],
&ct->tuplehash[IP_CT_DIR_ORIGINAL].list);
list_prepend(&nf_conntrack_hash[repl_hash],
@@ -1247,7 +1247,7 @@ static void nf_conntrack_expect_insert(s
exp->timeout.expires = jiffies + exp->master->helper->timeout * HZ;
add_timer(&exp->timeout);
- exp->id = ++nf_conntrack_expect_next_id;
+ exp->id = nf_conntrack_expect_next_id++;
atomic_inc(&exp->use);
NF_CT_STAT_INC(expect_create);
}
Index: net-2.6.git/net/ipv4/netfilter/ip_conntrack_core.c
===================================================================
--- net-2.6.git.orig/net/ipv4/netfilter/ip_conntrack_core.c 2006-02-04 14:34:40.000000000 +0100
+++ net-2.6.git/net/ipv4/netfilter/ip_conntrack_core.c 2006-02-13 01:46:42.000000000 +0100
@@ -417,7 +417,7 @@ static void __ip_conntrack_hash_insert(s
unsigned int hash,
unsigned int repl_hash)
{
- ct->id = ++ip_conntrack_next_id;
+ ct->id = ip_conntrack_next_id++;
list_prepend(&ip_conntrack_hash[hash],
&ct->tuplehash[IP_CT_DIR_ORIGINAL].list);
list_prepend(&ip_conntrack_hash[repl_hash],
@@ -971,7 +971,7 @@ static void ip_conntrack_expect_insert(s
exp->timeout.expires = jiffies + exp->master->helper->timeout * HZ;
add_timer(&exp->timeout);
- exp->id = ++ip_conntrack_expect_next_id;
+ exp->id = ip_conntrack_expect_next_id++;
atomic_inc(&exp->use);
CONNTRACK_STAT_INC(expect_create);
}
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-13 2:41 [PATCH 4/4] first conntrack ID must be 1 not 2 Pablo Neira Ayuso @ 2006-02-13 11:20 ` Harald Welte 2006-02-16 8:33 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Harald Welte @ 2006-02-13 11:20 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: Netfilter Development Mailinglist, Patrick McHardy, Yasuyuki Kozakai [-- Attachment #1: Type: text/plain, Size: 854 bytes --] On Mon, Feb 13, 2006 at 03:41:52AM +0100, Pablo Neira Ayuso wrote: > [NF_CONNTRACK] first conntrack ID must be 1 not 2 > > The first conntrack ID must be 1. If a new conntrack is created, the > general ID counter must be post-incremented instead pre-incremented > since [ip|nf]_conntrack_next_id is initialized to 1. Same applies for > expectations. looks clean to me. Interesting question though: Does it matter whether we start with 1 or 2? Am I missing something? -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-13 11:20 ` Harald Welte @ 2006-02-16 8:33 ` Patrick McHardy 2006-02-16 8:47 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-02-16 8:33 UTC (permalink / raw) To: Harald Welte Cc: Netfilter Development Mailinglist, Yasuyuki Kozakai, Pablo Neira Ayuso Harald Welte wrote: > On Mon, Feb 13, 2006 at 03:41:52AM +0100, Pablo Neira Ayuso wrote: > >>[NF_CONNTRACK] first conntrack ID must be 1 not 2 >> >>The first conntrack ID must be 1. If a new conntrack is created, the >>general ID counter must be post-incremented instead pre-incremented >>since [ip|nf]_conntrack_next_id is initialized to 1. Same applies for >>expectations. > > > looks clean to me. Interesting question though: Does it matter whether > we start with 1 or 2? Am I missing something? Its irrelevant since the number doesn't have any meaning beyond beeing an identifier. Anyway, I'd prefer a patch that just removes the initialization to 1. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 8:33 ` Patrick McHardy @ 2006-02-16 8:47 ` Jozsef Kadlecsik 2006-02-16 9:02 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-16 8:47 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai On Thu, 16 Feb 2006, Patrick McHardy wrote: > Harald Welte wrote: > > On Mon, Feb 13, 2006 at 03:41:52AM +0100, Pablo Neira Ayuso wrote: > > > >>[NF_CONNTRACK] first conntrack ID must be 1 not 2 > >> > >>The first conntrack ID must be 1. If a new conntrack is created, the > >>general ID counter must be post-incremented instead pre-incremented > >>since [ip|nf]_conntrack_next_id is initialized to 1. Same applies for > >>expectations. > > > > > > looks clean to me. Interesting question though: Does it matter whether > > we start with 1 or 2? Am I missing something? > > Its irrelevant since the number doesn't have any meaning beyond beeing > an identifier. Anyway, I'd prefer a patch that just removes the > initialization to 1. Couldn't the id be simply equal to the jiffies when creating the conntrack entry? (And adding time_after/before when comparing ids, of course.) Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 8:47 ` Jozsef Kadlecsik @ 2006-02-16 9:02 ` Patrick McHardy 2006-02-16 9:11 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-02-16 9:02 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai Jozsef Kadlecsik wrote: > Couldn't the id be simply equal to the jiffies when creating the conntrack > entry? (And adding time_after/before when comparing ids, of course.) We can create more than one entry per jiffy, so that wouldn't be unique anymore. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 9:02 ` Patrick McHardy @ 2006-02-16 9:11 ` Jozsef Kadlecsik 2006-02-16 9:14 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-16 9:11 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai On Thu, 16 Feb 2006, Patrick McHardy wrote: > Jozsef Kadlecsik wrote: > > Couldn't the id be simply equal to the jiffies when creating the conntrack > > entry? (And adding time_after/before when comparing ids, of course.) > > We can create more than one entry per jiffy, so that wouldn't be unique > anymore. (jiffies, tuples) would be unique even in that case. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 9:11 ` Jozsef Kadlecsik @ 2006-02-16 9:14 ` Patrick McHardy 2006-02-16 9:36 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-02-16 9:14 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai Jozsef Kadlecsik wrote: > On Thu, 16 Feb 2006, Patrick McHardy wrote: > > >>Jozsef Kadlecsik wrote: >> >>>Couldn't the id be simply equal to the jiffies when creating the conntrack >>>entry? (And adding time_after/before when comparing ids, of course.) >> >>We can create more than one entry per jiffy, so that wouldn't be unique >>anymore. > > > (jiffies, tuples) would be unique even in that case. Thats true. But what is the advantage over using the counter? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 9:14 ` Patrick McHardy @ 2006-02-16 9:36 ` Jozsef Kadlecsik 2006-02-16 20:09 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-16 9:36 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai On Thu, 16 Feb 2006, Patrick McHardy wrote: > Jozsef Kadlecsik wrote: > > On Thu, 16 Feb 2006, Patrick McHardy wrote: > > > >>Jozsef Kadlecsik wrote: > >> > >>>Couldn't the id be simply equal to the jiffies when creating the conntrack > >>>entry? (And adding time_after/before when comparing ids, of course.) > >> > >>We can create more than one entry per jiffy, so that wouldn't be unique > >>anymore. > > > > (jiffies, tuples) would be unique even in that case. > > Thats true. But what is the advantage over using the counter? Currently nothing :-). However, the counter poses an internal bottleneck in the overall system when one attempts to replace the hashtable by hashtrie. (What I really like about hashtrie is that it naturally implements the 'region locking' suggested by Patrick Schaaf, compared to the 'brute force' per bucket locking.) As I see, we have got three bottlenecks in conntrack in SMP systems: the hashtable approach, the id and the expectation list. The first two could be eliminated by hashtrie and jiffies as ids. Some clever solution should only be found to spread the expectations over multiple, separatedly locked buckets... Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 9:36 ` Jozsef Kadlecsik @ 2006-02-16 20:09 ` Patrick McHardy 2006-02-17 8:18 ` Jozsef Kadlecsik 2006-03-30 8:31 ` Amin Azez 0 siblings, 2 replies; 40+ messages in thread From: Patrick McHardy @ 2006-02-16 20:09 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai Jozsef Kadlecsik wrote: > On Thu, 16 Feb 2006, Patrick McHardy wrote: > >>>(jiffies, tuples) would be unique even in that case. >> >>Thats true. But what is the advantage over using the counter? Actually it would still not be unique if connections live shorter than a jiffy and are resurrected. > Currently nothing :-). > > However, the counter poses an internal bottleneck in the overall system > when one attempts to replace the hashtable by hashtrie. (What I really > like about hashtrie is that it naturally implements the 'region locking' > suggested by Patrick Schaaf, compared to the 'brute force' per bucket > locking.) > > As I see, we have got three bottlenecks in conntrack in SMP systems: > the hashtable approach, the id and the expectation list. The first two > could be eliminated by hashtrie and jiffies as ids. > > Some clever solution should only be found to spread the expectations > over multiple, separatedly locked buckets... I've talked to Harald about this and as a start we can start by allowing masks only for the source part of the tuple. That means we can hash by destinations. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 20:09 ` Patrick McHardy @ 2006-02-17 8:18 ` Jozsef Kadlecsik 2006-02-17 8:45 ` Martin Josefsson 2006-02-17 8:50 ` [PATCH 4/4] first conntrack ID must be 1 not 2 Patrick McHardy 2006-03-30 8:31 ` Amin Azez 1 sibling, 2 replies; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-17 8:18 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai On Thu, 16 Feb 2006, Patrick McHardy wrote: > Jozsef Kadlecsik wrote: > > On Thu, 16 Feb 2006, Patrick McHardy wrote: > > > >>>(jiffies, tuples) would be unique even in that case. > >> > >>Thats true. But what is the advantage over using the counter? > > Actually it would still not be unique if connections live > shorter than a jiffy and are resurrected. I can imagine such a situation only when the conntrack table is full and we are forced to drop unassured connections - when we are in trouble anyway. > > Some clever solution should only be found to spread the expectations > > over multiple, separatedly locked buckets... > > I've talked to Harald about this and as a start we can start by > allowing masks only for the source part of the tuple. That > means we can hash by destinations. That's actually fine! Now I should complete that hashtrie/hashtable testing I have been planning... Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-17 8:18 ` Jozsef Kadlecsik @ 2006-02-17 8:45 ` Martin Josefsson 2006-02-17 9:30 ` Jozsef Kadlecsik 2006-02-17 8:50 ` [PATCH 4/4] first conntrack ID must be 1 not 2 Patrick McHardy 1 sibling, 1 reply; 40+ messages in thread From: Martin Josefsson @ 2006-02-17 8:45 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Patrick McHardy, Pablo Neira Ayuso, Yasuyuki Kozakai On Fri, 17 Feb 2006, Jozsef Kadlecsik wrote: > > > Some clever solution should only be found to spread the expectations > > > over multiple, separatedly locked buckets... > > > > I've talked to Harald about this and as a start we can start by > > allowing masks only for the source part of the tuple. That > > means we can hash by destinations. > > That's actually fine! Now I should complete that hashtrie/hashtable > testing I have been planning... Great that you are testing hashtrie, it works fine in all my simple tests but there might still be a lot of bugs in it, especially in the forced eviction, it might not be optimal, me and rusty tried to come up with a way to make it fair wrt the depth in the tree we are evicting at. If you have any suggestions and/or patches, please let me know :) Are you using the 060104 version, that's the latest version I have of the userspace code. I have an old untested patch against nf_conntrack as well but it needs some rewriting of the conntrack locking in order to avoid an SMP deadlock. /Martin ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-17 8:45 ` Martin Josefsson @ 2006-02-17 9:30 ` Jozsef Kadlecsik 2006-02-17 18:41 ` Jozsef Kadlecsik 2006-03-04 20:11 ` Hashtrie testing2 " Martin Josefsson 0 siblings, 2 replies; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-17 9:30 UTC (permalink / raw) To: Martin Josefsson Cc: Harald Welte, Netfilter Development Mailinglist, Patrick McHardy, Pablo Neira Ayuso, Yasuyuki Kozakai On Fri, 17 Feb 2006, Martin Josefsson wrote: > > That's actually fine! Now I should complete that hashtrie/hashtable > > testing I have been planning... > > Great that you are testing hashtrie, it works fine in all my simple tests > but there might still be a lot of bugs in it, especially in the forced > eviction, it might not be optimal, me and rusty tried to come up with a > way to make it fair wrt the depth in the tree we are evicting at. > > If you have any suggestions and/or patches, please let me know :) Here are the tests I'd like to see against hashtrie: - larger hashentry, i.e. when HASHSIFT is equal to 6, 7 or 8 - other HASHALIGN, hashbit_t values: (32, u8), (64, u16) and (64, u32). The current (32, u8) doesn't look optimal on 64bit CPUs, (64, u16) seems to be the best, but without testing it's hard to choose. - DoS against hashtrie by non-random tuples: single fixed destination IP, port and successive source IP, port numbers. (I don't think the current max 7 levels (childs) can survive such an attack.) Somehow I don't really like the eviction algorithm. What about some lazy auto-eviction instead: say, if there are more than 90% of the max elements, then drop a (any) unassured connection which can be found on the path when inserting a new one. Thus the current fixed stack could be eliminated and there were no builtin limit in the depth. I'm collecting the ideas, so can't submit patches yet ;-). > I have an old untested patch against nf_conntrack as well but it needs > some rewriting of the conntrack locking in order to avoid an SMP deadlock. Rusty's lock-ordering is a sure solution, but penalizes the process. I have been thinking on wether we could use simply, separated, unordered locking: 1. lock bucket according to tuple1 add element unlock bucket 2. lock bucket according to tuple2, add element unlock bucket if (add element failed) undo 1. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-17 9:30 ` Jozsef Kadlecsik @ 2006-02-17 18:41 ` Jozsef Kadlecsik 2006-03-04 16:23 ` Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) Martin Josefsson 2006-03-04 20:11 ` Hashtrie testing2 " Martin Josefsson 1 sibling, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-02-17 18:41 UTC (permalink / raw) To: Martin Josefsson Cc: Harald Welte, Netfilter Development Mailinglist, Patrick McHardy, Pablo Neira Ayuso, Yasuyuki Kozakai Hi Martin, On Fri, 17 Feb 2006, Jozsef Kadlecsik wrote: > I'm collecting the ideas, so can't submit patches yet ;-). Two more things: - the Jenkins hash internally produces 96 bits, which we could use in the hashtrie - there should be some (whatever artifical) tests to study the dynamical behaviour of the hashtrie: say all the entries are added, then half of them deleted and added back again, but in reverse order. Will the maxdepth grow? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-02-17 18:41 ` Jozsef Kadlecsik @ 2006-03-04 16:23 ` Martin Josefsson 2006-03-05 9:49 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Martin Josefsson @ 2006-03-04 16:23 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 3499 bytes --] On Fri, 2006-02-17 at 19:41 +0100, Jozsef Kadlecsik wrote: (Changing subject and removing some cc's) > Hi Martin, Hi Jozsef > On Fri, 17 Feb 2006, Jozsef Kadlecsik wrote: > > > I'm collecting the ideas, so can't submit patches yet ;-). :) I'll move my svn tree to the netfilter svn sometime soon. > Two more things: > > - the Jenkins hash internally produces 96 bits, which we could > use in the hashtrie That's a great idea, will look into it. > - there should be some (whatever artifical) tests to study the > dynamical behaviour of the hashtrie: say all the entries are > added, then half of them deleted and added back again, but > in reverse order. Will the maxdepth grow? I've performed some simple tests with this and here are the results: First I tried removing the upper half of the conntrack entries (it's the conntrack entries index in the test array that's printed out in the "Removing" line below) Number of entries in hashtrie: 819200 Number of children in hashtrie: 27345 Maxdepth of hashtrie: 3 (0 == root) Removing entries between 204800 and 409599. Adding entries between 409599 and 204800. insert (half reverse): time: 1484 cyc, 929 ns (1076184/s) Number of entries in hashtrie: 819200 Number of children in hashtrie: 27345 Maxdepth of hashtrie: 3 (0 == root) No diffrence at all which is a bit weird, I had to doublecheck the code to see that I really readded the entries in reverse order and I did. Then I tried it with the lower half of the conntrack entries, those lower in the tree and got this: Number of entries in hashtrie: 819200 Number of children in hashtrie: 27273 Maxdepth of hashtrie: 3 (0 == root) Removing entries between 0 and 204799. Adding entries between 204799 and 0. insert (half reverse): time: 1627 cyc, 1018 ns (981869/s) Number of entries in hashtrie: 819200 Number of children in hashtrie: 29189 Maxdepth of hashtrie: 3 (0 == root) Here wee see that the number of child-nodes has increased from 27273 to 29189, that's a 7.0% increase, but the max depth hasn't increased in this particular test. Given the right sitation the maxdepth will probably increase, it's likely to occur sometimes since the number of child-nodes has increased. Then I tried to remove every other entry and then readding them in reverse order (this test is performed after the previous readd of the lower half, not the same testrun as above so the absolute numbers doesn't match those above): Number of entries in hashtrie: 819200 Number of children in hashtrie: 27329 Maxdepth of hashtrie: 3 (0 == root) Removing entries between 0 and 204799. Adding entries between 204799 and 0. insert (half reverse): time: 1590 cyc, 996 ns (1004079/s) Number of entries in hashtrie: 819200 Number of children in hashtrie: 29253 Maxdepth of hashtrie: 3 (0 == root) Removing every other entry from 0 to 409599 Adding every other entry from 409599 to 0 insert (other reverse): time: 1637 cyc, 1025 ns (975239/s) Number of entries in hashtrie: 819200 Number of children in hashtrie: 29376 Maxdepth of hashtrie: 3 (0 == root) Here we see that the number of child-nodes increased in the remove_lower_half_and_readd_in_reverse_order test just as it did before, and then the number of child-nodes increased again after removing every other entry and adding them back in reverse order. The total increase after both tests is 7.5% More testing is clearly needed. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-04 16:23 ` Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) Martin Josefsson @ 2006-03-05 9:49 ` Jozsef Kadlecsik 2006-03-05 13:24 ` Martin Josefsson 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-05 9:49 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Sat, 4 Mar 2006, Martin Josefsson wrote: > I'll move my svn tree to the netfilter svn sometime soon. That'll be great! > > - there should be some (whatever artifical) tests to study the > > dynamical behaviour of the hashtrie: say all the entries are > > added, then half of them deleted and added back again, but > > in reverse order. Will the maxdepth grow? > > I've performed some simple tests with this and here are the results: > > First I tried removing the upper half of the conntrack entries (it's the > conntrack entries index in the test array that's printed out in the > "Removing" line below) > > Number of entries in hashtrie: 819200 > Number of children in hashtrie: 27345 > Maxdepth of hashtrie: 3 (0 == root) > Removing entries between 204800 and 409599. > Adding entries between 409599 and 204800. > insert (half reverse): time: 1484 cyc, 929 ns (1076184/s) > Number of entries in hashtrie: 819200 > Number of children in hashtrie: 27345 > Maxdepth of hashtrie: 3 (0 == root) > > No diffrence at all which is a bit weird, I had to doublecheck the code > to see that I really readded the entries in reverse order and I did. That's really weird: same maxdepth and exactly the same number of children! > Then I tried it with the lower half of the conntrack entries, those > lower in the tree and got this: > > Number of entries in hashtrie: 819200 > Number of children in hashtrie: 27273 > Maxdepth of hashtrie: 3 (0 == root) > Removing entries between 0 and 204799. > Adding entries between 204799 and 0. > insert (half reverse): time: 1627 cyc, 1018 ns (981869/s) > Number of entries in hashtrie: 819200 > Number of children in hashtrie: 29189 > Maxdepth of hashtrie: 3 (0 == root) > > Here wee see that the number of child-nodes has increased from 27273 to > 29189, that's a 7.0% increase, but the max depth hasn't increased in > this particular test. Given the right sitation the maxdepth will > probably increase, it's likely to occur sometimes since the number of > child-nodes has increased. Half of the entries were removed then re-added and it produced 7% increase in the number of the child nodes. That might be good or bad as well. [...] > More testing is clearly needed. Yes, one is somehow uneasy. Hm. What about filling up the hashtrie and then some long loop of deleting 10% at random and adding *new* random entries? We could measure the change in maxdepth/childnodes after every delete/add cycle. What were the peak numbers? At what numbers would it stabilize? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-05 9:49 ` Jozsef Kadlecsik @ 2006-03-05 13:24 ` Martin Josefsson 0 siblings, 0 replies; 40+ messages in thread From: Martin Josefsson @ 2006-03-05 13:24 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 3144 bytes --] On Sun, 2006-03-05 at 10:49 +0100, Jozsef Kadlecsik wrote: > Hi Martin, Hi Jozsef > > I'll move my svn tree to the netfilter svn sometime soon. > > That'll be great! After a small struggle with 'svnadmin dump', 'svndumpfilter' and 'sed' it's finally there with all history. trunk/hashtrie Enjoy and don't laugh too much at all the early experimentation :) > > First I tried removing the upper half of the conntrack entries (it's the > > conntrack entries index in the test array that's printed out in the > > "Removing" line below) > > > > Number of entries in hashtrie: 819200 > > Number of children in hashtrie: 27345 > > Maxdepth of hashtrie: 3 (0 == root) > > Removing entries between 204800 and 409599. > > Adding entries between 409599 and 204800. > > insert (half reverse): time: 1484 cyc, 929 ns (1076184/s) > > Number of entries in hashtrie: 819200 > > Number of children in hashtrie: 27345 > > Maxdepth of hashtrie: 3 (0 == root) > > > > No diffrence at all which is a bit weird, I had to doublecheck the code > > to see that I really readded the entries in reverse order and I did. > > That's really weird: same maxdepth and exactly the same number of > children! Yes I thought so too, I even added a bunch of debug printf's to the code to verify that it actually readded them in reverse order, and it did. > > Then I tried it with the lower half of the conntrack entries, those > > lower in the tree and got this: > > > > Number of entries in hashtrie: 819200 > > Number of children in hashtrie: 27273 > > Maxdepth of hashtrie: 3 (0 == root) > > Removing entries between 0 and 204799. > > Adding entries between 204799 and 0. > > insert (half reverse): time: 1627 cyc, 1018 ns (981869/s) > > Number of entries in hashtrie: 819200 > > Number of children in hashtrie: 29189 > > Maxdepth of hashtrie: 3 (0 == root) > > > > Here wee see that the number of child-nodes has increased from 27273 to > > 29189, that's a 7.0% increase, but the max depth hasn't increased in > > this particular test. Given the right sitation the maxdepth will > > probably increase, it's likely to occur sometimes since the number of > > child-nodes has increased. > > Half of the entries were removed then re-added and it produced 7% increase > in the number of the child nodes. That might be good or bad as well. Yes I don't know either :) In these tests both source and destination ipaddresses are random, that's usually not the case in most installations. With fixed dst ip, port and 1024 sourceipaddresses and random src ports it increases by 7.2% > [...] > > More testing is clearly needed. > > Yes, one is somehow uneasy. Hm. What about filling up the hashtrie and > then some long loop of deleting 10% at random and adding *new* random > entries? We could measure the change in maxdepth/childnodes after every > delete/add cycle. What were the peak numbers? At what numbers would it > stabilize? I'll test this as well during the evening. Removing 10% of the entries at random, and then give those entries new random values and readd them. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-02-17 9:30 ` Jozsef Kadlecsik 2006-02-17 18:41 ` Jozsef Kadlecsik @ 2006-03-04 20:11 ` Martin Josefsson 2006-03-05 11:24 ` Jozsef Kadlecsik 1 sibling, 1 reply; 40+ messages in thread From: Martin Josefsson @ 2006-03-04 20:11 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 4717 bytes --] On Fri, 2006-02-17 at 10:30 +0100, Jozsef Kadlecsik wrote: > Here are the tests I'd like to see against hashtrie: > > - larger hashentry, i.e. when HASHSIFT is equal to 6, 7 or 8 I have seen good results when sizeof(struct hashentry) = 64 on my laptop. But on the other hand my laptop likes the padding of struct hashentry at the beginning with gives unaligned pointers. It's an Pentium M processor in my laptop. > - other HASHALIGN, hashbit_t values: (32, u8), (64, u16) and (64, u32). Yes this needs more testing as well. > The current (32, u8) doesn't look optimal on 64bit CPUs, (64, u16) seems > to be the best, but without testing it's hard to choose. Currently I don't have any 64bit cpus to test things on. > - DoS against hashtrie by non-random tuples: single fixed destination IP, > port and successive source IP, port numbers. (I don't think the current > max 7 levels (childs) can survive such an attack.) I just ran some tests on this with this code: dstip = 192 << 24 | 168 << 16 | 1 << 8 | 1; srcip = 200 << 24 | 100 << 16 | 50 << 8 | 0; conntrack[i]->tuple[IP_CT_DIR_ORIGINAL].src.ip = srcip + rand() % 1024; conntrack[i]->tuple[IP_CT_DIR_ORIGINAL].src.u.tcp.port = (u16)rand(); conntrack[i]->tuple[IP_CT_DIR_ORIGINAL].dst.ip = dstip; conntrack[i]->tuple[IP_CT_DIR_ORIGINAL].dst.u.tcp.port = 80; And this is the result I got: sizeof struct hashentry: 32 sizeof struct ip_conntrack: 48 number of conntrack entries: 1228800 number of slots per hashtrie bucket: 5 number of pad bytes: 3 number of bytes per child: 2048 insert: time: 11299 cyc, 18854 ns (53040/s) Number of failed inserts: 22232 Number of entries in hashtrie: 2435368 Number of children in hashtrie: 238756 Maxdepth of hashtrie: 3 (0 == root) Maximum memory usage: 488972288 2.4 million entries in the hashtrie with fixed dstip, port and 1024 srcip's with random ports and a maxdepth of 3. I'd say the jenkins hash is doing its job quite nicely, wasn't expecting such good results. In the results above you see my main worry regarding hashtrie, the memory usage, 200 bytes per entry in hashtrie (400 bytes per conntrack) in this scenario, which means that the child-nodes aren't very populated in the leaf nodes. 2435368 entries / 238756 child-nodes = 10.2 entries/child-node And there's 2048 / 32 = 64 struct hashentry per child-node. That's 64 * 5 = 320 entries/child-node 10.2 / 320 = 3.2% usage of child-nodes in average which is simply horrible. I'm almost starting to think there's a major bug in there somewhere. I need to write some code to walk the tree and calculate the usage of each level to see that it's actually 100% for all buckets that has a child. > Somehow I don't really like the eviction algorithm. What about some lazy > auto-eviction instead: say, if there are more than 90% of the max > elements, then drop a (any) unassured connection which can be found on the > path when inserting a new one. Thus the current fixed stack could be > eliminated and there were no builtin limit in the depth. Couldn't this lead to the situation where we evict an entry early on the path, and then that slot gets reused for another entry that's also unassured, and it repeats... The problem with eviction in the hashtrie is that the depth of the entry has no correlation to the age of the entry like in the current hashtable (that's only true if you have long linked lists which leads to poor performance, so no real-world installation has properly working evication anyway today so maybe it doesn't matter too much) > > I have an old untested patch against nf_conntrack as well but it needs > > some rewriting of the conntrack locking in order to avoid an SMP deadlock. > > Rusty's lock-ordering is a sure solution, but penalizes the process. > I have been thinking on wether we could use simply, separated, unordered > locking: > > 1. lock bucket according to tuple1 > add element > unlock bucket > 2. lock bucket according to tuple2, > add element > unlock bucket > if (add element failed) > undo 1. Yes this is the way I've been thinking about. If we get into the case where we have conntrack A and B that have the exact same tuples but in reverse it doesn't really matter if one gets dropped or both, this is strictly best effort, no need to bend over backwards in order to minimize racewindows if it isn't anything serious. The only case I can think about where it might matter is the case of simultaneous open from both sides with the same source/destination ports, dns (some clients issue requests from port 53), games and ipsec isakmp come to mind. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-04 20:11 ` Hashtrie testing2 " Martin Josefsson @ 2006-03-05 11:24 ` Jozsef Kadlecsik 2006-03-05 17:48 ` Martin Josefsson 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-05 11:24 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Sat, 4 Mar 2006, Martin Josefsson wrote: > On Fri, 2006-02-17 at 10:30 +0100, Jozsef Kadlecsik wrote: > > > Here are the tests I'd like to see against hashtrie: > > > > - larger hashentry, i.e. when HASHSIFT is equal to 6, 7 or 8 > > I have seen good results when sizeof(struct hashentry) = 64 on my > laptop. But on the other hand my laptop likes the padding of struct > hashentry at the beginning with gives unaligned pointers. It's an > Pentium M processor in my laptop. HASHSIFT (i.e. HASHNUM) defines the hash size of the hashtrie. But the hashtrie - according to your tests - are much "wider" than "deep". So there might be no point in widening the hash. > > - other HASHALIGN, hashbit_t values: (32, u8), (64, u16) and (64, u32). > > Yes this needs more testing as well. > > > The current (32, u8) doesn't look optimal on 64bit CPUs, (64, u16) seems > > to be the best, but without testing it's hard to choose. > > Currently I don't have any 64bit cpus to test things on. With the default (32, u8) settings NUMENTRY looks too small on 64bit CPUs: HASHALIGN = 32 hashbits_t = u8 32bit CPU: NUMENTRY = (32 - 4)/(1+4) = 5 PADNUM = 32 - 4 - 5*(1+4) = 3 64bit CPU: NUMENTRY = (32 - 8)/(1+8) = 2 PADNUM = 32 - 8 - 2*(1+8) = 6 HASHALIGN = 64 hashbits_t = u16 32bit CPU: NUMENTRY = (64 - 4)/(2+4) = 10 PADNUM = 64 - 4 - 10*(2+4) = 0 64bit CPU: NUMENTRY = (64 - 8)/(2+8) = 5 PADNUM = 64 - 8 - 5*(2+8) = 6 HASHALIGN = 64 hashbits_t = u32 32bit CPU: NUMENTRY = (64 - 4)/(4+4) = 7 PADNUM = 64 - 4 - 7*(4+4) = 4 64bit CPU: NUMENTRY = (64 - 8)/(4+8) = 4 PADNUM = 64 - 8 - 4*(4+8) = 8 > > - DoS against hashtrie by non-random tuples: single fixed destination IP, > > port and successive source IP, port numbers. (I don't think the current > > max 7 levels (childs) can survive such an attack.) > > I just ran some tests on this with this code: [...] > And this is the result I got: > > sizeof struct hashentry: 32 > sizeof struct ip_conntrack: 48 > number of conntrack entries: 1228800 > number of slots per hashtrie bucket: 5 > number of pad bytes: 3 > number of bytes per child: 2048 > insert: time: 11299 cyc, 18854 ns (53040/s) > Number of failed inserts: 22232 > Number of entries in hashtrie: 2435368 > Number of children in hashtrie: 238756 > Maxdepth of hashtrie: 3 (0 == root) > Maximum memory usage: 488972288 > > 2.4 million entries in the hashtrie with fixed dstip, port and 1024 > srcip's with random ports and a maxdepth of 3. I'd say the jenkins hash > is doing its job quite nicely, wasn't expecting such good results. The hashtrie tends to grow wider and not deeper: level: childs (max) entries (max) 0 64 320 1 4160 20800 2 266304 1331520 3 17043520 85217600 > In the results above you see my main worry regarding hashtrie, the > memory usage, 200 bytes per entry in hashtrie (400 bytes per conntrack) > in this scenario, which means that the child-nodes aren't very populated > in the leaf nodes. > > 2435368 entries / 238756 child-nodes = 10.2 entries/child-node > > And there's 2048 / 32 = 64 struct hashentry per child-node. > That's 64 * 5 = 320 entries/child-node > > 10.2 / 320 = 3.2% usage of child-nodes in average which is simply > horrible. > > I'm almost starting to think there's a major bug in there somewhere. > I need to write some code to walk the tree and calculate the usage of > each level to see that it's actually 100% for all buckets that has a > child. Yes, that could help to spot the problem. But if there are clashes (due to using to small parts of the hash key, i.e hashbits_t = u8), then the node can be fairly low utilized and still there is a need to expand by new childs. > > Somehow I don't really like the eviction algorithm. What about some lazy > > auto-eviction instead: say, if there are more than 90% of the max > > elements, then drop a (any) unassured connection which can be found on the > > path when inserting a new one. Thus the current fixed stack could be > > eliminated and there were no builtin limit in the depth. > > Couldn't this lead to the situation where we evict an entry early on the > path, and then that slot gets reused for another entry that's also > unassured, and it repeats... You mean, we could always evict the newest unassured entries instead of the oldest ones? The algorithm could be extended so that we'd walk the whole tree and register the oldest unassured entry in the path. I'm more worried that we'd check too litle number of entries as the tree is shorter than I thought. > > I have been thinking on wether we could use simply, separated, unordered > > locking: [...] > The only case I can think about where it might matter is the case of > simultaneous open from both sides with the same source/destination > ports, dns (some clients issue requests from port 53), games and ipsec > isakmp come to mind. That can be fixed by checking that we undo the insertion for the same conntrack entry we were unable to add at the second step. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-05 11:24 ` Jozsef Kadlecsik @ 2006-03-05 17:48 ` Martin Josefsson 2006-03-06 13:15 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Martin Josefsson @ 2006-03-05 17:48 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 7276 bytes --] On Sun, 2006-03-05 at 12:24 +0100, Jozsef Kadlecsik wrote: > Hi Martin, Hi Jozsef > HASHSIFT (i.e. HASHNUM) defines the hash size of the hashtrie. > But the hashtrie - according to your tests - are much "wider" than > "deep". So there might be no point in widening the hash. HASHSHIFT and HASHNUM controls the number of struct hashentry per child. The original idea was to allocate 1 page (4kB) for each child but I got better results with 2kB, which means it isn't as wide as it was from the start. > > > - other HASHALIGN, hashbit_t values: (32, u8), (64, u16) and (64, u32). > > > > Yes this needs more testing as well. > > > > > The current (32, u8) doesn't look optimal on 64bit CPUs, (64, u16) seems > > > to be the best, but without testing it's hard to choose. > > > > Currently I don't have any 64bit cpus to test things on. HASHALIGN is the size of struct hashentry, the idea is that HASHALIGN should be equal to the cacheline size of the cpu it's running on. That way you only get one cachemiss per level in the tree. HASHALIGN is set to 32 bytes just because my laptop has 32byte cachelines, other cpus has 64bytes or even 128bytes. hashbit_t is an u8 because in my tests I havn't see any gain by going to more bits. The only purpose of hashbits is to "cheat" a bit during lookup, store a part of the hash-value in the datastructure so you can easily determine if a member even can be a possible match by comparing the hashbits to the hashvalue you are searching for, this eliminates a lot of cachemisses (without this the datastructure will behave more like a linked list). I've tested with diffrent number of bits for hashbits but I havn't seen any real gain by going over 7 bits which is what is currently used, and the last bit is used as status bit for ASSURED. > With the default (32, u8) settings NUMENTRY looks too small on 64bit > CPUs: Yes it's too small, luckily I don't know of any 64bit cpus that have 32byte cachelines :) Lets say we keep hashbits as an u8 then we get the following results for diffrent cacheline sizes (I believe the intel p4 has 128byte cachelines) HASHALIGN = 32 hashbits_t = u8 32bit CPU: NUMENTRY = (32 - 4)/(1+4) = 5 PADNUM = 32 - 4 - 5*(1+4) = 3 64bit CPU: NUMENTRY = (32 - 8)/(1+8) = 2 PADNUM = 32 - 8 - 2*(1+8) = 6 HASHALIGN = 64 hashbits_t = u8 32bit CPU: NUMENTRY = (64 - 4)/(1 + 4) = 12 PADNUM = 64 - 4 - 12*(1 + 4) = 0 64bit CPU: NUMENTRY = (64 - 8)/(1 + 8) = 6 PADNUM = 64 - 8 - 6*(1 + 8) = 2 HASHALIGN = 128 hashbits_t = u8 32bit CPU: NUMENTRY = (128 - 4)/(1 +4) = 24 PADNUM = 128 - 4 - 24*(1 + 4) = 4 64bit CPU: NUMENTRY = (128 - 8)/(1 + 8) = 13 PADNUM = 128 - 8 - 13*(1 + 8) = 3 When HASHALIGN goes up we need to reduce HASHSHIFT in order to make sure we don't try to allocate > 1 page (usually 4kB) of memory. > > > - DoS against hashtrie by non-random tuples: single fixed destination IP, > > > port and successive source IP, port numbers. (I don't think the current > > > max 7 levels (childs) can survive such an attack.) [snip] > > 2.4 million entries in the hashtrie with fixed dstip, port and 1024 > > srcip's with random ports and a maxdepth of 3. I'd say the jenkins hash > > is doing its job quite nicely, wasn't expecting such good results. > > The hashtrie tends to grow wider and not deeper: > > level: childs (max) entries (max) > 0 64 320 > 1 4160 20800 > 2 266304 1331520 > 3 17043520 85217600 It grows wider if and only if we get hashvalues that aren't too similar in the high bits. If that happens we grow deeper instead of wider. But it seems the jenkins hash is really good at distributing the changes over all of the bits in the hashvalue. (it's just a little bit on the heavy side, it uses plenty of cpu) > > I'm almost starting to think there's a major bug in there somewhere. > > I need to write some code to walk the tree and calculate the usage of > > each level to see that it's actually 100% for all buckets that has a > > child. > > Yes, that could help to spot the problem. But if there are clashes > (due to using to small parts of the hash key, i.e hashbits_t = u8), then > the node can be fairly low utilized and still there is a need to expand by > new childs. Small hashbits_t doesn't have anything to do with it, hashbits_t is just for accelerating lookups. If we have a low number of struct hashentry per child-node we'll probably fill them up faster since there's fewer of them to be filled. And if we have a low NUMENTRY they will also be filled faster. > > Couldn't this lead to the situation where we evict an entry early on the > > path, and then that slot gets reused for another entry that's also > > unassured, and it repeats... > > You mean, we could always evict the newest unassured entries instead of > the oldest ones? The algorithm could be extended so that we'd walk the > whole tree and register the oldest unassured entry in the path. I'm more > worried that we'd check too litle number of entries as the tree is shorter > than I thought. The problem is how do you know which entry along a path is the oldest? Entries aren't always added to the front like with the current linked lists in the hashtable. > > > I have been thinking on wether we could use simply, separated, unordered > > > locking: > [...] > > The only case I can think about where it might matter is the case of > > simultaneous open from both sides with the same source/destination > > ports, dns (some clients issue requests from port 53), games and ipsec > > isakmp come to mind. > > That can be fixed by checking that we undo the insertion for the same > conntrack entry we were unable to add at the second step. What I meant is the following situation: clients A & B are both located behind NAT and are trying to connect to each other. Both bind to the same port, lets say X and try to connect to each others external ip. Currently if the packets from both clients arrive at the same time at an SMP firewall we add a conntrack entry for one of the clients but drop the second one because of the global lock. With Rustys lock ordering we keep this behaviour since both buckets are locked in the same order for both packets. With the "unordered independent locking" of buckets we might end up with: these happen at the same time: packet from client A hashes into bucket C, lock and add entry. packet from client B hashes into bucket D, lock and add entry. and after that these happen at the same time: packet from client A hashes into bucket D, lock and find entry, remove entry in bucket C packet from client B hashes into bucket C, lock and find entry, remove entry in bucket D This leads to no conntrack entry created for any of the clients connection attempts. The solution would be "ordered independent locking" which locks and adds entries based on bucket number that the tuplex hash into. That way we still keep the current behaviour while simplifying the locking, no chance of deadlocks. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-05 17:48 ` Martin Josefsson @ 2006-03-06 13:15 ` Jozsef Kadlecsik 2006-03-07 18:33 ` Martin Josefsson 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-06 13:15 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin On Sun, 5 Mar 2006, Martin Josefsson wrote: > HASHSHIFT and HASHNUM controls the number of struct hashentry per child. > The original idea was to allocate 1 page (4kB) for each child but I got > better results with 2kB, which means it isn't as wide as it was from the > start. Did you get faster lookups/insertions/deletions with half of the page size? > HASHALIGN is the size of struct hashentry, the idea is that HASHALIGN > should be equal to the cacheline size of the cpu it's running on. That > way you only get one cachemiss per level in the tree. > HASHALIGN is set to 32 bytes just because my laptop has 32byte > cachelines, other cpus has 64bytes or even 128bytes. As far as I see, the structure should satisfy "contradicting" requirements. We want an as low tree as possible, thus minimising the cost of the operations. That includes the number of levels of the childs and the number of slots in a bucket and the latter depends on the HASHALIGN, hashbits_t parameters and the members of the hashentry structure. So in theory we should maximize HASHSHIFT and minimize NUMENTRY. However we do not want a preallocated, huge, sparse hash either, and therefore it is hard to tell the best values. > hashbit_t is an u8 because in my tests I havn't see any gain by going to > more bits. > The only purpose of hashbits is to "cheat" a bit during lookup, store a > part of the hash-value in the datastructure so you can easily determine > if a member even can be a possible match by comparing the hashbits to > the hashvalue you are searching for, this eliminates a lot of > cachemisses (without this the datastructure will behave more like a > linked list). This structure is quite tricky :-). > > Yes, that could help to spot the problem. But if there are clashes > > (due to using to small parts of the hash key, i.e hashbits_t = u8), then > > the node can be fairly low utilized and still there is a need to expand by > > new childs. > > Small hashbits_t doesn't have anything to do with it, hashbits_t is just > for accelerating lookups. If we have a low number of struct hashentry > per child-node we'll probably fill them up faster since there's fewer of > them to be filled. And if we have a low NUMENTRY they will also be > filled faster. But how can we explain the so few entries in the buckets? Your numbers suggest that most of the buckets are pretty empty, as if there were two slots full from the 64 in the DoS case. Or we can say that the DoS has got an effect: the jenkins hash were unable to produce good enough keys. Or using the 32 bits from the 96 internal ones is not sufficient. But in the non-DoS random pattern case, there was 819200 entries / 27345 child-nodes =~ 29 entries/child-nodes, that's still around 10% utilization. So it looks it's not the jenkins hash which produces the sparse tree. > The problem is how do you know which entry along a path is the oldest? > Entries aren't always added to the front like with the current linked > lists in the hashtable. We could check the timer of the entry: the oldest one is that which would time out nearest in the future. [...] > these happen at the same time: > packet from client A hashes into bucket C, lock and add entry. > packet from client B hashes into bucket D, lock and add entry. > > and after that these happen at the same time: > packet from client A hashes into bucket D, lock and find entry, remove > entry in bucket C > packet from client B hashes into bucket C, lock and find entry, remove > entry in bucket D > > This leads to no conntrack entry created for any of the clients > connection attempts. > > The solution would be "ordered independent locking" which locks and adds > entries based on bucket number that the tuplex hash into. > That way we still keep the current behaviour while simplifying the > locking, no chance of deadlocks. That looks perfect! Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-06 13:15 ` Jozsef Kadlecsik @ 2006-03-07 18:33 ` Martin Josefsson 2006-03-08 6:34 ` Patrick Schaaf ` (2 more replies) 0 siblings, 3 replies; 40+ messages in thread From: Martin Josefsson @ 2006-03-07 18:33 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 5339 bytes --] On Mon, 2006-03-06 at 14:15 +0100, Jozsef Kadlecsik wrote: > Hi Martin Hi Jozsef > > HASHSHIFT and HASHNUM controls the number of struct hashentry per child. > > The original idea was to allocate 1 page (4kB) for each child but I got > > better results with 2kB, which means it isn't as wide as it was from the > > start. > > Did you get faster lookups/insertions/deletions with half of the page > size? Yes I did, but I don't trust my testresults 100% (run on my laptop) because of the fact that my laptop gets better results if I put the padding at the start of struct hashentry than if I put the padding at the end. > As far as I see, the structure should satisfy "contradicting" > requirements. We want an as low tree as possible, thus minimising the cost > of the operations. That includes the number of levels of the childs and > the number of slots in a bucket and the latter depends on the HASHALIGN, > hashbits_t parameters and the members of the hashentry structure. So in > theory we should maximize HASHSHIFT and minimize NUMENTRY. However we do > not want a preallocated, huge, sparse hash either, and therefore it is > hard to tell the best values. Maximizing HASHSHIFT makes us allocate a lot for each node. And minimizing HASHNUM makes struct hashentry smaller, thus giving a _lot_ of buckets in each node. Will having say 200 buckets with 3 entries in each bucket be better than 50 buckets with 12 entries in each bucket? Same number of entries in the node but one child-pointer per bucket, which might lead to even more child-nodes in the case where we have 200 buckets that fill up rather quick since they only hold a small number of entries. But on the other hand we hash to 200 buckets instead of 50 buckets so the total fill-rate of the node is the same for both cases. And having lots of entries in a bucket doesn't really cause a performance degradation as long as you keep the bucket not larger than the cacheline size and it has to be cacheline aligned. Having lots of buckets will give an even flatter tree. I think I've tested this as well about a year ago but I can't remember the results. > This structure is quite tricky :-). Yes it is :) Image how I felt when I got it described to me in conversations on irc. There's been a lot of experimentations and failures along the path and there's still a lot of things to test and rewrite. > > Small hashbits_t doesn't have anything to do with it, hashbits_t is just > > for accelerating lookups. If we have a low number of struct hashentry > > per child-node we'll probably fill them up faster since there's fewer of > > them to be filled. And if we have a low NUMENTRY they will also be > > filled faster. > > But how can we explain the so few entries in the buckets? Your numbers > suggest that most of the buckets are pretty empty, as if there were two > slots full from the 64 in the DoS case. Yes, it's weird. Havn't had time to write the code to traverse the produce stats from the tree yet, who knows, tonight may be the night when that happens :) > Or we can say that the DoS has got an effect: the jenkins hash were > unable to produce good enough keys. Or using the 32 bits from the 96 > internal ones is not sufficient. That might be the case. > But in the non-DoS random pattern case, there was 819200 entries / 27345 > child-nodes =~ 29 entries/child-nodes, that's still around 10% > utilization. So it looks it's not the jenkins hash which produces the > sparse tree. I'll see what I can dig out from the entries in the nodes, it might be very unbalanced. > > The problem is how do you know which entry along a path is the oldest? > > Entries aren't always added to the front like with the current linked > > lists in the hashtable. > > We could check the timer of the entry: the oldest one is that which would > time out nearest in the future. Ouch, this is exactly what we don't want to do. If we are under attack we don't want to start looking at entries to determine which entries to evict, that would result in lots of cachemisses which slows things down a lot. The current eviction algorithm tries to evict entries along the path that the new entry will use, and respect the level of the entries so you don't end up evicting entries from the root-node all the time, those free slots will get filled up with new entries rather quickly and we don't want to evict things we probably just added. But that doesn't say the current eviction algorithm does the right thing, it was just something me and Rusty came up with during the workshop as an aproximation. Fast batch-killing of entries, but try to not evict from the often walked parts of the path too often. It is best-effort, no need to bend over backwards just to do the right thing in all situations, it's not worth it in the end. > > The solution would be "ordered independent locking" which locks and adds > > entries based on bucket number that the tuplex hash into. > > That way we still keep the current behaviour while simplifying the > > locking, no chance of deadlocks. > > That looks perfect! It came to me after writing the description of the "unordered independent locking", it hit me like a punch in the face, it was so simple. -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-07 18:33 ` Martin Josefsson @ 2006-03-08 6:34 ` Patrick Schaaf 2006-03-12 18:49 ` Martin Josefsson 2006-03-30 8:28 ` Hashtrie testing2, dancing trees Amin Azez 2 siblings, 0 replies; 40+ messages in thread From: Patrick Schaaf @ 2006-03-08 6:34 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik > > This structure is quite tricky :-). > > Yes it is :) Image how I felt when I got it described to me in > conversations on irc. Hmm. Evoke this feeling again. :) Now I'll describe an easy way (haha) to get rid of half of the tuples, for the case of no-NAT. I did that before, but if you are now revisiting conntracking basics, maybe it's time to retell: Notice that for the non-NAT case, the two tuples of a connection are fully symmetric in src(ip|port) vs. dst(ip|port). For ease of exposition, imagine each (ip|port) to be an 48-bit number. Given two numbers A and B, a function can easily be implemented: symsort: N x N -> N x N x Z2 given (a, b), if a <= b, result is (a, b, 0) given (a, b), if a > b, result is (b, a, 1) Given the symmetric tuples from both halves of one connection, this function returns the same A/B results in both cases, with a different flip bit (third result). So, when doing the hash lookup, run symsort first, and look for the resulting "sorted" tuple. The flip bit output of symsort needs to be taken into account when classifying original / other destination. Details... One would have a single tuple, instead of two, in struct ip_conntrack. For NATted connections, an additional tuple would be allocated on demand, referenced from ip_conntrack by a pointer, and also put into the hashes. When a non-NAT conntrack gets deleted, only one hash entry needs to be removed, avoiding the need for complex lock ordering schemes in that place. You can always tell you have this case, by looking whether the NAT-tuple-pointer is != 0. Downsides: - complexity - second tuple in separate memory, in the NAT case. Maybe more cache misses in some places Upsides in the non-NAT case: - half the memory for tuples - half the number of hash insertions and deletions with their assorted cache influences best regards Patrick ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-07 18:33 ` Martin Josefsson 2006-03-08 6:34 ` Patrick Schaaf @ 2006-03-12 18:49 ` Martin Josefsson 2006-03-14 11:35 ` Jozsef Kadlecsik 2006-03-23 11:27 ` Jozsef Kadlecsik 2006-03-30 8:28 ` Hashtrie testing2, dancing trees Amin Azez 2 siblings, 2 replies; 40+ messages in thread From: Martin Josefsson @ 2006-03-12 18:49 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 2922 bytes --] On Tue, 2006-03-07 at 19:33 +0100, Martin Josefsson wrote: Hi Jozsef > > But in the non-DoS random pattern case, there was 819200 entries / 27345 > > child-nodes =~ 29 entries/child-nodes, that's still around 10% > > utilization. So it looks it's not the jenkins hash which produces the > > sparse tree. > > I'll see what I can dig out from the entries in the nodes, it might be > very unbalanced. I've added some very simple debug-code and here's what I've found so far when testing with random src/dst ip/port. number of slots per hashtrie bucket: 5 ... Depth 4 - Number of children 0 Depth 4 - Number of buckets 0 Depth 4 - Number of not full buckets 0 Depth 3 - Number of children 0 Depth 3 - Number of buckets 7801856 Depth 3 - Number of entries 291689 (0%) Depth 3 - Number of not full buckets 7801856 Depth 2 - Number of children 121904 Depth 2 - Number of buckets 262144 Depth 2 - Number of entries 1121175 (85%) Depth 2 - Number of not full buckets 99698 Depth 1 - Number of children 4096 Depth 1 - Number of buckets 4096 Depth 1 - Number of entries 20480 (100%) Depth 1 - Number of not full buckets 0 Depth 0 - Number of children 64 Depth 0 - Number of buckets 64 Depth 0 - Number of entries 256 (100%) Depth 0 - Number of not full buckets 0 If we look at depth 2 we can see the following: We have 121904 buckets that are "overfull", thus leading to an expansion. (121904 * 5 + 291689) / (121904 * 5) = 1.48 = 148% And we have 262144 - 121904 - 99698 = 40542 buckets that are 100% full. And we have 99698 not full buckets with a number of entries: 1121175 - (121904 + 40542) * 5 = 308945 entries 308945 / (99698 * 5) = 0.62 = 62% usage Summary: 121904 buckets that are 148% used. 40542 buckets that are 100% used. 99698 buckets that are 62% used. So the jenkins hash doesn't seem to be able to distribute the entries very well. This leads to the allocation of 121904 children for just 291689 entries, that's a huge waste of memory. But even if we are able to come up with a better hash algorithm for this we'll still have this problem when we start expanding deeper, the size of the tree simply explodes. One idea I've played with earlier is to change the number of buckets per child depending on the depth, so we have larger children at the top of the tree and then they get smaller further down. I'll have to revive that code (should be somewhere in svn) in order to say if it helped the memory usage or not, I know it made things a bit slower. I've also tried making the buckets larger and decreasing HASHSHIFT in order to not exceed 4kB allocations, this allows the buckets to absorb the imperfectness of the hashing and memory usage goes down a _lot_. But... this is will simply destroy performance on machines with small cachelines as it will lead to lots of cachemisses per bucket. Any ideas? -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-12 18:49 ` Martin Josefsson @ 2006-03-14 11:35 ` Jozsef Kadlecsik 2006-03-23 11:27 ` Jozsef Kadlecsik 1 sibling, 0 replies; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-14 11:35 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Sun, 12 Mar 2006, Martin Josefsson wrote: > I've added some very simple debug-code and here's what I've found so far > when testing with random src/dst ip/port. [...] > Summary: > 121904 buckets that are 148% used. > 40542 buckets that are 100% used. > 99698 buckets that are 62% used. > > So the jenkins hash doesn't seem to be able to distribute the entries > very well. I wrote first such words out of frustation, but actually I think the jenkins hash is fine. It was selected after a lot of testing, checking, comparison. There might of course be flaws in it, but > This leads to the allocation of 121904 children for just 291689 entries, > that's a huge waste of memory. But even if we are able to come up with a > better hash algorithm for this we'll still have this problem when we > start expanding deeper, the size of the tree simply explodes. we may simply see the consequences of the inherent behaviour of hashtree *and* the goodness of the jenkins hash: as part of the hash keys clash (they do because HASHSHIFT is relatively small) and as the tree grows, there will be too many empty branches (the whole hash key is pretty unique, there are to *few* near-clashes at the higher levels). But that's an assumption only. > One idea I've played with earlier is to change the number of buckets per > child depending on the depth, so we have larger children at the top of > the tree and then they get smaller further down. I'll have to revive > that code (should be somewhere in svn) in order to say if it helped the > memory usage or not, I know it made things a bit slower. Usually we trade speed for memory. Double hashing does not help really either, because it slows down the lookups too :-( - I learnt it at ipset. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-12 18:49 ` Martin Josefsson 2006-03-14 11:35 ` Jozsef Kadlecsik @ 2006-03-23 11:27 ` Jozsef Kadlecsik 2006-03-23 21:07 ` Martin Josefsson 1 sibling, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-23 11:27 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Sun, 12 Mar 2006, Martin Josefsson wrote: [...] > I've also tried making the buckets larger and decreasing HASHSHIFT in > order to not exceed 4kB allocations, this allows the buckets to absorb > the imperfectness of the hashing and memory usage goes down a _lot_. > But... this is will simply destroy performance on machines with small > cachelines as it will lead to lots of cachemisses per bucket. > > Any ideas? I took your code and messed with it: added input parameters (hashsize/number of elements), HASHNUM-dependent variants of hashtrie, added Paul Hsiesh's superfasthash function and an implementation of the shared-node fast hash from a SIGCOMM article. Then ran the different methods with increasing number of elements (table is half, full, two and four times more entry than hashsize) and created some graphs by gnuplot. The results are at http://people.netfilter.org/kadlec/hashtrie/ What I could conclude is that hashtable requires the least memory and the fastest at inserting elements. Hashtrie is better in looking up entries when there are more than two times entries in the hash than the hashsize of the hashtable. Something is required to improve hashtrie, so that it'd require less memory. Without losing speed, but possibly gaining, at insertion. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-23 11:27 ` Jozsef Kadlecsik @ 2006-03-23 21:07 ` Martin Josefsson 2006-03-25 8:39 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Martin Josefsson @ 2006-03-23 21:07 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 3674 bytes --] On Thu, 2006-03-23 at 12:27 +0100, Jozsef Kadlecsik wrote: > Hi Martin, Hi Jozsef > I took your code and messed with it: added input parameters > (hashsize/number of elements), HASHNUM-dependent variants of hashtrie, > added Paul Hsiesh's superfasthash function and an implementation of the > shared-node fast hash from a SIGCOMM article. Then ran the different > methods with increasing number of elements (table is half, full, two and > four times more entry than hashsize) and created some graphs by gnuplot. Wonderful, finally someone playing with it :) > The results are at http://people.netfilter.org/kadlec/hashtrie/ Wow, that's some nice work. Must have taken quite a bit of time to rewrite my crappy code (started out as a single small .c file :) into something useful for testing and then performing all those tests. A lot of your changes should go into svn, you have full freedom to change whatever you want. I see that you have tested with HASHSHIFT 3, 4, 5, 6 and a 32byte cachelinesize. That gives: HASHSHIFT = 3 HASHNUM (buckets per child) = 8 childsize = HASHNUM * bucketsize (cachelinesize) = 256 bytes HASHSHIFT = 4 HASHNUM (buckets per child) = 16 childsize = HASHNUM * bucketsize (cachelinesize) = 512 bytes HASHSHIFT = 5 HASHNUM (buckets per child) = 32 childsize = HASHNUM * bucketsize (cachelinesize) = 1024 bytes HASHSHIFT = 6 HASHNUM (buckets per child) = 64 childsize = HASHNUM * bucketsize (cachelinesize) = 2048 bytes The original idea was to allocate PAGE_SIZE (4096 on i386) bytes at a time so it would be interesting to test HASHSHIFT = 7 (assuming a 32byte cachelinesize) Another idea is to have varible number of buckets (HASHNUM) in the nodes, more buckets at the top of the trie and then decrease the number of buckets as we descend deeper into the trie. I've tried this earlier but with lowered performance, I don't remember anything about the change in memory-usage, and I don't trust my memory so it should be tested again. HASHSHIFT=8 or 9 or even larger at depth 0 would be interesting to test. Allocating more than PAGE_SIZE makes it an order-1 or larger allocation in kernel which is slower and more likely to fail but the the root-node it doesn't matter if it takes a little longer to allocate, and it's usually performed just after bootup when memory isn't very fragmented so it shouldn't fail. > What I could conclude is that hashtable requires the least memory and the > fastest at inserting elements. Hashtrie is better in looking up entries > when there are more than two times entries in the hash than the hashsize > of the hashtable. Yes, this matches the results I've seen as well. I've also run it under the cachegrind module of valgrind and among other things it keeps statistics on the number of instructions executed, and hashtrie executes a lot more instructions than the plain hashtable during lookup, yet hashtrie gets better performance in some cases due to the fact that it gets less cachemisses. > Something is required to improve hashtrie, so that it'd require less > memory. Without losing speed, but possibly gaining, at insertion. Yes, something is needed, right now I don't really see any solution to the memory-usage problem. The problem is less severe with larger cachelinesizes than 32bytes. 64byte cachelines gives a lot lower memory usage, 128bytes like the P4 has decreases the memoryusage even more (you have to decrease HASHSHIFT in order to not allocate too much memory for each node). Now I'm going to think about the other hashtrie email I havn't responded to yet :) -- /Martin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-23 21:07 ` Martin Josefsson @ 2006-03-25 8:39 ` Jozsef Kadlecsik 2006-03-28 12:26 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-25 8:39 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Thu, 23 Mar 2006, Martin Josefsson wrote: [...] > A lot of your changes should go into svn, you have full freedom to > change whatever you want. OK, I'll commit it sometime at the weekend. :-) > I see that you have tested with HASHSHIFT 3, 4, 5, 6 and a 32byte > cachelinesize. > That gives: > > HASHSHIFT = 3 > HASHNUM (buckets per child) = 8 > childsize = HASHNUM * bucketsize (cachelinesize) = 256 bytes [...] > The original idea was to allocate PAGE_SIZE (4096 on i386) bytes at a > time so it would be interesting to test HASHSHIFT = 7 (assuming a 32byte > cachelinesize) I just wanted to see the effect of different HASHSHIFT values. But I'll add HASHSHIFT = 7, that's just some new lines. > Another idea is to have varible number of buckets (HASHNUM) in the > nodes, more buckets at the top of the trie and then decrease the number > of buckets as we descend deeper into the trie. I've tried this earlier > but with lowered performance, I don't remember anything about the change > in memory-usage, and I don't trust my memory so it should be tested > again. > > HASHSHIFT=8 or 9 or even larger at depth 0 would be interesting to test. I'm going into that direction: probably even higher HASHSHIF value at depth = 0 was fairer. The slow insert time compared to hashtable may come from the preallocated large hash in hashtable. > Yes, this matches the results I've seen as well. I've also run it under > the cachegrind module of valgrind and among other things it keeps > statistics on the number of instructions executed, and hashtrie executes > a lot more instructions than the plain hashtable during lookup, yet > hashtrie gets better performance in some cases due to the fact that it > gets less cachemisses. I'm thinking on an array-version, where instead of chaining of pointers, basically an array of ip_conntrack_tuple is used in the hashtable to avoid clashing. And IPv6 should be taken into account, too. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2 (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) 2006-03-25 8:39 ` Jozsef Kadlecsik @ 2006-03-28 12:26 ` Jozsef Kadlecsik 0 siblings, 0 replies; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-28 12:26 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Hi Martin, On Sat, 25 Mar 2006, Jozsef Kadlecsik wrote: > > A lot of your changes should go into svn, you have full freedom to > > change whatever you want. > > OK, I'll commit it sometime at the weekend. :-) [...] The new version finally is in svn :-) The new results and comments can be found at http://people.netfilter.org/kadlec/hashtrie A short summary: - sfhash does indeed faster than jhash when CPU-dependent optimization is enabled - hastrie with variable hash sizes at the different level can help, surprisingly both in memory and speed. I tested three variants, there may easily be other ones which are even better. - hasharray is very good on the price of some memory. But very good, and simple. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2, dancing trees 2006-03-07 18:33 ` Martin Josefsson 2006-03-08 6:34 ` Patrick Schaaf 2006-03-12 18:49 ` Martin Josefsson @ 2006-03-30 8:28 ` Amin Azez 2006-03-31 18:43 ` Jozsef Kadlecsik 2 siblings, 1 reply; 40+ messages in thread From: Amin Azez @ 2006-03-30 8:28 UTC (permalink / raw) To: Martin Josefsson; +Cc: Netfilter Development Mailinglist Just for interest, and I don't know how well tuned for processor cache it is, but Hans Reisers dancing trees are build to solve some of this problem; dancing trees keep themselves balanced. I realise that filesystem timescales are not the same as packet processing timecales, I merely mention it for completeness. Sam ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Hashtrie testing2, dancing trees 2006-03-30 8:28 ` Hashtrie testing2, dancing trees Amin Azez @ 2006-03-31 18:43 ` Jozsef Kadlecsik 0 siblings, 0 replies; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-31 18:43 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist, Martin Josefsson On Thu, 30 Mar 2006, Amin Azez wrote: > Just for interest, and I don't know how well tuned for processor cache > it is, but Hans Reisers dancing trees are build to solve some of this > problem; dancing trees keep themselves balanced. > > I realise that filesystem timescales are not the same as packet > processing timecales, I merely mention it for completeness. I pondered over trees but excluded the whole family as too slow for our purposes. We aim to process packets near at wire speed and that implies the fastest algorithms. A little complexity can have dire consequences - I was quite surprised that the shared node method was so weak compared to the hashes/tries when it required just a little bit more internal maintenance. I believe keeping a tree balanced requires too expensive operations which we cannot afford. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-17 8:18 ` Jozsef Kadlecsik 2006-02-17 8:45 ` Martin Josefsson @ 2006-02-17 8:50 ` Patrick McHardy 1 sibling, 0 replies; 40+ messages in thread From: Patrick McHardy @ 2006-02-17 8:50 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Ayuso, Yasuyuki Kozakai Jozsef Kadlecsik wrote: > On Thu, 16 Feb 2006, Patrick McHardy wrote: > >>Actually it would still not be unique if connections live >>shorter than a jiffy and are resurrected. > > > I can imagine such a situation only when the conntrack table is full and > we are forced to drop unassured connections - when we are in trouble > anyway. I guess it depends on how fast the everything is. Someone at the workshop made the point that given how fast speed increases, it will probably be a realistic problem at some point. But I was never convinced of the current half-assed approach to a unique ID anyway, its either unique or its not. So in my opinion, we could eliminate it if we find a better solution for the dumping problem. >>>Some clever solution should only be found to spread the expectations >>>over multiple, separatedly locked buckets... >> >>I've talked to Harald about this and as a start we can start by >>allowing masks only for the source part of the tuple. That >>means we can hash by destinations. > > > That's actually fine! Now I should complete that hashtrie/hashtable > testing I have been planning... I wanted to work on the expectation lookup thing soon, I'll let you know once I have something working. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-02-16 20:09 ` Patrick McHardy 2006-02-17 8:18 ` Jozsef Kadlecsik @ 2006-03-30 8:31 ` Amin Azez 2006-03-31 1:11 ` Patrick McHardy 1 sibling, 1 reply; 40+ messages in thread From: Amin Azez @ 2006-03-30 8:31 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Yasuyuki Kozakai, Pablo Neira Ayuso Patrick McHardy wrote: > Jozsef Kadlecsik wrote: > >>On Thu, 16 Feb 2006, Patrick McHardy wrote: >> >> >>>>(jiffies, tuples) would be unique even in that case. >>> >>>Thats true. But what is the advantage over using the counter? > > > Actually it would still not be unique if connections live > shorter than a jiffy and are resurrected. In which case would there be any harm in it not being unqiue, if it were the same connection resurrected, why try to avoid having the same id? Sam ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-03-30 8:31 ` Amin Azez @ 2006-03-31 1:11 ` Patrick McHardy 2006-03-31 18:35 ` Jozsef Kadlecsik 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-03-31 1:11 UTC (permalink / raw) To: Amin Azez Cc: Harald Welte, Netfilter Development Mailinglist, Yasuyuki Kozakai, Pablo Neira Ayuso Amin Azez wrote: > Patrick McHardy wrote: > >> Jozsef Kadlecsik wrote: >> >>> On Thu, 16 Feb 2006, Patrick McHardy wrote: >>> >>> >>>>> (jiffies, tuples) would be unique even in that case. >>>> >>>> >>>> Thats true. But what is the advantage over using the counter? >> >> >> >> Actually it would still not be unique if connections live >> shorter than a jiffy and are resurrected. > > > In which case would there be any harm in it not being unqiue, if it were > the same connection resurrected, why try to avoid having the same id? Besides solving an implementation problem, so ID was meant as a long-term unique identifier. For the life-time of a connection the tuples are already a unique identifier. It's amazing, I've never had so many discussions about a single 32 bit field :) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-03-31 1:11 ` Patrick McHardy @ 2006-03-31 18:35 ` Jozsef Kadlecsik 2006-03-31 18:44 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Jozsef Kadlecsik @ 2006-03-31 18:35 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Pablo Neira Ayuso, Netfilter Development Mailinglist, Amin Azez, Yasuyuki Kozakai On Fri, 31 Mar 2006, Patrick McHardy wrote: > >> Actually it would still not be unique if connections live > >> shorter than a jiffy and are resurrected. > > > > In which case would there be any harm in it not being unqiue, if it were > > the same connection resurrected, why try to avoid having the same id? > > Besides solving an implementation problem, so ID was meant as a > long-term unique identifier. For the life-time of a connection > the tuples are already a unique identifier. > > It's amazing, I've never had so many discussions about a single > 32 bit field :) It's amazingly hard to accept we need it :-). Putting aside the implementation problem, there are so many information already available in the conntrack entries: tuples, packet and byte counters, state, timeout value. Those together can serve as a short-term unique identifier, even after the entry is actually deleted. Not bulletproof, but in practice those can be sufficient. I could not resist... :-)) Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-03-31 18:35 ` Jozsef Kadlecsik @ 2006-03-31 18:44 ` Patrick McHardy 2006-04-01 19:31 ` Harald Welte 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-03-31 18:44 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Pablo Neira Ayuso, Netfilter Development Mailinglist, Amin Azez, Yasuyuki Kozakai Jozsef Kadlecsik wrote: > On Fri, 31 Mar 2006, Patrick McHardy wrote: > > >>>>Actually it would still not be unique if connections live >>>>shorter than a jiffy and are resurrected. >>> >>>In which case would there be any harm in it not being unqiue, if it were >>>the same connection resurrected, why try to avoid having the same id? >> >>Besides solving an implementation problem, so ID was meant as a >>long-term unique identifier. For the life-time of a connection >>the tuples are already a unique identifier. >> >>It's amazing, I've never had so many discussions about a single >>32 bit field :) > > > It's amazingly hard to accept we need it :-). > > Putting aside the implementation problem, there are so many information > already available in the conntrack entries: tuples, packet and byte > counters, state, timeout value. Those together can serve as a short-term > unique identifier, even after the entry is actually deleted. Not > bulletproof, but in practice those can be sufficient. > > I could not resist... :-)) Me neither :) I was just pointing out the idea behind the ID, not necessarily condoning it. After discussing it with Rusty in Sevilla, it became clear that we can provide guarantees similar to filesystems without it, which seems to be good enough. So I have absolutely nothing against removing it. In fact, IIRC the last discussion stopped after I had sent a patch to remove it. I need to dig out that patch again :) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-03-31 18:44 ` Patrick McHardy @ 2006-04-01 19:31 ` Harald Welte 2006-04-06 11:02 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Harald Welte @ 2006-04-01 19:31 UTC (permalink / raw) To: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 1151 bytes --] On Fri, Mar 31, 2006 at 08:44:56PM +0200, Patrick McHardy wrote: > Me neither :) I was just pointing out the idea behind the ID, not > necessarily condoning it. After discussing it with Rusty in Sevilla, > it became clear that we can provide guarantees similar to filesystems > without it, which seems to be good enough. So I have absolutely > nothing against removing it. In fact, IIRC the last discussion > stopped after I had sent a patch to remove it. I need to dig out that > patch again :) I didn't contribute to any of these discussions anymore, since I already made my opinion on this quite clear very early on: I don't have any problem whatsoever with non-uniqueness ;) So if there's now a majority of people who want to delete the ID: Go for it :) -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-04-01 19:31 ` Harald Welte @ 2006-04-06 11:02 ` Patrick McHardy 2006-04-11 16:09 ` Amin Azez 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-04-06 11:02 UTC (permalink / raw) To: Harald Welte; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 645 bytes --] Harald Welte wrote: > So if there's now a majority of people who want to delete the ID: Go for > it :) Found the patch again. What it does is: - note entry of next conntrack to be dumped and keep a reference to it - when continuing, look for the conntrack and continue at it if its still there - if not, dump the entire bucket again In theory we could end up in an endless loop if the conntrack entry we're keeping the reference to is deleted everytime we want to continue dumping. It shouldn't be triggerable intentionally because of the jenkins hash though. If there are no objections I'll port it to nf_conntrack_netlink and submit it. [-- Attachment #2: x --] [-- Type: text/plain, Size: 1853 bytes --] diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c index e0b5926..5a1769d 100644 --- a/net/ipv4/netfilter/ip_conntrack_netlink.c +++ b/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -387,38 +387,52 @@ nfattr_failure: static int ctnetlink_done(struct netlink_callback *cb) { DEBUGP("entered %s\n", __FUNCTION__); + if (cb->args[1]) + ip_conntrack_put(cb->args[1]); return 0; } static int ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) { - struct ip_conntrack *ct = NULL; + struct ip_conntrack *ct; struct ip_conntrack_tuple_hash *h; struct list_head *i; - u_int32_t *id = (u_int32_t *) &cb->args[1]; DEBUGP("entered %s, last bucket=%lu id=%u\n", __FUNCTION__, cb->args[0], *id); read_lock_bh(&ip_conntrack_lock); - for (; cb->args[0] < ip_conntrack_htable_size; cb->args[0]++, *id = 0) { + for (; cb->args[0] < ip_conntrack_htable_size; cb->args[0]++) { +restart: list_for_each_prev(i, &ip_conntrack_hash[cb->args[0]]) { h = (struct ip_conntrack_tuple_hash *) i; if (DIRECTION(h) != IP_CT_DIR_ORIGINAL) continue; ct = tuplehash_to_ctrack(h); - if (ct->id <= *id) - continue; + if (cb->args[1]) { + if (ct == cb->args[1]) { + ip_conntrack_put(cb->args[1]); + cb->args[1] = NULL; + } else + continue; + } if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, IPCTNL_MSG_CT_NEW, - 1, ct) < 0) + 1, ct) < 0) { + nf_conntrack_get(&ct->ct_general); + cb->args[1] = ct; goto out; - *id = ct->id; + } + } + if (cb->args[1]) { + ip_conntrack_put(cb->args[1]); + cb->args[1] = NULL; + goto restart; } } -out: +out: read_unlock_bh(&ip_conntrack_lock); DEBUGP("leaving, last bucket=%lu id=%u\n", cb->args[0], *id); ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-04-06 11:02 ` Patrick McHardy @ 2006-04-11 16:09 ` Amin Azez 2006-04-11 16:17 ` Patrick McHardy 0 siblings, 1 reply; 40+ messages in thread From: Amin Azez @ 2006-04-11 16:09 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist Patrick McHardy wrote: > Harald Welte wrote: > >>So if there's now a majority of people who want to delete the ID: Go for >>it :) > > > Found the patch again. What it does is: > > - note entry of next conntrack to be dumped and keep a reference to it > - when continuing, look for the conntrack and continue at it if its > still there > - if not, dump the entire bucket again > > In theory we could end up in an endless loop if the conntrack entry > we're keeping the reference to is deleted everytime we want to > continue dumping. Why not defer the ip_conntrack_put until after ctnetlink_fill_info, would that avoid the problem altogether? It's no longer the entry of the next conntrack but the entry of the last conntrack by the time it gets deleted then...? Or have I misunderstood the way in which the loop would occur? Sam ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 2006-04-11 16:09 ` Amin Azez @ 2006-04-11 16:17 ` Patrick McHardy [not found] ` <443CA579.3030908@ufomechanic.net> 0 siblings, 1 reply; 40+ messages in thread From: Patrick McHardy @ 2006-04-11 16:17 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist Amin Azez wrote: > Patrick McHardy wrote: > >> Found the patch again. What it does is: >> >> - note entry of next conntrack to be dumped and keep a reference to it >> - when continuing, look for the conntrack and continue at it if its >> still there >> - if not, dump the entire bucket again >> >> In theory we could end up in an endless loop if the conntrack entry >> we're keeping the reference to is deleted everytime we want to >> continue dumping. > > > Why not defer the ip_conntrack_put until after ctnetlink_fill_info, > would that avoid the problem altogether? > > It's no longer the entry of the next conntrack but the entry of the last > conntrack by the time it gets deleted then...? Or have I misunderstood > the way in which the loop would occur? Yes, the loop is interrupted when the skb is full and continued when userspace issues the next recvmsg(). The reference is kept to find the point where to continue. ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <443CA579.3030908@ufomechanic.net>]
* Re: [PATCH 4/4] first conntrack ID must be 1 not 2 [not found] ` <443CA579.3030908@ufomechanic.net> @ 2006-04-12 18:30 ` Patrick McHardy 0 siblings, 0 replies; 40+ messages in thread From: Patrick McHardy @ 2006-04-12 18:30 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist Amin Azez wrote: > Patrick McHardy wrote: > >>> It's no longer the entry of the next conntrack but the entry of the last >>> conntrack by the time it gets deleted then...? Or have I misunderstood >>> the way in which the loop would occur? >>> >> >> >> Yes, the loop is interrupted when the skb is full and continued when >> userspace issues the next recvmsg(). The reference is kept to find >> the point where to continue. >> >> > Did you mean "yes, you misunderstood" or "yes, good idea"? Yes, you misunderstood :) > I thought that what would theoretically cause the problem if the > next-conntrack is deleted was the fact that conntrack-put is called > before ctnetlink_fill_info. Its only a marker, once we have found it in the list we don't need that reference anymore. > The problem may not be entirely theoretical if the conntrack is being > deleted by a userspace conntrack client in response to what it reads > when enumerating conntracks. (The conntrack may spring back into > existance as an resurrected established connection - and also may > therefore be likely in the case of a full contrack table where > connections flip-flop their resurrected conntrack entry, > Rather than issue warnings to conntrack developers to avoid certain > kinds of design, it would be better to avoid this. The marker points to the next entry that beed dumped yet, so enumerate-and-delete won't affect it. It is also not the ID of the connection but the pointer to the conntrack entry that is used, so resurrections don't affect it in any way. The problem is in my opinion purely theoretical so far: - the bucket beeing dumped must contain enough entries to exceed the size of a single netlink message (very unlikely) - when continuing to dump, the entry we wanted to dump next must be gone - and there must still be enough entries in the bucket to exceed the skb. For an endless loop we must have a steady refill of the bucket with new conntrack entries, otherwise at some point it won't exceed the skb anymore and we can move on to the next bucket. The jenkins hash should prevent this situation. ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2006-04-12 18:30 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-13 2:41 [PATCH 4/4] first conntrack ID must be 1 not 2 Pablo Neira Ayuso
2006-02-13 11:20 ` Harald Welte
2006-02-16 8:33 ` Patrick McHardy
2006-02-16 8:47 ` Jozsef Kadlecsik
2006-02-16 9:02 ` Patrick McHardy
2006-02-16 9:11 ` Jozsef Kadlecsik
2006-02-16 9:14 ` Patrick McHardy
2006-02-16 9:36 ` Jozsef Kadlecsik
2006-02-16 20:09 ` Patrick McHardy
2006-02-17 8:18 ` Jozsef Kadlecsik
2006-02-17 8:45 ` Martin Josefsson
2006-02-17 9:30 ` Jozsef Kadlecsik
2006-02-17 18:41 ` Jozsef Kadlecsik
2006-03-04 16:23 ` Hashtrie testing (was: Re: [PATCH 4/4] first conntrack ID must be 1 not 2) Martin Josefsson
2006-03-05 9:49 ` Jozsef Kadlecsik
2006-03-05 13:24 ` Martin Josefsson
2006-03-04 20:11 ` Hashtrie testing2 " Martin Josefsson
2006-03-05 11:24 ` Jozsef Kadlecsik
2006-03-05 17:48 ` Martin Josefsson
2006-03-06 13:15 ` Jozsef Kadlecsik
2006-03-07 18:33 ` Martin Josefsson
2006-03-08 6:34 ` Patrick Schaaf
2006-03-12 18:49 ` Martin Josefsson
2006-03-14 11:35 ` Jozsef Kadlecsik
2006-03-23 11:27 ` Jozsef Kadlecsik
2006-03-23 21:07 ` Martin Josefsson
2006-03-25 8:39 ` Jozsef Kadlecsik
2006-03-28 12:26 ` Jozsef Kadlecsik
2006-03-30 8:28 ` Hashtrie testing2, dancing trees Amin Azez
2006-03-31 18:43 ` Jozsef Kadlecsik
2006-02-17 8:50 ` [PATCH 4/4] first conntrack ID must be 1 not 2 Patrick McHardy
2006-03-30 8:31 ` Amin Azez
2006-03-31 1:11 ` Patrick McHardy
2006-03-31 18:35 ` Jozsef Kadlecsik
2006-03-31 18:44 ` Patrick McHardy
2006-04-01 19:31 ` Harald Welte
2006-04-06 11:02 ` Patrick McHardy
2006-04-11 16:09 ` Amin Azez
2006-04-11 16:17 ` Patrick McHardy
[not found] ` <443CA579.3030908@ufomechanic.net>
2006-04-12 18:30 ` Patrick McHardy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.