From: Rusty Russell <rusty@rustcorp.com.au>
To: Martin Josefsson <gandalf@wlug.westbo.se>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Rolf Fokkens <fokkensr@fokkensr.vertis.nl>,
Harald Welte <laforge@netfilter.org>,
Netfilter-devel <netfilter-devel@lists.netfilter.org>,
Patrick Schaaf <bof@bof.de>
Subject: Re: PATCH: extra conntrack stats
Date: Thu, 01 May 2003 14:05:52 +1000 [thread overview]
Message-ID: <20030501050235.F0DCE2C04F@lists.samba.org> (raw)
In-Reply-To: Your message of "01 May 2003 01:05:03 +0200." <1051743903.8213.188.camel@tux.rsn.bth.se>
In message <1051743903.8213.188.camel@tux.rsn.bth.se> you write:
> On Wed, 2003-04-30 at 13:19, Martin Josefsson wrote:
> - Switch to using hlists for the hashtable
OK. Please implement hlist_for_each_entry(), though, a-la
list_for_each_entry.
> - Rearrange struct ip_conntrack to be more cachefriendly if possible
Leave 'til last, I think, since that structure will be changing
anyway. Most important is that next ptr and tuple are at the start
(ie. only one cacheline per hash chain entry).
> - Add prefetching in list-searching
Can be done, but it's a micro-optimization and you'd want to check
carefully that recent gcc's don't do this anyway.
> - Turn protocol_list into an array
Or hardcode the three builtins and use the list for others.
> - Switch to a better hashfunction
Yes! See comments below on secret hashing...
> - Remove pointless timer-updating
We could revisit this altogether: go for one timer which sweeps the
hash chains and an "expiry" date on each one. You end up with some
icky deletion issues, but...
> - Rework locking to be finer grained, start with per bucket spinlocks
> (goal: RCU?)
Well, might as well go straight to RCU for the infrastructure
locking.
It's a little tricky. RCU can be used for the read side, but since
writes are not uncommon (I always guesstimated 1 in 10), we still
probably want per-chain locks. Hmm, actually, since that bloats the
hash, let's start with one lock and see how it goes.
The protection of the conntrack objects themselves should become a
lock per conntrack I think. We currently use the timer lock as a form
of synchronization: if we have a conntrack.lock we should use that.
> - Remove tcp_lock if using per bucket spinlocks, otherwise move it into
> the entries
Agreed: use conntrack.lock.
> - Remove pointless rehashing of tuples
Hmmm, does this happen?
> - Rework overload support (early_drop)
OK, you've probably seen my "hash with secret key" scheme.
Unfortunately, it relies on being able to grab the network brlock to
stop all activity while it redoes the hash. But Dave is removing the
brlock, so this becomes more tricky (ie. readers have to be aware that
there could be two hash tables, ick).
It might still be worth it though (this same trick would allow us to
resize the hash through /proc). Needs more thought.
> - Avoid as many memory-writes as possible, no need to dirty cachelines if
> we don't have to
Well, yes, but it's usually secondary after correctness.
> - Eliminate listhelp.h and lockhelp.h by request from hch
Yeah, list.h is more sophisticated now, and we have lock debugging
> - Try to shrink struct ip_conntrack
Ignoring NAT for the moment, and using a 32-bit arch:
struct nf_conntrack ct_general;
8 bytes: hard to shrink.
struct ip_conntrack_tuple_hash tuplehash[IP_CT_DIR_MAX];
2 x 28 bytes: we can get rid of one.
unsigned long status;
8 bytes. Needs to be ulong for bitops.
struct timer_list timeout;
48 bytes: could be one ulong (time for expiry).
struct list_head sibling_list;
8 bytes. Hard to remove without getting tricky...
unsigned int expecting;
4 bytes. Maybe could merge this with upper bits of status, maybe...
struct ip_conntrack_expect *master;
4 bytes. Needed.
struct ip_conntrack_helper *helper;
4 bytes. Needed.
struct nf_ct_info infos[IP_CT_NUMBER];
7 * 4 bytes. We could eliminate this by adding an skb field to hold
the state.
union ip_conntrack_proto proto;
8 bytes, will get bigger with tcp tracking.
union ip_conntrack_help help;
16 bytes, could use flags in status for ftp's seq_aft_nl_set and cut
to 8 bytes.
So, we're about 192 bytes. We could get to 80 bytes.
Cheers,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
next prev parent reply other threads:[~2003-05-01 4:05 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-19 21:42 PATCH: extra conntrack stats Rolf Fokkens
2003-04-25 8:23 ` Patrick Schaaf
2003-04-27 12:48 ` Harald Welte
2003-04-27 15:23 ` Rolf Fokkens
2003-04-27 20:42 ` Harald Welte
2003-04-28 6:13 ` Patrick Schaaf
2003-04-29 22:15 ` Jozsef Kadlecsik
2003-04-29 22:38 ` Martin Josefsson
2003-04-30 10:49 ` Jozsef Kadlecsik
2003-04-30 11:19 ` Martin Josefsson
2003-04-30 23:05 ` Martin Josefsson
2003-05-01 4:05 ` Rusty Russell [this message]
2003-05-01 6:05 ` Patrick Schaaf
2003-05-01 6:46 ` Rusty Russell
2003-05-01 7:04 ` Patrick Schaaf
2003-05-01 7:38 ` Rusty Russell
2003-05-01 9:58 ` Martin Josefsson
2003-05-01 11:32 ` Harald Welte
2003-05-01 11:26 ` Harald Welte
2003-05-02 12:18 ` Jozsef Kadlecsik
2003-05-02 12:30 ` Martin Josefsson
2003-05-02 21:51 ` Jozsef Kadlecsik
2003-05-02 21:58 ` Martin Josefsson
2003-05-05 9:24 ` Jozsef Kadlecsik
2003-05-05 12:38 ` Jozsef Kadlecsik
2003-05-05 13:07 ` Martin Josefsson
2003-05-01 0:06 ` Rusty Russell
2003-05-01 5:48 ` Patrick Schaaf
2003-05-01 10:01 ` Martin Josefsson
2003-05-01 9:06 ` Martin Josefsson
2003-05-02 5:31 ` Rusty Russell
2003-05-02 7:06 ` Patrick Schaaf
2003-05-02 8:57 ` Rusty Russell
2003-05-02 9:54 ` SNAT and IP ID Patrick Schaaf
2003-05-02 15:43 ` Harald Welte
2003-05-05 8:43 ` PATCH: extra conntrack stats Jozsef Kadlecsik
2003-04-28 9:13 ` vecna
2003-04-28 13:47 ` Patrick Schaaf
2003-04-28 15:07 ` possible target SBALANCE ? vecna
2003-04-29 14:48 ` Harald Welte
2003-04-30 11:59 ` vecna
2003-04-30 13:02 ` Roberto Nibali
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030501050235.F0DCE2C04F@lists.samba.org \
--to=rusty@rustcorp.com.au \
--cc=gandalf@wlug.westbo.se \
--cc=kadlec@blackhole.kfki.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.