RAM and conntrack performance

All of lore.kernel.org
 help / color / mirror / Atom feed

* RAM and conntrack performance
@ 2003-10-28 15:10 Herve Eychenne
  2003-11-03  8:12 ` Harald Welte
  0 siblings, 1 reply; 20+ messages in thread
From: Herve Eychenne @ 2003-10-28 15:10 UTC (permalink / raw)
  To: Netfilter Development

 Hi everyone,

Can someone post a state of the art summary for netfilter conntrack
(and maybe NAT) performance tweaking?
The only things I'm currently aware of are:
- modprobe ip_conntrack hashsize=$HASHSIZE
- echo $CONNTRACK_MAX > /proc/sys/net/ipv4/ip_conntrack_max

I think it would be good to end up with a small document which would
give every detail about how to choose optimal values for HASHSIZE and
CONNTRACK_MAX, and every other mean to get the best out of the
conntracking/NAT system...

Here are things I've collected so far, that it would be good to have
in this little document. I have questions, also:
- CONNTRACK_MAX and HASHSIZE get default values at boot time.
  By default, CONNTRACK_MAX = n * 64, where n is the RAM size in MB,
  am I right?
  What about HASHSIZE default value? How to read it at runtime?
  What is the exact link between these 2 values?
- HASHSIZE should be an odd number, and even better: a prime number.
  What happens when you set it to an even number, or a non-prime number?
  Why enable people to set even and non-prime numbers at all?
- Default values are "reasonnable" for a typical host, but we may
  increase them on high-loaded firewalling-only systems, right?
  Which values are the "best"? I.e., can someone give a formula with
  this potential parameters (if pertinent):
  - total RAM size
  - size of the memory that should be left for non-conntrack data in
    the kernel and userspace in general (what is a reasonnable value for
    a firewall doing only firewalling with very few applications
    running, and how to measure that at runtime?)
  - number of rules, connections rate, etc.
- CONNTRACK_MAX can be modified at run time with /proc. What does it
  do exactly (when shinked, when extended)?
  When you modify CONNTRACK_MAX, should you also modify HASHSIZE
  accordingly? Why? How?
- Is it possible to modify HASHSIZE at runtime when ip_conntrack is
  not compiled as a module? If not, shouldn't we enable this with
  /proc, like CONNTRACK_MAX?
- Does any of these operations currently (or possibly, if soon
  implemented) lead to some rehashing at runtime?
  I suppose it would be quite slow... How long does/would it take?
  How to proceed to keep current conntrack entries at runtime as much
  as possible? (I suppose unloading ip_conntrack module and
  reinserting it with another hashsize value clears the table...)

Please comment...

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance
  2003-10-28 15:10 RAM and conntrack performance Herve Eychenne
@ 2003-11-03  8:12 ` Harald Welte
  2003-11-25 15:35   ` Herve Eychenne
  0 siblings, 1 reply; 20+ messages in thread
From: Harald Welte @ 2003-11-03  8:12 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 6111 bytes --]

On Tue, Oct 28, 2003 at 04:10:32PM +0100, Herve Eychenne wrote:
>  Hi everyone,

Hi Herve!

> 
> Can someone post a state of the art summary for netfilter conntrack
> (and maybe NAT) performance tweaking?
> The only things I'm currently aware of are:
> - modprobe ip_conntrack hashsize=$HASHSIZE
> - echo $CONNTRACK_MAX > /proc/sys/net/ipv4/ip_conntrack_max

you shouldn't need to tweak anything else.  Recent kernels have the
jenkins2b hash instead of our old one, and hash distribution should thus
be more optimal.

> I think it would be good to end up with a small document which would
> give every detail about how to choose optimal values for HASHSIZE and
> CONNTRACK_MAX, and every other mean to get the best out of the
> conntracking/NAT system...

I guess there hasn't been any performance testing.  Ideally you'd have
as many buckets as you have conntrack entries in the system.  However,
every bucket will 
> 
> Here are things I've collected so far, that it would be good to have
> in this little document. I have questions, also:
> - CONNTRACK_MAX and HASHSIZE get default values at boot time.
>   By default, CONNTRACK_MAX = n * 64, where n is the RAM size in MB,
>   am I right?

well, it's true on i386.  See the algorithm below.

>   What about HASHSIZE default value? How to read it at runtime?
>   What is the exact link between these 2 values?

	/* Idea from tcp.c: use 1/16384 of memory.  On i386: 32MB
	 * machine has 256 buckets.  >= 1GB machines have 8192 buckets. */
 	if (hashsize) {
 		ip_conntrack_htable_size = hashsize;
 	} else {
		ip_conntrack_htable_size
			= (((num_physpages << PAGE_SHIFT) / 16384)
			   / sizeof(struct list_head));
		if (num_physpages > (1024 * 1024 * 1024 / PAGE_SIZE))
			ip_conntrack_htable_size = 8192;
		if (ip_conntrack_htable_size < 16)
			ip_conntrack_htable_size = 16;
	}
	ip_conntrack_max = 8 * ip_conntrack_htable_size;

I guess it's hard to describe the algorithm any better in written
language.

> - HASHSIZE should be an odd number, and even better: a prime number.
>   What happens when you set it to an even number, or a non-prime number?

hash distribution will be less optimal.

>   Why enable people to set even and non-prime numbers at all?

because we're lazy (and it doesn't cause a malfunction)

> - Default values are "reasonnable" for a typical host, but we may
>   increase them on high-loaded firewalling-only systems, right?

yes.

>   Which values are the "best"? I.e., can someone give a formula with
>   this potential parameters (if pertinent):
>   - total RAM size
>   - size of the memory that should be left for non-conntrack data in
>     the kernel and userspace in general (what is a reasonnable value for
>     a firewall doing only firewalling with very few applications
>     running, and how to measure that at runtime?)
>   - number of rules, connections rate, etc.

This is not a fixed formula. If it was, we could just do it
automatically that way.  In the ideal case, you have a machine _just_
doing packet filtering (i.e. almost no userspace running, at least none
that would have a growing memory consumption like proxies, ...).  Then
you put a decent amount of memory into that box, and use all but 64MB
(or 128MB) for conntrack (which can easily be half a gig of ram
considering todays memory prices).

size_of_mem_available_for_ct = 
	ip_conntrack_max*sizeof(struct ip_conntrack) +
	hashsize*sizeof(struct list_head)

struct ip_conntrack is about 300 bytes (depending on your compile-time
configuration, see the printout at module load time).  struct list_head
is 2 times the size of a pointer on the respective arch.  on i386 it's 8
bytes total.

> - CONNTRACK_MAX can be modified at run time with /proc. What does it
>   do exactly (when shinked, when extended)?
>   When you modify CONNTRACK_MAX, should you also modify HASHSIZE
>   accordingly? Why? How?

it increases the counter of maximum allowed conntrack entries.
yes, you should also modify the hash size, since now the average number
of conntrack entries per hash bucket is increasing
(ip_conntrack_max/hashsize in the optimal case) and thus we need to
iterate over more list entries per conntrack lookup.  Having a large
hashsize is not bad at all - it will just occupy 
hashzize*sizeof(struct list_head) bytes of non-swappable kernel memory,
whether you have any connections or not. 

> - Is it possible to modify HASHSIZE at runtime when ip_conntrack is
>   not compiled as a module? If not, shouldn't we enable this with
>   /proc, like CONNTRACK_MAX?

no.  It is non-trivial to change the hash size after we have conntrack
entries in the table.  It would mean we'd need to re-hash all alrady
existing connections.  

With 2.6.x you should be able to set hashsize at boottime using the new
module parameter stuff (which I haven't yet looked into, sorry).

> - Does any of these operations currently (or possibly, if soon
>   implemented) lead to some rehashing at runtime?

no.  the hash is just initialized with some random values at the time we
receive the first packet.  This is to make the hash function not
guessable from the outside (and thus less likely to be attacked).

>   I suppose it would be quite slow... How long does/would it take?

no idea.

>   How to proceed to keep current conntrack entries at runtime as much
>   as possible? (I suppose unloading ip_conntrack module and
>   reinserting it with another hashsize value clears the table...)

yes.  You just don't do that.  You configure your firewall, and put it
in place.  You should know your network traffic beforehand and configure
it correctly.

>  Herve

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance
  2003-11-03  8:12 ` Harald Welte
@ 2003-11-25 15:35   ` Herve Eychenne
  2003-11-25 20:57     ` Harald Welte
  0 siblings, 1 reply; 20+ messages in thread
From: Herve Eychenne @ 2003-11-25 15:35 UTC (permalink / raw)
  To: Harald Welte, Netfilter Development

On Mon, Nov 03, 2003 at 09:12:40AM +0100, Harald Welte wrote:

> On Tue, Oct 28, 2003 at 04:10:32PM +0100, Herve Eychenne wrote:

 Hi!

Thank you very much for your detailled answer, Harald.
Sorry for the delay. I'm currently writing this little document, based mainly
on your answers.

> > I think it would be good to end up with a small document which would
> > give every detail about how to choose optimal values for HASHSIZE and
> > CONNTRACK_MAX, and every other mean to get the best out of the
> > conntracking/NAT system...

> I guess there hasn't been any performance testing.  Ideally you'd have
> as many buckets as you have conntrack entries in the system.  However,
> every bucket will 

Something was lost in space... Will? ;-)

> > Here are things I've collected so far, that it would be good to have
> > in this little document. I have questions, also:
> > - CONNTRACK_MAX and HASHSIZE get default values at boot time.
> >   By default, CONNTRACK_MAX = n * 64, where n is the RAM size in MB,
> >   am I right?

> well, it's true on i386.
> See the algorithm below.

> >   What about HASHSIZE default value? How to read it at runtime?

So, it cannot be read at runtime, I suppose... It would be really nice,
though... would /proc be ok?

> >   What is the exact link between these 2 values?

> /* Idea from tcp.c: use 1/16384 of memory.  On i386: 32MB
>  * machine has 256 buckets.  >= 1GB machines have 8192 buckets. */
>  	if (hashsize) {
>  		ip_conntrack_htable_size = hashsize;
>  	} else {
> ip_conntrack_htable_size
> = (((num_physpages << PAGE_SHIFT) / 16384)
> / sizeof(struct list_head));
> if (num_physpages > (1024 * 1024 * 1024 / PAGE_SIZE))
> ip_conntrack_htable_size = 8192;

We could put a "else" here.
BTW, why this hard limit of 8192? On really high-speed and high-loaded networks,
you may perfectly want to set to an upper value...

> if (ip_conntrack_htable_size < 16)
> ip_conntrack_htable_size = 16;
> }
> ip_conntrack_max = 8 * ip_conntrack_htable_size;

> I guess it's hard to describe the algorithm any better in written
> language.

> > - HASHSIZE should be an odd number, and even better: a prime number.
> >   What happens when you set it to an even number, or a non-prime number?

> hash distribution will be less optimal.

But reading the algorithm, hashsize is never automatically set to a
prime number... but an even one. So how do you explain
that I have 4091 (which is probably a prime number, right?) buckets on
my system by default?

> >   Why enable people to set even and non-prime numbers at all?

> because we're lazy (and it doesn't cause a malfunction)

Lazyness is the mother of all vices. ;-)

> >   Which values are the "best"? I.e., can someone give a formula with
> >   this potential parameters (if pertinent):
> >   - total RAM size
> >   - size of the memory that should be left for non-conntrack data in
> >     the kernel and userspace in general (what is a reasonnable value for
> >     a firewall doing only firewalling with very few applications
> >     running, and how to measure that at runtime?)
> >   - number of rules, connections rate, etc.

> This is not a fixed formula. If it was, we could just do it
> automatically that way.

No, because we don't know the amount of memory potentially used by
non-conntrack data.

> In the ideal case, you have a machine _just_
> doing packet filtering (i.e. almost no userspace running, at least none
> that would have a growing memory consumption like proxies, ...).  Then
> you put a decent amount of memory into that box, and use all but 64MB
> (or 128MB) for conntrack (which can easily be half a gig of ram
> considering todays memory prices).

> size_of_mem_available_for_ct = 
> ip_conntrack_max*sizeof(struct ip_conntrack) +
> hashsize*sizeof(struct list_head)

> struct ip_conntrack is about 300 bytes (depending on your compile-time
> configuration, see the printout at module load time).  struct list_head
> is 2 times the size of a pointer on the respective arch.  on i386 it's 8
> bytes total.

So on i386,
size_of_mem_available_for_ct =~
300 * ip_conntrack_max + hashsize * 8 =~
300 * ip_conntrack_max + ip_conntrack_max =~
300 * ip_conntrack_max =~
300 * RAM / 16384 =~
RAM / 55 by default
On a firewall-only machine (without proxies), this is not much, as
we could run with 
ip_conntrack_max = RAM - 128MB / 300

So, on a firewall-only machine with 512MB and 128MB "reserved" for
non-conntrack things (which is really big already for a firewall in
console mode), we could have 40 times more conntrack entries
than the default value without any problem. Interesting.

> > - CONNTRACK_MAX can be modified at run time with /proc. What does it
> >   do exactly (when shrinked, when extended)?

You don't really answer to my question: what happens when you set
conntrack_max to a smaller number than the currently stored conntrack
entries? I suppose conntrack entries are deleted? According to which
criterias?

> >   When you modify CONNTRACK_MAX, should you also modify HASHSIZE
> >   accordingly? Why? How?

> it increases the counter of maximum allowed conntrack entries.
> yes, you should also modify the hash size, since now the average number
> of conntrack entries per hash bucket is increasing
> (ip_conntrack_max/hashsize in the optimal case) and thus we need to
> iterate over more list entries per conntrack lookup.  Having a large
> hashsize is not bad at all - it will just occupy 
> hashzize*sizeof(struct list_head) bytes of non-swappable kernel memory,
> whether you have any connections or not. 

Yes, but globally, if we have 
conntrack_max = 8 * hashsize,
size_of_mem_available_for_ct =~
300 * ip_conntrack_max + hashsize * 8 =~
300 * ip_conntrack_max + ip_conntrack_max =~
300 * ip_conntrack_max

But if we take conntrack_max = hashsize,
size_of_mem_available_for_ct is still around 300 * ip_conntrack_max
(on my system, it is not 300, but exactly 292)
So I simply think that on firewall-only machines with 512Mo, we should
simply use conntrack_max = hashsize without any questioning.

Oh, BTW, what happens if hashsize > conntrack_max?
And what happens exactly when the number of active sessions exceeds
conntrack max?

> >   How to proceed to keep current conntrack entries at runtime as much
> >   as possible? (I suppose unloading ip_conntrack module and
> >   reinserting it with another hashsize value clears the table...)

> yes.  You just don't do that.  You configure your firewall, and put it
> in place.  You should know your network traffic beforehand and configure
> it correctly.

That's not always that simple. Suppose you're working for a company for which
availability and performance is critical... and suppose the growing
network traffic forces you to increase your bandwidth by about 10.
Well, in these sort of cases, you certainly want to avoid to reboot
(and loose connections) too much, believe me.
Yes, netfilter is sometimes used in these kind of companies. And
yes, I sometimes happen to do some missions for them.
And no, I can hardly give you any names. ;-)

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance
  2003-11-25 15:35   ` Herve Eychenne
@ 2003-11-25 20:57     ` Harald Welte
  2003-11-26  3:42       ` RAM and conntrack performance: first draft of the document is online Herve Eychenne
  0 siblings, 1 reply; 20+ messages in thread
From: Harald Welte @ 2003-11-25 20:57 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 4599 bytes --]

On Tue, Nov 25, 2003 at 04:35:43PM +0100, Herve Eychenne wrote:
> On Mon, Nov 03, 2003 at 09:12:40AM +0100, Harald Welte wrote:
> 
> > On Tue, Oct 28, 2003 at 04:10:32PM +0100, Herve Eychenne wrote:
> 
>  Hi!
> 
> Thank you very much for your detailled answer, Harald.
> Sorry for the delay. I'm currently writing this little document, based mainly
> on your answers.
> 
> > > I think it would be good to end up with a small document which would
> > > give every detail about how to choose optimal values for HASHSIZE and
> > > CONNTRACK_MAX, and every other mean to get the best out of the
> > > conntracking/NAT system...
> 
> > I guess there hasn't been any performance testing.  Ideally you'd have
> > as many buckets as you have conntrack entries in the system.  However,
> > every bucket will 
> 
> Something was lost in space... Will? ;-)

hm. don't remember what i wanted to say.   oh, yes. every bucket will
occupy some space, whether there are any connections in that bucket or
not.

> > >   What about HASHSIZE default value? How to read it at runtime?
> 
> So, it cannot be read at runtime, I suppose... It would be really nice,
> though... would /proc be ok?

yes. It is printed at startup via syslog, however.

> We could put a "else" here.
> BTW, why this hard limit of 8192? On really high-speed and high-loaded
> networks, you may perfectly want to set to an upper value...

yes, and you can if you do so by hand.  however, just because a system
has loads of ram, it doesn't mean it will actually do lots of
connections... there are people using computers for something else than
firewalling ;)

> > > - HASHSIZE should be an odd number, and even better: a prime number.
> > >   What happens when you set it to an even number, or a non-prime number?
> 
> > hash distribution will be less optimal.
> 
> But reading the algorithm, hashsize is never automatically set to a
> prime number... but an even one. So how do you explain
> that I have 4091 (which is probably a prime number, right?) buckets on
> my system by default?

maybe you're running a different kernel?

> > > - CONNTRACK_MAX can be modified at run time with /proc. What does it
> > >   do exactly (when shrinked, when extended)?
> 
> You don't really answer to my question: what happens when you set
> conntrack_max to a smaller number than the currently stored conntrack
> entries? I suppose conntrack entries are deleted? According to which
> criterias?

no, there are none deleted.  we just skip creating new ones until the
number has dropped below the limit.  There is no special case for that,
we just chek >= conntrack_max at conntrack allocation time.

> But if we take conntrack_max = hashsize,
> size_of_mem_available_for_ct is still around 300 * ip_conntrack_max
> (on my system, it is not 300, but exactly 292)
> So I simply think that on firewall-only machines with 512Mo, we should
> simply use conntrack_max = hashsize without any questioning.

yes.  but just because your suse or redhat default packetfilter script
modprobes ip_conntrack, there is no way we can assume that this is a
firewall-only machine.

> Oh, BTW, what happens if hashsize > conntrack_max?

nothing.  you will waste memory by keeping empty buckets.

> And what happens exactly when the number of active sessions exceeds
> conntrack max?

at this time, please read the comments in the code.  we try to evict old
unconfirmed conntracks.

> > yes.  You just don't do that.  You configure your firewall, and put it
> > in place.  You should know your network traffic beforehand and configure
> > it correctly.
> 
> That's not always that simple. Suppose you're working for a company for which
> availability and performance is critical... and suppose the growing
> network traffic forces you to increase your bandwidth by about 10.
> Well, in these sort of cases, you certainly want to avoid to reboot
> (and loose connections) too much, believe me.
> Yes, netfilter is sometimes used in these kind of companies. And
> yes, I sometimes happen to do some missions for them.
> And no, I can hardly give you any names. ;-)

well, patches are welcome ;)

>  Herve
 
-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-25 20:57     ` Harald Welte
@ 2003-11-26  3:42       ` Herve Eychenne
  2003-11-26  4:13         ` Henrik Nordstrom
  2003-11-26 11:36         ` Harald Welte
  0 siblings, 2 replies; 20+ messages in thread
From: Herve Eychenne @ 2003-11-26  3:42 UTC (permalink / raw)
  To: Harald Welte, Netfilter Development

On Tue, Nov 25, 2003 at 09:57:23PM +0100, Harald Welte wrote:

 Hi,

> > Thank you very much for your detailled answer, Harald.
> > Sorry for the delay. I'm currently writing this little document, based mainly
> > on your answers.
> > 
> > > > I think it would be good to end up with a small document which would
> > > > give every detail about how to choose optimal values for HASHSIZE and
> > > > CONNTRACK_MAX, and every other mean to get the best out of the
> > > > conntracking/NAT system...
> > 
> > > I guess there hasn't been any performance testing.  Ideally you'd have
> > > as many buckets as you have conntrack entries in the system.  However,
> > > every bucket will 
> > 
> > Something was lost in space... Will? ;-)

> hm. don't remember what i wanted to say.   oh, yes. every bucket will
> occupy some space, whether there are any connections in that bucket or
> not.

Yes, but that is really negligible (2 * size_of_pointer * HASHSIZE).

> > > >   What about HASHSIZE default value? How to read it at runtime?

> > So, it cannot be read at runtime, I suppose... It would be really nice,
> > though... would /proc be ok?

> yes. It is printed at startup via syslog, however.

Syslog can be enough for humans, but not for scripts...
I think you can add "make hashsize value available through /proc" to
the TODO list (whose size is unfortunately ever growing ;-)).

> > We could put a "else" here.
> > BTW, why this hard limit of 8192? On really high-speed and high-loaded
> > networks, you may perfectly want to set to an upper value...

> yes, and you can if you do so by hand.  however, just because a system
> has loads of ram, it doesn't mean it will actually do lots of
> connections... there are people using computers for something else than
> firewalling ;)

Of course.  That was just a statement for a specific configuration, and
this must be decided by a human being.

> > > > - HASHSIZE should be an odd number, and even better: a prime number.
> > > >   What happens when you set it to an even number, or a non-prime number?

> > > hash distribution will be less optimal.

> > But reading the algorithm, hashsize is never automatically set to a
> > prime number... but an even one. So how do you explain
> > that I have 4091 (which is probably a prime number, right?) buckets on
> > my system by default?

> maybe you're running a different kernel?

Debian standard kernel.  Maybe they are patching netfilter?  These are smart
guys! ;-)

> > > > - CONNTRACK_MAX can be modified at run time with /proc. What does it
> > > >   do exactly (when shrinked, when extended)?
> > 
> > You don't really answer to my question: what happens when you set
> > conntrack_max to a smaller number than the currently stored conntrack
> > entries? I suppose conntrack entries are deleted? According to which
> > criterias?

> no, there are none deleted.  we just skip creating new ones until the
> number has dropped below the limit.  There is no special case for that,
> we just chek >= conntrack_max at conntrack allocation time.

Don't you think it would be good to shrink the lists immediately?
Waiting til the number has dropped below the limit can take days...

> > But if we take conntrack_max = hashsize,
> > size_of_mem_available_for_ct is still around 300 * ip_conntrack_max
> > (on my system, it is not 300, but exactly 292)
> > So I simply think that on firewall-only machines with 512Mo, we should
> > simply use conntrack_max = hashsize without any questioning.

> yes.  but just because your suse or redhat default packetfilter script
> modprobes ip_conntrack, there is no way we can assume that this is a
> firewall-only machine.

Of course.  Once more, I didn't propose that this should be done
automatically, I just wanted to know if someone had any objection to
that statement.

> > > yes.  You just don't do that.  You configure your firewall, and put it
> > > in place.  You should know your network traffic beforehand and configure
> > > it correctly.

> > That's not always that simple. Suppose you're working for a company for which
> > availability and performance is critical... and suppose the growing
> > network traffic forces you to increase your bandwidth by about 10.
> > Well, in these sort of cases, you certainly want to avoid to reboot
> > (and loose connections) too much, believe me.
> > Yes, netfilter is sometimes used in these kind of companies. And
> > yes, I sometimes happen to do some missions for them.
> > And no, I can hardly give you any names. ;-)

> well, patches are welcome ;)

Yet another TODO++...

Oh, I nearly forgot... The little document about conntrack/NAT tuning
is located to http://www.wallfire.org/misc/netfilter_conntrack_perf.txt
for the moment.
Corrections and ideas are welcome.

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26  3:42       ` RAM and conntrack performance: first draft of the document is online Herve Eychenne
@ 2003-11-26  4:13         ` Henrik Nordstrom
  2003-11-27  4:56           ` Herve Eychenne
  2003-11-26 11:36         ` Harald Welte
  1 sibling, 1 reply; 20+ messages in thread
From: Henrik Nordstrom @ 2003-11-26  4:13 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Harald Welte, Netfilter Development

On Wed, 26 Nov 2003, Herve Eychenne wrote:

> Debian standard kernel.  Maybe they are patching netfilter?  These are smart
> guys! ;-)

Or maybe you/they have a prime set in modules.conf?

Regards
Henrik

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26  4:13         ` Henrik Nordstrom
@ 2003-11-27  4:56           ` Herve Eychenne
  2003-11-28 11:00             ` Willy Tarreau
  0 siblings, 1 reply; 20+ messages in thread
From: Herve Eychenne @ 2003-11-27  4:56 UTC (permalink / raw)
  To: Henrik Nordstrom, Laurence J. Lane; +Cc: Harald Welte, Netfilter Development

On Wed, Nov 26, 2003 at 05:13:49AM +0100, Henrik Nordstrom wrote:

> On Wed, 26 Nov 2003, Herve Eychenne wrote:

> > Debian standard kernel.  Maybe they are patching netfilter?  These are smart
> > guys! ;-)

> Or maybe you/they have a prime set in modules.conf?

I looked at both modules.conf and Debian kernel source (patched, but
netfilter code seems unaffected), and could find nothing that explains
why I have 4091 buckets (which is indeed a prime number, that's cool)
by default instead of 4096 (I have 512MB).

So it's a bit strange.

Debian iptables package maintainer TO'ed.

Context: I have a Debian testing (sarge) with kernel-image-2.4.22-1-686
package and 512MB of RAM.  I should then logically get ip_conntrack
module initialized with 4096 buckets (size of the netfilter conntrack hash
table, that should be computed automatically by netfilter code according
to the amount of RAM), but I can read 4091 in syslog message.
4091 is better (prime number), but I cannot understand why I get this value
instead of 4096, as nothing particular is done for the moment in
netfilter code to ensure that the computed value will be a prime number.

For further reading about the subject, you can read
http://www.wallfire.org/misc/netfilter_conntrack_perf.txt

Any ideas, Laurence?

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-27  4:56           ` Herve Eychenne
@ 2003-11-28 11:00             ` Willy Tarreau
  0 siblings, 0 replies; 20+ messages in thread
From: Willy Tarreau @ 2003-11-28 11:00 UTC (permalink / raw)
  To: Herve Eychenne
  Cc: Henrik Nordstrom, Laurence J. Lane, Harald Welte,
	Netfilter Development

On Thu, Nov 27, 2003 at 05:56:20AM +0100, Herve Eychenne wrote:
 
> Context: I have a Debian testing (sarge) with kernel-image-2.4.22-1-686
> package and 512MB of RAM.  I should then logically get ip_conntrack
> module initialized with 4096 buckets (size of the netfilter conntrack hash
> table, that should be computed automatically by netfilter code according
> to the amount of RAM), but I can read 4091 in syslog message.
> 4091 is better (prime number), but I cannot understand why I get this value
> instead of 4096, as nothing particular is done for the moment in
> netfilter code to ensure that the computed value will be a prime number.

Perhaps you have a small portion of this RAM dedicated to a video RAM so
that the amount of system ram is slightly lower than 512 MB (eg: 510 MB).
Then, dividing this would give you something which is not a power of 2.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26  3:42       ` RAM and conntrack performance: first draft of the document is online Herve Eychenne
  2003-11-26  4:13         ` Henrik Nordstrom
@ 2003-11-26 11:36         ` Harald Welte
  2003-11-26 16:26           ` Patrick McHardy
                             ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Harald Welte @ 2003-11-26 11:36 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 2431 bytes --]

On Wed, Nov 26, 2003 at 04:42:32AM +0100, Herve Eychenne wrote:

> Yes, but that is really negligible (2 * size_of_pointer * HASHSIZE).

well, sizof(void *) is 4 bytes on most archs... two times is 8.  so if
you have let's say 100k buckets, that's 800k non-swappable kernel
memory...

> > > So, it cannot be read at runtime, I suppose... It would be really nice,
> > > though... would /proc be ok?
> 
> > yes. It is printed at startup via syslog, however.
> 
> Syslog can be enough for humans, but not for scripts...
> I think you can add "make hashsize value available through /proc" to
> the TODO list (whose size is unfortunately ever growing ;-)).

i'd rather write a patch than add it to the todo list.  adding and
removing that item from the list would be about the same amount of work,
i guess.

> > maybe you're running a different kernel?
> 
> Debian standard kernel.  Maybe they are patching netfilter?  These are smart
> guys! ;-)

IIRC debian has still 2.4.18 which had a different hashing algorithm

> > no, there are none deleted.  we just skip creating new ones until the
> > number has dropped below the limit.  There is no special case for that,
> > we just chek >= conntrack_max at conntrack allocation time.
> 
> Don't you think it would be good to shrink the lists immediately?
> Waiting til the number has dropped below the limit can take days...

well, it might be a good idea.  but I somehow doubt this is a valid
scenario.  And if we would shrink the list:  how do we select which
entries to evict?  I'd rather wait for ctnetlink to appear in mainstream
kernels and then leave that to a userspace process.

> > yes.  but just because your suse or redhat default packetfilter script
> > modprobes ip_conntrack, there is no way we can assume that this is a
> > firewall-only machine.
> 
> Of course.  Once more, I didn't propose that this should be done
> automatically, I just wanted to know if someone had any objection to
> that statement.

ah, ok.

>  Herve
> (°=  Hervé Eychenne

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26 11:36         ` Harald Welte
@ 2003-11-26 16:26           ` Patrick McHardy
  2003-11-27 11:10             ` Harald Welte
  2003-11-27  3:33           ` Herve Eychenne
  2003-11-27  4:14           ` [PATCH] Re: hashsize available through /proc was " Herve Eychenne
  2 siblings, 1 reply; 20+ messages in thread
From: Patrick McHardy @ 2003-11-26 16:26 UTC (permalink / raw)
  To: Harald Welte; +Cc: Herve Eychenne, Netfilter Development

Harald Welte wrote:

>>Don't you think it would be good to shrink the lists immediately?
>>Waiting til the number has dropped below the limit can take days...
>>    
>>
>
>well, it might be a good idea.  but I somehow doubt this is a valid
>scenario.  And if we would shrink the list:  how do we select which
>entries to evict?  I'd rather wait for ctnetlink to appear in mainstream
>kernels and then leave that to a userspace process.
>  
>

PF uses "adaptive timeouts" to scale down timeouts if the table gets full.
IIRC until some threshold is reached, all entries have 100% of their normal
timeouts, from then on its scaled down until 0% for a completly full table.
I've been thinking about adding this to ip_conntrack for some time because
I often have problems with my roommates edonkey and overflowing conntrack
table. I decided to experiment with it when the timeout handling of conntrack
is changed from beeing handled by a per-conntrack timer to a global cleanup
timer. I recall someone wanted to make a patch for this some time ago to
prevent timer storms, do you have any information if anyone is currently
working on this ? Otherwise I might just do both ..

Best regards,
Patrick

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26 16:26           ` Patrick McHardy
@ 2003-11-27 11:10             ` Harald Welte
  0 siblings, 0 replies; 20+ messages in thread
From: Harald Welte @ 2003-11-27 11:10 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Herve Eychenne, Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

On Wed, Nov 26, 2003 at 05:26:17PM +0100, Patrick McHardy wrote:
> I recall someone wanted to make a patch for this some time ago to
> prevent timer storms, do you have any information if anyone is currently
> working on this ? Otherwise I might just do both ..

I think somebody had already written such a patch (gandalf?), however
he didn't see any significant performance difference.

> Best regards,
> Patrick

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-26 11:36         ` Harald Welte
  2003-11-26 16:26           ` Patrick McHardy
@ 2003-11-27  3:33           ` Herve Eychenne
  2003-11-27  9:56             ` Henrik Nordstrom
  2003-11-30 22:25             ` Harald Welte
  2003-11-27  4:14           ` [PATCH] Re: hashsize available through /proc was " Herve Eychenne
  2 siblings, 2 replies; 20+ messages in thread
From: Herve Eychenne @ 2003-11-27  3:33 UTC (permalink / raw)
  To: Harald Welte, Netfilter Development

On Wed, Nov 26, 2003 at 12:36:45PM +0100, Harald Welte wrote:

> On Wed, Nov 26, 2003 at 04:42:32AM +0100, Herve Eychenne wrote:

> > Yes, but that is really negligible (2 * size_of_pointer * HASHSIZE).

> well, sizof(void *) is 4 bytes on most archs... two times is 8.  so if
> you have let's say 100k buckets, that's 800k non-swappable kernel
> memory...

Which is really not that much when you have 512 MB... (0.0015 %)

> > > maybe you're running a different kernel?

> > Debian standard kernel.  Maybe they are patching netfilter?  These are smart
> > guys! ;-)

> IIRC debian has still 2.4.18 which had a different hashing algorithm

Standard Debian stable, maybe. But you may want to run testing, or
sid, and I run testing (sarge), so I have a 2.4.22.

> > > no, there are none deleted.  we just skip creating new ones until the
> > > number has dropped below the limit.  There is no special case for that,
> > > we just chek >= conntrack_max at conntrack allocation time.

> > Don't you think it would be good to shrink the lists immediately?
> > Waiting til the number has dropped below the limit can take days...

> well, it might be a good idea.  but I somehow doubt this is a valid
> scenario.  And if we would shrink the list:  how do we select which
> entries to evict?

That seems relatively simple to me:
- reduce timeouts on every entries proportionally
- sort the entries by order of importance (state, timeout (time to
  live), protocol (icmp ping/pong, udp, tcp), maybe unprivileged ports
  matter less, etc.). Then evict "bad scores" first.

> I'd rather wait for ctnetlink to appear in mainstream
> kernels and then leave that to a userspace process.

Maybe such a job (probably happening while network is stressed) is
better done in kernel space? That doesn't seem so complicated...

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-27  3:33           ` Herve Eychenne
@ 2003-11-27  9:56             ` Henrik Nordstrom
  2003-11-30 22:25             ` Harald Welte
  1 sibling, 0 replies; 20+ messages in thread
From: Henrik Nordstrom @ 2003-11-27  9:56 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

On Thu, 27 Nov 2003, Herve Eychenne wrote:

> Which is really not that much when you have 512 MB... (0.0015 %)

Err... 800K or 512M is .15% which is not insignificant if you do not plan
on using that memory. If each subsytem of the kernel did the same "just in
case" then there would be very little memory left for the user.

Regards
Henrik

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RAM and conntrack performance: first draft of the document is online
  2003-11-27  3:33           ` Herve Eychenne
  2003-11-27  9:56             ` Henrik Nordstrom
@ 2003-11-30 22:25             ` Harald Welte
  1 sibling, 0 replies; 20+ messages in thread
From: Harald Welte @ 2003-11-30 22:25 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]

On Thu, Nov 27, 2003 at 04:33:30AM +0100, Herve Eychenne wrote:

> > well, it might be a good idea.  but I somehow doubt this is a valid
> > scenario.  And if we would shrink the list:  how do we select which
> > entries to evict?
> 
> That seems relatively simple to me:
> - reduce timeouts on every entries proportionally
> - sort the entries by order of importance (state, timeout (time to
>   live), protocol (icmp ping/pong, udp, tcp), maybe unprivileged ports
>   matter less, etc.). Then evict "bad scores" first.

well, but how do you set those 'scores' or the 'importance'?  somebody
running a packet filter in front of an important DNS server will care
more about UDP than somebody else with a large ftp server.  And you
definitely don't want to add a sophisticated user-configurable interface
for this rare case.

I'd rather say we provide a mechanism for userspace:  
1) limiting ip_conntrack_max via sysctl()
2) evicting entries via ctnetlink, based on whatever choice a userspace
   program might want.

> > I'd rather wait for ctnetlink to appear in mainstream
> > kernels and then leave that to a userspace process.
> 
> Maybe such a job (probably happening while network is stressed) is
> better done in kernel space? That doesn't seem so complicated...

mh.  when the network is stressed you want to add addotional pressure by 
reducing the number of conntracks?  doesn't really sound like a
reasonable thing to me.  Also, ordering and priorizing the list would
have to be done with a WRITE_LOCK on ip_conntrack_lock... again
something that wouldn'd be a wise thing if  your network is stressed.

>  Herve
-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-26 11:36         ` Harald Welte
  2003-11-26 16:26           ` Patrick McHardy
  2003-11-27  3:33           ` Herve Eychenne
@ 2003-11-27  4:14           ` Herve Eychenne
  2003-11-27 10:09             ` Henrik Nordstrom
  2003-11-27 11:14             ` Harald Welte
  2 siblings, 2 replies; 20+ messages in thread
From: Herve Eychenne @ 2003-11-27  4:14 UTC (permalink / raw)
  To: Harald Welte, Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

On Wed, Nov 26, 2003 at 12:36:45PM +0100, Harald Welte wrote:

> On Wed, Nov 26, 2003 at 04:42:32AM +0100, Herve Eychenne wrote:

> > > > So, it cannot be read at runtime, I suppose... It would be really nice,
> > > > though... would /proc be ok?
> > 
> > > yes. It is printed at startup via syslog, however.
> > 
> > Syslog can be enough for humans, but not for scripts...
> > I think you can add "make hashsize value available through /proc" to
> > the TODO list (whose size is unfortunately ever growing ;-)).

> i'd rather write a patch than add it to the todo list.  adding and
> removing that item from the list would be about the same amount of work,
> i guess.

I had a quick look at the existing code in ip_conntrack_core.c.

First I would have been happy to write a small patch, but I'm not really a
kernel guy and register_sysctl_table API seems _completely crappy_ to
me.

So I took the risk to ridiculize myself in public and wrote something, though,
but I'm unsure about my patch. Especially unsure about the Binary ID of
the ctl_table... I took NET_IP_CONNTRACK_MAX + 1 = 2090 because I could find
no occurence of 2090 user for sysctl in the whole kernel tree... but it
seems crappy and hazardous. Who the hell is in charge of ensuring the
unicity of each sysctl binary entry? Where's the list?

So I didn't even took time to compile the attached patch, but with the help of
gods if will hopefully work.

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

[-- Attachment #2: ip_conntrack_core.c.patch --]
[-- Type: text/plain, Size: 1016 bytes --]

--- ip_conntrack_core.c.old	2003-11-27 04:59:57.000000000 +0100
+++ ip_conntrack_core.c.new	2003-11-27 05:09:30.000000000 +0100
@@ -1349,15 +1349,23 @@
     SO_ORIGINAL_DST, SO_ORIGINAL_DST+1, &getorigdst,
     0, NULL };
 
+#ifdef CONFIG_SYSCTL
+
 #define NET_IP_CONNTRACK_MAX 2089
 #define NET_IP_CONNTRACK_MAX_NAME "ip_conntrack_max"
 
-#ifdef CONFIG_SYSCTL
+#define NET_IP_CONNTRACK_HASHSIZE 2090
+#define NET_IP_CONNTRACK_HASHSIZE_NAME "ip_conntrack_hashsize"
+
 static struct ctl_table_header *ip_conntrack_sysctl_header;
 
 static ctl_table ip_conntrack_table[] = {
-	{ NET_IP_CONNTRACK_MAX, NET_IP_CONNTRACK_MAX_NAME, &ip_conntrack_max,
-	  sizeof(ip_conntrack_max), 0644,  NULL, proc_dointvec },
+	{ NET_IP_CONNTRACK_MAX, NET_IP_CONNTRACK_MAX_NAME,
+	  &ip_conntrack_max, sizeof(ip_conntrack_max), 0644,
+	  NULL, proc_dointvec },
+	{ NET_IP_CONNTRACK_HASHSIZE, NET_IP_CONNTRACK_HASHSIZE_NAME,
+	  &ip_conntrack_htable_size, sizeof(ip_conntrack_htable_size), 0444,
+	  NULL, proc_dointvec },
  	{ 0 }
 };
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-27  4:14           ` [PATCH] Re: hashsize available through /proc was " Herve Eychenne
@ 2003-11-27 10:09             ` Henrik Nordstrom
  2003-11-27 10:13               ` Henrik Nordstrom
  2003-11-27 11:38               ` Herve Eychenne
  2003-11-27 11:14             ` Harald Welte
  1 sibling, 2 replies; 20+ messages in thread
From: Henrik Nordstrom @ 2003-11-27 10:09 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Harald Welte, Netfilter Development

On Thu, 27 Nov 2003, Herve Eychenne wrote:

> First I would have been happy to write a small patch, but I'm not really a
> kernel guy and register_sysctl_table API seems _completely crappy_ to
> me.

It is not that bad compared to the alternatives..

> So I took the risk to ridiculize myself in public and wrote something, though,
> but I'm unsure about my patch. Especially unsure about the Binary ID of
> the ctl_table... I took NET_IP_CONNTRACK_MAX + 1 = 2090 because I could find
> no occurence of 2090 user for sysctl in the whole kernel tree... but it
> seems crappy and hazardous. Who the hell is in charge of ensuring the
> unicity of each sysctl binary entry? Where's the list?

You also need to make the sysctl read-only.. you can not change the 
ip_conntrack hash size while conntrack is running. If you do there will be 
serious hazard.

Also what kernel version did you do this in? Your source does not seem to
match either 2.4.22 or 2.6.0-test10.. in the sources I see the conntrack
sysctls are all in ip_conntrack_standalone.c not ip_conntrack_core.c..
(looks like you got the filename wrong only..  the sources seem to match
ip_conntrack_standalone.c even if the header say 
ip_conntrack_core.c.new/orig)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-27 10:09             ` Henrik Nordstrom
@ 2003-11-27 10:13               ` Henrik Nordstrom
  2003-11-27 11:38               ` Herve Eychenne
  1 sibling, 0 replies; 20+ messages in thread
From: Henrik Nordstrom @ 2003-11-27 10:13 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

On Thu, 27 Nov 2003, Henrik Nordstrom wrote:

> You also need to make the sysctl read-only.. you can not change the 
> ip_conntrack hash size while conntrack is running. If you do there will be 
> serious hazard.

Looking again I see that you did take care of this. Sorry.

Regards
Henrik

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-27 10:09             ` Henrik Nordstrom
  2003-11-27 10:13               ` Henrik Nordstrom
@ 2003-11-27 11:38               ` Herve Eychenne
  2003-11-27 11:57                 ` Henrik Nordstrom
  1 sibling, 1 reply; 20+ messages in thread
From: Herve Eychenne @ 2003-11-27 11:38 UTC (permalink / raw)
  To: Henrik Nordstrom; +Cc: Harald Welte, Netfilter Development

On Thu, Nov 27, 2003 at 11:09:03AM +0100, Henrik Nordstrom wrote:

> On Thu, 27 Nov 2003, Herve Eychenne wrote:

Anyway, Harald had already made a patch for pom without telling,
and Patrick discussed it some hours ago.

> Also what kernel version did you do this in? Your source does not seem to
> match either 2.4.22 or 2.6.0-test10.. in the sources I see the conntrack
> sysctls are all in ip_conntrack_standalone.c not ip_conntrack_core.c..
> (looks like you got the filename wrong only..  the sources seem to match
> ip_conntrack_standalone.c even if the header say 
> ip_conntrack_core.c.new/orig)

I worked on 2.4.22 Debian kernel sources (patched), but in vanilla
kernel 2.4.22, the files you're talking about are identical to the
Debian sources (which contain only a few patches).
And I maintain that my patch is against ip_conntrack_core.c...

This is crazy, as I'm aware that Harald's patch what against
ip_conntrack_standalone.c... but I just downloaded a fresh 2.4.22
kernel from kernel.org, and I can swear that sysctls are in
ip_conntrack_core.c (and not ip_conntrack_standalone.c).

rv@comet:/usr/src/linux-2.4.22/net/ipv4/netfilter$ grep sysctl *
ip_conntrack_core.c:#include <linux/sysctl.h>
ip_conntrack_core.c:static struct ctl_table_header *ip_conntrack_sysctl_header;
ip_conntrack_core.c:unregister_sysctl_table(ip_conntrack_sysctl_header);
ip_conntrack_core.c:    ip_conntrack_sysctl_header
ip_conntrack_core.c:  = register_sysctl_table(ip_conntrack_root_table, 0);
ip_conntrack_core.c:    if (ip_conntrack_sysctl_header == NULL) {
ip_queue.c:#include <linux/sysctl.h>
ip_queue.c:static int sysctl_maxlen = IPQ_QMAX_DEFAULT;
ip_queue.c:static struct ctl_table_header *ipq_sysctl_header;
ip_queue.c:     { NET_IPQ_QMAX, NET_IPQ_QMAX_NAME, &sysctl_maxlen,
ip_queue.c:  sizeof(sysctl_maxlen), 0644, NULL, proc_dointvec },
ip_queue.c:  ipq_sysctl_header = register_sysctl_table(ipq_root_table, 0);
ip_queue.c:             goto cleanup_sysctl;
ip_queue.c:cleanup_sysctl:
ip_queue.c:     unregister_sysctl_table(ipq_sysctl_header);
ipt_REJECT.c:   /* FIXME: Use sysctl number. --RR */
ipt_ULOG.c: *            nlgroup now global (sysctl)
rv@comet:/usr/src/linux-2.4.22/net/ipv4/netfilter$

Has someone an explanation, before I'm going nuts?

 Herve

-- 
 _
(°=  Hervé Eychenne
//)
v_/_ WallFire project:  http://www.wallfire.org/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-27 11:38               ` Herve Eychenne
@ 2003-11-27 11:57                 ` Henrik Nordstrom
  0 siblings, 0 replies; 20+ messages in thread
From: Henrik Nordstrom @ 2003-11-27 11:57 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

On Thu, 27 Nov 2003, Herve Eychenne wrote:

> I worked on 2.4.22 Debian kernel sources (patched), but in vanilla
> kernel 2.4.22, the files you're talking about are identical to the
> Debian sources (which contain only a few patches).
> And I maintain that my patch is against ip_conntrack_core.c...

Right.. my sources have current patch-o-matic applied.. 
80_ip_conntrack-proc.patch moves these to ip_conntrack_standalone.c

Regards
Henrik

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Re: hashsize available through /proc was RAM and conntrack performance: first draft of the document is online
  2003-11-27  4:14           ` [PATCH] Re: hashsize available through /proc was " Herve Eychenne
  2003-11-27 10:09             ` Henrik Nordstrom
@ 2003-11-27 11:14             ` Harald Welte
  1 sibling, 0 replies; 20+ messages in thread
From: Harald Welte @ 2003-11-27 11:14 UTC (permalink / raw)
  To: Herve Eychenne; +Cc: Netfilter Development

[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]

On Thu, Nov 27, 2003 at 05:14:52AM +0100, Herve Eychenne wrote:
> I had a quick look at the existing code in ip_conntrack_core.c.
> 
> First I would have been happy to write a small patch, but I'm not really a
> kernel guy and register_sysctl_table API seems _completely crappy_ to
> me.

;)

> 
> So I took the risk to ridiculize myself in public and wrote something, though,
> the ctl_table... I took NET_IP_CONNTRACK_MAX + 1 = 2090 because I could find
> no occurence of 2090 user for sysctl in the whole kernel tree... but it
> seems crappy and hazardous. Who the hell is in charge of ensuring the
> unicity of each sysctl binary entry? Where's the list?

the list is in include/linux/sysctl.h  And sysctl-by-numbers has been
deprecated anyways, nobody should safely assume that sysctl via number
is safe anymore...

> So I didn't even took time to compile the attached patch, but with the help of
> gods if will hopefully work.

I've written up a patch that ensures it is exported only read-only,
please have a look at
http://cvs.netfilter.org/netfilter/patch-o-matic/pending/76_conntrack_bucket_sysctl.patch

>  Herve

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2003-11-30 22:25 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-28 15:10 RAM and conntrack performance Herve Eychenne
2003-11-03  8:12 ` Harald Welte
2003-11-25 15:35   ` Herve Eychenne
2003-11-25 20:57     ` Harald Welte
2003-11-26  3:42       ` RAM and conntrack performance: first draft of the document is online Herve Eychenne
2003-11-26  4:13         ` Henrik Nordstrom
2003-11-27  4:56           ` Herve Eychenne
2003-11-28 11:00             ` Willy Tarreau
2003-11-26 11:36         ` Harald Welte
2003-11-26 16:26           ` Patrick McHardy
2003-11-27 11:10             ` Harald Welte
2003-11-27  3:33           ` Herve Eychenne
2003-11-27  9:56             ` Henrik Nordstrom
2003-11-30 22:25             ` Harald Welte
2003-11-27  4:14           ` [PATCH] Re: hashsize available through /proc was " Herve Eychenne
2003-11-27 10:09             ` Henrik Nordstrom
2003-11-27 10:13               ` Henrik Nordstrom
2003-11-27 11:38               ` Herve Eychenne
2003-11-27 11:57                 ` Henrik Nordstrom
2003-11-27 11:14             ` Harald Welte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.