Re: IPtables performance (fwd)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: IPtables performance (fwd)
@ 2003-04-22 11:04 Chris Wilson
  2003-04-27 13:02 ` Harald Welte
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2003-04-22 11:04 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Robert Olsson, John McEleney

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3513 bytes --]

Dear Sirs,

We have been having a problem with the performance of iptables in loading
large and complex rulesets for a long time. We have tracked it down to the
mark_source_chains kernel function, which takes about 15 seconds to
execute with a fairly large and complex ruleset on decent hardware.

We discovered Robert Olsson's patch which greatly improves the efficiency
of this function (he claims up to 1000x, we have seen around 80x with our
rulesets). This was last submitted to this list on 6th March 2002
(http://lists.netfilter.org/pipermail/netfilter-devel/2002-March/006994.html).

I cannot find any replies to this message in the archives, and the TODO 
list entry was removed in revision 1.65 with the message "todo update". We 
are really interested in this patch and would like to see it applied.

I have attached Robert's original patch which he has tested extensively,
and also a slightly cleaned-up version which we are using now and which
looks all right to me and seems to work (although I am still not sure
under what circumstances pos could be less than 0 or greater than
newinfo->size).

Please could you let me know whether you would consider including this 
patch in Netfilter.

Cheers, Chris.

[P.S. Harald: Robert sends his regards =)]
-- 
   ___ __     _
 / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
/ (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
\ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |

---------- Forwarded message ----------
Date: 17 Apr 2003 15:42:15 +0200
From: Robert Olsson <robban@robtex.com>
To: Chris Wilson <chris@netservers.co.uk>
Subject: Re: IPtables performance

On Thu, 2003-04-17 at 12:13, Chris Wilson wrote:
> Dear Mr Olsson,
> 
> I read with interest about your patch for improving the rule load 
> performance of iptables by changing the algorithm used by 
> mark_source_chains. We are having exactly the same problem and would like 
> to try your patch.
> 
> However the latest version I can find is at 
> http://lists.netfilter.org/pipermail/netfilter-devel/2002-March/006994.html, 
> dated 6th March 2002. Since this has been HTML-ified, it no longer applies 
> as a working patch. I'm going to work on reintegrating it with the latest 
> iptables patch-o-matic, but first I wanted to check whether you were:
> 
> - aware of any problems with that version of the patch (since I couldn't 
>   find much discussion about it)

attached last version. problem with prior versions was that it didn't
care much about different chains (INPUT/FORWARD etc).
current version has been in production at our main firewall (we are an
ISP) with 3000 rules or something. really interlinked, takes a month to
insert without my patch :)

> - still interested in, developing or testing the patch (and how much has 
>   it been tested?)
> - happy for the code to be used under GPL

i would be happy if the code was used in netfilter

> - happy for us to resubmit the updated patch to the Netfilter developers 
>   (with your name still in the credits, of course)

sure you have my blessing. give my regards to harald :).
i hate having to patch each new version myself.

> 
> Look forward to hearing from you soon,

please keep me informed what happens

> 
> Cheers, Chris.
> -- 
>    ___ __     _
>  / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
> / (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
> \ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |
> 


[-- Attachment #2: cleaned-up patch --]
[-- Type: TEXT/PLAIN, Size: 1698 bytes --]

This is my updated version of Robert Olsson's patch, against 2.4.20
with patch-o-matic-20030107.

<chris@netservers.co.uk> 17/04/03

--- linux/net/ipv4/netfilter/ip_tables.c	Wed Apr 16 18:44:45 2003
+++ linux-nfspeed/net/ipv4/netfilter/ip_tables.c	Thu Apr 17 11:36:34 2003
@@ -8,4 +8,9 @@
  * 	- increase module usage count as soon as we have rules inside
  * 	  a table
+ *
+ *  6 Mar 2002 Robert Olsson <robban@robtex.com>
+ * 17 Apr 2003 Chris  Wilson <chris@netservers.co.uk>
+ *     - mark_source_chains speedup for complex chains
+ *
  */
 #include <linux/config.h>
@@ -503,4 +508,7 @@
 	unsigned int hook;
 
+	/* keep track of where we have been: */
+	unsigned char *been = vmalloc(newinfo->size);
+
 	/* No recursion; use packet counter to save back ptrs (reset
 	   to 0 as we leave), and comefrom to save source hook bitmask */
@@ -515,4 +523,5 @@
 		/* Set initial back pointer. */
 		e->counters.pcnt = pos;
+		memset(been, 0, newinfo->size);
 
 		for (;;) {
@@ -523,4 +532,5 @@
 				printk("iptables: loop hook %u pos %u %08X.\n",
 				       hook, pos, e->comefrom);
+				vfree(been);
 				return 0;
 			}
@@ -570,8 +580,12 @@
 				int newpos = t->verdict;
 
-				if (strcmp(t->target.u.user.name,
+				if ( (pos < 0 || pos >= newinfo->size
+				      || !been[pos]) 
+				    && strcmp(t->target.u.user.name,
 					   IPT_STANDARD_TARGET) == 0
 				    && newpos >= 0) {
 					/* This a jump; chase it. */
+					if (pos >= 0 && pos < newinfo->size)
+						been[pos]++;
 					duprintf("Jump rule %u -> %u\n",
 						 pos, newpos);
@@ -589,4 +603,5 @@
 		duprintf("Finished chain %u\n", hook);
 	}
+	vfree(been);
 	return 1;
 }

[-- Attachment #3: original patch --]
[-- Type: TEXT/PLAIN, Size: 1828 bytes --]

--- oldlinux/net/ipv4/netfilter/ip_tables.c	Sat Mar  2 23:24:52 2002
+++ linux/net/ipv4/netfilter/ip_tables.c	Sun Mar  3 00:25:27 2002
@@ -7,6 +7,10 @@
  * 19 Jan 2002 Harald Welte <laforge@gnumonks.org>
  * 	- increase module usage count as soon as we have rules inside
  * 	  a table
+ *
+ * 2 Mar 2002 Robert Olsson <robban@robtex.com>
+ *	- mark_source_chains speedup for complex chains
+ *     
  */
 #include <linux/config.h>
 #include <linux/skbuff.h>
@@ -500,10 +504,12 @@
 mark_source_chains(struct ipt_table_info *newinfo, unsigned int valid_hooks)
 {
 	unsigned int hook;
-
+	/* keep track of where we have been: */
+	unsigned char *been=vmalloc(newinfo->size);
 	/* No recursion; use packet counter to save back ptrs (reset
 	   to 0 as we leave), and comefrom to save source hook bitmask */
 	for (hook = 0; hook < NF_IP_NUMHOOKS; hook++) {
+		memset(been,0,newinfo->size);
 		unsigned int pos = newinfo->hook_entry[hook];
 		struct ipt_entry *e
 			= (struct ipt_entry *)(newinfo->entries + pos);
@@ -521,6 +527,7 @@
 			if (e->comefrom & (1 << NF_IP_NUMHOOKS)) {
 				printk("iptables: loop hook %u pos %u %08X.\n",
 				       hook, pos, e->comefrom);
+				vfree(been);
 				return 0;
 			}
 			e->comefrom
@@ -567,11 +574,12 @@
 				pos += size;
 			} else {
 				int newpos = t->verdict;
-
-				if (strcmp(t->target.u.user.name,
+				if (((0<pos&&pos<newinfo->size)?(been[pos]<1):1)
+					&&strcmp(t->target.u.user.name,
 					   IPT_STANDARD_TARGET) == 0
 				    && newpos >= 0) {
 					/* This a jump; chase it. */
+					if (0<pos&&newpos<newinfo->size) been[pos]++;
 					duprintf("Jump rule %u -> %u\n",
 						 pos, newpos);
 				} else {
@@ -587,6 +595,7 @@
 		next:
 		duprintf("Finished chain %u\n", hook);
 	}
+	vfree(been);
 	return 1;
 }
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-22 11:04 IPtables performance (fwd) Chris Wilson
@ 2003-04-27 13:02 ` Harald Welte
  2003-04-27 14:57   ` Robert Olsson
  2003-04-28 12:25   ` Chris Wilson
  0 siblings, 2 replies; 15+ messages in thread
From: Harald Welte @ 2003-04-27 13:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: netfilter-devel, Robert Olsson, John McEleney

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]

On Tue, Apr 22, 2003 at 12:04:26PM +0100, Chris Wilson wrote:
> Dear Sirs,
> 
> We have been having a problem with the performance of iptables in loading
> large and complex rulesets for a long time. We have tracked it down to the
> mark_source_chains kernel function, which takes about 15 seconds to
> execute with a fairly large and complex ruleset on decent hardware.

yup, we are aware of this problem.  However, most people tend to execute
iptables 1000 times for inserting 1000 rules, which is obviously the
wrong way to do it.  This way you end up doing 1000 ruleset-loads and
1000 loop checks.  iptables-restore takes away that burden.

> We discovered Robert Olsson's patch which greatly improves the efficiency
> of this function (he claims up to 1000x, we have seen around 80x with our
> rulesets). This was last submitted to this list on 6th March 2002
> (http://lists.netfilter.org/pipermail/netfilter-devel/2002-March/006994.html).

I didn't accept his previous patches, because it didn't cover all cases,
i.e. there are loops that wouldn't have been detected by his patch.

However, the last patch seems to have changed, so I need to do some
analysis whether it would still detect all loops or not.

> Please could you let me know whether you would consider including this 
> patch in Netfilter.

I will consider any optimization as soon as anybody can guarantee me
that it still detects all cases of loops ;)

I'll put your patch into patch-o-matic 'extra' and add a Status:
Experimental line.

> [P.S. Harald: Robert sends his regards =)]

thanks.

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-27 13:02 ` Harald Welte
@ 2003-04-27 14:57   ` Robert Olsson
  2003-05-02 12:57     ` Andre Uratsuka Manoel
  2003-04-28 12:25   ` Chris Wilson
  1 sibling, 1 reply; 15+ messages in thread
From: Robert Olsson @ 2003-04-27 14:57 UTC (permalink / raw)
  To: Harald Welte; +Cc: Chris Wilson, netfilter-devel, John McEleney

On Sun, 2003-04-27 at 15:02, Harald Welte wrote:
> On Tue, Apr 22, 2003 at 12:04:26PM +0100, Chris Wilson wrote:
> > Dear Sirs,
> > 
> > We have been having a problem with the performance of iptables in loading
> > large and complex rulesets for a long time. We have tracked it down to the
> > mark_source_chains kernel function, which takes about 15 seconds to
> > execute with a fairly large and complex ruleset on decent hardware.
> 
> yup, we are aware of this problem.  However, most people tend to execute
> iptables 1000 times for inserting 1000 rules, which is obviously the
> wrong way to do it.  This way you end up doing 1000 ruleset-loads and
> 1000 loop checks.  iptables-restore takes away that burden.

I wish it was that easy, but it was equally slow to me when I tried it a
long time ago, so my analysis was that this function is called for every
single rule-insertion even for iptables-restore?

> 
> > We discovered Robert Olsson's patch which greatly improves the efficiency
> > of this function (he claims up to 1000x, we have seen around 80x with our
> > rulesets). This was last submitted to this list on 6th March 2002
> > (http://lists.netfilter.org/pipermail/netfilter-devel/2002-March/006994.html).
> 
> I didn't accept his previous patches, because it didn't cover all cases,
> i.e. there are loops that wouldn't have been detected by his patch.
> 
> However, the last patch seems to have changed, so I need to do some
> analysis whether it would still detect all loops or not.
> 
> > Please could you let me know whether you would consider including this 
> > patch in Netfilter.
> 
> I will consider any optimization as soon as anybody can guarantee me
> that it still detects all cases of loops ;)

I have tested thoroughly all kinds of loops and it detects it. I will of
course never guarantee anything :)


> 
> I'll put your patch into patch-o-matic 'extra' and add a Status:
> Experimental line.
> 
> > [P.S. Harald: Robert sends his regards =)]
> 
> thanks.
> 

I guess most people have a simple tree-like iptables, but some have
rulesets with an exponentially growing number of possible paths.
We are very thankful for you reinvestigating this issue.
Thank you!

Regards
Robert Olsson


> -- 
> - Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
> ============================================================================
>   "Fragmentation is like classful addressing -- an interesting early
>    architectural error that shows how much experimentation was going
>    on while IP was being designed."                    -- Paul Vixie

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-27 13:02 ` Harald Welte
  2003-04-27 14:57   ` Robert Olsson
@ 2003-04-28 12:25   ` Chris Wilson
  2003-04-29 14:34     ` Harald Welte
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2003-04-28 12:25 UTC (permalink / raw)
  To: Harald Welte; +Cc: netfilter-devel, Robert Olsson, John McEleney

Hi Harald,

> yup, we are aware of this problem.  However, most people tend to execute
> iptables 1000 times for inserting 1000 rules, which is obviously the
> wrong way to do it.  This way you end up doing 1000 ruleset-loads and
> 1000 loop checks.  iptables-restore takes away that burden.

We have already been using iptables-restore since last year, with a big
increase in performance, but the basic problem still remains. Without this
patch however, it still takes 15 seconds in mark_source_chains every time
we change the ruleset on a large firewall. This is a significant delay
when it needs to happen every time a VPN user connects, for example,
before they can send traffic.

> However, the last patch seems to have changed, so I need to do some
> analysis whether it would still detect all loops or not.
> 
> I will consider any optimization as soon as anybody can guarantee me
> that it still detects all cases of loops ;)

I don't claim to understand exactly how mark_source_chains works, but here
is my argument that the patch does detect all loops. 

It appears to me that loop check made by mark_source_chains is
deterministic, so once we've visited a particular rule, and determined
that there are no loops in the paths leading from it, then the rule is
_safe_, and we don't need to check it again.

On the other hand, detection of a loop leads to immediate return with a
fatal error, so that rule will never be touched again either.

However, the current algorithm in mark_source_chains is to check the rule
once for every path leading _to_ it, despite the fact that each and every
one of these checks will return precisely the same verdict.

All that Robert's patch does is to add an array of "been" flags, one per
rule, which tells us if the rule is a jump and we've checked it before.  
Since we're still running mark_source_chains and didn't return earlier,
the rule did not cuase a loop then, and if nobody's changed the rules then
it still does not cause a loop, and doesn't need to be checked again.

> I'll put your patch into patch-o-matic 'extra' and add a Status:
> Experimental line.

Thanks =)

By the way, this code looks like it could be simplified:

[net/ipv4/netfilter/ip_tables.c:mark_source_chains()]

                                        e->comefrom ^= (1<<NF_IP_NUMHOOKS);
#ifdef DEBUG_IP_FIREWALL_USER
                                        if (e->comefrom
                                            & (1 << NF_IP_NUMHOOKS)) {
                                                duprintf("Back unset "
                                                         "on hook %u "
                                                         "rule %u\n",
                                                         hook, pos);
                                        }
#endif

because the condition inside the #ifdef will always be true, so we don't 
need to check it.

Cheers, Chris.
-- 
   ___ __     _
 / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
/ (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
\ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-28 12:25   ` Chris Wilson
@ 2003-04-29 14:34     ` Harald Welte
  2003-05-02 18:32       ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Harald Welte @ 2003-04-29 14:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: netfilter-devel, Robert Olsson, John McEleney

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

On Mon, Apr 28, 2003 at 01:25:59PM +0100, Chris Wilson wrote:
> We have already been using iptables-restore since last year, with a big
> increase in performance, but the basic problem still remains. Without this
> patch however, it still takes 15 seconds in mark_source_chains every time
> we change the ruleset on a large firewall. This is a significant delay
> when it needs to happen every time a VPN user connects, for example,
> before they can send traffic.

Yes, I can imagine.

> I don't claim to understand exactly how mark_source_chains works, but here
> is my argument that the patch does detect all loops. 

I have to admit that I don't exactly remember all the details about
mark_source_chains without re-investigating it as well.  I did that now,
so I think I'm fit for further discussion  ;)

> It appears to me that loop check made by mark_source_chains is
> deterministic, so once we've visited a particular rule, and determined
> that there are no loops in the paths leading from it, then the rule is
> _safe_, and we don't need to check it again.

You are missing the other half of the functionality of
mark_source_chains:  It needs to write down (into the 'comefrom' field),
from which hooks the particular rule can be called.

There are matches (like the 'owner' match), that can only be used at
certain hooks (in our example: There is no socket owner at
NF_IP_FORWARD).

> However, the current algorithm in mark_source_chains is to check the rule
> once for every path leading _to_ it, despite the fact that each and every
> one of these checks will return precisely the same verdict.

Yes, this is (as stated above) to save for every rule from which hook it
could possibly be called.

> All that Robert's patch does is to add an array of "been" flags, one per
> rule, which tells us if the rule is a jump and we've checked it before.  
> Since we're still running mark_source_chains and didn't return earlier,
> the rule did not cuase a loop then, and if nobody's changed the rules then
> it still does not cause a loop, and doesn't need to be checked again.

So how does this make sure that every rule will be flagged with the hook
numbers of all paths leading to it?

> Cheers, Chris.

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-27 14:57   ` Robert Olsson
@ 2003-05-02 12:57     ` Andre Uratsuka Manoel
  0 siblings, 0 replies; 15+ messages in thread
From: Andre Uratsuka Manoel @ 2003-05-02 12:57 UTC (permalink / raw)
  To: Robert Olsson
  Cc: Harald Welte, Chris Wilson, netfilter-devel@lists.netfilter.org,
	John McEleney

On 27 Apr 2003, Robert Olsson wrote:

> On Sun, 2003-04-27 at 15:02, Harald Welte wrote:
> > On Tue, Apr 22, 2003 at 12:04:26PM +0100, Chris Wilson wrote:
> > > Dear Sirs,
> > > 
> > > We have been having a problem with the performance of iptables in loading
> > > large and complex rulesets for a long time. We have tracked it down to the
> > > mark_source_chains kernel function, which takes about 15 seconds to
> > > execute with a fairly large and complex ruleset on decent hardware.
> > 
> > yup, we are aware of this problem.  However, most people tend to execute
> > iptables 1000 times for inserting 1000 rules, which is obviously the
> > wrong way to do it.  This way you end up doing 1000 ruleset-loads and
> > 1000 loop checks.  iptables-restore takes away that burden.
> 
> I wish it was that easy, but it was equally slow to me when I tried it a
> long time ago, so my analysis was that this function is called for every
> single rule-insertion even for iptables-restore?

	My profiling of iptables-restore indicates that on my rule set, 
about half of the time is spent memcpy'ing inside insert_rule after the 
handle is recreated. The other half of the time is spent mostly inside 
populate_cache, which is rebuilt after every rule insertion. My test took 
5 minutes to run, of which only 40 seconds were kernel time. Total: 16 
thousand rules, 1000 chains.

	Andre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-04-29 14:34     ` Harald Welte
@ 2003-05-02 18:32       ` Chris Wilson
  2003-05-02 20:14         ` Harald Welte
  2003-05-02 20:15         ` Andre Uratsuka Manoel
  0 siblings, 2 replies; 15+ messages in thread
From: Chris Wilson @ 2003-05-02 18:32 UTC (permalink / raw)
  To: Harald Welte; +Cc: netfilter-devel, Robert Olsson, John McEleney, Ben McKeegan

Hi Harald,

> I have to admit that I don't exactly remember all the details about
> mark_source_chains without re-investigating it as well.  I did that now,
> so I think I'm fit for further discussion  ;)

I know it's late on Friday evening and I'm in a rush to get home, but I'd
like to give everyone the weekend to think about this, so I'm going to
venture into unknown territory and risk making a fool of myself in public =)

> > It appears to me that loop check made by mark_source_chains is
> > deterministic, so once we've visited a particular rule, and determined
> > that there are no loops in the paths leading from it, then the rule is
> > _safe_, and we don't need to check it again.
> 
> You are missing the other half of the functionality of
> mark_source_chains:  It needs to write down (into the 'comefrom' field),
> from which hooks the particular rule can be called.
> 
> There are matches (like the 'owner' match), that can only be used at
> certain hooks (in our example: There is no socket owner at
> NF_IP_FORWARD).

I agree completely that it's important for the kernel to verify that the 
ruleset does not break this condition by calling rules which are not 
valid from a given hook.

However, it seems to me that it's only necessary to visit each rule once 
_and_ set the appropriate bit in the 'comefrom' bitmask. Once the bit is 
set, there's no use setting it again.

So it's only necessary to visit each rule once per hook, in order both to 
detect loops and mark the valid source chains (hooks).

Therefore, if the patched code does actually succeed in visiting each rule 
once, and detecting loops, and setting the appropriate bit in comefrom, 
then it's functionally equivalent to the original code, right?

Thanks for including the patch in pom/extra, we look forward to having it 
well tested. =)

Cheers, Chris.
-- 
   ___ __     _
 / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
/ (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
\ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-02 18:32       ` Chris Wilson
@ 2003-05-02 20:14         ` Harald Welte
  2003-05-03 10:34           ` Chris Wilson
  2003-05-02 20:15         ` Andre Uratsuka Manoel
  1 sibling, 1 reply; 15+ messages in thread
From: Harald Welte @ 2003-05-02 20:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: netfilter-devel, Robert Olsson, John McEleney, Ben McKeegan

[-- Attachment #1: Type: text/plain, Size: 1352 bytes --]

On Fri, May 02, 2003 at 07:32:29PM +0100, Chris Wilson wrote:

> I agree completely that it's important for the kernel to verify that the 
> ruleset does not break this condition by calling rules which are not 
> valid from a given hook.

great :)

> However, it seems to me that it's only necessary to visit each rule once 
> _and_ set the appropriate bit in the 'comefrom' bitmask. Once the bit is 
> set, there's no use setting it again.

true.

> So it's only necessary to visit each rule once per hook, in order both to 
> detect loops and mark the valid source chains (hooks).

also true.  (given that loops could be detected while still only visiting
every rule once)

> Therefore, if the patched code does actually succeed in visiting each rule 
> once, and detecting loops, and setting the appropriate bit in comefrom, 
> then it's functionally equivalent to the original code, right?

yes. Are you claiming that it does so?

> Cheers, Chris.

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-02 18:32       ` Chris Wilson
  2003-05-02 20:14         ` Harald Welte
@ 2003-05-02 20:15         ` Andre Uratsuka Manoel
  2003-05-03 10:32           ` Chris Wilson
  1 sibling, 1 reply; 15+ messages in thread
From: Andre Uratsuka Manoel @ 2003-05-02 20:15 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Harald Welte, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan



	I also may be making a fool of myself (not a first), but my take
is this:

	The traditional algorithm for finding loops in graphs is:

	paint every vertex white
	for every vertex (in this case, for every hook)
		visit(vertex)
	visit(vertex) is:
		if vertex is gray, return LOOP
		if vertex is black, return NO_LOOPS
		if vertex is white
			paint vertex gray
			for every arc leaving from vertex
				if (visit(arc->destination) == LOOP)
					return LOOP
			paint vertex black.
		return NO_LOOPS


	This takes 1 visit to every vertex and 1 verification for every 
arc, but that's it.

	If you have to also colour chains according to what hooks they
came from, the easiest way would be to use about the same algorithm to
traverse the chains, marking the comefrom bitfield. For that you have to
paint the chains white (unvisited) every time (or you use a bitfield and
set it only once).

	for every vertice:
		paint vertex white
	vertex := vertex pointed to by hook
	visit_and_mark(vertex)

	visit_and_mark(vertex) is:
		if (vertex is white)
			mark(vertex)
			paint vertex gray
			for every arc leaving from vertex
				visit_and_mark(vertex)

			
	One visit per vertex downstream from a hook, one check per arc.
Vertices that are used by more than one hook would be visited n times, but
no vertice would be visited more times than twice. It takes O(n+m).

	So, in pathological cases, you would have 2 visits to every chain
and every rule, to find loops, plus 6 visits to every chain and every rule
(supposing every rule is used by every hook), but it is still linear time, 
no matter how complex the ruleset is.


> > > It appears to me that loop check made by mark_source_chains is
> > > deterministic, so once we've visited a particular rule, and determined
> > > that there are no loops in the paths leading from it, then the rule is
> > > _safe_, and we don't need to check it again.
> > 
> > You are missing the other half of the functionality of
> > mark_source_chains:  It needs to write down (into the 'comefrom' field),
> > from which hooks the particular rule can be called.
> > 
> > There are matches (like the 'owner' match), that can only be used at
> > certain hooks (in our example: There is no socket owner at
> > NF_IP_FORWARD).
> 
> I agree completely that it's important for the kernel to verify that the 
> ruleset does not break this condition by calling rules which are not 
> valid from a given hook.
> 
> However, it seems to me that it's only necessary to visit each rule once 
> _and_ set the appropriate bit in the 'comefrom' bitmask. Once the bit is 
> set, there's no use setting it again.
> 
> So it's only necessary to visit each rule once per hook, in order both to 
> detect loops and mark the valid source chains (hooks).
> 
> Therefore, if the patched code does actually succeed in visiting each rule 
> once, and detecting loops, and setting the appropriate bit in comefrom, 
> then it's functionally equivalent to the original code, right?
> 
> Thanks for including the patch in pom/extra, we look forward to having it 
> well tested. =)
> 
> Cheers, Chris.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-02 20:15         ` Andre Uratsuka Manoel
@ 2003-05-03 10:32           ` Chris Wilson
  2003-05-03 22:39             ` Andre Uratsuka Manoel
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2003-05-03 10:32 UTC (permalink / raw)
  To: Andre Uratsuka Manoel
  Cc: Harald Welte, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan

Hi Andre,

> 	I also may be making a fool of myself (not a first), but my take
> is this:

It looks like you have found some very efficient algorithms (not to 
mention a nice visual metaphor). Would you mind if I try to reimplement 
mark_source_chains using these algorithms?

I'm also wondering if the two steps, loop-detection and colouring, should 
be combined as they currently are for brevity and performance, or 
separated for simplicity and intuitive correctness?

Cheers, Chris.
-- 
   ___ __     _
 / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
/ (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
\ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-02 20:14         ` Harald Welte
@ 2003-05-03 10:34           ` Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2003-05-03 10:34 UTC (permalink / raw)
  To: Harald Welte; +Cc: netfilter-devel, Robert Olsson, John McEleney, Ben McKeegan

Hi Harald,

> > Therefore, if the patched code does actually succeed in visiting each rule 
> > once, and detecting loops, and setting the appropriate bit in comefrom, 
> > then it's functionally equivalent to the original code, right?
> 
> yes. Are you claiming that it does so?

I think so, but I'm still working on it so no promises yet =)

How about I rewrite the function using Andre's suggested algorithms, so
that we know it's both correct and optimally efficient?

Cheers, Chris.
-- 
   ___ __     _
 / __// / ,__(_)_  | Chris Wilson -- UNIX Firewall Lead Developer |
/ (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk |
\ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-03 10:32           ` Chris Wilson
@ 2003-05-03 22:39             ` Andre Uratsuka Manoel
  2003-05-04  1:09               ` Harald Welte
  2003-05-28 19:20               ` Harald Welte
  0 siblings, 2 replies; 15+ messages in thread
From: Andre Uratsuka Manoel @ 2003-05-03 22:39 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Harald Welte, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan

On Sat, 3 May 2003, Chris Wilson wrote:

> Hi Andre,
> 
> > 	I also may be making a fool of myself (not a first), but my take
> > is this:
> 
> It looks like you have found some very efficient algorithms (not to 
> mention a nice visual metaphor). Would you mind if I try to reimplement 
> mark_source_chains using these algorithms?

	Unfortunately, I cannot claim to be that smart, since not even the
colors are my idea... 

	Be my guest to implement it. I thought of doing it myself, but I'm
still looking around and trying to understand all those structures and
pointers and I am still trying to find my way around.

	Anyway, I was wondering, there is a variation that I think would
be more efficient, and would avoid traversing the trees again for every
hook.

	What is new is a stack of chains, initially empty, and you change
the traversal slightly to put chains on the stack.

	Here is the pseudo-code for the algorithm:

-------------------------------------------------
	for every chain:
		paint it white 
		set chain->comefrom to 0 

	for every hook:
		if dfs(chain pointed to by hook) returns LOOP
			return LOOP

	// here we know there are no loops and our stack have the chains 
	// in a nice order. Let's mark comefroms

	for every hook:
		hook->chain->comefrom |= (1<<HOOK_NUM)

	while there are chains on stack:
		get chain from top of stack
		for every rule in chain:
			comefrom of rule's target |= chain->comefrom

	dfs(chain) is:
		if color of chain is gray (that is, undecided)
			return LOOP
		if color of chain is black (that is, loop-free)
			return NO_LOOP
		paint chain gray
		for every rule:
			if (dfs(target of rule == LOOP)
				return LOOP
		paint chain black
		push chain
		return NO_LOOP
--------------------------------------------

	The explanation of it all is this: 

	First, to detect a loop, you paint vertices according to its
current state and you traverse the graph depth-first. Each vertex may be:
	
	- white: not yet seen
	- black: the graphs leaving from it are known not to have loops
	- gray: still being considered. 

	If you visit a chain and it is gray, then there is a loop, since
the chain being visited is a predecessor of the previous one. If you
visited every arc coming out of a vertex (rules coming out of a chain) and
you didn't detect a loop, then you paint this vertex black.

	Ok, now about marking the comefrom field: I was thinking that you
had to visit the vertices of the tree more than once because I was
thinking that you can't know what shade you use to paint a vertex at the
first time you visit it, since there may be more than one source for a
given vertex, and those sources may belong to different hooks. If you
visit a vertex from hook OUTPUT, for instance, and find out that it was
already marked as coming from hook INPUT, you still have to visit every
chain below it to mark them as coming from OUTPUT (if you use the
coloring, you visit them only once for every hook, but it is still once
*for every hook*).

	But there is a better way: what if you only visited a chain after
every chain that could be a source to it was already visited and had the
right comefrom field? 

	The way to do this is to visit every chain, starting from the 
hooks and put them on a stack after each of its children were visited. 
Since we have already done that to find loops, we can use that first 
traversal to make that stack.

	Here is an example:

-----------------------
Chains:
	INPUT -> A, G
	OUTPUT -> B 
	FORWARD -> D, C
	A -> B, F, G
	B -> C
	C -> G, H
	D -> E, H
	E -> F
	F -> nothing
	G -> nothing
	H -> nothing

Order of traversal:
	dfs(INPUT)
		dfs(A)
			dfs(B)
				dfs (C)
					dfs(G)
					dfs(H)
			dfs(F)
			dfs(G) returns immediately since G is black.
		dfs(G). G is black.
	dfs(OUTPUT)
		dfs(B). B is black.
	dfs(FORWARD)
		dfs(D)
			dfs(E)
			dfs(H). H is black
		dfs(C). C is black
		

	Order on stack:
	G H C B F A INPUT OUTPUT E D FORWARD


Now you begin popping:
	
	FORWARD: you set D and C as coming from FORWARD
	D: you mark E and H as coming from FORWARD
	E: you mark F as coming from FORWARD
	OUTPUT: you mark B as coming from OUTPUT
	INPUT: you mark A and G as coming from INPUT
	A: you mark B, F, G as coming from INPUT
		-- notice that B is OUTPUT + INPUT, and 
		-- F is FORWARD + INPUT
	F: no children to mark
	B: you mark C as OUTPUT + INPUT
		- now C is OUTPUT + INPUT + FORWARD
	C: you mark G and H as OUTPUT + INPUT + FORWARD
	H: no children to mark
	G: no children to mark

	A: INPUT
	B: INPUT OUTPUT
	C: INPUT OUTPUT FORWARD
	D: FORWARD
	E: FORWARD
	F: INPUT FORWARD
	G: INPUT OUTPUT FORWARD
	H: INPUT OUTPUT FORWARD
	
> I'm also wondering if the two steps, loop-detection and colouring, should 
> be combined as they currently are for brevity and performance, or 
> separated for simplicity and intuitive correctness?

	I don't think there is a way to have them really combined. I think
you have to visit the tree at least twice (discounting the first one in
which you paint the chains white), because of the problem mentioned above.  
I think that the method above will give you the theoretical minimum number
of operations possible. I may be wrong, of course.

	BTW, one thing to look at is how deep the recursion will be. If
someone decides to make a very deep tree, we may exhaust the system stack. 
We may have to use a recursion depth counter or change from recursion to 
iteration.

	Regards,
	Andre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-03 22:39             ` Andre Uratsuka Manoel
@ 2003-05-04  1:09               ` Harald Welte
  2003-05-28 19:20               ` Harald Welte
  1 sibling, 0 replies; 15+ messages in thread
From: Harald Welte @ 2003-05-04  1:09 UTC (permalink / raw)
  To: Andre Uratsuka Manoel
  Cc: Chris Wilson, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

On Sat, May 03, 2003 at 07:39:46PM -0300, Andre Uratsuka Manoel wrote:
> 	Be my guest to implement it. I thought of doing it myself, but I'm
> still looking around and trying to understand all those structures and
> pointers and I am still trying to find my way around.

yes, the iptables structure is fairly complex :(

> 	Anyway, I was wondering, there is a variation that I think would
> be more efficient, and would avoid traversing the trees again for every
> hook.
> 
> 	What is new is a stack of chains, initially empty, and you change
> the traversal slightly to put chains on the stack.

I don't like this idea, exactly because of the recursion problem.
The linux kernel has a very small stack (IIRC 8k on x86), and we really
don't want to introduce any limitations on the chain depth inside a
ruleset.

I will try to have an optimized version (similar to what you've been
proposing in your last mails) by tomorrow.

> 	Regards,
> 	Andre

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-03 22:39             ` Andre Uratsuka Manoel
  2003-05-04  1:09               ` Harald Welte
@ 2003-05-28 19:20               ` Harald Welte
  2003-05-28 22:28                 ` Andre Uratsuka Manoel
  1 sibling, 1 reply; 15+ messages in thread
From: Harald Welte @ 2003-05-28 19:20 UTC (permalink / raw)
  To: Andre Uratsuka Manoel
  Cc: Chris Wilson, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan

[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]

On Sat, May 03, 2003 at 07:39:46PM -0300, Andre Uratsuka Manoel wrote:

> 	Be my guest to implement it. I thought of doing it myself, but I'm
> still looking around and trying to understand all those structures and
> pointers and I am still trying to find my way around.

While a long train ride I recently had the time to think about your
algorithm and I really like it.  I even started with an implementation,
but came to the following problem:

> 	Here is the pseudo-code for the algorithm:
> 
> -------------------------------------------------
> 	for every chain:
> 		paint it white 
> 		set chain->comefrom to 0 

how can we safely iterate over the whole table (which might contain
loops) and paint all chains white?  I mean, if there is a loop in it,
already the initialization-to-white function would loop indefinitely.

any ideas?

> 	Regards,
> 	Andre

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: IPtables performance (fwd)
  2003-05-28 19:20               ` Harald Welte
@ 2003-05-28 22:28                 ` Andre Uratsuka Manoel
  0 siblings, 0 replies; 15+ messages in thread
From: Andre Uratsuka Manoel @ 2003-05-28 22:28 UTC (permalink / raw)
  To: Harald Welte
  Cc: Chris Wilson, netfilter-devel@lists.netfilter.org, Robert Olsson,
	John McEleney, Ben McKeegan

On Wed, 28 May 2003, Harald Welte wrote:

> On Sat, May 03, 2003 at 07:39:46PM -0300, Andre Uratsuka Manoel wrote:
> 
> > 	Be my guest to implement it. I thought of doing it myself, but I'm
> > still looking around and trying to understand all those structures and
> > pointers and I am still trying to find my way around.
> 
> While a long train ride I recently had the time to think about your
> algorithm and I really like it.  I even started with an implementation,
> but came to the following problem:
> 
> > 	Here is the pseudo-code for the algorithm:
> > 
> > -------------------------------------------------
> > 	for every chain:
> > 		paint it white 
> > 		set chain->comefrom to 0 
> 
> how can we safely iterate over the whole table (which might contain
> loops) and paint all chains white?  I mean, if there is a loop in it,
> already the initialization-to-white function would loop indefinitely.
> 
> any ideas?

	You could IPT_ENTRY_ITERATE. It would be fast and hassle-free...

	Andre

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-05-28 22:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-22 11:04 IPtables performance (fwd) Chris Wilson
2003-04-27 13:02 ` Harald Welte
2003-04-27 14:57   ` Robert Olsson
2003-05-02 12:57     ` Andre Uratsuka Manoel
2003-04-28 12:25   ` Chris Wilson
2003-04-29 14:34     ` Harald Welte
2003-05-02 18:32       ` Chris Wilson
2003-05-02 20:14         ` Harald Welte
2003-05-03 10:34           ` Chris Wilson
2003-05-02 20:15         ` Andre Uratsuka Manoel
2003-05-03 10:32           ` Chris Wilson
2003-05-03 22:39             ` Andre Uratsuka Manoel
2003-05-04  1:09               ` Harald Welte
2003-05-28 19:20               ` Harald Welte
2003-05-28 22:28                 ` Andre Uratsuka Manoel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.