iptables as a state machine

All of lore.kernel.org
 help / color / mirror / Atom feed

* iptables as a state machine
@ 2004-10-01  2:39 David S. Miller
  2004-10-01  3:47 ` shemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: David S. Miller @ 2004-10-01  2:39 UTC (permalink / raw)
  To: netfilter-devel

So while waiting for a kernel build to finish I was sniffing
around the iptables code thinking about how one might optimize
the lookups.

The ipt targets are simple enough already, and I think this
shows the elegance of the core iptables design.

It's the main ip header + indev + outdev matching that is
the real memory and cpu cycle killer in these paths.  Once
we match the target, the verdict is determined quickly.

This made me think about how in the old days gcc's instruction
scheduler was computationally complex.  This was fixed by
representing the processor as a state machine, and this is how
the gcc insn scheduler works currently.  It's very fast and all
of the complexity is in building the state machine which is done
at gcc build time.

I think iptables core IP header + indev + outdev match is a
state machine problem as well.  Such a state machine can be
made extremely small memory wise.  The lookup can be something
like running a berkeley packet filter on the frame.  Except
that instead of a "yes or no" answer we get a pointer to a
target.

I think this would run as efficiently, if not more so, than the
various B-tree based schemes and it certainly would consume much
less memory.

The only trick is making the state machine building fast, and
where to put it (kernel or user space).

Comments?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  2:39 iptables as a state machine David S. Miller
@ 2004-10-01  3:47 ` shemminger
  2004-10-01  4:01   ` David S. Miller
  2004-10-02  8:44   ` Roberto Nibali
  2004-10-01 20:06 ` Gonzalo A. Arana
  2004-10-02 21:01 ` Tobias DiPasquale
  2 siblings, 2 replies; 16+ messages in thread
From: shemminger @ 2004-10-01  3:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel

Have you looked at nf-hipac.  It works and is fast but is limited
to 2.4 right now.  It's on my to do list...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  3:47 ` shemminger
@ 2004-10-01  4:01   ` David S. Miller
  2004-10-01  8:24     ` Thomas Heinz
  2004-10-01 11:12     ` Henrik Nordstrom
  2004-10-02  8:44   ` Roberto Nibali
  1 sibling, 2 replies; 16+ messages in thread
From: David S. Miller @ 2004-10-01  4:01 UTC (permalink / raw)
  To: shemminger; +Cc: netfilter-devel

On Thu, 30 Sep 2004 20:47:49 -0700 (PDT)
shemminger@osdl.org wrote:

> Have you looked at nf-hipac.  It works and is fast but is limited
> to 2.4 right now.  It's on my to do list...

nf-hipac is what I was indirectly referring to when I
said "B-Tree based schemes". :-)

nf-hipac is fast for lookups, but it does so at a huge
cost in memory usage.  It uses B-Trees and prefix
expansion, which is a well known technique for this
task.

How prefix expansion works is basically like the following.
Say you had a rule which matches on address "10.0.0.0/24"
Prefix expansion would duplicate this rule 256 times, one
for every address in that prefix'd range.

	10.0.0.1
	10.0.0.2
	...
	10.0.0.255

Then you can use whatever data structure you wish to do
an exact match lookup.  The nf-hipac folks choose a B-Tree
although at this point you could simply use a plain hash
as well.

It is a classic example, in fact, of a memory usage vs.
lookup performance tradeoff.

I warn you if you go looking at the existing nf-hipac, it's
big and complex :-)  My dislike of nf-hipac's approach is
one of the reasons I went looking for other possible schemes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  4:01   ` David S. Miller
@ 2004-10-01  8:24     ` Thomas Heinz
  2004-10-01 19:46       ` David S. Miller
  2004-10-01 11:12     ` Henrik Nordstrom
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Heinz @ 2004-10-01  8:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel, shemminger

[-- Attachment #1: Type: text/plain, Size: 2633 bytes --]

Hi David

Your statement about the algorithmic approach of nf-hipac
is wrong.

You wrote:
> nf-hipac is what I was indirectly referring to when I
> said "B-Tree based schemes". :-)
> 
> nf-hipac is fast for lookups, but it does so at a huge
> cost in memory usage.

There are indeed cases where the memory consumption of
the currently implemented scheme grows huge. Nevertheless,
your explanation of why this is the case is wrong.
 From your explanation, one must conclude that nf-hipac
is not even capable of handling _any_ real world rule set.

> It uses B-Trees and prefix
> expansion, which is a well known technique for this
> task.
> 
> How prefix expansion works is basically like the following.
> Say you had a rule which matches on address "10.0.0.0/24"
> Prefix expansion would duplicate this rule 256 times, one
> for every address in that prefix'd range.
> 
> 	10.0.0.1
> 	10.0.0.2
> 	...
> 	10.0.0.255
> 
> Then you can use whatever data structure you wish to do
> an exact match lookup.  The nf-hipac folks choose a B-Tree
> although at this point you could simply use a plain hash
> as well.

Sorry David, but that is completely wrong. Doing packet
classification this way would _not_ scale, even for the
simplest rule bases.

nf-hipac views the rules geometrically. Each rule is a
hypercube in d-dimensional space (d = number of matches).
This representation is powerful enough to express each
"iptables-like" rule as one "nf-hipac" rule.
The classification is based on solving one-dimensional
problems and thus recursively reducing the d-dimensional
problem to a (d-1)-dimensional one (and so on).

> It is a classic example, in fact, of a memory usage vs.
> lookup performance tradeoff.

Its indeed a tradeoff but it's a big difference if your
rules are simply points in d-dimensional space (as you
described) or hypercubes. The latter is obviously more
expressive.

> I warn you if you go looking at the existing nf-hipac, it's
> big and complex :-)  My dislike of nf-hipac's approach is
> one of the reasons I went looking for other possible schemes.

It's of course fine to think about alternative approaches.
Did I get you right that you intend to model packet
classification as regular expression matching on the packet
header and then compute the minimal automata to represent
the expression rule base?

This approach was already considered very early in history
of packet classification. Even more complex matchings as
context free grammars have been used. Nonetheless, even
regular expressions have been found to not being able to
cope with high performance demands of todays rule bases.

Regards,

Thomas

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  8:24     ` Thomas Heinz
@ 2004-10-01 19:46       ` David S. Miller
  2004-10-01 20:26         ` Thomas Heinz
  2004-10-01 20:33         ` Stephen Hemminger
  0 siblings, 2 replies; 16+ messages in thread
From: David S. Miller @ 2004-10-01 19:46 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: netfilter-devel, shemminger

On Fri, 01 Oct 2004 10:24:23 +0200
Thomas Heinz <creatix@hipac.org> wrote:

> Your statement about the algorithmic approach of nf-hipac
> is wrong.

Thanks for the correction.

> This approach was already considered very early in history
> of packet classification. Even more complex matchings as
> context free grammars have been used. Nonetheless, even
> regular expressions have been found to not being able to
> cope with high performance demands of todays rule bases.

Any pointers to papers on this topic?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01 19:46       ` David S. Miller
@ 2004-10-01 20:26         ` Thomas Heinz
  2004-10-01 20:33         ` Stephen Hemminger
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Heinz @ 2004-10-01 20:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel, shemminger

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

You wrote:
> Thanks for the correction.

You're welcome.

>>This approach was already considered very early in history
>>of packet classification. Even more complex matchings as
>>context free grammars have been used. Nonetheless, even
>>regular expressions have been found to not being able to
>>cope with high performance demands of todays rule bases.
> 
> Any pointers to papers on this topic?

http://citeseer.ist.psu.edu/rd/0%2C56411%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/compress/0/papers/cs/364/http:zSzzSzwww.cs.wustl.eduzSzcszSztechreportszSz1995zSzwucs-95-21.ps.gz/jayaram95efficient.ps
This paper describes the use of an optimized LR parser
for packet classification. Note that it's from 1995.

As for regular expressions, any theoretical computer
science textbook describes the way how to construct
deterministic finite automata from regular expressions
and how to compute the equivalent minimal automaton.
This approach is for example also implemented by flex.

One of the first approaches towards packet
classification was the design of dedicated virtual
machines similar to what is used in compiler
technology.

As the demand for high performance packet classification
grew, one came up with the so-called packet classification
problem which is the foundation of todays firewalling
rule sets.

Regards,

Thomas

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01 19:46       ` David S. Miller
  2004-10-01 20:26         ` Thomas Heinz
@ 2004-10-01 20:33         ` Stephen Hemminger
  1 sibling, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2004-10-01 20:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: Thomas Heinz, netfilter-devel

On Fri, 2004-10-01 at 12:46 -0700, David S. Miller wrote:
> On Fri, 01 Oct 2004 10:24:23 +0200
> Thomas Heinz <creatix@hipac.org> wrote:
> 
> > Your statement about the algorithmic approach of nf-hipac
> > is wrong.
> 
> Thanks for the correction.
> 
> > This approach was already considered very early in history
> > of packet classification. Even more complex matchings as
> > context free grammars have been used. Nonetheless, even
> > regular expressions have been found to not being able to
> > cope with high performance demands of todays rule bases.
> 
> Any pointers to papers on this topic?

Current state described here:

http://www.netfilter.org/documentation/conferences/nf-workshop-2004-summary.html#AEN525

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  4:01   ` David S. Miller
  2004-10-01  8:24     ` Thomas Heinz
@ 2004-10-01 11:12     ` Henrik Nordstrom
  2004-10-01 12:06       ` Henrik Nordstrom
  1 sibling, 1 reply; 16+ messages in thread
From: Henrik Nordstrom @ 2004-10-01 11:12 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel, shemminger

On Thu, 30 Sep 2004, David S. Miller wrote:

> How prefix expansion works is basically like the following.
> Say you had a rule which matches on address "10.0.0.0/24"
> Prefix expansion would duplicate this rule 256 times, one
> for every address in that prefix'd range.
>
> 	10.0.0.1
> 	10.0.0.2
> 	...
> 	10.0.0.255

this is not what hipac does.

hipac works on ranges, not prefixes. In the decisiontree it collects the 
list or unique ranges for the active rules of the current dimention

By example:

If the ruleset contains the following destinations, in priority by the 
order listed:

    10.0.0.45-10.0.0.64
    10.0.0.0/24
    10.0.0.87

then the decision tree contains

    10.0.0.0-10.0.0.44
    10.0.0.45-10.0.0.64
    10.0.0.65-10.0.0.255

    (10.0.0.87 is hided by the /24, and does not exists in the hipac tree)

if 10.0.0.87 is moved up one line to unhide it, or there is matches in 
later dimentions making this range active then the tree will contain

    10.0.0.0-10.0.0.44
    10.0.0.35-10.0.0.64
    10.0.0.65-10.0.0.86
    10.0.0.87
    10.0.0.88-10.0.0.255

It is true that the current tree algorith is a bit memory hungry due to a 
lot of information being duplicated when there partially overlapping or 
partly identical rules, but Michael is working on a new generation where 
the tree is optimal with no duplicated information. This complication in 
memory usage is from the lookup being applied in multiple dimenstions 
(destination, source, protocol, ports, ...).

Example:

   iptables -A FORWARD -d 192.168.1.54 -s 10.0.0.0/24 -j LOG
   iptables -A FORWARD -d 192.168.1.54 -s 10.0.0.0/24 -j ACCEPT
   iptables -A FORWARD -d 192.168.1.87 -s 10.0.0.0/24 -j LOG
   iptables -A FORWARD -d 192.168.1.87 -s 10.0.0.0/24 -j ACCEPT
   iptables -A FORWARD -d 192.168.1.0/24 -s 10.0.0.0/24 -j REJECT

In the current algorithm this becomes

Destination tree (ROOT):

   192.168.1.0  - 192.168.1.53  -> Source tree 1
   192.168.1.54                 -> Source tree 2
   192.168.1.55 - 192.168.86    -> Source tree 3
   192.168.1.87                 -> Source tree 4
   192.168.1.88 - 192.168.1.255 -> Source tree 5

Source tree 1:

   10.0.0.0-10.0.0.255  -j REJECT

Source tree 2:

   10.0.0.0-10.0.0.255  -j LOG, ACCEPT

Source tree 3:

   10.0.0.0-10.0.0.255  -j REJECT

Source tree 4:

   10.0.0.0-10.0.0.255  -j LOG, ACCEPT

Source tree 5:

   10.0.0.0-10.0.0.255  -j REJECT

while in the new algorithm all the identical subtrees will be eleminated, 
resulting in a hipac ruleset which looks like

Destination tree (ROOT):

   192.168.1.0  - 192.168.1.53  -> Source tree 1
   192.168.1.54                 -> Source tree 2
   192.168.1.55 - 192.168.86    -> Source tree 1
   192.168.1.87                 -> Source tree 2
   192.168.1.88 - 192.168.1.255 -> Source tree 1

Source tree 1:

   10.0.0.0-10.0.0.255  -j REJECT

Source tree 2:

   10.0.0.0-10.0.0.255  -j LOG, ACCEPT

and all this while still allowing for very efficient insert/delete 
operations.

what the iptables core is solving is the range location problem of a 
multidimensional lookup (destination, source, ports, protocols etc...). 
The user entered ruleset information is backward compatible with iptables 
for convenience, but already allows you to specify ranges rather than 
prefixes and it is not unlikely it will become even more expressive in the 
future to allow for much leaner ruleset specifications.

Regards
Henrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01 11:12     ` Henrik Nordstrom
@ 2004-10-01 12:06       ` Henrik Nordstrom
  0 siblings, 0 replies; 16+ messages in thread
From: Henrik Nordstrom @ 2004-10-01 12:06 UTC (permalink / raw)
  To: David S. Miller; +Cc: netfilter-devel

There was some trivial typos in my message which some may find confusing. 
See below for the corrections.

On Fri, 1 Oct 2004, Henrik Nordstrom wrote:

> On Thu, 30 Sep 2004, David S. Miller wrote:
>
>> How prefix expansion works is basically like the following.
>> Say you had a rule which matches on address "10.0.0.0/24"
>> Prefix expansion would duplicate this rule 256 times, one
>> for every address in that prefix'd range.
>> 
>> 	10.0.0.1
>> 	10.0.0.2
>> 	...
>> 	10.0.0.255
>
> this is not what hipac does.
>
> hipac works on ranges, not prefixes. In the decisiontree it collects the list 
> or unique ranges for the active rules of the current dimention
>
> By example:
>
> If the ruleset contains the following destinations, in priority by the order 
> listed:
>
>   10.0.0.45-10.0.0.64
>   10.0.0.0/24
>   10.0.0.87
>
> then the decision tree contains
>
>   10.0.0.0-10.0.0.44
>   10.0.0.45-10.0.0.64
>   10.0.0.65-10.0.0.255
>
>   (10.0.0.87 is hided by the /24, and does not exists in the hipac tree)
>
> if 10.0.0.87 is moved up one line to unhide it, or there is matches in later 
> dimentions making this range active then the tree will contain
>
>   10.0.0.0-10.0.0.44
>   10.0.0.35-10.0.0.64
>   10.0.0.65-10.0.0.86
>   10.0.0.87
>   10.0.0.88-10.0.0.255


The above should obviously read

   10.0.0.0-10.0.0.44
   10.0.0.45-10.0.0.64
   10.0.0.65-10.0.0.86
   10.0.0.87
   10.0.0.88-10.0.0.255

(45, not 35)


> It is true that the current tree algorith is a bit memory hungry due to a lot 
> of information being duplicated when there partially overlapping or partly 
> identical rules, but Michael is working on a new generation where the tree is 
> optimal with no duplicated information. This complication in memory usage is 
> from the lookup being applied in multiple dimenstions (destination, source, 
> protocol, ports, ...).
>
> Example:
>
>  iptables -A FORWARD -d 192.168.1.54 -s 10.0.0.0/24 -j LOG
>  iptables -A FORWARD -d 192.168.1.54 -s 10.0.0.0/24 -j ACCEPT
>  iptables -A FORWARD -d 192.168.1.87 -s 10.0.0.0/24 -j LOG
>  iptables -A FORWARD -d 192.168.1.87 -s 10.0.0.0/24 -j ACCEPT
>  iptables -A FORWARD -d 192.168.1.0/24 -s 10.0.0.0/24 -j REJECT
>
> In the current algorithm this becomes
>
>
> Destination tree (ROOT):
>
>  192.168.1.0  - 192.168.1.53  -> Source tree 1
>  192.168.1.54                 -> Source tree 2
>  192.168.1.55 - 192.168.86    -> Source tree 3
>  192.168.1.87                 -> Source tree 4
>  192.168.1.88 - 192.168.1.255 -> Source tree 5
>
> Source tree 1:
>
>  10.0.0.0-10.0.0.255  -j REJECT
>
> Source tree 2:
>
>  10.0.0.0-10.0.0.255  -j LOG, ACCEPT
>
> Source tree 3:
>
>  10.0.0.0-10.0.0.255  -j REJECT
>
> Source tree 4:
>
>  10.0.0.0-10.0.0.255  -j LOG, ACCEPT
>
> Source tree 5:
>
>  10.0.0.0-10.0.0.255  -j REJECT
>
>
> while in the new algorithm all the identical subtrees will be eleminated, 
> resulting in a hipac ruleset which looks like
>
>
> Destination tree (ROOT):
>
>  192.168.1.0  - 192.168.1.53  -> Source tree 1
>  192.168.1.54                 -> Source tree 2
>  192.168.1.55 - 192.168.86    -> Source tree 1
>  192.168.1.87                 -> Source tree 2
>  192.168.1.88 - 192.168.1.255 -> Source tree 1
>
> Source tree 1:
>
>  10.0.0.0-10.0.0.255  -j REJECT
>
> Source tree 2:
>
>  10.0.0.0-10.0.0.255  -j LOG, ACCEPT
>
>
>
> and all this while still allowing for very efficient insert/delete 
> operations.
>
>
>
> what the iptables core is solving is the range location problem of a 
> multidimensional lookup (destination, source, ports, protocols etc...). The 
> user entered ruleset information is backward compatible with iptables for 
> convenience, but already allows you to specify ranges rather than prefixes 
> and it is not unlikely it will become even more expressive in the future to 
> allow for much leaner ruleset specifications.


Correction:

what the hipac core is solving ...


Regards
Henrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  3:47 ` shemminger
  2004-10-01  4:01   ` David S. Miller
@ 2004-10-02  8:44   ` Roberto Nibali
  2004-10-02 14:42     ` Henrik Nordstrom
                       ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Roberto Nibali @ 2004-10-02  8:44 UTC (permalink / raw)
  To: shemminger; +Cc: netfilter-devel

> Have you looked at nf-hipac.  It works and is fast but is limited
> to 2.4 right now.  It's on my to do list...

I've started porting it as well, except a few locking issues and some 
Makefile hazzle I should be done with it. I can send it to you in private.

Having seen a lot of packet classification and matching schemes lately 
I'm a bit in favour of the IDD based approach described in [1]. The 
proof of concept implementation, which lacks a big amount of features 
still, can be found at [2]. I probably favour that since I teach Linear 
Algebra at college myself :)

An last but not least, I've written a short overview paper on the 3 
currently available "alternatives" to netfilter [3], which are "tc 
action", "nf-hipac" and "compact filter". However I unfortunately was 
required to write it in german; so it might not be of much use to a lot 
of people on this list.

[1] http://www.cs.auc.dk/~mixxel/papers/icn04.pdf
[2] http://www.cs.aau.dk/~mixxel/cf/
[3] 
http://pubwww.hsz-t.ch/~rnibali/downloads/FACHSTUDIUM/seminar/Ersatz_fuer_iptables_als_filter/iptables_ersatz.pdf

Best regards,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-02  8:44   ` Roberto Nibali
@ 2004-10-02 14:42     ` Henrik Nordstrom
  2004-10-04 10:04     ` Jozsef Kadlecsik
  2004-10-04 15:51     ` Stephen Hemminger
  2 siblings, 0 replies; 16+ messages in thread
From: Henrik Nordstrom @ 2004-10-02 14:42 UTC (permalink / raw)
  To: Roberto Nibali; +Cc: netfilter-devel, shemminger

On Sat, 2 Oct 2004, Roberto Nibali wrote:

>> Have you looked at nf-hipac.  It works and is fast but is limited
>> to 2.4 right now.  It's on my to do list...
>
> I've started porting it as well, except a few locking issues and some 
> Makefile hazzle I should be done with it. I can send it to you in private.
>
> Having seen a lot of packet classification and matching schemes lately I'm a 
> bit in favour of the IDD based approach described in [1]. The proof of 
> concept implementation, which lacks a big amount of features still, can be 
> found at [2]. I probably favour that since I teach Linear Algebra at college 
> myself :)
>
> An last but not least, I've written a short overview paper on the 3 currently 
> available "alternatives" to netfilter [3], which are "tc action", "nf-hipac" 
> and "compact filter". However I unfortunately was required to write it in 
> german; so it might not be of much use to a lot of people on this list.
>
> [1] http://www.cs.auc.dk/~mixxel/papers/icn04.pdf
> [2] http://www.cs.aau.dk/~mixxel/cf/
> [3] http://pubwww.hsz-t.ch/~rnibali/downloads/FACHSTUDIUM/seminar/Ersatz_fuer_iptables_als_filter/iptables_ersatz.pdf

For what it is worth I have made a rough babelfish + minor cleanups 
translation of your document at 
http://hno.homeip.net/~henrik/iptables_ersatz/ which may help (or 
sometimes confuse) those of us (myself included) who can not read 
German...

There is also a HTML version of the original German version.

Regards
Henrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-02  8:44   ` Roberto Nibali
  2004-10-02 14:42     ` Henrik Nordstrom
@ 2004-10-04 10:04     ` Jozsef Kadlecsik
  2004-10-04 15:51     ` Stephen Hemminger
  2 siblings, 0 replies; 16+ messages in thread
From: Jozsef Kadlecsik @ 2004-10-04 10:04 UTC (permalink / raw)
  To: Roberto Nibali; +Cc: netfilter-devel, shemminger

On Sat, 2 Oct 2004, Roberto Nibali wrote:

> > Have you looked at nf-hipac.  It works and is fast but is limited
> > to 2.4 right now.  It's on my to do list...
>
> I've started porting it as well, except a few locking issues and some
> Makefile hazzle I should be done with it. I can send it to you in private.
>
> Having seen a lot of packet classification and matching schemes lately
> I'm a bit in favour of the IDD based approach described in [1]. The
> proof of concept implementation, which lacks a big amount of features
> still, can be found at [2]. I probably favour that since I teach Linear
> Algebra at college myself :)

Unfortunately the time to generate the rulesets grows rapidly by the
number of rules and that can cripple cf in practice.

It's definitely not a comprehensive solution, still worth to mention ipset
as an alternative.

There are different kind of sets already supported by ipset:

- ipmap: uses a memory range, where each bit represents one IP address.
  Up to to 65535 (B-class network) IP addresses can be stored. Can store
  network addresses of the same size as well.

- portmap: the same as ipmap, but for port numbers

- macipmap: in a memory range each 8 bytes represents one IP and a MAC
  addresses

- iphash: IP/network addresses stored in a hash where clashes are resolved
  by double hashing and rehashing

Any (set, set-element) can be bound to another set which can then be
matched in iptables by a single match. An example: let's assume we
administer a full A class network (:-) and want to set up the rules for
every server on the network, expressed in sets:

- the ipmap set "toplevel" stores the addresses of our in-use B-class
  networks
- the ipmap sets "b-class-0" ... "b-class-255" stores the IP addresses of
  the servers in the different B-class networks

  We bound every (toplevel, entry) to the corresponding "b-class-n" set.

- portmap sets "service-model-n" contain the ports of the services offered

  We bound the entries in "b-class-n" to the corresponding
  "service-model-n" sets.

Now the single iptables rule

  .. -m set --set toplevel dst,dst,dst -m state --state NEW

will match any new request targeted to any service on any server. The
computation cost of the matching:

- match the destination (network) address in set "toplevel"
- find the binding value corresponding to (toplevel, entry)
- match the destination address in set "b-class-n" we found
- find the binding value corresponding to (b-class-n, entry)
- match the destination port in service-model-n

That is three bitmap and two hash lookups.

Set elements/bindings can be added/deleted fast, save/restore
functionality is supported. One can also swap sets without the need to
modify referring iptables rules or set bindings.

The new version I'm working on it adds the save/restore support, better
iphash type, minimalized syscalls and better binding support compared to
the old version.

It's a pragmatical solution and of course does not address the
rule-handling in iptables itself at all.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-02  8:44   ` Roberto Nibali
  2004-10-02 14:42     ` Henrik Nordstrom
  2004-10-04 10:04     ` Jozsef Kadlecsik
@ 2004-10-04 15:51     ` Stephen Hemminger
  2 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2004-10-04 15:51 UTC (permalink / raw)
  To: Roberto Nibali; +Cc: netfilter-devel

On Sat, 2004-10-02 at 10:44 +0200, Roberto Nibali wrote:
> > Have you looked at nf-hipac.  It works and is fast but is limited
> > to 2.4 right now.  It's on my to do list...
> 
> I've started porting it as well, except a few locking issues and some 
> Makefile hazzle I should be done with it. I can send it to you in private.

Great, I planned to convert it to RCU and get rid of the netdevice
unregister changes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  2:39 iptables as a state machine David S. Miller
  2004-10-01  3:47 ` shemminger
@ 2004-10-01 20:06 ` Gonzalo A. Arana
  2004-10-02 21:01 ` Tobias DiPasquale
  2 siblings, 0 replies; 16+ messages in thread
From: Gonzalo A. Arana @ 2004-10-01 20:06 UTC (permalink / raw)
  To: netfilter-devel

Content-Transfer-Encoding: 7bit


On Thu, 30 Sep 2004 19:39:55 -0700
"David S. Miller" <davem@davemloft.net> wrote:

> 
> So while waiting for a kernel build to finish I was sniffing
> around the iptables code thinking about how one might optimize
> the lookups.
> 
> The ipt targets are simple enough already, and I think this
> shows the elegance of the core iptables design.
> 
> It's the main ip header + indev + outdev matching that is
> the real memory and cpu cycle killer in these paths.  Once
> we match the target, the verdict is determined quickly.

Are you trying to optimize a single rule matching?  Or the traveral of an entire chain?

> ...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-01  2:39 iptables as a state machine David S. Miller
  2004-10-01  3:47 ` shemminger
  2004-10-01 20:06 ` Gonzalo A. Arana
@ 2004-10-02 21:01 ` Tobias DiPasquale
  2004-10-02 21:52   ` Thomas Heinz
  2 siblings, 1 reply; 16+ messages in thread
From: Tobias DiPasquale @ 2004-10-02 21:01 UTC (permalink / raw)
  To: David S. Miller, netfilter-devel

On Thu, 30 Sep 2004 19:39:55 -0700, David S. Miller <davem@davemloft.net> wrote:
> I think iptables core IP header + indev + outdev match is a
> state machine problem as well.  Such a state machine can be
> made extremely small memory wise.  The lookup can be something
> like running a berkeley packet filter on the frame.  Except
> that instead of a "yes or no" answer we get a pointer to a
> target.

What about using a n-ary PATRICIA trie to solve this problem? That
would yield O(1)-time matching of rules and the data pointer for each
node in the tree could be the list of targets that apply to that
particular IP/subnet? Not sure how ranges would work yet, though if
they didn't fit into a CIDR block...

-- 
[ Tobias DiPasquale ]
0x636f6465736c696e67657240676d61696c2e636f6d

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: iptables as a state machine
  2004-10-02 21:01 ` Tobias DiPasquale
@ 2004-10-02 21:52   ` Thomas Heinz
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Heinz @ 2004-10-02 21:52 UTC (permalink / raw)
  To: Tobias DiPasquale; +Cc: netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

Hi Tobias

You wrote:
> What about using a n-ary PATRICIA trie to solve this problem? That
> would yield O(1)-time matching of rules and the data pointer for each
> node in the tree could be the list of targets that apply to that
> particular IP/subnet? Not sure how ranges would work yet, though if
> they didn't fit into a CIDR block...

Maybe you want to have a look at the following paper.
http://tiny-tera.stanford.edu/~nickm/papers/classification_tutorial_01.pdf

It provides a very good overview of a wide range of different
approaches towards packet classification.

There are two basic trie-based approaches presented in the paper.
1) hierarchical tries
    lookup time:        O(w^d)
    storage complexity: O(n * d * w)

2) set-pruning tries
    lookup time:        O(d * w)
    storage complexity: O(n^d * d * w)

n: number of rules
d: number of dimensions (such as source ip, dest. ip, protocol)
w: bit width of largest header field

As you can see from the performance complexity both approaches
are not desirable for d > 1.

BTW, every range in [0, 2^w - 1] can be expressed by at most
2*(w-1) prefixes. Hence, if you have a single rule with d range
matches, you need at most (2*(w-1))^d rules with prefix matches
to express the _single_ range rule.

Sounds not very promising, does it? ;-)

Regards,

Thomas

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-10-04 15:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-01  2:39 iptables as a state machine David S. Miller
2004-10-01  3:47 ` shemminger
2004-10-01  4:01   ` David S. Miller
2004-10-01  8:24     ` Thomas Heinz
2004-10-01 19:46       ` David S. Miller
2004-10-01 20:26         ` Thomas Heinz
2004-10-01 20:33         ` Stephen Hemminger
2004-10-01 11:12     ` Henrik Nordstrom
2004-10-01 12:06       ` Henrik Nordstrom
2004-10-02  8:44   ` Roberto Nibali
2004-10-02 14:42     ` Henrik Nordstrom
2004-10-04 10:04     ` Jozsef Kadlecsik
2004-10-04 15:51     ` Stephen Hemminger
2004-10-01 20:06 ` Gonzalo A. Arana
2004-10-02 21:01 ` Tobias DiPasquale
2004-10-02 21:52   ` Thomas Heinz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.