[RFC] High Performance Packet Classification (HPPC) to succeed HIPAC

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
@ 2007-09-07  9:09 Steven Van Acker
  2007-09-07 10:57 ` Patrick McHardy
  2007-09-07 22:45 ` Thomas Heinz
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Van Acker @ 2007-09-07  9:09 UTC (permalink / raw)
  To: Netfilter Development Mailinglist

Hi,

I'm a user of nf-HIPAC (http://www.hipac.org) and I've tried to monitor it's
progress into the main kernel tree. Unfortunately, the work on nf-HIPAC seems to
have stopped.  The last update of the website is from November 8th 2005, which
is almost 2 years ago.

Unless there is some work being done on integrating HIPAC with iptables (or
xtables), I would like to give this a go.

The main problem I see with this right now is that the patch is way too big to
analyze and rework (at least for me). So instead, I would like to reimplement
everything from scratch and integrate it with iptables from the very start.

Both these situation sketches are based on my (currently) narrow perception of
how things seem to work:

current situation
-----------------

For every packet traversing a chain, all rules are evaluated untill one reaches
a terminal decision. The worst case scenario is that no rules in the chain
match and all of them need to be traversed.

Each rule has 5 "fixed" specifiers that can be seen as dimensions: source and
destination IP addresses, protocol number and incoming/outgoing interfaces.

Every time a rule needs to be evaluated, these 5 dimensions are checked first.
(ip_packet_match())

The complexity of this is O(n), for n rules.

HIPAC
-----

In HIPAC, all rules are stored in a trie indexed by specifier. By walking down
such a trie given e.g. an IP address, all rules relevant to that IP address can
be found. Repeating this walk for every dimension and intersecting the results
(which are lists of rules), a single list is obtained that is relevant for the
packet to classify. All that remains to be done then, is go over those rules to
classify the packet.

The HIPAC authors claim that the complexity of the trie-lookup is O(log log N),
where N is the size of the search-space. For IPv4, (and log == log2), this would
be 5. The complexity of this way of filtering is not related to the amount of
rules stored in the trie, which makes it very fast for a large ruleset.

I don't understand the (apparently) complicated datastructure used for this in
HIPAC, or how it results in O(log log N) complexity, so I have formulated
another theory which should perform about the same.

HPPC (High Performance Packet Classification)
---------------------------------------------

Just like in HIPAC, I would like to refer to rules from a (radix) tree, indexed
by a specifier. Every level in this tree will split up the searchspace in 256
pieces, unless no nodes are stored below it. For IPv4 IP addresses, this means
an IP can be looked up by splitting it in 4 bytes, and using each of the bytes
as an index into an array at each level of the tree.

The complexity of this is O((log N) / 8), with N being 2^32, and is the fastest
and simplest way of looking up an IP address I can imagine without using alot of
memory.

Just like HIPAC, intersecting the lists of those lookups results in the final
list of rules to consider for that packet.

The same technique can be used to lookup port numbers (2 levels), mac-addresses
(6 levels), IPv6 addresses (16 levels) or pretty much anything else.

My questions:

Is anything like this already in the works ?
Would it be useful ?
Are there any technical reasons why this should not be implemented the way I'd
like to do it ?

kind regards,
-- Steven

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
  2007-09-07  9:09 [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC Steven Van Acker
@ 2007-09-07 10:57 ` Patrick McHardy
  2007-09-07 11:44   ` Steven Van Acker
  2007-09-07 22:45 ` Thomas Heinz
  1 sibling, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2007-09-07 10:57 UTC (permalink / raw)
  To: Steven Van Acker; +Cc: Netfilter Development Mailinglist

Steven Van Acker wrote:
> Hi,
> 
> I'm a user of nf-HIPAC (http://www.hipac.org) and I've tried to monitor it's
> progress into the main kernel tree. Unfortunately, the work on nf-HIPAC seems to
> have stopped.  The last update of the website is from November 8th 2005, which
> is almost 2 years ago.
> 
> Unless there is some work being done on integrating HIPAC with iptables (or
> xtables), I would like to give this a go.
> 
> The main problem I see with this right now is that the patch is way too big to
> analyze and rework (at least for me). So instead, I would like to reimplement
> everything from scratch and integrate it with iptables from the very start.


We're holding the netfilter workshop next week, and one of the
topics will be a possible merge of nf-hipac. If this doesn't
work out, we'll start looking into alternatives, but until then
I prefer to stay optimistic :)

So lets postpone this discussion for a week.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
  2007-09-07 10:57 ` Patrick McHardy
@ 2007-09-07 11:44   ` Steven Van Acker
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Van Acker @ 2007-09-07 11:44 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Netfilter Development Mailinglist

On Fri, Sep 07, 2007 at 12:57:16PM +0200, Patrick McHardy wrote:
> We're holding the netfilter workshop next week, and one of the
> topics will be a possible merge of nf-hipac. If this doesn't
> work out, we'll start looking into alternatives, but until then
> I prefer to stay optimistic :)
> 
> So lets postpone this discussion for a week.

Hi,

this sounds good, I can wait one more week :)

kind regards,
-- Steven

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
  2007-09-07  9:09 [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC Steven Van Acker
  2007-09-07 10:57 ` Patrick McHardy
@ 2007-09-07 22:45 ` Thomas Heinz
  2007-09-08 18:16   ` Steven Van Acker
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Heinz @ 2007-09-07 22:45 UTC (permalink / raw)
  To: netfilter-devel

Hi Steven

You wrote:
> In HIPAC, all rules are stored in a trie indexed by specifier. By
> walking down such a trie given e.g. an IP address, all rules relevant to
> that IP address can be found. Repeating this walk for every dimension
> and intersecting the results (which are lists of rules), a single list
> is obtained that is relevant for the packet to classify. All that
> remains to be done then, is go over those rules to classify the packet.

Well, no. HiPAC is not based on tries but interprets the packet 
classification problem as a geometric problem where each rule is 
represented by a d-dimensional orthotope (= rectangular hypercube, for d=2 
a rectangle) and a packet is a point in the d-dimensional space. Hence, 
packet classification is equivalent to finding the highest priority 
orthotope enclosing the point.

> The HIPAC authors claim that the complexity of the trie-lookup is O(log
> log N), where N is the size of the search-space. For IPv4, (and log ==
> log2), this would be 5.

With 'this' you mean that log log N = 5, I guess. Well, using van Emde Boas 
trees also known as stratified trees in order to represent the bounded 
range location problem (subproblem of the packet classification problem in 
HiPAC), it is possible to achieve O(d log log N) lookup time for the 
d-dimensional packet classification problem. However, in practice this is 
not relevant as for practical range location problem sizes as occuring in 
the HiPAC data structure, a simpler data structure can be used yielding 
O(d log n) lookup time where n is the number of rules.

> Just like in HIPAC, I would like to refer to rules from a (radix) tree,
> indexed by a specifier. Every level in this tree will split up the
> searchspace in 256 pieces, unless no nodes are stored below it. For IPv4
> IP addresses, this means an IP can be looked up by splitting it in 4
> bytes, and using each of the bytes as an index into an array at each
> level of the tree.
>
> The complexity of this is O((log N) / 8), with N being 2^32, and is the
> fastest and simplest way of looking up an IP address I can imagine
> without using alot of memory.

The disadvantage of using tries is that you can only use prefixes for each 
dimension, e.g. port ranges have to be represented as a set of set of port 
prefixes. In the worst case you need 2(k-1) prefixes to represent a single 
range where k is the number of bits (e.g. 32 in case of IPv4 addresses). 
Hence, if you have a rule with d range matches, you end up with (2(k-1))^d 
prefix rules in the worst case (assuming that each match has bit width k).

Moreover, this approach does not scale well to multiple dimensions: either 
you obtain poor lookup times (hierarchical tries) or poor space complexity 
(set-pruning tries). See e.g.
http://tiny-tera.stanford.edu/~nickm/papers/classification_tutorial_01.pdf

Cheers,

Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
  2007-09-07 22:45 ` Thomas Heinz
@ 2007-09-08 18:16   ` Steven Van Acker
  2007-09-09 22:12     ` Thomas Heinz
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Van Acker @ 2007-09-08 18:16 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: netfilter-devel

On Sat, Sep 08, 2007 at 12:45:48AM +0200, Thomas Heinz wrote:
> Hi Steven

Hi Thomas! :)

> You wrote:
> > In HIPAC, all rules are stored in a trie indexed by specifier. By
> > walking down such a trie given e.g. an IP address, all rules relevant to
> > that IP address can be found. Repeating this walk for every dimension
> > and intersecting the results (which are lists of rules), a single list
> > is obtained that is relevant for the packet to classify. All that
> > remains to be done then, is go over those rules to classify the packet.
> 
> Well, no. HiPAC is not based on tries but interprets the packet 
> classification problem as a geometric problem where each rule is 
> represented by a d-dimensional orthotope (= rectangular hypercube, for d=2 
> a rectangle) and a packet is a point in the d-dimensional space. Hence, 
> packet classification is equivalent to finding the highest priority 
> orthotope enclosing the point.

the above was actually my interpretation of the thesis you wrote, but
your explanantion is obviously more correct :)

> > The HIPAC authors claim that the complexity of the trie-lookup is O(log
> > log N), where N is the size of the search-space. For IPv4, (and log ==
> > log2), this would be 5.
> 
> With 'this' you mean that log log N = 5, I guess. Well, using van Emde Boas 
> trees also known as stratified trees in order to represent the bounded 
> range location problem (subproblem of the packet classification problem in 
> HiPAC), it is possible to achieve O(d log log N) lookup time for the 
> d-dimensional packet classification problem. 

I see now how vBE trees can achieve O(log log N) (after reading the
wikipedia article). The tree is actually a trie, where each next layer
holds maximum half the entries of the previous layer. Concretely for
IPv4, this means the root of the tree contains 2^16 pointers to nodes of
the next layer (the first 16 bits of the address is used as index in
this array of the rootnode), the next layer has 2^8, then 2^4, 2^2 and
so on.

The way I suggested uses a fixed 8-bit wide key at each level.
(By the way, I'm not saying HiPAC does a bad job, I just never truly
understood it ;) We use HiPAC on our central firewall and are very happy
with its performance).

For IPv4, using fixed 8-bit wide keys takes 4 steps to lookup an IP
address. In HiPAC, it takes 5 steps.

But in IPv6, using HiPAC would take only 7 steps, compared to 16 steps
in using 8-bit indices.

However, a genuine vEB tree would need to store 2^64 pointers at the
first layer, which is more memory than any server I know of, uses.

> However, in practice this is 
> not relevant as for practical range location problem sizes as occuring in 
> the HiPAC data structure, a simpler data structure can be used yielding 
> O(d log n) lookup time where n is the number of rules.

Why is the lookup time dependant on the number of rules ?

> 
> > Just like in HIPAC, I would like to refer to rules from a (radix) tree,
> > indexed by a specifier. Every level in this tree will split up the
> > searchspace in 256 pieces, unless no nodes are stored below it. For IPv4
> > IP addresses, this means an IP can be looked up by splitting it in 4
> > bytes, and using each of the bytes as an index into an array at each
> > level of the tree.
> >
> > The complexity of this is O((log N) / 8), with N being 2^32, and is the
> > fastest and simplest way of looking up an IP address I can imagine
> > without using alot of memory.
> 
> The disadvantage of using tries is that you can only use prefixes for each 
> dimension, e.g. port ranges have to be represented as a set of set of port 
> prefixes. In the worst case you need 2(k-1) prefixes to represent a single 
> range where k is the number of bits (e.g. 32 in case of IPv4 addresses). 
> Hence, if you have a rule with d range matches, you end up with (2(k-1))^d 
> prefix rules in the worst case (assuming that each match has bit width k).

Do you mean 2^(k-1) prefixes as the worst case ?

Suppose I have a trie with 2 levels of 8-bit wide prefixes to represent the 
values 0 to 65535.

If I want to represent a range of 1 to 255, that means I need to store
the rule 255 times in the trie (index 0 at level 0, then 1-255 at level
1)

If I need 1 to 510, I would need to store 510 entries ([0][1] to
[0][255] and [1][0] to [1][254])

But if I need to represent a range that encompases an entire level 1
block, then I can just add the rule at a higher level.

The problem with space is very real here. vEB trees would need less
space, because as the depth increases, the blocksize decreases, so more
of the entries in the range can be grouped.

For IPv6, maybe a hybrid approach can be used: a couple fixed-width
prefixes for the first few levels, which then point to vEB trees.

> Moreover, this approach does not scale well to multiple dimensions: either 
> you obtain poor lookup times (hierarchical tries) or poor space complexity 
> (set-pruning tries). See e.g.
> http://tiny-tera.stanford.edu/~nickm/papers/classification_tutorial_01.pdf

This is only if you store multiple dimensions in a single trie, no ?
If you do lookups for each dimension separately, and then intersect the
results, the complexity is just O(d * log n)

kind regards,
-- Steven

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC
  2007-09-08 18:16   ` Steven Van Acker
@ 2007-09-09 22:12     ` Thomas Heinz
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Heinz @ 2007-09-09 22:12 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Steven Van Acker

You wrote:
> I see now how vBE trees can achieve O(log log N) (after reading the
> wikipedia article). The tree is actually a trie, where each next layer
> holds maximum half the entries of the previous layer. Concretely for
> IPv4, this means the root of the tree contains 2^16 pointers to nodes of
> the next layer (the first 16 bits of the address is used as index in
> this array of the rootnode), the next layer has 2^8, then 2^4, 2^2 and
> so on.

Well, an efficient implementation is a little more complicated. You may 
want to read http://citeseer.ist.psu.edu/483355.html

Essentially, a stratified tree consists of a top and a bottom part. The top 
part is again a stratified tree containing the higher N/2 bits of the 
values and the bottom part is a dictionary (hash) which maps each entry i 
in the top tree to a stratified tree containing the lower N/2 bits of all 
values containing i in the higher N/2 bits. This construction continues 
recursively.

> (By the way, I'm not saying HiPAC does a bad job, I just never truly
> understood it ;) We use HiPAC on our central firewall and are very happy
> with its performance).

Nice to hear that :)

> However, a genuine vEB tree would need to store 2^64 pointers at the
> first layer, which is more memory than any server I know of, uses.

No, the paper I mentioned above provides an implementation of stratified 
trees with space complexity O(n) where n is the the number of values 
inserted into the stratified tree.

> > However, in practice this is
> > not relevant as for practical range location problem sizes as occuring
> > in the HiPAC data structure, a simpler data structure can be used
> > yielding O(d log n) lookup time where n is the number of rules.
>
> Why is the lookup time dependant on the number of rules ?

Well, HiPAC does not implement vEB/stratified trees but uses (depending on 
the release) a static B-Tree with implicit layout (no pointers) or simply 
binary search on an array. Since both data structures are not based on a 
bounded universe, the lookup time depends on the number of rules. Hence 
O(d log n) time for the lookup. In practice, they are more efficient. In 
the beginning, we tried a highly tuned stratified tree implementation and 
it lost against our optimized btree variant which to our surprise again 
lost against an efficient binary search implementation.

Actually, the efficiency highly depends on the problem size. Since the 
problem that is solved by these data structures occurs very often in the 
HiPAC data structure, most instances are very small even for huge rule 
sets.

> > The disadvantage of using tries is that you can only use prefixes for
> > each dimension, e.g. port ranges have to be represented as a set of
> > set of port prefixes. In the worst case you need 2(k-1) prefixes to
> > represent a single range where k is the number of bits (e.g. 32 in
> > case of IPv4 addresses). Hence, if you have a rule with d range
> > matches, you end up with (2(k-1))^d prefix rules in the worst case
> > (assuming that each match has bit width k).
>
> Do you mean 2^(k-1) prefixes as the worst case ?

No, any range [a, b] where a, b \in [0, 2^(k-1)] can be expressed by at 
most 2(k-1) prefixes. Just to avoid any confusion. With prefix I mean the 
following. Let k = 4 and our bit prefix be e.g. <01>. Then this prefix 
represents all numbers <0100>...<0111> (4-7).

If you have a rule with d dimensions where each match (dimension) is an 
interval you have to use (2(k-1))^d prefix rules in the worst case to 
represent this rule set (all possible combinations).

> Suppose I have a trie with 2 levels of 8-bit wide prefixes to represent
> the values 0 to 65535.
>
> If I want to represent a range of 1 to 255, that means I need to store
> the rule 255 times in the trie (index 0 at level 0, then 1-255 at level
> 1)
>
> If I need 1 to 510, I would need to store 510 entries ([0][1] to
> [0][255] and [1][0] to [1][254])

You can store prefixes in the trie. Thus, you don't have to store [0..255] 
in the second level under 0 in the first level as the leaf is full. To 
understand this, consider a trie where each node represents a single bit. 
The rest is just optimization.

Note that even with this "trick" tries are highly inefficient for packet 
classification.

The one-dimensional packet classification problem based on intervals is 
actually very simple. You are given n rules of the form [a, b] and a 
unique priority for each rule. Now, these intervals may intersect each 
other but we can easily provide at most 2n + 1 intervals such that none of 
these intervals is intersected by any of our n rules. Check the ascii art 
to see this.

     |-----------------|
             |------------------|
  |-------------------------------------|

|-|--|-------|---------|--------|-------|---------------------------|
0                                                                2^k - 1

Each of the adjacent intervals on the lowest line can be associated with 
the highest priority overlapping interval (rule) and hence the 
one-dimensional packet classification problem boils down to storing the 
right end points in a data structure D which supports the locate operation 
such that
locate(x): returns the smallest y >= x such that y \in D
This problem is also called range location problem.

Efficient data structures are (depending on the problem size): array 
(binary search), btree, vEB (requires bounded universe).

> > Moreover, this approach does not scale well to multiple dimensions:
> > either you obtain poor lookup times (hierarchical tries) or poor space
> > complexity (set-pruning tries). See e.g.
> > http://tiny-tera.stanford.edu/~nickm/papers/classification_tutorial_01
> >.pdf
>
> This is only if you store multiple dimensions in a single trie, no ?
> If you do lookups for each dimension separately, and then intersect the
> results, the complexity is just O(d * log n)

Sorry to disappoint you but no. Each of the d lookups may return O(n) rules 
which you have to intersect and this intersection cannot be done in O(d 
log n) time.

Cheers,

Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-09-09 22:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-07  9:09 [RFC] High Performance Packet Classification (HPPC) to succeed HIPAC Steven Van Acker
2007-09-07 10:57 ` Patrick McHardy
2007-09-07 11:44   ` Steven Van Acker
2007-09-07 22:45 ` Thomas Heinz
2007-09-08 18:16   ` Steven Van Acker
2007-09-09 22:12     ` Thomas Heinz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).