netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification
       [not found] <20020926.020602.75761707.davem@redhat.com>
@ 2002-09-26 12:03 ` jamal
  2002-09-26 20:23   ` Roberto Nibali
  0 siblings, 1 reply; 5+ messages in thread
From: jamal @ 2002-09-26 12:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: ratz, ak, niv, linux-kernel, netdev


It would be nice if people would start ccing networking related
discussions to netdev. I missed the first part of the discussion
but i take it the NF-HIPAC posted a patch.. BTW, I emailed the authors
when i read the paper but never heard back.
What i wanted the authors was to compare against one of the tc
classifiers not iptables.

On Thu, 26 Sep 2002, David S. Miller wrote:

> You are talking about a lot of independant things, but I'm going
> to defer my contributions until we have actual code people can
> start plugging netfilter into if they want.
>

I hacked some code using the traffic control framework around OLS time;
there are a lot of ideas i havent incorporated yet. Too many hacks, too
little time ;-> I think this is what i may have showed Roberto on my
laptop over a drink.
I probably wouldnt have put this code out if my complaints about
netfilter werent ignored.
And you know what happens when you start writting poetry, I ended worrying
more than just about the performance problems of iptables; for example
the code i have now makes it easy to extend the path a packet takes using
simple policies.
The code i have is based around tc framework. One thing i liked about
netfilter is the idea of targets being separate modules; so the code i
have infact makes uses of netfilter targets.
I plan on revisiting this code at some point, maybe this weekend now that
i am reminded of it ;->
Take a look:
http://www.cyberus.ca/~hadi/patches/action.DESCRIPTION

> About using syslog to record messages, that is doomed to failure,
> implement log messages via netlink and use that to log the events
> instead.
>

Agreed, you need a netlink to syslog converter.
Netlink is king -- all the policies in the above code are netlink
controlled. All events are also netlink transported. You dont have to send
every little message you see; netlink allows you to batch and you could
easily do a nagle like algorithm. Next steps are a distributed version
of netlink..

cheers,
jamal

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification
  2002-09-26 12:03 ` [ANNOUNCE] NF-HIPAC: High Performance Packet Classification jamal
@ 2002-09-26 20:23   ` Roberto Nibali
  2002-09-27 13:57     ` jamal
  0 siblings, 1 reply; 5+ messages in thread
From: Roberto Nibali @ 2002-09-26 20:23 UTC (permalink / raw)
  To: jamal; +Cc: niv, linux-kernel, netdev

Hello Jamal,

[took out AK und DaveM since I know they both read netdev and this reply 
is not really of any relevance to them]

> It would be nice if people would start ccing networking related
> discussions to netdev. I missed the first part of the discussion
> but i take it the NF-HIPAC posted a patch.. BTW, I emailed the authors

Yes, your assumption is correct and sorry for missing the cc once again.

> when i read the paper but never heard back.
> What i wanted the authors was to compare against one of the tc
> classifiers not iptables.

I will contact you privately on this issue since I'm about to conduct 
tests this weekend.

> I hacked some code using the traffic control framework around OLS time;
> there are a lot of ideas i havent incorporated yet. Too many hacks, too
> little time ;-> I think this is what i may have showed Roberto on my
> laptop over a drink.

Exactly (even wearing a netfilter T-shirt).

> I probably wouldnt have put this code out if my complaints about
> netfilter werent ignored.
> And you know what happens when you start writting poetry, I ended worrying
> more than just about the performance problems of iptables; for example
> the code i have now makes it easy to extend the path a packet takes using
> simple policies.

Great, I remember some of your postings about the netfilter framework.

> The code i have is based around tc framework. One thing i liked about
> netfilter is the idea of targets being separate modules; so the code i
> have infact makes uses of netfilter targets.
> I plan on revisiting this code at some point, maybe this weekend now that
> i am reminded of it ;->

Excellent, this could make it into my test suites as well.

> Take a look:
> http://www.cyberus.ca/~hadi/patches/action.DESCRIPTION

I did, I simply didn't find the time to do it.

> Agreed, you need a netlink to syslog converter.
> Netlink is king -- all the policies in the above code are netlink
> controlled. All events are also netlink transported. You dont have to send
> every little message you see; netlink allows you to batch and you could
> easily do a nagle like algorithm. Next steps are a distributed version
> of netlink..

Is there a code architecture draft somewhere?

Best regards,
Roberto Nibali, ratz
-- 
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification
       [not found]         ` <20020926140430.E14485@wotan.suse.de>
@ 2002-09-26 20:49           ` Roberto Nibali
  0 siblings, 0 replies; 5+ messages in thread
From: Roberto Nibali @ 2002-09-26 20:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, niv, linux-kernel, jamal, netdev

> For iptables/ipchain you need to write hierarchical/port range rules 
> in this case and try to terminate searchs early.

We're still trying to find the correct mathematical functions to do 
this. Trust me, it is not so easy, the mapping of the port matrix and 
the network flow through many stacked packet filters and firewalls 
generates a rather complex graph (partly bigraph (LVS-DR for example)) 
which has complex structures (redundancy and parallelisations). It's not 
that we could sit down and implement a fw-script for our packet filters, 
the fw-script is being generated through a meta-fw layer that knows 
about the surrounding network nodes.

> But yes, we also found that the L2 cache is limiting here
> (ip_conntrack has the same problem) 

I think this weekend I will do my tests also measuring some cpu 
performance counters with oprofile, such as DATA_READ_MISS, CODE CACHE 
MISS and NONCACHEABLE_MEMORY_READS.

> At least  that is easily fixed. Just increase the LOG_BUF_LEN parameter
> in kernel/printk.c

Tests showed that this only helps in peak situations, I think we should 
simply forget about printk().

> Alternatively don't use slow printk, but nfnetlink to report bad packets
> and print from user space. That should scale much better.

Yes and there are a few things that my collegue found out during his 
tests (actually pretty straight forward things):

1. A big log buffer is only useful to come by peaks
2. A big log buffer while having high CPU load doesn't help at all
3. The smaller the message, the better (binary logging thus is an
    advantage)
4. The logging via printk() is extremely expensive, because of the
    conversions and whatnot. A rough estimate would be 12500 clock
    cycles for a log entry generated by printk(). This means that on a
    PIII/450 a log entry needs 0.000028s and this again leads to
    following observation: Having 36000pps which should all be logged,
    you will end up with a system having 100% CPU load and being 0% idle.
5. The kernel should log a binary stream, also the daemon that needs to
    fetch the data. If you want to convert the binary to human readable
    format, you start a process with low prio or do it on-demand.
6. Ideally the log daemon should be preemtible to get a defined time
    slice to do its job.

Some test results conducted by a coworker of mine (Achim Gsell):

Max pkt rate the system can log without losing more then 1% of the messages:
----------------------------------------------------------------------------


kernel:		Linux 2.4.19-gentoo-r7 (low latency scheduling)

daemon:		syslog-ng (nice 0), logbufsiz=16k, pkts=10*10000, CPU=PIII/450
packet-len:	64		256		512		1024

		2873pkt/s	3332pkt/s	3124pkt/s	3067pkt/s
		1.4 Mb/s	6.6Mb/s		12.2Mb/s	23.9Mb/s

daemon:		syslog-ng (nice 0), logbufsiz=16k, pkts=10*10000, CPU=PIVM/1.7
packet-len:	64		256		512		1024

		7808pkt/s	7807pkt/s	7806pkt/s	    pkt/s
		3.8 Mb/s	15.2Mb/s	30.5Mb/s	    Mb/s

----------------------------------------------------------------------------------------------------------

daemon:		cat /proc/kmsg > kernlog, logbufsiz=16k, pkts=10*10000, 
CPU=PIII/450
packet-len:	64		256		512		1024

		4300pkt/s	        	         	3076pkt/s
		2.1 Mb/s	       		         	24.0Mb/s

daemon:		ulogd (nlbufsize=4k, qthreshold=1), pkts=10*10000, CPU=PIII/450
packet-len:	64		256		512		1024

		4097pkt/s	        	       		4097pkt/s
		2.0 Mb/s	       		         	32  Mb/s

daemon:		ulogd (nlbufsize=2^17 - 1, qthreshold=1), pkts=10*10000, 
CPU=PIII/450
packet-len:	64		256		512		1024

		6576pkt/s	        	         	5000pkt/s
		3.2 Mb/s	       		        	38  Mb/s

daemon:		ulogd (nlbufsize=64k, qthreshold=1), pkts=1*10000, CPU=PIII/450
packet-len:	64		256		512		1024

		         	        	         	    pkt/s
		        	       		        	4.0 Mb/s

daemon:		ulogd (nlbufsize=2^17 - 1, qthreshold=50), pkts=10*10000, 
CPU=PIII/450
packet-len:	64		256		512		1024

		6170pkt/s	        	         	5000pkt/s
		3.0 Mb/s	       		        	38  Mb/s


Best regards,
Roberto Nibali, ratz
-- 
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification
  2002-09-26 20:23   ` Roberto Nibali
@ 2002-09-27 13:57     ` jamal
  0 siblings, 0 replies; 5+ messages in thread
From: jamal @ 2002-09-27 13:57 UTC (permalink / raw)
  To: Roberto Nibali; +Cc: linux-kernel, netdev



On Thu, 26 Sep 2002, Roberto Nibali wrote:

> Is there a code architecture draft somewhere?

You mean for what i posted? Dont you think i already went beyond
the classical open source model by putting out a user guide? ;-> ;->
Just ask me questions in private and i'll try and help.

cheers,
jamal

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ANNOUNCE] NF-HIPAC: High Performance Packet Classification
       [not found] <Pine.LNX.3.96.1020930133306.20863A-100000@gatekeeper.tmr.com>
@ 2002-10-02 17:37 ` Roberto Nibali
  0 siblings, 0 replies; 5+ messages in thread
From: Roberto Nibali @ 2002-10-02 17:37 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-kernel, netdev

Hi,

>>I will do a new round of testing this weekend for a speech I'll be 
>>giving. This time I will include ipchains, iptables (of course I am 
>>willing to apply every interesting patch regarding hash table 
>>optimisation and whatnot you want me to test), nf-hipac, the OpenBSD pf 
>>and of course the work done by Jamal.
>  
> Look forward to any info you can provide.

Unfortunately (as always) there were tons of delays that didn't allow me 
to finish the complete test suite as I hoped I could but I sent some 
information off this list to Jamal and the nf-hipac guys about previous 
test result. See below. I hope I can do more tests this weekend ...

> I particularly like that nf-hipac can be put in and tried in one-to-one
> comparison, that leaves an easy route to testing and getting confidence in
> the code.

Yes and it was very convincing after the first few tests Some prelimiary 
test with raw TCP throughput have given me following really cool results:

TCP RAW throughput 100Mbit/s max MTU:
-------------------------------------
ratz@laphish:~/netperf-2.2pl2 > ./netperf -H 192.168.1.141 -p 6666 -l 60
TCP STREAM TEST to 192.168.1.141
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/s

  87380  16384  16384    60.01      88.03 <------
ratz@laphish:~/netperf-2.2pl2 >


TCP RAW throughput 100Mbit/s max MTU with 10000 non-matching rules + 1 
last matching rule at the end of the FORWARD chain [iptables]:
----------------------------------------------------------------------
ratz@laphish:~/netperf-2.2pl2 > ./netperf -H 192.168.1.141 -p 6666 -l 60
TCP STREAM TEST to 192.168.1.141
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384  16384    60.12       3.28 <------
ratz@laphish:~/netperf-2.2pl2 >


TCP RAW throughput 100Mbit/s max MTU with 10000 non-matching rules + 1 
last matching rule at the end of the FORWARD chain [nf-hipac]:
----------------------------------------------------------------------
ratz@laphish:~/netperf-2.2pl2 > ./netperf -H 192.168.1.141 -p 6666 -l 60
TCP STREAM TEST to 192.168.1.141
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

  87380  16384  16384    60.03      85.78 <------
ratz@laphish:~/netperf-2.2pl2 >


For nf-hipac I also have some statistics:
-----------------------------------------
bloodyhell:/var/FWTEST/nf-hipac # cat /proc/net/nf-hipac
nf-hipac statistics
-------------------

Maximum available memory:          65308672 bytes

Currently used memory:             1764160 bytes

INPUT:
   - INPUT chain is empty

FORWARD:
   - Number of rules:                 10002
   - Total size:                    1033010 bytes
   - Total size (allocated):        1764160 bytes
   - Termrule size:                   80016 bytes
   - Termrule size (allocated):      320064 bytes
   - Number of btrees:                30007
     * number of u32 btrees:          10003
       + distribution of u32 btrees:
                                     [     2,      4]:   10002
                                     [ 16384,  32768]:       1
     * number of u16 btrees:          10002
       + distribution of u16 btrees:
                                     [    1,     2]:   10002
     * number of u8 btrees:           10002
       + distribution of u8 btrees:
                                     [  2,   4]:      18

OUTPUT:
   - OUTPUT chain is empty

bloodyhell:/var/FWTEST/nf-hipac #

Roberto Nibali, ratz
-- 
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-10-02 17:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20020926.020602.75761707.davem@redhat.com>
2002-09-26 12:03 ` [ANNOUNCE] NF-HIPAC: High Performance Packet Classification jamal
2002-09-26 20:23   ` Roberto Nibali
2002-09-27 13:57     ` jamal
     [not found] <3D924F9D.C2DCF56A@us.ibm.com.suse.lists.linux.kernel>
     [not found] ` <20020925.170336.77023245.davem@redhat.com.suse.lists.linux.kernel>
     [not found]   ` <p73n0q5sib2.fsf@oldwotan.suse.de>
     [not found]     ` <20020925.172931.115908839.davem@redhat.com>
     [not found]       ` <3D92CCC5.5000206@drugphish.ch>
     [not found]         ` <20020926140430.E14485@wotan.suse.de>
2002-09-26 20:49           ` Roberto Nibali
     [not found] <Pine.LNX.3.96.1020930133306.20863A-100000@gatekeeper.tmr.com>
2002-10-02 17:37 ` Roberto Nibali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).