From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Dan" Date: Wed, 24 Jan 2007 17:22:14 +0000 Subject: [LARTC] Thoughput Message-Id: <006f01c73fdc$30f4c0d0$92de4270$@eu> MIME-Version: 1 Content-Type: multipart/mixed; boundary="===============0558200135==" List-Id: To: lartc@vger.kernel.org This is a multipart message in MIME format. --===============0558200135== Content-Type: multipart/alternative; boundary="----=_NextPart_000_0070_01C73FDC.30F4C0D0" Content-Language: en-gb This is a multipart message in MIME format. ------=_NextPart_000_0070_01C73FDC.30F4C0D0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, I am after a feel of the throughput capabilities for TC and Iptables in comparison to dedicated hardware. I have heard talk about 1Gb+ throughput with minimal performance impact using 50ish TC rules and 100+ Iptables rules. Is there anyone here running large throughput / large configurations, and if so, what sort of figures? Regards Dan ------=_NextPart_000_0070_01C73FDC.30F4C0D0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

 

I am after a feel of the throughput capabilities = for TC and Iptables in comparison to dedicated hardware. I have heard talk about = 1Gb+ throughput with minimal performance impact using 50ish TC rules and 100+ Iptables = rules.

 

Is there anyone here running large throughput / = large configurations, and if so, what sort of figures?

 

Regards

 

Dan

------=_NextPart_000_0070_01C73FDC.30F4C0D0-- --===============0558200135== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc --===============0558200135==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Kierdelewicz Date: Thu, 25 Jan 2007 14:40:16 +0000 Subject: Re: [LARTC] Thoughput Message-Id: <20070125154016.2a45cd30@localhost> List-Id: References: <006f01c73fdc$30f4c0d0$92de4270$@eu> In-Reply-To: <006f01c73fdc$30f4c0d0$92de4270$@eu> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: lartc@vger.kernel.org > Hi, Hi > I am after a feel of the throughput capabilities for TC and Iptables > in comparison to dedicated hardware. I have heard talk about 1Gb+ > throughput with minimal performance impact using 50ish TC rules and > 100+ Iptables rules. More important than bandwidth is packets per seconds. Calculate your average packet size (measure bandwith and packets in some time window and calculate per second values). It's not the number of rules (tc or firewall) that matter most but thier composition. You should use hashing tc filters when possible and "set" iptables module (instead of many iptables rules) to offload cpu. If you don't need connection tracking (NAT and stuff) - disable it. > Is there anyone here running large throughput / large > configurations, and if so, what sort of figures? You can easily achieve 600k pps on AMD 64 x2 5200 with mean 70% cpu utilization at peek hours. You must bind irqs of nics to different cores (look in /proc/irq/NUM/smp_affinity) to achieve symmetric load of both cores (sometimes its difficult). Similar speed can be achieved with Xeon 3,2GHz with HT (the old one). I havn't tested new Xeons in the network field and I'm curious myself how would they manage. One can put more cores and more nics into the box and achieve even more throuput. Problem is balancing load between the cores. Your setup will be as effective as most used core. I think that using Cisco EtherChannel (and any other bonding/trunking technique that allows round robin traffic distribution between physical links) would allow the ideal distribution of load between cores. Has anyone tried this? cheers, Marek Kierdelewicz _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ivan Vladimirov Date: Fri, 02 Feb 2007 09:32:30 +0000 Subject: Re: [LARTC] Thoughput Message-Id: <45C3052E.60105@netwlan.net> MIME-Version: 1 Content-Type: multipart/mixed; boundary="===============1998203615==" List-Id: References: <006f01c73fdc$30f4c0d0$92de4270$@eu> In-Reply-To: <006f01c73fdc$30f4c0d0$92de4270$@eu> To: lartc@vger.kernel.org --===============1998203615== Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Marek Kierdelewicz wrote:
>> Hi,
>
> Hi
>
>> I am after a feel of the throughput capabilities for TC and
>> Iptables in comparison to dedicated hardware. I have heard talk
>> about 1Gb+ throughput with minimal performance impact using 50ish
>> TC rules and 100+ Iptables rules.
>
> More important than bandwidth is packets per seconds. Calculate
> your average packet size (measure bandwith and packets in some time
> window and calculate per second values).
>
> It's not the number of rules (tc or firewall) that matter most but
> thier composition. You should use hashing tc filters when possible
> and "set" iptables module (instead of many iptables rules) to
> offload cpu.
>
> If you don't need connection tracking (NAT and stuff) - disable it.
>
>
>> Is there anyone here running large throughput / large
>> configurations, and if so, what sort of figures?
>
> You can easily achieve 600k pps on AMD 64 x2 5200 with mean 70% cpu
>  utilization at peek hours. You must bind irqs of nics to different
>  cores (look in /proc/irq/NUM/smp_affinity) to achieve symmetric
> load of both cores (sometimes its difficult).
>
> Similar speed can be achieved with Xeon 3,2GHz with HT (the old
> one). I havn't tested new Xeons in the network field and I'm
> curious myself how would they manage.
>
> One can put more cores and more nics into the box and achieve even
> more throuput. Problem is balancing load between the cores. Your
> setup will be as effective as most used core. I think that using
> Cisco EtherChannel (and any other bonding/trunking technique that
> allows round robin traffic distribution between physical links)
> would allow the ideal distribution of load between cores. Has
> anyone tried this?
>
> cheers, Marek Kierdelewicz
> _______________________________________________ LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
>

I have made some research in this area on new Xeon 5130 with 4 NIC's
and been able to achieve throughput  about  1.5 Gbit  bidirectional
traffic using one processor
bonding 2 external NIC's to catalyst 3750 ether channel .
My server has 8000 tc rules and 2000 ipset rules .
Also it is possible to use CPU set to achieve better irq balancing
over multiple CPUs if you find software for Linux.
Average packet size was 476 byte ...
I have set irqs in the fowling way.
------------------------------------
|       core1       |    core2    |
------------------------------------
          |                      |
eth0 eth2            eth1 eth3
This was the optimal solution ...



--===============1998203615== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc --===============1998203615==--