All of lore.kernel.org
 help / color / mirror / Atom feed
* SMP load balancing of softirqs
@ 2009-01-29  9:58 Tore Anderson
  2009-01-29 10:49 ` Thomas Jacob
  2009-01-31 15:17 ` Vlado Drz(ík
  0 siblings, 2 replies; 7+ messages in thread
From: Tore Anderson @ 2009-01-29  9:58 UTC (permalink / raw)
  To: netfilter

Hello,

I've got an "router on a stick" with about 6000 iptables rules.
Connection tracking is in use, including a few protocol helpers.  The
hardware is a SunFire X4100 with 4x e1000 NICs and two AMD 275 CPUs
(dual-core).  It was running a 2.6 kernel (early .20ies).

A while back I noticed a performance problem, the process ksoftirqd/1
was using 100% of its respective CPU core (#1), and there was severe
packet loss.  The forwarding rate was around 600 Mbps / 110 Kpps, so
nothing that the NIC shouldn't be able to handle.  The other CPU cores
were mostly idle.  I found out that I could move the problem around to
ksoftirqd/{0,2,3} by changing the smp_affinity parameter for eth0's IRQ,
so that the interrupts was handled by a different CPU core.  I found no
way to make the softirqs to be balanced across all four CPU cores.

The workaround I ended up with was to simply connect all four NICs and
join them together in a bonded ethernet device (LAG), making sure the
switch load-balanced incoming packets equally amongst all four LAG
members, and also use smp_affinity to make sure the intterupts for each
NIC is handled by separate CPUs.  It works well enouch - I assume I've
roughly quadrupled the maximum capacity of the router compared to using
a single NIC, even though I'm wasting switch ports since I can at most
utilise half of the interfaces' max bandwith.

Anyway, now I'm considering getting a 10G aggregation switch and connect
the router to it.  The high port cost of 10 GbE interfaces/switch ports
rules out using the same trick, so I was wondering if anyone else has
had a problem with this behaviour and found another way to deal with it,
that enables the full utilisation of a SMP system even if the router has
only one network interface?

Best regards,
-- 
Tore Anderson
Redpill Linpro AS - http://www.redpill-linpro.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29  9:58 SMP load balancing of softirqs Tore Anderson
@ 2009-01-29 10:49 ` Thomas Jacob
  2009-01-29 11:23   ` Tore Anderson
  2009-01-29 17:33   ` Rick Jones
  2009-01-31 15:17 ` Vlado Drz(ík
  1 sibling, 2 replies; 7+ messages in thread
From: Thomas Jacob @ 2009-01-29 10:49 UTC (permalink / raw)
  To: Tore Anderson; +Cc: netfilter

On Thu, 2009-01-29 at 10:58 +0100, Tore Anderson wrote:
> The workaround I ended up with was to simply connect all four NICs and
> join them together in a bonded ethernet device (LAG), making sure the
> switch load-balanced incoming packets equally amongst all four LAG
> members, and also use smp_affinity to make sure the intterupts for each
> NIC is handled by separate CPUs.

I'm guessing that would be the standard approach... don't know whether
or not it is possible or advisable to balance soft IRQs

> Anyway, now I'm considering getting a 10G aggregation switch and connect
> the router to it.  The high port cost of 10 GbE interfaces/switch ports
> rules out using the same trick, so I was wondering if anyone else has
> had a problem with this behaviour and found another way to deal with it,
> that enables the full utilisation of a SMP system even if the router has
> only one network interface?

Some newer NICs (some of Intel's for instance) support several packet
queues to make it possible to deal with just this problem.

Check out http://lwn.net/Articles/289137/ for a start...

It would be great if you'd let the list know of the results should you
try to use one of the multiqueue NICs for a netfilter firewall, I for
one am very curious...





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29 10:49 ` Thomas Jacob
@ 2009-01-29 11:23   ` Tore Anderson
  2009-01-29 12:06     ` Thomas Jacob
  2009-01-29 17:33   ` Rick Jones
  1 sibling, 1 reply; 7+ messages in thread
From: Tore Anderson @ 2009-01-29 11:23 UTC (permalink / raw)
  To: Thomas Jacob; +Cc: netfilter

* Thomas Jacob

> Some newer NICs (some of Intel's for instance) support several packet
> queues to make it possible to deal with just this problem.
> 
> Check out http://lwn.net/Articles/289137/ for a start...

Interesting link, thanks!   However, I was under the impression that the
problem is incoming (RX) frames, that causes an interrupt to be raised
on a certain CPU (core) which in turn causes the frame to be processed
by that particular CPU by the NET_RX softirq handler.

The multiqueue patch seem to be about being able to submit outgoing (TX)
frames to multiple hardware queues.  So I don't think it will make much
of a difference for me?

> It would be great if you'd let the list know of the results should you
> try to use one of the multiqueue NICs for a netfilter firewall, I for
> one am very curious...

I'll remember that.  Thanks again!

Best regards,
-- 
Tore Anderson
Redpill Linpro AS - http://www.redpill-linpro.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29 11:23   ` Tore Anderson
@ 2009-01-29 12:06     ` Thomas Jacob
  2009-01-29 20:26       ` Tore Anderson
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Jacob @ 2009-01-29 12:06 UTC (permalink / raw)
  To: Tore Anderson; +Cc: netfilter

On Thu, 2009-01-29 at 12:23 +0100, Tore Anderson wrote:
> * Thomas Jacob
> 
> > Some newer NICs (some of Intel's for instance) support several packet
> > queues to make it possible to deal with just this problem.
> > 
> > Check out http://lwn.net/Articles/289137/ for a start...
> 
> Interesting link, thanks!   However, I was under the impression that the
> problem is incoming (RX) frames, that causes an interrupt to be raised
> on a certain CPU (core) which in turn causes the frame to be processed
> by that particular CPU by the NET_RX softirq handler.

Correct... maybe this info isn't quite up to date...

> The multiqueue patch seem to be about being able to submit outgoing (TX)
> frames to multiple hardware queues.  So I don't think it will make much
> of a difference for me?

Don't know about the current state of general Linux driver support for
this, but the Intel hardware also supports multiple RX queues:

http://www.intel.com/network/connectivity/resources/technologies/10_gigabit_ethernet.htm


and their current ixgbe driver docs say:


RSS - Receive Side Scaling (or multiple queues for receives)
------------------------------------------------------------
Valid Range: 0 - 16
0 = disables RSS
1 = enables RSS and sets the descriptor queue count to 16 or the number
of
    online cpus, whichever is less.
2-16 = enables RSS, with 2-16 queues
Default Value: 1
RSS also effects the number of transmit queues allocated on 2.6.23 and
newer kernels with CONFIG_NETDEVICES_MULTIQUEUE set in the
kernel .config file.
CONFIG_NETDEVICES_MULTIQUEUE only exists from 2.6.23 to 2.6.26.  Other
options
enable multiqueue in 2.6.27 and newer kernels.


MQ - Multi Queue
----------------
Valid Range: 0, 1
0 = Disables Multiple Queue support
1 = Enabled Multiple Queue support (a prerequisite for RSS)
Default Value: 1









^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29 10:49 ` Thomas Jacob
  2009-01-29 11:23   ` Tore Anderson
@ 2009-01-29 17:33   ` Rick Jones
  1 sibling, 0 replies; 7+ messages in thread
From: Rick Jones @ 2009-01-29 17:33 UTC (permalink / raw)
  To: Thomas Jacob; +Cc: Tore Anderson, netfilter

> Some newer NICs (some of Intel's for instance) support several packet
> queues to make it possible to deal with just this problem.

I think it may be more than just "some" newer/newish 10Gig NICs actually. 
Besides Intel, ChelsIO and Neterion and Broadcom come to mind, and probably 
Myricom and SolarFlare (did they merge with someone?) and certainly others.  I 
think that Cisco and Qlogic also offer 10G NICs these days.

> It would be great if you'd let the list know of the results should you
> try to use one of the multiqueue NICs for a netfilter firewall, I for
> one am very curious...

IIRC there are sort of "two" multiqueues - there is the older, more established 
"inbound" multiqueue stuff - what Microsoft has everyone calling RSS or Recieve 
Side Scaling (or am I mixing terms?) that only affects inbound packets.

Then there is "tx (transmit) multiqueue" which is rather newer (first in 2.6.26 
kernels?) and still "evolving."

If you are forwarding traffic, you probably want both, and likely as not probably 
want to be on as current (bleeding edge) a kernel and NIC drivers as you can 
stomache.

rick jones

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29 12:06     ` Thomas Jacob
@ 2009-01-29 20:26       ` Tore Anderson
  0 siblings, 0 replies; 7+ messages in thread
From: Tore Anderson @ 2009-01-29 20:26 UTC (permalink / raw)
  To: Thomas Jacob; +Cc: netfilter

* Thomas Jacob

> Don't know about the current state of general Linux driver support for
> this, but the Intel hardware also supports multiple RX queues:
> 
> http://www.intel.com/network/connectivity/resources/technologies/10_gigabit_ethernet.htm

This (MSI-X and/or Receive-Side Scaling) looks like exactly the kind of
functionality I need.  Thank you very much for the pointer!  I'll make
sure to let you know how it goes if I get one of these cards in my lab.

Best regards,
-- 
Tore Anderson
Redpill Linpro AS - http://www.redpill-linpro.com/
Tel: +47 21 54 41 27

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SMP load balancing of softirqs
  2009-01-29  9:58 SMP load balancing of softirqs Tore Anderson
  2009-01-29 10:49 ` Thomas Jacob
@ 2009-01-31 15:17 ` Vlado Drz(ík
  1 sibling, 0 replies; 7+ messages in thread
From: Vlado Drz(ík @ 2009-01-31 15:17 UTC (permalink / raw)
  To: Tore Anderson; +Cc: netfilter


Tore Anderson  wrote:
> Hello,
> 
> I've got an "router on a stick" with about 6000 iptables rules.
> Connection tracking is in use, including a few protocol helpers.  The
> hardware is a SunFire X4100 with 4x e1000 NICs and two AMD 275 CPUs
> (dual-core).  It was running a 2.6 kernel (early .20ies).
> 
> A while back I noticed a performance problem, the process ksoftirqd/1
> was using 100% of its respective CPU core (#1), and there was severe
> packet loss.  The forwarding rate was around 600 Mbps / 110 Kpps, so

Firstly I'm very surprised that you can reach that packet rate with so
many rules. I'd also focus on that. Do you really need so many rules.
What are you using that for?  accounting traffic, per ip fileters?
There are iptables extensions which are able to do accouning in more
effective maner also IP filters can be convereted to hash based ipsets.
Also there is always space for ruleset optimization (to make it tree
based not flat list) to limit number or rules packet needs to traverse.

> nothing that the NIC shouldn't be able to handle.  The other CPU cores
> were mostly idle.  I found out that I could move the problem around to
> ksoftirqd/{0,2,3} by changing the smp_affinity parameter for eth0's IRQ,
> so that the interrupts was handled by a different CPU core.  I found no
> way to make the softirqs to be balanced across all four CPU cores.
> 
You are right. Nowdays more problem lies in packet rate that CPU is able
to handle not a NIC ( for non-router load LSO could help
http://en.wikipedia.org/wiki/Large_segment_offload ).

> The workaround I ended up with was to simply connect all four NICs and
> join them together in a bonded ethernet device (LAG), making sure the
> switch load-balanced incoming packets equally amongst all four LAG
> members, 
I was forced to use same practice on our 2xquadcore router. We had a 2x
1GB nic traffic and we seen a perforomace problems in much lower
scenarios (80kpps). But we are doing mostly traffic shaping and NAT.
I'm using xmit_hash_policy=layer2+3 on bonding devices. I wanted to keep
  packets belonging to same flow on same NIC to get better cache
locality and avoid problems with ordering. (would be nice to compare to
do comparison to usual rr method).
Also I'd recomed you to play around with coalescing setting of NIC
(ethtool -C) and rx buffer and backlog.

Problem I can see is that Linux is not able to handover processing
packet comming from one NIC (or queue) to more softirqs and it'll end up
using just one softirq.
Really only solution seems to be RSS so packets will be separeted to
independent TCP flows and so could be then handled by separate CPU/softirqs.

I'd also like to test out RSS on our machine. Do someone have experience
with it on various HW (e1000,bnx2..) and recent kernels?


and also use smp_affinity to make sure the intterupts for each
> NIC is handled by separate CPUs.  It works well enouch - I assume I've
> roughly quadrupled the maximum capacity of the router compared to using
> a single NIC, even though I'm wasting switch ports since I can at most
> utilise half of the interfaces' max bandwith.
> 
> Anyway, now I'm considering getting a 10G aggregation switch and connect
> the router to it.  The high port cost of 10 GbE interfaces/switch ports
> rules out using the same trick, so I was wondering if anyone else has
> had a problem with this behaviour and found another way to deal with it,
> that enables the full utilisation of a SMP system even if the router has
> only one network interface?
> 
> Best regards,


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-01-31 15:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-29  9:58 SMP load balancing of softirqs Tore Anderson
2009-01-29 10:49 ` Thomas Jacob
2009-01-29 11:23   ` Tore Anderson
2009-01-29 12:06     ` Thomas Jacob
2009-01-29 20:26       ` Tore Anderson
2009-01-29 17:33   ` Rick Jones
2009-01-31 15:17 ` Vlado Drz(ík

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.