* Network multiqueue question
@ 2010-04-15 16:58 George B.
2010-04-15 17:47 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: George B. @ 2010-04-15 16:58 UTC (permalink / raw)
To: netdev
I am in need of a little education on multiqueue and was wondering if
someone here might be able to help me.
Given intel igb network driver, it appears I can do something like:
tc qdisc del dev eth0 root handle 1: multiq
which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4
But our network is a little more complicated. Above the ethernet we
have the bonding driver which is using mode 2 bonding with two
ethernet slaves. Then we have vlans on the bond interface. Our
production traffic is on a vlan and resource contention is an issue as
these are busy machines.
It is my understanding that the vlan driver became multiqueue aware in
2.6.32 (we are currently using 2.6.31).
It would seem that the first thing the kernel would encounter with
traffic headed out would be the vlan interface, and then the bond
interface, and then the physical ethernet interface. Is that correct?
So with my kernel, I would seem to get no utility from multiq on the
ethernet interface if the vlan interface is going to be a
single-threaded bottleneck. What about the bond driver? Is it
currently multiqueue aware?
I am try to get some sort of logical picture of how all these things
interact with each other to get things a little more efficient and
reduce resource contention in the application while still trying to be
efficient in use of network ports/interfaces.
If someone feels up to the task of sending a little education my way,
I would be most appreciative. There doesn't seem to be a whole lot of
documentation floating around about multiqueue other than a blurb of
text in the kernel and David's presentation of last year.
Thanks!
George
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-15 16:58 Network multiqueue question George B.
@ 2010-04-15 17:47 ` Eric Dumazet
2010-04-15 18:09 ` Jay Vosburgh
2010-04-16 4:00 ` George B.
0 siblings, 2 replies; 8+ messages in thread
From: Eric Dumazet @ 2010-04-15 17:47 UTC (permalink / raw)
To: George B.; +Cc: netdev
Le jeudi 15 avril 2010 à 09:58 -0700, George B. a écrit :
> I am in need of a little education on multiqueue and was wondering if
> someone here might be able to help me.
>
> Given intel igb network driver, it appears I can do something like:
>
> tc qdisc del dev eth0 root handle 1: multiq
>
> which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4
>
> But our network is a little more complicated. Above the ethernet we
> have the bonding driver which is using mode 2 bonding with two
> ethernet slaves. Then we have vlans on the bond interface. Our
> production traffic is on a vlan and resource contention is an issue as
> these are busy machines.
>
> It is my understanding that the vlan driver became multiqueue aware in
> 2.6.32 (we are currently using 2.6.31).
>
> It would seem that the first thing the kernel would encounter with
> traffic headed out would be the vlan interface, and then the bond
> interface, and then the physical ethernet interface. Is that correct?
> So with my kernel, I would seem to get no utility from multiq on the
> ethernet interface if the vlan interface is going to be a
> single-threaded bottleneck. What about the bond driver? Is it
> currently multiqueue aware?
>
> I am try to get some sort of logical picture of how all these things
> interact with each other to get things a little more efficient and
> reduce resource contention in the application while still trying to be
> efficient in use of network ports/interfaces.
>
> If someone feels up to the task of sending a little education my way,
> I would be most appreciative. There doesn't seem to be a whole lot of
> documentation floating around about multiqueue other than a blurb of
> text in the kernel and David's presentation of last year.
Hi George
Vlan is multiqueue aware, but bonding is not unfortunatly at this
moment.
We could let it being 'multiqueue' (a patch was submitted by Oleg A.
Arkhangelsky a while ago), but bonding xmit routine needs to lock a
central lock, shared by all queues, so it wont be very efficient...
Since this bothers me a bit, I will probably work on this in a near
future. (adding real multiqueue capability and RCU to bonding fast
paths)
Ref: http://permalink.gmane.org/gmane.linux.network/152987
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-15 17:47 ` Eric Dumazet
@ 2010-04-15 18:09 ` Jay Vosburgh
2010-04-15 18:41 ` Eric Dumazet
2010-04-16 3:54 ` George B.
2010-04-16 4:00 ` George B.
1 sibling, 2 replies; 8+ messages in thread
From: Jay Vosburgh @ 2010-04-15 18:09 UTC (permalink / raw)
To: Eric Dumazet; +Cc: George B., netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
>Le jeudi 15 avril 2010 à 09:58 -0700, George B. a écrit :
>> I am in need of a little education on multiqueue and was wondering if
>> someone here might be able to help me.
>>
>> Given intel igb network driver, it appears I can do something like:
>>
>> tc qdisc del dev eth0 root handle 1: multiq
>>
>> which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4
>>
>> But our network is a little more complicated. Above the ethernet we
>> have the bonding driver which is using mode 2 bonding with two
>> ethernet slaves. Then we have vlans on the bond interface. Our
>> production traffic is on a vlan and resource contention is an issue as
>> these are busy machines.
>>
>> It is my understanding that the vlan driver became multiqueue aware in
>> 2.6.32 (we are currently using 2.6.31).
>>
>> It would seem that the first thing the kernel would encounter with
>> traffic headed out would be the vlan interface, and then the bond
>> interface, and then the physical ethernet interface. Is that correct?
>> So with my kernel, I would seem to get no utility from multiq on the
>> ethernet interface if the vlan interface is going to be a
>> single-threaded bottleneck. What about the bond driver? Is it
>> currently multiqueue aware?
>>
>> I am try to get some sort of logical picture of how all these things
>> interact with each other to get things a little more efficient and
>> reduce resource contention in the application while still trying to be
>> efficient in use of network ports/interfaces.
>>
>> If someone feels up to the task of sending a little education my way,
>> I would be most appreciative. There doesn't seem to be a whole lot of
>> documentation floating around about multiqueue other than a blurb of
>> text in the kernel and David's presentation of last year.
>
>Hi George
>
>Vlan is multiqueue aware, but bonding is not unfortunatly at this
>moment.
>
>We could let it being 'multiqueue' (a patch was submitted by Oleg A.
>Arkhangelsky a while ago), but bonding xmit routine needs to lock a
>central lock, shared by all queues, so it wont be very efficient...
The lock is a read lock, so theoretically it should be possible
to enter the bonding transmit function on multiple CPUs at the same
time. The lock may thrash around, though.
>Since this bothers me a bit, I will probably work on this in a near
>future. (adding real multiqueue capability and RCU to bonding fast
>paths)
>
>Ref: http://permalink.gmane.org/gmane.linux.network/152987
The question I have about it (and the above patch), is: what
does multi-queue "awareness" really mean for a bonding device? How does
allocating a bunch of TX queues help, given that the determination of
the transmitting device hasn't necessarily been made?
I haven't had the chance to acquire some multi-queue network
cards and check things out with bonding, so I'm not really sure how it
should work. Should the bond look, from a multi-queue perspective, like
the largest slave, or should it look like the sum of the slaves? Some
of this is may be mode-specific, as well.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-15 18:09 ` Jay Vosburgh
@ 2010-04-15 18:41 ` Eric Dumazet
2010-04-16 3:54 ` George B.
1 sibling, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2010-04-15 18:41 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: George B., netdev
Le jeudi 15 avril 2010 à 11:09 -0700, Jay Vosburgh a écrit :
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> >Vlan is multiqueue aware, but bonding is not unfortunatly at this
> >moment.
> >
> >We could let it being 'multiqueue' (a patch was submitted by Oleg A.
> >Arkhangelsky a while ago), but bonding xmit routine needs to lock a
> >central lock, shared by all queues, so it wont be very efficient...
>
> The lock is a read lock, so theoretically it should be possible
> to enter the bonding transmit function on multiple CPUs at the same
> time. The lock may thrash around, though.
>
Yes, and with 10Gb cards, this is a limiting factor, if you want to send
14 million packets per second ;)
read_lock() is one atomic op, dirtying cacheline
read_unlock() is one atomic op, dirtying cache line again (if contended)
in active-passive mode, RCU use should be really easy, given netdevices
are already RCU compatable. This way, each cpu only reads bonding state,
without any memory changes.
> >Since this bothers me a bit, I will probably work on this in a near
> >future. (adding real multiqueue capability and RCU to bonding fast
> >paths)
> >
> >Ref: http://permalink.gmane.org/gmane.linux.network/152987
>
> The question I have about it (and the above patch), is: what
> does multi-queue "awareness" really mean for a bonding device? How does
> allocating a bunch of TX queues help, given that the determination of
> the transmitting device hasn't necessarily been made?
>
Well, it is a problem that was also taken into account with vlan, you
might take a look at this commit :
commit 669d3e0babb40018dd6e78f4093c13a2eac73866
Author: Vasu Dev <vasu.dev@intel.com>
Date: Tue Mar 23 14:41:45 2010 +0000
vlan: adds vlan_dev_select_queue
This is required to correctly select vlan tx queue for a driver
supporting multi tx queue with ndo_select_queue implemented since
currently selected vlan tx queue is unaligned to selected queue by
real net_devce ndo_select_queue.
Unaligned vlan tx queue selection causes thrash with higher vlan
tx lock contention for least fcoe traffic and wrong socket tx
queue_mapping for ixgbe having ndo_select_queue implemented.
-v2
As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored
vlan net_device_ops to have them with and without
vlan_dev_select_queue
and then select according to real dev ndo_select_queue present or
not
for a vlan net_device. This is to completely skip
vlan_dev_select_queue
calling for real net_device not supporting ndo_select_queue.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
> I haven't had the chance to acquire some multi-queue network
> cards and check things out with bonding, so I'm not really sure how it
> should work. Should the bond look, from a multi-queue perspective, like
> the largest slave, or should it look like the sum of the slaves? Some
> of this is may be mode-specific, as well.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-15 18:09 ` Jay Vosburgh
2010-04-15 18:41 ` Eric Dumazet
@ 2010-04-16 3:54 ` George B.
1 sibling, 0 replies; 8+ messages in thread
From: George B. @ 2010-04-16 3:54 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev
On Thu, Apr 15, 2010 at 11:09 AM, Jay Vosburgh <fubar@us.ibm.com> wrote:
> The question I have about it (and the above patch), is: what
> does multi-queue "awareness" really mean for a bonding device? How does
> allocating a bunch of TX queues help, given that the determination of
> the transmitting device hasn't necessarily been made?
Good point.
> I haven't had the chance to acquire some multi-queue network
> cards and check things out with bonding, so I'm not really sure how it
> should work. Should the bond look, from a multi-queue perspective, like
> the largest slave, or should it look like the sum of the slaves? Some
> of this is may be mode-specific, as well.
I would say that having the number of bands be either the number of
cores or 4, whichever is the smaller would be a good start. That is
probably fine for GigE. Of the network cards we have that support
multiqueue, they are either 4 or 8 bands. In an optimal world, you
would have the number of bands that you have available at the physical
ethernet level but changing those on the fly in case of a change in
available interfaces might be more trouble than it is worth.
Four or eight would seem to be a good number to start with as I don't
think I have seen an ethernet card with less than 4. If you have
fewer than 4 CPUs there probably isn't much utility in having more
bands than processors, or maybe that utility rapidly diminishes as the
number of bands increases beyond the number of CPUs. At that point
you have probably just spent a lot of work building a bigger buffer.
I would be happy with 4 bands. I guess it just depends on where you
want the bottleneck. If you have 8 bands on the bond driver (another
reasonable alternative) and only 4 bands available for output, you
have just moved the contention down a layer to between the bond and
the ethernet driver. But I am a fan of moving the point of contention
as far away from the application interface as possible. If I have one
big lock around the bond driver and have 6 things waiting to talk to
the network, those are six things that can't be doing anything else.
I would rather have the application handle its network task and get
back to other things. Now if you have 8 bands of bond and only 4
bands of ethernet, or even one band of ethernet, oh well. Maybe have
1 to 8 bands configurable by an option to the driver that could be set
explicitly and defaults to, say, 4?
Thanks for taking the time to answer.
George
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-15 17:47 ` Eric Dumazet
2010-04-15 18:09 ` Jay Vosburgh
@ 2010-04-16 4:00 ` George B.
2010-04-16 4:53 ` Eric Dumazet
1 sibling, 1 reply; 8+ messages in thread
From: George B. @ 2010-04-16 4:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Thu, Apr 15, 2010 at 10:47 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Since this bothers me a bit, I will probably work on this in a near
> future. (adding real multiqueue capability and RCU to bonding fast
> paths)
>
> Ref: http://permalink.gmane.org/gmane.linux.network/152987
That would be great and you would have my sincere thanks.. And if
anyone is interested, what we do is take a pair of "top of rack"
switches and cluster them together so they appear as one switch.
Configure a LAG consisting of a port on each physical switch to a pair
of bonded interfaces on the server and use mode 2 bonding. In normal
operation, both interfaces are active. Should one switch experience a
power or interface failure, the server sees one of the interfaces fail
but just keeps working on the remaining interface. There is no
"failover" event going on.
Thanks,
George
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-16 4:00 ` George B.
@ 2010-04-16 4:53 ` Eric Dumazet
2010-04-16 7:28 ` George B.
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2010-04-16 4:53 UTC (permalink / raw)
To: George B.; +Cc: netdev
Le jeudi 15 avril 2010 à 21:00 -0700, George B. a écrit :
> On Thu, Apr 15, 2010 at 10:47 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> > Since this bothers me a bit, I will probably work on this in a near
> > future. (adding real multiqueue capability and RCU to bonding fast
> > paths)
> >
> > Ref: http://permalink.gmane.org/gmane.linux.network/152987
>
> That would be great and you would have my sincere thanks.. And if
> anyone is interested, what we do is take a pair of "top of rack"
> switches and cluster them together so they appear as one switch.
> Configure a LAG consisting of a port on each physical switch to a pair
> of bonded interfaces on the server and use mode 2 bonding. In normal
> operation, both interfaces are active. Should one switch experience a
> power or interface failure, the server sees one of the interfaces fail
> but just keeps working on the remaining interface. There is no
> "failover" event going on.
>
What kind of traffic do your machines manage exactly ?
On server, you use two ports of the same kind (same number of queues) ?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question
2010-04-16 4:53 ` Eric Dumazet
@ 2010-04-16 7:28 ` George B.
0 siblings, 0 replies; 8+ messages in thread
From: George B. @ 2010-04-16 7:28 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Thu, Apr 15, 2010 at 9:53 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 15 avril 2010 à 21:00 -0700, George B. a écrit :
> What kind of traffic do your machines manage exactly ?
Content to mobile devices (cell phones and such). More detail sent privately.
> On server, you use two ports of the same kind (same number of queues) ?
Yes, same kind. We try to make everything identical. Fewer problems that way.
George
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-16 7:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-15 16:58 Network multiqueue question George B.
2010-04-15 17:47 ` Eric Dumazet
2010-04-15 18:09 ` Jay Vosburgh
2010-04-15 18:41 ` Eric Dumazet
2010-04-16 3:54 ` George B.
2010-04-16 4:00 ` George B.
2010-04-16 4:53 ` Eric Dumazet
2010-04-16 7:28 ` George B.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).