* Network multiqueue question @ 2010-04-15 16:58 George B. 2010-04-15 17:47 ` Eric Dumazet 0 siblings, 1 reply; 8+ messages in thread From: George B. @ 2010-04-15 16:58 UTC (permalink / raw) To: netdev I am in need of a little education on multiqueue and was wondering if someone here might be able to help me. Given intel igb network driver, it appears I can do something like: tc qdisc del dev eth0 root handle 1: multiq which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4 But our network is a little more complicated. Above the ethernet we have the bonding driver which is using mode 2 bonding with two ethernet slaves. Then we have vlans on the bond interface. Our production traffic is on a vlan and resource contention is an issue as these are busy machines. It is my understanding that the vlan driver became multiqueue aware in 2.6.32 (we are currently using 2.6.31). It would seem that the first thing the kernel would encounter with traffic headed out would be the vlan interface, and then the bond interface, and then the physical ethernet interface. Is that correct? So with my kernel, I would seem to get no utility from multiq on the ethernet interface if the vlan interface is going to be a single-threaded bottleneck. What about the bond driver? Is it currently multiqueue aware? I am try to get some sort of logical picture of how all these things interact with each other to get things a little more efficient and reduce resource contention in the application while still trying to be efficient in use of network ports/interfaces. If someone feels up to the task of sending a little education my way, I would be most appreciative. There doesn't seem to be a whole lot of documentation floating around about multiqueue other than a blurb of text in the kernel and David's presentation of last year. Thanks! George ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-15 16:58 Network multiqueue question George B. @ 2010-04-15 17:47 ` Eric Dumazet 2010-04-15 18:09 ` Jay Vosburgh 2010-04-16 4:00 ` George B. 0 siblings, 2 replies; 8+ messages in thread From: Eric Dumazet @ 2010-04-15 17:47 UTC (permalink / raw) To: George B.; +Cc: netdev Le jeudi 15 avril 2010 à 09:58 -0700, George B. a écrit : > I am in need of a little education on multiqueue and was wondering if > someone here might be able to help me. > > Given intel igb network driver, it appears I can do something like: > > tc qdisc del dev eth0 root handle 1: multiq > > which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4 > > But our network is a little more complicated. Above the ethernet we > have the bonding driver which is using mode 2 bonding with two > ethernet slaves. Then we have vlans on the bond interface. Our > production traffic is on a vlan and resource contention is an issue as > these are busy machines. > > It is my understanding that the vlan driver became multiqueue aware in > 2.6.32 (we are currently using 2.6.31). > > It would seem that the first thing the kernel would encounter with > traffic headed out would be the vlan interface, and then the bond > interface, and then the physical ethernet interface. Is that correct? > So with my kernel, I would seem to get no utility from multiq on the > ethernet interface if the vlan interface is going to be a > single-threaded bottleneck. What about the bond driver? Is it > currently multiqueue aware? > > I am try to get some sort of logical picture of how all these things > interact with each other to get things a little more efficient and > reduce resource contention in the application while still trying to be > efficient in use of network ports/interfaces. > > If someone feels up to the task of sending a little education my way, > I would be most appreciative. There doesn't seem to be a whole lot of > documentation floating around about multiqueue other than a blurb of > text in the kernel and David's presentation of last year. Hi George Vlan is multiqueue aware, but bonding is not unfortunatly at this moment. We could let it being 'multiqueue' (a patch was submitted by Oleg A. Arkhangelsky a while ago), but bonding xmit routine needs to lock a central lock, shared by all queues, so it wont be very efficient... Since this bothers me a bit, I will probably work on this in a near future. (adding real multiqueue capability and RCU to bonding fast paths) Ref: http://permalink.gmane.org/gmane.linux.network/152987 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-15 17:47 ` Eric Dumazet @ 2010-04-15 18:09 ` Jay Vosburgh 2010-04-15 18:41 ` Eric Dumazet 2010-04-16 3:54 ` George B. 2010-04-16 4:00 ` George B. 1 sibling, 2 replies; 8+ messages in thread From: Jay Vosburgh @ 2010-04-15 18:09 UTC (permalink / raw) To: Eric Dumazet; +Cc: George B., netdev Eric Dumazet <eric.dumazet@gmail.com> wrote: >Le jeudi 15 avril 2010 à 09:58 -0700, George B. a écrit : >> I am in need of a little education on multiqueue and was wondering if >> someone here might be able to help me. >> >> Given intel igb network driver, it appears I can do something like: >> >> tc qdisc del dev eth0 root handle 1: multiq >> >> which works and reports 4 bands: dev eth0 root refcnt 4 bands 4/4 >> >> But our network is a little more complicated. Above the ethernet we >> have the bonding driver which is using mode 2 bonding with two >> ethernet slaves. Then we have vlans on the bond interface. Our >> production traffic is on a vlan and resource contention is an issue as >> these are busy machines. >> >> It is my understanding that the vlan driver became multiqueue aware in >> 2.6.32 (we are currently using 2.6.31). >> >> It would seem that the first thing the kernel would encounter with >> traffic headed out would be the vlan interface, and then the bond >> interface, and then the physical ethernet interface. Is that correct? >> So with my kernel, I would seem to get no utility from multiq on the >> ethernet interface if the vlan interface is going to be a >> single-threaded bottleneck. What about the bond driver? Is it >> currently multiqueue aware? >> >> I am try to get some sort of logical picture of how all these things >> interact with each other to get things a little more efficient and >> reduce resource contention in the application while still trying to be >> efficient in use of network ports/interfaces. >> >> If someone feels up to the task of sending a little education my way, >> I would be most appreciative. There doesn't seem to be a whole lot of >> documentation floating around about multiqueue other than a blurb of >> text in the kernel and David's presentation of last year. > >Hi George > >Vlan is multiqueue aware, but bonding is not unfortunatly at this >moment. > >We could let it being 'multiqueue' (a patch was submitted by Oleg A. >Arkhangelsky a while ago), but bonding xmit routine needs to lock a >central lock, shared by all queues, so it wont be very efficient... The lock is a read lock, so theoretically it should be possible to enter the bonding transmit function on multiple CPUs at the same time. The lock may thrash around, though. >Since this bothers me a bit, I will probably work on this in a near >future. (adding real multiqueue capability and RCU to bonding fast >paths) > >Ref: http://permalink.gmane.org/gmane.linux.network/152987 The question I have about it (and the above patch), is: what does multi-queue "awareness" really mean for a bonding device? How does allocating a bunch of TX queues help, given that the determination of the transmitting device hasn't necessarily been made? I haven't had the chance to acquire some multi-queue network cards and check things out with bonding, so I'm not really sure how it should work. Should the bond look, from a multi-queue perspective, like the largest slave, or should it look like the sum of the slaves? Some of this is may be mode-specific, as well. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-15 18:09 ` Jay Vosburgh @ 2010-04-15 18:41 ` Eric Dumazet 2010-04-16 3:54 ` George B. 1 sibling, 0 replies; 8+ messages in thread From: Eric Dumazet @ 2010-04-15 18:41 UTC (permalink / raw) To: Jay Vosburgh; +Cc: George B., netdev Le jeudi 15 avril 2010 à 11:09 -0700, Jay Vosburgh a écrit : > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > >Vlan is multiqueue aware, but bonding is not unfortunatly at this > >moment. > > > >We could let it being 'multiqueue' (a patch was submitted by Oleg A. > >Arkhangelsky a while ago), but bonding xmit routine needs to lock a > >central lock, shared by all queues, so it wont be very efficient... > > The lock is a read lock, so theoretically it should be possible > to enter the bonding transmit function on multiple CPUs at the same > time. The lock may thrash around, though. > Yes, and with 10Gb cards, this is a limiting factor, if you want to send 14 million packets per second ;) read_lock() is one atomic op, dirtying cacheline read_unlock() is one atomic op, dirtying cache line again (if contended) in active-passive mode, RCU use should be really easy, given netdevices are already RCU compatable. This way, each cpu only reads bonding state, without any memory changes. > >Since this bothers me a bit, I will probably work on this in a near > >future. (adding real multiqueue capability and RCU to bonding fast > >paths) > > > >Ref: http://permalink.gmane.org/gmane.linux.network/152987 > > The question I have about it (and the above patch), is: what > does multi-queue "awareness" really mean for a bonding device? How does > allocating a bunch of TX queues help, given that the determination of > the transmitting device hasn't necessarily been made? > Well, it is a problem that was also taken into account with vlan, you might take a look at this commit : commit 669d3e0babb40018dd6e78f4093c13a2eac73866 Author: Vasu Dev <vasu.dev@intel.com> Date: Tue Mar 23 14:41:45 2010 +0000 vlan: adds vlan_dev_select_queue This is required to correctly select vlan tx queue for a driver supporting multi tx queue with ndo_select_queue implemented since currently selected vlan tx queue is unaligned to selected queue by real net_devce ndo_select_queue. Unaligned vlan tx queue selection causes thrash with higher vlan tx lock contention for least fcoe traffic and wrong socket tx queue_mapping for ixgbe having ndo_select_queue implemented. -v2 As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored vlan net_device_ops to have them with and without vlan_dev_select_queue and then select according to real dev ndo_select_queue present or not for a vlan net_device. This is to completely skip vlan_dev_select_queue calling for real net_device not supporting ndo_select_queue. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> > I haven't had the chance to acquire some multi-queue network > cards and check things out with bonding, so I'm not really sure how it > should work. Should the bond look, from a multi-queue perspective, like > the largest slave, or should it look like the sum of the slaves? Some > of this is may be mode-specific, as well. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-15 18:09 ` Jay Vosburgh 2010-04-15 18:41 ` Eric Dumazet @ 2010-04-16 3:54 ` George B. 1 sibling, 0 replies; 8+ messages in thread From: George B. @ 2010-04-16 3:54 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev On Thu, Apr 15, 2010 at 11:09 AM, Jay Vosburgh <fubar@us.ibm.com> wrote: > The question I have about it (and the above patch), is: what > does multi-queue "awareness" really mean for a bonding device? How does > allocating a bunch of TX queues help, given that the determination of > the transmitting device hasn't necessarily been made? Good point. > I haven't had the chance to acquire some multi-queue network > cards and check things out with bonding, so I'm not really sure how it > should work. Should the bond look, from a multi-queue perspective, like > the largest slave, or should it look like the sum of the slaves? Some > of this is may be mode-specific, as well. I would say that having the number of bands be either the number of cores or 4, whichever is the smaller would be a good start. That is probably fine for GigE. Of the network cards we have that support multiqueue, they are either 4 or 8 bands. In an optimal world, you would have the number of bands that you have available at the physical ethernet level but changing those on the fly in case of a change in available interfaces might be more trouble than it is worth. Four or eight would seem to be a good number to start with as I don't think I have seen an ethernet card with less than 4. If you have fewer than 4 CPUs there probably isn't much utility in having more bands than processors, or maybe that utility rapidly diminishes as the number of bands increases beyond the number of CPUs. At that point you have probably just spent a lot of work building a bigger buffer. I would be happy with 4 bands. I guess it just depends on where you want the bottleneck. If you have 8 bands on the bond driver (another reasonable alternative) and only 4 bands available for output, you have just moved the contention down a layer to between the bond and the ethernet driver. But I am a fan of moving the point of contention as far away from the application interface as possible. If I have one big lock around the bond driver and have 6 things waiting to talk to the network, those are six things that can't be doing anything else. I would rather have the application handle its network task and get back to other things. Now if you have 8 bands of bond and only 4 bands of ethernet, or even one band of ethernet, oh well. Maybe have 1 to 8 bands configurable by an option to the driver that could be set explicitly and defaults to, say, 4? Thanks for taking the time to answer. George ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-15 17:47 ` Eric Dumazet 2010-04-15 18:09 ` Jay Vosburgh @ 2010-04-16 4:00 ` George B. 2010-04-16 4:53 ` Eric Dumazet 1 sibling, 1 reply; 8+ messages in thread From: George B. @ 2010-04-16 4:00 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Thu, Apr 15, 2010 at 10:47 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Since this bothers me a bit, I will probably work on this in a near > future. (adding real multiqueue capability and RCU to bonding fast > paths) > > Ref: http://permalink.gmane.org/gmane.linux.network/152987 That would be great and you would have my sincere thanks.. And if anyone is interested, what we do is take a pair of "top of rack" switches and cluster them together so they appear as one switch. Configure a LAG consisting of a port on each physical switch to a pair of bonded interfaces on the server and use mode 2 bonding. In normal operation, both interfaces are active. Should one switch experience a power or interface failure, the server sees one of the interfaces fail but just keeps working on the remaining interface. There is no "failover" event going on. Thanks, George ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-16 4:00 ` George B. @ 2010-04-16 4:53 ` Eric Dumazet 2010-04-16 7:28 ` George B. 0 siblings, 1 reply; 8+ messages in thread From: Eric Dumazet @ 2010-04-16 4:53 UTC (permalink / raw) To: George B.; +Cc: netdev Le jeudi 15 avril 2010 à 21:00 -0700, George B. a écrit : > On Thu, Apr 15, 2010 at 10:47 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > Since this bothers me a bit, I will probably work on this in a near > > future. (adding real multiqueue capability and RCU to bonding fast > > paths) > > > > Ref: http://permalink.gmane.org/gmane.linux.network/152987 > > That would be great and you would have my sincere thanks.. And if > anyone is interested, what we do is take a pair of "top of rack" > switches and cluster them together so they appear as one switch. > Configure a LAG consisting of a port on each physical switch to a pair > of bonded interfaces on the server and use mode 2 bonding. In normal > operation, both interfaces are active. Should one switch experience a > power or interface failure, the server sees one of the interfaces fail > but just keeps working on the remaining interface. There is no > "failover" event going on. > What kind of traffic do your machines manage exactly ? On server, you use two ports of the same kind (same number of queues) ? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Network multiqueue question 2010-04-16 4:53 ` Eric Dumazet @ 2010-04-16 7:28 ` George B. 0 siblings, 0 replies; 8+ messages in thread From: George B. @ 2010-04-16 7:28 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Thu, Apr 15, 2010 at 9:53 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le jeudi 15 avril 2010 à 21:00 -0700, George B. a écrit : > What kind of traffic do your machines manage exactly ? Content to mobile devices (cell phones and such). More detail sent privately. > On server, you use two ports of the same kind (same number of queues) ? Yes, same kind. We try to make everything identical. Fewer problems that way. George ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-16 7:28 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-04-15 16:58 Network multiqueue question George B. 2010-04-15 17:47 ` Eric Dumazet 2010-04-15 18:09 ` Jay Vosburgh 2010-04-15 18:41 ` Eric Dumazet 2010-04-16 3:54 ` George B. 2010-04-16 4:00 ` George B. 2010-04-16 4:53 ` Eric Dumazet 2010-04-16 7:28 ` George B.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).