* Tx queue selection
@ 2010-07-27 10:51 Benjamin Herrenschmidt
2010-07-27 11:50 ` Neil Horman
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2010-07-27 10:51 UTC (permalink / raw)
To: netdev
Hi folks !
I'm putting my newbie hat on ... :-)
While looking at our ehea driver (and in fact another upcoming driver
I'm helping with), I noticed it's using the "old style" multiqueue. IE.
It doesn't use the alloc_netdev_mq() variant, creates one queue on the
linux side, an makes its own selection of HW queue in start_xmit.
This had many drawbacks, obviously, such as not getting per-queue locks
etc...
Now, the mechanics of converting that to the new scheme are easy enough
to figure out by reading the code. However, where my lack of networking
background fails me is when it comes to the policy of choosing a Tx
queue.
ehea uses its own hash of the header, different from the "default" queue
selection in the net core. Looking at other drivers such as ixgbe, I see
that it can chose to use smp_processor_id() when a flag is set for which
I don't totally understand the meaning or default to the core algorithm.
Now, while I can understand why it's a good idea to use the current
processor, in order to limit cache ping pong etc... I'm not really
confident I understand the pro/cons of using the hashing for tx. I
understand that the net core can play interesting games with associating
sockets with queues etc... but I'm a bit at a loss when it comes to
deciding what's best for this driver. I suppose I could start by
implementing my own queue selection based on what ehea does today but I
have the nasty feeling that's going to be sub-optimal :-)
So I would very much appreciate (and reward with free beer at the next
conference) if somebody could give me a bit of a heads up on how things
are expected to be done there, pro/cons, perf impact etc...
Thanks in avance !
Cheers,
Ben.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Tx queue selection
2010-07-27 10:51 Tx queue selection Benjamin Herrenschmidt
@ 2010-07-27 11:50 ` Neil Horman
2010-07-27 11:57 ` Eric Dumazet
2010-07-27 19:31 ` Ben Hutchings
2 siblings, 0 replies; 4+ messages in thread
From: Neil Horman @ 2010-07-27 11:50 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: netdev
On Tue, Jul 27, 2010 at 08:51:07PM +1000, Benjamin Herrenschmidt wrote:
> Hi folks !
>
> I'm putting my newbie hat on ... :-)
>
> While looking at our ehea driver (and in fact another upcoming driver
> I'm helping with), I noticed it's using the "old style" multiqueue. IE.
> It doesn't use the alloc_netdev_mq() variant, creates one queue on the
> linux side, an makes its own selection of HW queue in start_xmit.
>
> This had many drawbacks, obviously, such as not getting per-queue locks
> etc...
>
> Now, the mechanics of converting that to the new scheme are easy enough
> to figure out by reading the code. However, where my lack of networking
> background fails me is when it comes to the policy of choosing a Tx
> queue.
>
> ehea uses its own hash of the header, different from the "default" queue
> selection in the net core. Looking at other drivers such as ixgbe, I see
> that it can chose to use smp_processor_id() when a flag is set for which
> I don't totally understand the meaning or default to the core algorithm.
>
IIRC, that intels Flow Director feature. I've not used it, but from what I
understand flow director is a technology that allows a card to isolate flows to
and from a socket to a single cpu. i.e. the cpu which handles the transmission
of frames will be the cpu that handles frames destined for that flow as well.
To do this they do some additional analysis and steering configuration in the
driver using a user space utility as well. I don't think it makes much sense to
use smp_processor_id in your tx hashing if you don't have some sort of specific
feature like that. The other drivers which implement ndo_select_queue don't do
it, they hash on the skb header like the core does, and make modifications to
that based on hardware capabilities.
HTH
Neil
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Tx queue selection
2010-07-27 10:51 Tx queue selection Benjamin Herrenschmidt
2010-07-27 11:50 ` Neil Horman
@ 2010-07-27 11:57 ` Eric Dumazet
2010-07-27 19:31 ` Ben Hutchings
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2010-07-27 11:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: netdev
Le mardi 27 juillet 2010 à 20:51 +1000, Benjamin Herrenschmidt a écrit :
> Hi folks !
>
> I'm putting my newbie hat on ... :-)
>
> While looking at our ehea driver (and in fact another upcoming driver
> I'm helping with), I noticed it's using the "old style" multiqueue. IE.
> It doesn't use the alloc_netdev_mq() variant, creates one queue on the
> linux side, an makes its own selection of HW queue in start_xmit.
>
> This had many drawbacks, obviously, such as not getting per-queue locks
> etc...
>
> Now, the mechanics of converting that to the new scheme are easy enough
> to figure out by reading the code. However, where my lack of networking
> background fails me is when it comes to the policy of choosing a Tx
> queue.
>
> ehea uses its own hash of the header, different from the "default" queue
> selection in the net core. Looking at other drivers such as ixgbe, I see
> that it can chose to use smp_processor_id() when a flag is set for which
> I don't totally understand the meaning or default to the core algorithm.
>
> Now, while I can understand why it's a good idea to use the current
> processor, in order to limit cache ping pong etc... I'm not really
> confident I understand the pro/cons of using the hashing for tx. I
> understand that the net core can play interesting games with associating
> sockets with queues etc... but I'm a bit at a loss when it comes to
> deciding what's best for this driver. I suppose I could start by
> implementing my own queue selection based on what ehea does today but I
> have the nasty feeling that's going to be sub-optimal :-)
>
> So I would very much appreciate (and reward with free beer at the next
> conference) if somebody could give me a bit of a heads up on how things
> are expected to be done there, pro/cons, perf impact etc...
I am not sure ndo_select_queue() is really needed these days. It was
done before core network was able to use a socket provided hash.
tx queue selection done by default (skb_tx_hash()) should be fine.
bnx2 for example doesnt provide a ndo_select_queue()
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Tx queue selection
2010-07-27 10:51 Tx queue selection Benjamin Herrenschmidt
2010-07-27 11:50 ` Neil Horman
2010-07-27 11:57 ` Eric Dumazet
@ 2010-07-27 19:31 ` Ben Hutchings
2 siblings, 0 replies; 4+ messages in thread
From: Ben Hutchings @ 2010-07-27 19:31 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: netdev
On Tue, 2010-07-27 at 20:51 +1000, Benjamin Herrenschmidt wrote:
> Hi folks !
>
> I'm putting my newbie hat on ... :-)
>
> While looking at our ehea driver (and in fact another upcoming driver
> I'm helping with), I noticed it's using the "old style" multiqueue. IE.
> It doesn't use the alloc_netdev_mq() variant, creates one queue on the
> linux side, an makes its own selection of HW queue in start_xmit.
>
> This had many drawbacks, obviously, such as not getting per-queue locks
> etc...
>
> Now, the mechanics of converting that to the new scheme are easy enough
> to figure out by reading the code. However, where my lack of networking
> background fails me is when it comes to the policy of choosing a Tx
> queue.
>
> ehea uses its own hash of the header, different from the "default" queue
> selection in the net core. Looking at other drivers such as ixgbe, I see
> that it can chose to use smp_processor_id() when a flag is set for which
> I don't totally understand the meaning or default to the core algorithm.
>
> Now, while I can understand why it's a good idea to use the current
> processor, in order to limit cache ping pong etc... I'm not really
> confident I understand the pro/cons of using the hashing for tx. I
> understand that the net core can play interesting games with associating
> sockets with queues etc... but I'm a bit at a loss when it comes to
> deciding what's best for this driver. I suppose I could start by
> implementing my own queue selection based on what ehea does today but I
> have the nasty feeling that's going to be sub-optimal :-)
>
> So I would very much appreciate (and reward with free beer at the next
> conference) if somebody could give me a bit of a heads up on how things
> are expected to be done there, pro/cons, perf impact etc...
In the past Dave has recommended against implementing
ndo_select_queue().
When forwarding between multiqueue interfaces, we expect the input
device to spread traffic out between RX queues and we then use the
corresponding TX queue on output (assuming equal numbers of queues on
interfaces). Thus we should easily avoid contention on TX queues.
For endpoints, the situation is more complex. Ideally we would have one
IRQ, one RX queue and one TX queue per processor; we would let each
processor send on its own TX queue and NICs would automatically steer RX
packets to the RX queue for wherever the receiving thread will be
scheduled. In practice the NIC doesn't know that and even if it does we
can easily introduce reordering. This also depends on the driver being
able to control affinity of its IRQs.
ixgbe, in conjunction with the firmware 'Flow Director' feature attempts
to implement this, using the last TX queue for the flow as an indicator
of which RX queue to use, but so far as I can see it punts on the
reordering issue. It sets affinity 'hints' and apparently requires that
irqbalance follows these.
Another approach is to assume that when a receiving thread is regularly
woken up by packet reception on a given CPU then it will tend to be
scheduled and to transmit on the same flow from that CPU. On that basis
we should set the TX queue for a connected socket to match the RX queue
it last received on. (See
<http://article.gmane.org/gmane.linux.network/158477>.) It's not clear
whether this is really true.
Receive Flow Steering implements the steering entirely in software, but
AFAIK does nothing for the TX side; it seems mostly targetted at
single-queue NICs.
I will shortly be proposing some changes that I hope will allow at least
some multiqueue NIC drivers to move closer to that ideal.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-07-27 19:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-27 10:51 Tx queue selection Benjamin Herrenschmidt
2010-07-27 11:50 ` Neil Horman
2010-07-27 11:57 ` Eric Dumazet
2010-07-27 19:31 ` Ben Hutchings
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox