xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
@ 2014-02-17 18:01 Andrew J. Bennieston
  2014-02-17 18:06 ` David Vrabel
  2014-02-19 11:06 ` Ian Campbell
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew J. Bennieston @ 2014-02-17 18:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew J. Bennieston, --cc=paul.durrant, wei.liu2, ian.campbell,
	david.vrabel

From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>

Document the multi-queue feature in terms of XenStore keys to be written
by the backend and by the frontend.

Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
---
 xen/include/public/io/netif.h |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
index d7fb771..90be2fc 100644
--- a/xen/include/public/io/netif.h
+++ b/xen/include/public/io/netif.h
@@ -69,6 +69,27 @@
  */
 
 /*
+ * Multiple transmit and receive queues:
+ * If supported, the backend will write "multi-queue-max-queues" and set its
+ * value to the maximum supported number of queues.
+ * Frontends that are aware of this feature and wish to use it can write the
+ * key "multi-queue-num-queues", set to the number they wish to use.
+ *
+ * Queues replicate the shared rings and event channels, and
+ * "feature-split-event-channels" is required when using multiple queues.
+ *
+ * For frontends requesting just one queue, the usual event-channel and
+ * ring-ref keys are written as before, simplifying the backend processing
+ * to avoid distinguishing between a frontend that doesn't understand the
+ * multi-queue feature, and one that does, but requested only one queue.
+ *
+ * Frontends requesting two or more queues must not write the toplevel
+ * event-channel and ring-ref keys, instead writing them under sub-keys having
+ * the name "queue-N" where N is the integer ID of the queue for which those
+ * keys belong. Queues are indexed from zero.
+ */
+
+/*
  * "feature-no-csum-offload" should be used to turn IPv4 TCP/UDP checksum
  * offload off or on. If it is missing then the feature is assumed to be on.
  * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-17 18:01 [PATCH] netif.h: Document xen-net{back, front} multi-queue feature Andrew J. Bennieston
@ 2014-02-17 18:06 ` David Vrabel
  2014-02-19 11:06 ` Ian Campbell
  1 sibling, 0 replies; 8+ messages in thread
From: David Vrabel @ 2014-02-17 18:06 UTC (permalink / raw)
  To: Andrew J. Bennieston; +Cc: xen-devel, wei.liu2, ian.campbell

On 17/02/14 18:01, Andrew J. Bennieston wrote:
> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
> 
> Document the multi-queue feature in terms of XenStore keys to be written
> by the backend and by the frontend.
> 
> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-17 18:01 [PATCH] netif.h: Document xen-net{back, front} multi-queue feature Andrew J. Bennieston
  2014-02-17 18:06 ` David Vrabel
@ 2014-02-19 11:06 ` Ian Campbell
  2014-02-19 11:47   ` Andrew Bennieston
  1 sibling, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-02-19 11:06 UTC (permalink / raw)
  To: Andrew J. Bennieston; +Cc: xen-devel, --cc=paul.durrant, wei.liu2, david.vrabel

On Mon, 2014-02-17 at 18:01 +0000, Andrew J. Bennieston wrote:
> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
> 
> Document the multi-queue feature in terms of XenStore keys to be written
> by the backend and by the frontend.
> 
> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
> ---
>  xen/include/public/io/netif.h |   21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
> index d7fb771..90be2fc 100644
> --- a/xen/include/public/io/netif.h
> +++ b/xen/include/public/io/netif.h
> @@ -69,6 +69,27 @@
>   */
>  
>  /*
> + * Multiple transmit and receive queues:
> + * If supported, the backend will write "multi-queue-max-queues" and set its
> + * value to the maximum supported number of queues.
> + * Frontends that are aware of this feature and wish to use it can write the
> + * key "multi-queue-num-queues", set to the number they wish to use.
> + *
> + * Queues replicate the shared rings and event channels, and
> + * "feature-split-event-channels" is required when using multiple queues.
> + *
> + * For frontends requesting just one queue, the usual event-channel and
> + * ring-ref keys are written as before, simplifying the backend processing
> + * to avoid distinguishing between a frontend that doesn't understand the
> + * multi-queue feature, and one that does, but requested only one queue.
> + *
> + * Frontends requesting two or more queues must not write the toplevel
> + * event-channel and ring-ref keys, instead writing them under sub-keys having
> + * the name "queue-N" where N is the integer ID of the queue for which those
> + * keys belong. Queues are indexed from zero.

If "feature-split-event-channels" is required then I think what should
be written is queue-N/event-channel-{tx,rx} and
queue-N/{tx,rx}-ring-ref, rather than queue-N/{event-channel,ring-ref}
as the final paragraph sort of implies?

(what a shame we have event-channel-DIR and DIR-ring-ref, oh well!)

Is it required to have the same number of RX and TX queues?

Are there any other properties/behaviours which should be documented,
e.g. relating to the selection of which queue to use for a given frame
(on either TX or RX)? If not and it is up to the relevant end to do what
it wants then I think it would be useful to say so.

> + */
> +
> +/*
>   * "feature-no-csum-offload" should be used to turn IPv4 TCP/UDP checksum
>   * offload off or on. If it is missing then the feature is assumed to be on.
>   * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-19 11:06 ` Ian Campbell
@ 2014-02-19 11:47   ` Andrew Bennieston
  2014-02-19 12:02     ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Bennieston @ 2014-02-19 11:47 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, --cc=paul.durrant, wei.liu2, david.vrabel

On 19/02/14 11:06, Ian Campbell wrote:
> On Mon, 2014-02-17 at 18:01 +0000, Andrew J. Bennieston wrote:
>> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
>>
>> Document the multi-queue feature in terms of XenStore keys to be written
>> by the backend and by the frontend.
>>
>> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
>> ---
>>   xen/include/public/io/netif.h |   21 +++++++++++++++++++++
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
>> index d7fb771..90be2fc 100644
>> --- a/xen/include/public/io/netif.h
>> +++ b/xen/include/public/io/netif.h
>> @@ -69,6 +69,27 @@
>>    */
>>
>>   /*
>> + * Multiple transmit and receive queues:
>> + * If supported, the backend will write "multi-queue-max-queues" and set its
>> + * value to the maximum supported number of queues.
>> + * Frontends that are aware of this feature and wish to use it can write the
>> + * key "multi-queue-num-queues", set to the number they wish to use.
>> + *
>> + * Queues replicate the shared rings and event channels, and
>> + * "feature-split-event-channels" is required when using multiple queues.
>> + *
>> + * For frontends requesting just one queue, the usual event-channel and
>> + * ring-ref keys are written as before, simplifying the backend processing
>> + * to avoid distinguishing between a frontend that doesn't understand the
>> + * multi-queue feature, and one that does, but requested only one queue.
>> + *
>> + * Frontends requesting two or more queues must not write the toplevel
>> + * event-channel and ring-ref keys, instead writing them under sub-keys having
>> + * the name "queue-N" where N is the integer ID of the queue for which those
>> + * keys belong. Queues are indexed from zero.
>
> If "feature-split-event-channels" is required then I think what should
> be written is queue-N/event-channel-{tx,rx} and
> queue-N/{tx,rx}-ring-ref, rather than queue-N/{event-channel,ring-ref}
> as the final paragraph sort of implies?
I can change the wording to make this more clear.

>
> (what a shame we have event-channel-DIR and DIR-ring-ref, oh well!)
>
> Is it required to have the same number of RX and TX queues?
Strictly speaking, no. But the implementation assumes this to be the 
case. Since the code already sets up one pair, simply looping over this 
sufficient times to create N pairs was a fairly sane approach to this. 
In practice, if you have an asymmetry between RX and TX queues, you will 
end up hitting a bottleneck sooner in one direction than the other, 
which seems impractical.

>
> Are there any other properties/behaviours which should be documented,
> e.g. relating to the selection of which queue to use for a given frame
> (on either TX or RX)? If not and it is up to the relevant end to do what
> it wants then I think it would be useful to say so.
There are no other requirements. The current implementation will 
transmit anything it cannot hash sensibly on queue 0, but this is an 
arbitrary choice (albeit a sensible one, since queue 0 should always 
exist). I'll document this.

>
>> + */
>> +
>> +/*
>>    * "feature-no-csum-offload" should be used to turn IPv4 TCP/UDP checksum
>>    * offload off or on. If it is missing then the feature is assumed to be on.
>>    * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-19 11:47   ` Andrew Bennieston
@ 2014-02-19 12:02     ` Ian Campbell
  2014-02-20 10:58       ` Andrew Bennieston
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-02-19 12:02 UTC (permalink / raw)
  To: Andrew Bennieston; +Cc: xen-devel, --cc=paul.durrant, wei.liu2, david.vrabel

On Wed, 2014-02-19 at 11:47 +0000, Andrew Bennieston wrote:
> On 19/02/14 11:06, Ian Campbell wrote:
> > On Mon, 2014-02-17 at 18:01 +0000, Andrew J. Bennieston wrote:
> >> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
> >>
> >> Document the multi-queue feature in terms of XenStore keys to be written
> >> by the backend and by the frontend.
> >>
> >> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
> >> ---
> >>   xen/include/public/io/netif.h |   21 +++++++++++++++++++++
> >>   1 file changed, 21 insertions(+)
> >>
> >> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
> >> index d7fb771..90be2fc 100644
> >> --- a/xen/include/public/io/netif.h
> >> +++ b/xen/include/public/io/netif.h
> >> @@ -69,6 +69,27 @@
> >>    */
> >>
> >>   /*
> >> + * Multiple transmit and receive queues:
> >> + * If supported, the backend will write "multi-queue-max-queues" and set its
> >> + * value to the maximum supported number of queues.
> >> + * Frontends that are aware of this feature and wish to use it can write the
> >> + * key "multi-queue-num-queues", set to the number they wish to use.
> >> + *
> >> + * Queues replicate the shared rings and event channels, and
> >> + * "feature-split-event-channels" is required when using multiple queues.
> >> + *
> >> + * For frontends requesting just one queue, the usual event-channel and
> >> + * ring-ref keys are written as before, simplifying the backend processing
> >> + * to avoid distinguishing between a frontend that doesn't understand the
> >> + * multi-queue feature, and one that does, but requested only one queue.
> >> + *
> >> + * Frontends requesting two or more queues must not write the toplevel
> >> + * event-channel and ring-ref keys, instead writing them under sub-keys having
> >> + * the name "queue-N" where N is the integer ID of the queue for which those
> >> + * keys belong. Queues are indexed from zero.
> >
> > If "feature-split-event-channels" is required then I think what should
> > be written is queue-N/event-channel-{tx,rx} and
> > queue-N/{tx,rx}-ring-ref, rather than queue-N/{event-channel,ring-ref}
> > as the final paragraph sort of implies?
> I can change the wording to make this more clear.

Thanks.

> >
> > (what a shame we have event-channel-DIR and DIR-ring-ref, oh well!)
> >
> > Is it required to have the same number of RX and TX queues?
> Strictly speaking, no. But the implementation assumes this to be the 
> case. Since the code already sets up one pair, simply looping over this 
> sufficient times to create N pairs was a fairly sane approach to this. 
> In practice, if you have an asymmetry between RX and TX queues, you will 
> end up hitting a bottleneck sooner in one direction than the other, 
> which seems impractical.

OK. So we should either mandate that there will always be the same
number, if that is going to be the case, or we should design the
xenstore layout to be extensible if not.

It sounds reasonable to me to mandate they be the same, but I don't
really know.

> > Are there any other properties/behaviours which should be documented,
> > e.g. relating to the selection of which queue to use for a given frame
> > (on either TX or RX)? If not and it is up to the relevant end to do what
> > it wants then I think it would be useful to say so.
> There are no other requirements. The current implementation will 
> transmit anything it cannot hash sensibly on queue 0, but this is an 
> arbitrary choice (albeit a sensible one, since queue 0 should always 
> exist). I'll document this.

What is this undocumented/unnegotiated "hash" you speak of ;-)

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-19 12:02     ` Ian Campbell
@ 2014-02-20 10:58       ` Andrew Bennieston
  2014-02-20 11:04         ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Bennieston @ 2014-02-20 10:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, --cc=paul.durrant, wei.liu2, david.vrabel

On 19/02/14 12:02, Ian Campbell wrote:
> On Wed, 2014-02-19 at 11:47 +0000, Andrew Bennieston wrote:
>> On 19/02/14 11:06, Ian Campbell wrote:
>>> On Mon, 2014-02-17 at 18:01 +0000, Andrew J. Bennieston wrote:
>>>> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
>>>>
>>>> Document the multi-queue feature in terms of XenStore keys to be written
>>>> by the backend and by the frontend.
>>>>
>>>> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
>>>> ---
>>>>    xen/include/public/io/netif.h |   21 +++++++++++++++++++++
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
>>>> index d7fb771..90be2fc 100644
>>>> --- a/xen/include/public/io/netif.h
>>>> +++ b/xen/include/public/io/netif.h
>>>> @@ -69,6 +69,27 @@
>>>>     */
>>>>
>>>>    /*
>>>> + * Multiple transmit and receive queues:
>>>> + * If supported, the backend will write "multi-queue-max-queues" and set its
>>>> + * value to the maximum supported number of queues.
>>>> + * Frontends that are aware of this feature and wish to use it can write the
>>>> + * key "multi-queue-num-queues", set to the number they wish to use.
>>>> + *
>>>> + * Queues replicate the shared rings and event channels, and
>>>> + * "feature-split-event-channels" is required when using multiple queues.
>>>> + *
>>>> + * For frontends requesting just one queue, the usual event-channel and
>>>> + * ring-ref keys are written as before, simplifying the backend processing
>>>> + * to avoid distinguishing between a frontend that doesn't understand the
>>>> + * multi-queue feature, and one that does, but requested only one queue.
>>>> + *
>>>> + * Frontends requesting two or more queues must not write the toplevel
>>>> + * event-channel and ring-ref keys, instead writing them under sub-keys having
>>>> + * the name "queue-N" where N is the integer ID of the queue for which those
>>>> + * keys belong. Queues are indexed from zero.
>>>
>>> If "feature-split-event-channels" is required then I think what should
>>> be written is queue-N/event-channel-{tx,rx} and
>>> queue-N/{tx,rx}-ring-ref, rather than queue-N/{event-channel,ring-ref}
>>> as the final paragraph sort of implies?
>> I can change the wording to make this more clear.
>
> Thanks.
>
>>>
>>> (what a shame we have event-channel-DIR and DIR-ring-ref, oh well!)
>>>
>>> Is it required to have the same number of RX and TX queues?
>> Strictly speaking, no. But the implementation assumes this to be the
>> case. Since the code already sets up one pair, simply looping over this
>> sufficient times to create N pairs was a fairly sane approach to this.
>> In practice, if you have an asymmetry between RX and TX queues, you will
>> end up hitting a bottleneck sooner in one direction than the other,
>> which seems impractical.
>
> OK. So we should either mandate that there will always be the same
> number, if that is going to be the case, or we should design the
> xenstore layout to be extensible if not.
>
> It sounds reasonable to me to mandate they be the same, but I don't
> really know.
>
>>> Are there any other properties/behaviours which should be documented,
>>> e.g. relating to the selection of which queue to use for a given frame
>>> (on either TX or RX)? If not and it is up to the relevant end to do what
>>> it wants then I think it would be useful to say so.
>> There are no other requirements. The current implementation will
>> transmit anything it cannot hash sensibly on queue 0, but this is an
>> arbitrary choice (albeit a sensible one, since queue 0 should always
>> exist). I'll document this.
>
> What is this undocumented/unnegotiated "hash" you speak of ;-)
Each end must decide how to split packets across queues. This decision 
is made by the transmitting guest, and the receiving guest will simply 
take packets on any queue and deal with them as before. This is why it 
is unnegotiated; the two ends could be using completely different 
mechanisms for this, or could always choose a single queue, with no 
adverse effect on anything except performance (i.e. if they do this 
badly, they will not benefit from the full value of multiple queues).

These properties should probably be documented, though. I'll add a 
paragraph to the effect of the one above. A complication arises if we 
wish to support Windows frontends in a logical manner, since Windows 
expects to be able to control the mapping of packets to RX queues. This 
means that for a Windows guest, we may need to write additional Xenstore 
keys to specify these parameters to the backend. However, that is not 
considered here because

a] the default behaviour of decoupling the TX and RX queue mappings is 
not affected by this; a Windows frontend would specifically request an
algorithm, while other frontends may remain silent about this.

b] Windows also wants the hash value (used to select the queue, via the 
Toeplitz hash) to be transmitted along with each packet. This would 
require either an extension to the ring protocol to pass this value in 
an extra slot, or some other mechanism - either of which would require 
separate negotiation anyway.

For these reasons, the default behaviour is to _not_ negotiate any 
hashing between the back- and frontends, leaving each free to do the 
most appropriate thing. I'd like to avoid writing unnecessary stuff into 
Xenstore, preferring to have a sensible default behaviour that can be 
overridden for the few specific cases that require something different.

As such, I think the docs should, for now, say something along the lines of:
   Mapping of packets to queues is considered to be a function of the
   transmitting system (backend or frontend) and is not negotiated
   between the two. Guests are free to transmit packets on any queue
   they choose, provided it has been set up correctly.

Andrew.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-20 10:58       ` Andrew Bennieston
@ 2014-02-20 11:04         ` Ian Campbell
  2014-02-21 10:10           ` Paul Durrant
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-02-20 11:04 UTC (permalink / raw)
  To: Andrew Bennieston; +Cc: xen-devel, paul.durrant, wei.liu2, david.vrabel

On Thu, 2014-02-20 at 10:58 +0000, Andrew Bennieston wrote:
> As such, I think the docs should, for now, say something along the lines of:
>    Mapping of packets to queues is considered to be a function of the
>    transmitting system (backend or frontend) and is not negotiated
>    between the two. Guests are free to transmit packets on any queue
>    they choose, provided it has been set up correctly.

That's the sort of thing I was looking for, thanks.

(fixing Paul's CC at last...)

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
  2014-02-20 11:04         ` Ian Campbell
@ 2014-02-21 10:10           ` Paul Durrant
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Durrant @ 2014-02-21 10:10 UTC (permalink / raw)
  To: Ian Campbell, Andrew Bennieston
  Cc: xen-devel@lists.xenproject.org, Wei Liu, David Vrabel

> -----Original Message-----
> From: Ian Campbell
> Sent: 20 February 2014 11:04
> To: Andrew Bennieston
> Cc: xen-devel@lists.xenproject.org; Wei Liu; Paul Durrant; David Vrabel
> Subject: Re: [PATCH] netif.h: Document xen-net{back,front} multi-queue
> feature
> 
> On Thu, 2014-02-20 at 10:58 +0000, Andrew Bennieston wrote:
> > As such, I think the docs should, for now, say something along the lines of:
> >    Mapping of packets to queues is considered to be a function of the
> >    transmitting system (backend or frontend) and is not negotiated
> >    between the two. Guests are free to transmit packets on any queue
> >    they choose, provided it has been set up correctly.
> 
> That's the sort of thing I was looking for, thanks.
> 
> (fixing Paul's CC at last...)
> 

That all sounds fine. There will be some changes when we attempt to implement Toeplitz and Windows RSS (Receive Side Scaling)...

Briefly, the Windows stack provides a hash key at start of day which needs to be fed to the backend (which I guess we'll do via xenstore) so that it can use it in its calculation. The stack also provides a hash->queue mapping table which is updated on a fairly infrequent basis (so we can probably still use xenstore for that) to dictate which TCP flows coming from the backend should appear on which queue (and hence get processed by which frontend vCPU). The other complication is that the actual hash value needs to be passed to the stack with each TCP packet, so I suspect we'll have to use a new 'extra segment' type to pass that across.

All of this, as the same suggests, is (guest) receive side. There is no connection with transmit side, although I believe Windows will ensure that the transmissions of any particular TCP flow will occur on the same CPU that the hash->queue mapping mandates for reception (modulo a bit of mismatch whenever the table is updated).

  Paul

> Ian.
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-21 10:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-17 18:01 [PATCH] netif.h: Document xen-net{back, front} multi-queue feature Andrew J. Bennieston
2014-02-17 18:06 ` David Vrabel
2014-02-19 11:06 ` Ian Campbell
2014-02-19 11:47   ` Andrew Bennieston
2014-02-19 12:02     ` Ian Campbell
2014-02-20 10:58       ` Andrew Bennieston
2014-02-20 11:04         ` Ian Campbell
2014-02-21 10:10           ` Paul Durrant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).