xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Bennieston <andrew.bennieston@citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: xen-devel@lists.xenproject.org, --cc=paul.durrant@citrix.com,
	wei.liu2@citrix.com, david.vrabel@citrix.com
Subject: Re: [PATCH] netif.h: Document xen-net{back, front} multi-queue feature
Date: Thu, 20 Feb 2014 10:58:07 +0000	[thread overview]
Message-ID: <5305DFBF.1030909@citrix.com> (raw)
In-Reply-To: <1392811341.29739.24.camel@kazak.uk.xensource.com>

On 19/02/14 12:02, Ian Campbell wrote:
> On Wed, 2014-02-19 at 11:47 +0000, Andrew Bennieston wrote:
>> On 19/02/14 11:06, Ian Campbell wrote:
>>> On Mon, 2014-02-17 at 18:01 +0000, Andrew J. Bennieston wrote:
>>>> From: "Andrew J. Bennieston" <andrew.bennieston@citrix.com>
>>>>
>>>> Document the multi-queue feature in terms of XenStore keys to be written
>>>> by the backend and by the frontend.
>>>>
>>>> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
>>>> ---
>>>>    xen/include/public/io/netif.h |   21 +++++++++++++++++++++
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
>>>> index d7fb771..90be2fc 100644
>>>> --- a/xen/include/public/io/netif.h
>>>> +++ b/xen/include/public/io/netif.h
>>>> @@ -69,6 +69,27 @@
>>>>     */
>>>>
>>>>    /*
>>>> + * Multiple transmit and receive queues:
>>>> + * If supported, the backend will write "multi-queue-max-queues" and set its
>>>> + * value to the maximum supported number of queues.
>>>> + * Frontends that are aware of this feature and wish to use it can write the
>>>> + * key "multi-queue-num-queues", set to the number they wish to use.
>>>> + *
>>>> + * Queues replicate the shared rings and event channels, and
>>>> + * "feature-split-event-channels" is required when using multiple queues.
>>>> + *
>>>> + * For frontends requesting just one queue, the usual event-channel and
>>>> + * ring-ref keys are written as before, simplifying the backend processing
>>>> + * to avoid distinguishing between a frontend that doesn't understand the
>>>> + * multi-queue feature, and one that does, but requested only one queue.
>>>> + *
>>>> + * Frontends requesting two or more queues must not write the toplevel
>>>> + * event-channel and ring-ref keys, instead writing them under sub-keys having
>>>> + * the name "queue-N" where N is the integer ID of the queue for which those
>>>> + * keys belong. Queues are indexed from zero.
>>>
>>> If "feature-split-event-channels" is required then I think what should
>>> be written is queue-N/event-channel-{tx,rx} and
>>> queue-N/{tx,rx}-ring-ref, rather than queue-N/{event-channel,ring-ref}
>>> as the final paragraph sort of implies?
>> I can change the wording to make this more clear.
>
> Thanks.
>
>>>
>>> (what a shame we have event-channel-DIR and DIR-ring-ref, oh well!)
>>>
>>> Is it required to have the same number of RX and TX queues?
>> Strictly speaking, no. But the implementation assumes this to be the
>> case. Since the code already sets up one pair, simply looping over this
>> sufficient times to create N pairs was a fairly sane approach to this.
>> In practice, if you have an asymmetry between RX and TX queues, you will
>> end up hitting a bottleneck sooner in one direction than the other,
>> which seems impractical.
>
> OK. So we should either mandate that there will always be the same
> number, if that is going to be the case, or we should design the
> xenstore layout to be extensible if not.
>
> It sounds reasonable to me to mandate they be the same, but I don't
> really know.
>
>>> Are there any other properties/behaviours which should be documented,
>>> e.g. relating to the selection of which queue to use for a given frame
>>> (on either TX or RX)? If not and it is up to the relevant end to do what
>>> it wants then I think it would be useful to say so.
>> There are no other requirements. The current implementation will
>> transmit anything it cannot hash sensibly on queue 0, but this is an
>> arbitrary choice (albeit a sensible one, since queue 0 should always
>> exist). I'll document this.
>
> What is this undocumented/unnegotiated "hash" you speak of ;-)
Each end must decide how to split packets across queues. This decision 
is made by the transmitting guest, and the receiving guest will simply 
take packets on any queue and deal with them as before. This is why it 
is unnegotiated; the two ends could be using completely different 
mechanisms for this, or could always choose a single queue, with no 
adverse effect on anything except performance (i.e. if they do this 
badly, they will not benefit from the full value of multiple queues).

These properties should probably be documented, though. I'll add a 
paragraph to the effect of the one above. A complication arises if we 
wish to support Windows frontends in a logical manner, since Windows 
expects to be able to control the mapping of packets to RX queues. This 
means that for a Windows guest, we may need to write additional Xenstore 
keys to specify these parameters to the backend. However, that is not 
considered here because

a] the default behaviour of decoupling the TX and RX queue mappings is 
not affected by this; a Windows frontend would specifically request an
algorithm, while other frontends may remain silent about this.

b] Windows also wants the hash value (used to select the queue, via the 
Toeplitz hash) to be transmitted along with each packet. This would 
require either an extension to the ring protocol to pass this value in 
an extra slot, or some other mechanism - either of which would require 
separate negotiation anyway.

For these reasons, the default behaviour is to _not_ negotiate any 
hashing between the back- and frontends, leaving each free to do the 
most appropriate thing. I'd like to avoid writing unnecessary stuff into 
Xenstore, preferring to have a sensible default behaviour that can be 
overridden for the few specific cases that require something different.

As such, I think the docs should, for now, say something along the lines of:
   Mapping of packets to queues is considered to be a function of the
   transmitting system (backend or frontend) and is not negotiated
   between the two. Guests are free to transmit packets on any queue
   they choose, provided it has been set up correctly.

Andrew.

  reply	other threads:[~2014-02-20 10:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-17 18:01 [PATCH] netif.h: Document xen-net{back, front} multi-queue feature Andrew J. Bennieston
2014-02-17 18:06 ` David Vrabel
2014-02-19 11:06 ` Ian Campbell
2014-02-19 11:47   ` Andrew Bennieston
2014-02-19 12:02     ` Ian Campbell
2014-02-20 10:58       ` Andrew Bennieston [this message]
2014-02-20 11:04         ` Ian Campbell
2014-02-21 10:10           ` Paul Durrant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5305DFBF.1030909@citrix.com \
    --to=andrew.bennieston@citrix.com \
    --cc=--cc=paul.durrant@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).