qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jonah Palmer <jonah.palmer@oracle.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, raphael@enfabrica.net,
	kwolf@redhat.com, hreitz@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, fam@euphon.net, stefanha@redhat.com,
	qemu-block@nongnu.org, schalla@marvell.com, leiyang@redhat.com,
	virtio-fs@lists.linux.dev, si-wei.liu@oracle.com,
	boris.ostrovsky@oracle.com
Subject: Re: [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support
Date: Mon, 25 Mar 2024 12:52:16 -0400	[thread overview]
Message-ID: <5b8494df-c15c-4d18-ad4a-74b5584429ce@oracle.com> (raw)
In-Reply-To: <CAJaqyWf-oS_Y7EgO7DrVxMj5Roe=yjbbU3tka=Yj3St1ALCvnw@mail.gmail.com>



On 3/22/24 7:18 AM, Eugenio Perez Martin wrote:
> On Thu, Mar 21, 2024 at 4:57 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
>>
>> The goal of these patches is to add support to a variety of virtio and
>> vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
>> indicates that all buffers are used by the device in the same order in
>> which they were made available by the driver.
>>
>> These patches attempt to implement a generalized, non-device-specific
>> solution to support this feature.
>>
>> The core feature behind this solution is a buffer mechanism in the form
>> of GLib's GHashTable. The decision behind using a hash table was to
>> leverage their ability for quick lookup, insertion, and removal
>> operations. Given that our keys are simply numbers of an ordered
>> sequence, a hash table seemed like the best choice for a buffer
>> mechanism.
>>
>> ---------------------
>>
>> The strategy behind this implementation is as follows:
>>
>> We know that buffers that are popped from the available ring and enqueued
>> for further processing will always done in the same order in which they
>> were made available by the driver. Given this, we can note their order
>> by assigning the resulting VirtQueueElement a key. This key is a number
>> in a sequence that represents the order in which they were popped from
>> the available ring, relative to the other VirtQueueElements.
>>
>> For example, given 3 "elements" that were popped from the available
>> ring, we assign a key value to them which represents their order (elem0
>> is popped first, then elem1, then lastly elem2):
>>
>>       elem2   --  elem1   --  elem0   ---> Enqueue for processing
>>      (key: 2)    (key: 1)    (key: 0)
>>
>> Then these elements are enqueued for further processing by the host.
>>
>> While most devices will return these completed elements in the same
>> order in which they were enqueued, some devices may not (e.g.
>> virtio-blk). To guarantee that these elements are put on the used ring
>> in the same order in which they were enqueued, we can use a buffering
>> mechanism that keeps track of the next expected sequence number of an
>> element.
>>
>> In other words, if the completed element does not have a key value that
>> matches the next expected sequence number, then we know this element is
>> not in-order and we must stash it away in a hash table until an order
>> can be made. The element's key value is used as the key for placing it
>> in the hash table.
>>
>> If the completed element has a key value that matches the next expected
>> sequence number, then we know this element is in-order and we can push
>> it on the used ring. Then we increment the next expected sequence number
>> and check if the hash table contains an element at this key location.
>>
>> If so, we retrieve this element, push it to the used ring, delete the
>> key-value pair from the hash table, increment the next expected sequence
>> number, and check the hash table again for an element at this new key
>> location. This process is repeated until we're unable to find an element
>> in the hash table to continue the order.
>>
>> So, for example, say the 3 elements we enqueued were completed in the
>> following order: elem1, elem2, elem0. The next expected sequence number
>> is 0:
>>
>>      exp-seq-num = 0:
>>
>>       elem1   --> elem1.key == exp-seq-num ? --> No, stash it
>>      (key: 1)                                         |
>>                                                       |
>>                                                       v
>>                                                 ================
>>                                                 |key: 1 - elem1|
>>                                                 ================
>>      ---------------------
>>      exp-seq-num = 0:
>>
>>       elem2   --> elem2.key == exp-seq-num ? --> No, stash it
>>      (key: 2)                                         |
>>                                                       |
>>                                                       v
>>                                                 ================
>>                                                 |key: 1 - elem1|
>>                                                 |--------------|
>>                                                 |key: 2 - elem2|
>>                                                 ================
>>      ---------------------
>>      exp-seq-num = 0:
>>
>>       elem0   --> elem0.key == exp-seq-num ? --> Yes, push to used ring
>>      (key: 0)
>>
>>      exp-seq-num = 1:
>>
>>      lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
>>                                               remove elem from table
>>                                                       |
>>                                                       v
>>                                                 ================
>>                                                 |key: 2 - elem2|
>>                                                 ================
>>
>>      exp-seq-num = 2:
>>
>>      lookup(table, exp-seq-num) != NULL ? --> Yes, push to used ring,
>>                                               remove elem from table
>>                                                       |
>>                                                       v
>>                                                 ================
>>                                                 |   *empty*    |
>>                                                 ================
>>
>>      exp-seq-num = 3:
>>
>>      lookup(table, exp-seq-num) != NULL ? --> No, done
>>      ---------------------
>>
> 
> I think to use a hashtable to handle this has an important drawback:
> it hurts performance on the devices that are using right in-order
> because of hash calculus, to benefit devices that are using it badly
> by using descriptors out of order. We should use data structs that are
> as free as possible for the first, and we don't care to worse the
> experience of the devices that enable in_order and they shouldn't.
> 

Right, because if descriptors are coming in in-order, we still search 
the (empty) hash table.

Hmm... what if we introduced a flag to see if we actually should bother 
searching the hash table? That way we avoid the cost of searching when 
we really don't need to.

> So I suggest reusing vq->used_elems array vq. At each used descriptor
> written in the used ring, you know the next head is elem->index +
> elem->ndescs, so you can check if that element has been filled or not.
> If used, it needs to be flushed too. If not used, just return.
> 
> Of course virtqueue_flush also needs to take this into account.
> 
> What do you think, does it make sense to you?
> 

I'm having a bit of trouble understanding the suggestion here. Would you 
mind elaborating a bit more for me on this?

For example, say elem0, elem1, and elem2 were enqueued in-order (elem0 
being first, elem2 last) and then elem2 finishes first, elem1 second, 
and elem0 third. Given that these elements finish out-of-order, how 
would you handle these out-of-order elements using your suggestion?

Thanks :)

> Thanks!
> 
> 
>> Jonah Palmer (8):
>>    virtio: Define InOrderVQElement
>>    virtio: Create/destroy/reset VirtQueue In-Order hash table
>>    virtio: Define order variables
>>    virtio: Implement in-order handling for virtio devices
>>    virtio-net: in-order handling
>>    vhost-svq: in-order handling
>>    vhost/vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits
>>    virtio: Add VIRTIO_F_IN_ORDER property definition
>>
>>   hw/block/vhost-user-blk.c          |   1 +
>>   hw/net/vhost_net.c                 |   2 +
>>   hw/net/virtio-net.c                |   6 +-
>>   hw/scsi/vhost-scsi.c               |   1 +
>>   hw/scsi/vhost-user-scsi.c          |   1 +
>>   hw/virtio/vhost-shadow-virtqueue.c |  15 ++++-
>>   hw/virtio/vhost-user-fs.c          |   1 +
>>   hw/virtio/vhost-user-vsock.c       |   1 +
>>   hw/virtio/virtio.c                 | 103 ++++++++++++++++++++++++++++-
>>   include/hw/virtio/virtio.h         |  20 +++++-
>>   net/vhost-vdpa.c                   |   1 +
>>   11 files changed, 145 insertions(+), 7 deletions(-)
>>
>> --
>> 2.39.3
>>
> 


  reply	other threads:[~2024-03-25 16:53 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 15:57 [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support Jonah Palmer
2024-03-21 15:57 ` [RFC 1/8] virtio: Define InOrderVQElement Jonah Palmer
2024-03-22  9:45   ` Eugenio Perez Martin
2024-03-25 17:08     ` Jonah Palmer
2024-03-25 19:12       ` Eugenio Perez Martin
2024-03-21 15:57 ` [RFC 2/8] virtio: Create/destroy/reset VirtQueue In-Order hash table Jonah Palmer
2024-03-21 15:57 ` [RFC 3/8] virtio: Define order variables Jonah Palmer
2024-03-21 15:57 ` [RFC 4/8] virtio: Implement in-order handling for virtio devices Jonah Palmer
2024-03-22 10:46   ` Eugenio Perez Martin
2024-03-25 17:34     ` Jonah Palmer
2024-03-25 19:45       ` Eugenio Perez Martin
2024-03-21 15:57 ` [RFC 5/8] virtio-net: in-order handling Jonah Palmer
2024-03-21 15:57 ` [RFC 6/8] vhost-svq: " Jonah Palmer
2024-03-21 15:57 ` [RFC 7/8] vhost/vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Jonah Palmer
2024-03-22 10:47   ` Eugenio Perez Martin
2024-03-21 15:57 ` [RFC 8/8] virtio: Add VIRTIO_F_IN_ORDER property definition Jonah Palmer
2024-03-22 10:48   ` Eugenio Perez Martin
2024-03-21 19:48 ` [RFC 0/8] virtio,vhost: Add VIRTIO_F_IN_ORDER support Dongli Zhang
2024-03-21 21:25   ` Jonah Palmer
2024-03-22 11:18 ` Eugenio Perez Martin
2024-03-25 16:52   ` Jonah Palmer [this message]
2024-03-25 20:33     ` Eugenio Perez Martin
2024-03-26 16:49       ` Jonah Palmer
2024-03-26 18:34         ` Eugenio Perez Martin
2024-03-26 19:01           ` Jonah Palmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b8494df-c15c-4d18-ad4a-74b5584429ce@oracle.com \
    --to=jonah.palmer@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=eperezma@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=leiyang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=raphael@enfabrica.net \
    --cc=schalla@marvell.com \
    --cc=si-wei.liu@oracle.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).