Re: [PATCH V3 19/19] virtio_ring: add in order support

public inbox for virtualization@lists.linux-foundation.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	virtualization@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V3 19/19] virtio_ring: add in order support
Date: Wed, 2 Jul 2025 06:56:52 -0400	[thread overview]
Message-ID: <20250702064413-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEuzTYPcDMamptLMQpSZu3gWxYx1Sr2nJef+pyuo2m35XQ@mail.gmail.com>

On Wed, Jul 02, 2025 at 05:29:18PM +0800, Jason Wang wrote:
> On Tue, Jul 1, 2025 at 2:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jun 16, 2025 at 04:25:17PM +0800, Jason Wang wrote:
> > > This patch implements in order support for both split virtqueue and
> > > packed virtqueue.
> >
> > I'd like to see more motivation for this work, documented.
> > It's not really performance, not as it stands, see below:
> >
> > >
> > > Benchmark with KVM guest + testpmd on the host shows:
> > >
> > > For split virtqueue: no obvious differences were noticed
> > >
> > > For packed virtqueue:
> > >
> > > 1) RX gets 3.1% PPS improvements from 6.3 Mpps to 6.5 Mpps
> > > 2) TX gets 4.6% PPS improvements from 8.6 Mpps to 9.0 Mpps
> > >
> >
> > That's a very modest improvement for a lot of code.
> > I also note you put in some batching just for in-order.
> > Which could also explain the gains maybe?
> > What if you just put in a simple implementation with no
> > batching tricks? do you still see a gain?
> 
> It is used to implement the batch used updating.
> 
> """
> Some devices always use descriptors in the same order in which they
> have been made available. These devices can offer the
> VIRTIO_F_IN_ORDER feature. If negotiated, this knowledge allows
> devices to notify the use of a batch of buffers to the driver by only
> writing out a single used ring entry with the id corresponding to the
> head entry of the descriptor chain describing the last buffer in the
> batch.
> """
> 
> DPDK implements this behavior, so it's a must for the virtio driver.
> 
> > Does any hardware implement this? Maybe that can demonstrate
> > bigger gains.
> 
> Maybe but I don't have one in my hand.
> 
> For performance, I think it should be sufficient as a starter. I can
> say in the next version something like "more optimizations could be
> done on top"

What are some optimizations you have in mind?



> Note that the patch that introduces packed virtqueue, there's not even
> any numbers:
> 
> commit 1ce9e6055fa0a9043405c5604cf19169ec5379ff
> Author: Tiwei Bie <tiwei.bie@intel.com>
> Date:   Wed Nov 21 18:03:27 2018 +0800
> 
>     virtio_ring: introduce packed ring support
> 
>     Introduce the packed ring support. Packed ring can only be
>     created by vring_create_virtqueue() and each chunk of packed
>     ring will be allocated individually. Packed ring can not be
>     created on preallocated memory by vring_new_virtqueue() or
>     the likes currently.
> 
>     Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>


I think the assumption there was that intel has hardware that
requires packed. That's why Dave merged this.

> >
> >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/virtio/virtio_ring.c | 423 +++++++++++++++++++++++++++++++++--
> > >  1 file changed, 402 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 27a9459a0555..21d456392ba0 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -70,11 +70,14 @@
> > >  enum vq_layout {
> > >       SPLIT = 0,
> > >       PACKED,
> > > +     SPLIT_IN_ORDER,
> > > +     PACKED_IN_ORDER,
> > >       VQ_TYPE_MAX,
> > >  };
> > >
> > >  struct vring_desc_state_split {
> > >       void *data;                     /* Data for callback. */
> > > +     u32 total_len;                  /* Buffer Length */
> > >
> > >       /* Indirect desc table and extra table, if any. These two will be
> > >        * allocated together. So we won't stress more to the memory allocator.
> > > @@ -84,6 +87,7 @@ struct vring_desc_state_split {
> > >
> > >  struct vring_desc_state_packed {
> > >       void *data;                     /* Data for callback. */
> > > +     u32 total_len;                  /* Buffer Length */
> > >
> > >       /* Indirect desc table and extra table, if any. These two will be
> > >        * allocated together. So we won't stress more to the memory allocator.
> >
> > We are bloating up the cache footprint for everyone,
> > so there's a chance of regressions.
> > Pls include benchmark for in order off, to make sure we
> > are not regressing.
> 
> Ok.
> 
> > How big was the ring?
> 
> 256.

that is very modest, you want to fill at least one cache way,
preferably more.

> > Worth trying with a biggish one, where there is more cache
> > pressure.
> 
> Ok.
> 
> >
> >
> > Why not have a separate state for in-order?
> 
> It can work.
> 
> >
> >
> >
> > > @@ -206,6 +210,12 @@ struct vring_virtqueue {
> > >
> > >       /* Head of free buffer list. */
> > >       unsigned int free_head;
> > > +
> > > +     /* Head of the batched used buffers, vq->num means no batching */
> > > +     unsigned int batch_head;
> > > +
> > > +     unsigned int batch_len;
> > > +
> >
> > Are these two only used for in-order? Please document that.
> 
> Yes, I will do that.
> 
> > I also want some documentation about the batching trickery
> > used please.
> > What is batched, when, how is batching flushed, why are we
> > only batching in-order ...
> 
> I'm not sure I get things like this, what you want seems to be the
> behaviour of the device which has been stated by the spec or I may
> miss something here.

"a single used ring entry with the id corresponding to the
 head entry of the descriptor chain describing the last buffer in the
 batch"
?

so together they form this used ring entry describing the last buffer?
"head" is the id and "len" the length?

maybe

	/*
	 * With IN_ORDER, devices write a single used ring entry with
	 * the id corresponding to the head entry of the descriptor chain
	 * describing the last buffer in the batch
	 */
	struct used_entry {
		u32 id;
		u32 len;
	} batch_last;

?




> >
> >
> >
> >
> > >       /* Number we've added since last sync. */
> > >       unsigned int num_added;
> > >
> > > @@ -256,10 +266,14 @@ static void vring_free(struct virtqueue *_vq);
> > >
> > >  #define to_vvq(_vq) container_of_const(_vq, struct vring_virtqueue, vq)
> > >
> > > -
> > >  static inline bool virtqueue_is_packed(const struct vring_virtqueue *vq)
> > >  {
> > > -     return vq->layout == PACKED;
> > > +     return vq->layout == PACKED || vq->layout == PACKED_IN_ORDER;
> > > +}
> > > +
> > > +static inline bool virtqueue_is_in_order(const struct vring_virtqueue *vq)
> > > +{
> > > +     return vq->layout == SPLIT_IN_ORDER || vq->layout == PACKED_IN_ORDER;
> > >  }
> > >
> > >  static bool virtqueue_use_indirect(const struct vring_virtqueue *vq,
> > > @@ -570,7 +584,7 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
> > >       struct vring_desc_extra *extra;
> > >       struct scatterlist *sg;
> > >       struct vring_desc *desc;
> > > -     unsigned int i, n, c, avail, descs_used, err_idx;
> > > +     unsigned int i, n, c, avail, descs_used, err_idx, total_len = 0;
> >
> >
> > I would add a comment here:
> >
> > /* Total length for in-order */
> > unsigned int total_len = 0;
> 
> Ok.
> 
> Thanks

next prev parent reply	other threads:[~2025-07-02 10:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-16  8:24 [PATCH V3 00/19] virtio_ring in order support Jason Wang
2025-06-16  8:24 ` [PATCH V3 01/19] virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx() Jason Wang
2025-06-24 10:35   ` Lei Yang
2025-06-16  8:25 ` [PATCH V3 02/19] virtio_ring: switch to use vring_virtqueue in virtqueue_poll variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 03/19] virtio_ring: unify logic of virtqueue_poll() and more_used() Jason Wang
2025-06-16  8:25 ` [PATCH V3 04/19] virtio_ring: switch to use vring_virtqueue for virtqueue resize variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 05/19] virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 06/19] virtio_ring: switch to use vring_virtqueue for virtqueue_add variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 07/19] virtio: " Jason Wang
2025-06-16  8:25 ` [PATCH V3 08/19] virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 09/19] virtio_ring: use vring_virtqueue for enable_cb_delayed variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 10/19] virtio_ring: switch to use vring_virtqueue for disable_cb variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 11/19] virtio_ring: switch to use vring_virtqueue for detach_unused_buf variants Jason Wang
2025-06-16  8:25 ` [PATCH V3 12/19] virtio_ring: use u16 for last_used_idx in virtqueue_poll_split() Jason Wang
2025-06-16  8:25 ` [PATCH V3 13/19] virtio_ring: introduce virtqueue ops Jason Wang
2025-07-01  6:28   ` Michael S. Tsirkin
2025-07-03  3:20     ` Jason Wang
2025-06-16  8:25 ` [PATCH V3 14/19] virtio_ring: determine descriptor flags at one time Jason Wang
2025-07-01  6:42   ` Michael S. Tsirkin
2025-07-01  7:25     ` Jason Wang
2025-06-16  8:25 ` [PATCH V3 15/19] virtio_ring: factor out core logic of buffer detaching Jason Wang
2025-06-16  8:25 ` [PATCH V3 16/19] virtio_ring: factor out core logic for updating last_used_idx Jason Wang
2025-06-16  8:25 ` [PATCH V3 17/19] virtio_ring: factor out split indirect detaching logic Jason Wang
2025-07-01  6:44   ` Michael S. Tsirkin
2025-06-16  8:25 ` [PATCH V3 18/19] virtio_ring: factor out split " Jason Wang
2025-07-01  6:45   ` Michael S. Tsirkin
2025-06-16  8:25 ` [PATCH V3 19/19] virtio_ring: add in order support Jason Wang
2025-07-01  6:56   ` Michael S. Tsirkin
2025-07-02  9:29     ` Jason Wang
2025-07-02 10:56       ` Michael S. Tsirkin [this message]
2025-07-03  3:13         ` Jason Wang
2025-07-03 13:49           ` Jonah Palmer
2025-07-07  3:28             ` Jason Wang
2025-06-17 12:29 ` [PATCH V3 00/19] virtio_ring " Eugenio Perez Martin
2025-07-01  6:58 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250702064413-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox