From: "Michael S. Tsirkin" <mst@redhat.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>,
yangjiale <yangjiale133@163.com>,
Jason Wang <jasowang@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
Andrew.Boyer@amd.com
Subject: Re: [PATCH] VIRTIO: Update the desc 'flag' fied last in packed ring.
Date: Fri, 5 Jun 2026 20:11:42 -0400 [thread overview]
Message-ID: <20260605200908-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5e1a10bd-2783-42ba-b443-853f12159756@oracle.com>
On Fri, Jun 05, 2026 at 11:50:36AM -0700, Si-Wei Liu wrote:
>
>
> On 6/5/2026 10:43 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 05, 2026 at 09:03:36AM -0700, Si-Wei Liu wrote:
> > >
> > > On 6/1/2026 11:04 PM, Eugenio Perez Martin wrote:
> > > > On Tue, Jun 2, 2026 at 6:34 AM yangjiale <yangjiale133@163.com> wrote:
> > > > > When a descriptor list spans across cache lines,
> > > > > updating the flag first can lead to a scenario where the device side
> > > > > perceives the flag as valid, yet the corresponding address and length
> > > > > fields remain unupdated—resulting in invalid values.
> > > > > Therefore, the flag field must be updated last.
> > > > >
> > > > > Signed-off-by: yangjiale <yangjiale133@163.com>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 8 ++++----
> > > > > 1 file changed, 4 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index fbca7ce1c6bf..036b4f90d30f 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
> > > > > &addr, &len, premapped, attr))
> > > > > goto unmap_release;
> > > > >
> > > > > + desc[i].addr = cpu_to_le64(addr);
> > > > > + desc[i].len = cpu_to_le32(len);
> > > > > + desc[i].id = cpu_to_le16(id);
> > > > > +
> > > > > flags = cpu_to_le16(vq->packed.avail_used_flags |
> > > > > (++c == total_sg ? 0 : VRING_DESC_F_NEXT) |
> > > > > (n < out_sgs ? 0 : VRING_DESC_F_WRITE));
> > > > > @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
> > > > > else
> > > > > desc[i].flags = flags;
> > > > >
> > > > > - desc[i].addr = cpu_to_le64(addr);
> > > > > - desc[i].len = cpu_to_le32(len);
> > > > > - desc[i].id = cpu_to_le16(id);
> > > > > -
> > > > > if (unlikely(vq->use_map_api)) {
> > > > > vq->packed.desc_extra[curr].addr = premapped ?
> > > > > DMA_MAPPING_ERROR : addr;
> > > > These flags are updated before the flags of the head descriptor at the
> > > > end of the function, at "vq->packed.vring.desc[head].flags =
> > > > head_flags", so the device should not see these. Because of that, the
> > > > relative order between the rest of the fields of the same descriptor
> > > > or other descriptors' fields, except for the head descriptor's flags,
> > > > should not matter. There is a write memory barrier just before
> > > > updating the head's flags.
> > > The above analysis is absolutely correct. Though one hardware vendor told me
> > > that this driver implementation kinda stops them from reading ahead of
> > > descriptors already posted beyond the available index., ending up with
> > > suboptimal performance that is hard to make up by other means. Would it be a
> > > bad idea to go with this change and add write barrier in a gentle way for a
> > > small flit in the batch, e.g. commit to memory after every cache line size
> > > worth of descriptors are posted? Would the memory barrier have negative
> > > performance overhead to other backend implementation variants than real
> > > hardware PCI device?
> > >
> > > -Siwei
> > this would need a new feature bit, won't it?
> Probably. This is to capture the device's expectation and behavior right?
> the driver change itself is not spec violating...
yes, device can't rely on this without a feature bit.
> >
> > > > Also, I don't get why the cache line matters here. Can you expand? Am
> > > > I missing something?
> > me too.
> >
> Just to avoid extra delay due to excessive coherency messages and frequent
> cache thrashing, device read over pci bus contends with host write/update on
> the descriptors in a same cache line..
>
> -Siwei
this should be infrequent, the whole idea is that there's parallelism:
device reads descriptors from X while host writes other ones to Y.
btw i can't say whether it's ok for device to just issue 2 reads,
or does it have to receive read result and only then issue the second
read.
--
MST
next prev parent reply other threads:[~2026-06-06 0:11 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 4:31 [PATCH] VIRTIO: Update the desc 'flag' fied last in packed ring yangjiale
2026-06-02 6:04 ` Eugenio Perez Martin
2026-06-02 6:59 ` Michael S. Tsirkin
2026-06-02 9:21 ` Dongli Zhang
2026-06-03 1:58 ` yangjiale133
2026-06-03 2:08 ` Xuan Zhuo
[not found] ` <5a3e06d5.103d.19e8b1863dd.Coremail.yangjiale133@163.com>
2026-06-03 5:10 ` Michael S. Tsirkin
2026-06-05 16:03 ` Si-Wei Liu
2026-06-05 17:43 ` Michael S. Tsirkin
2026-06-05 18:50 ` Si-Wei Liu
2026-06-06 0:11 ` Michael S. Tsirkin [this message]
2026-06-02 6:59 ` Michael S. Tsirkin
2026-06-03 1:09 ` Xuan Zhuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260605200908-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=Andrew.Boyer@amd.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=si-wei.liu@oracle.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yangjiale133@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.