From: "Michael S. Tsirkin" <mst@redhat.com>
To: Si-Wei Liu <si-wei.liu@oracle.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>,
yangjiale <yangjiale133@163.com>,
Jason Wang <jasowang@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
Andrew.Boyer@amd.com
Subject: Re: [PATCH] VIRTIO: Update the desc 'flag' fied last in packed ring.
Date: Fri, 5 Jun 2026 20:11:42 -0400 [thread overview]
Message-ID: <20260605200908-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5e1a10bd-2783-42ba-b443-853f12159756@oracle.com>
On Fri, Jun 05, 2026 at 11:50:36AM -0700, Si-Wei Liu wrote:
>
>
> On 6/5/2026 10:43 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 05, 2026 at 09:03:36AM -0700, Si-Wei Liu wrote:
> > >
> > > On 6/1/2026 11:04 PM, Eugenio Perez Martin wrote:
> > > > On Tue, Jun 2, 2026 at 6:34 AM yangjiale <yangjiale133@163.com> wrote:
> > > > > When a descriptor list spans across cache lines,
> > > > > updating the flag first can lead to a scenario where the device side
> > > > > perceives the flag as valid, yet the corresponding address and length
> > > > > fields remain unupdated—resulting in invalid values.
> > > > > Therefore, the flag field must be updated last.
> > > > >
> > > > > Signed-off-by: yangjiale <yangjiale133@163.com>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 8 ++++----
> > > > > 1 file changed, 4 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index fbca7ce1c6bf..036b4f90d30f 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
> > > > > &addr, &len, premapped, attr))
> > > > > goto unmap_release;
> > > > >
> > > > > + desc[i].addr = cpu_to_le64(addr);
> > > > > + desc[i].len = cpu_to_le32(len);
> > > > > + desc[i].id = cpu_to_le16(id);
> > > > > +
> > > > > flags = cpu_to_le16(vq->packed.avail_used_flags |
> > > > > (++c == total_sg ? 0 : VRING_DESC_F_NEXT) |
> > > > > (n < out_sgs ? 0 : VRING_DESC_F_WRITE));
> > > > > @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq,
> > > > > else
> > > > > desc[i].flags = flags;
> > > > >
> > > > > - desc[i].addr = cpu_to_le64(addr);
> > > > > - desc[i].len = cpu_to_le32(len);
> > > > > - desc[i].id = cpu_to_le16(id);
> > > > > -
> > > > > if (unlikely(vq->use_map_api)) {
> > > > > vq->packed.desc_extra[curr].addr = premapped ?
> > > > > DMA_MAPPING_ERROR : addr;
> > > > These flags are updated before the flags of the head descriptor at the
> > > > end of the function, at "vq->packed.vring.desc[head].flags =
> > > > head_flags", so the device should not see these. Because of that, the
> > > > relative order between the rest of the fields of the same descriptor
> > > > or other descriptors' fields, except for the head descriptor's flags,
> > > > should not matter. There is a write memory barrier just before
> > > > updating the head's flags.
> > > The above analysis is absolutely correct. Though one hardware vendor told me
> > > that this driver implementation kinda stops them from reading ahead of
> > > descriptors already posted beyond the available index., ending up with
> > > suboptimal performance that is hard to make up by other means. Would it be a
> > > bad idea to go with this change and add write barrier in a gentle way for a
> > > small flit in the batch, e.g. commit to memory after every cache line size
> > > worth of descriptors are posted? Would the memory barrier have negative
> > > performance overhead to other backend implementation variants than real
> > > hardware PCI device?
> > >
> > > -Siwei
> > this would need a new feature bit, won't it?
> Probably. This is to capture the device's expectation and behavior right?
> the driver change itself is not spec violating...
yes, device can't rely on this without a feature bit.
> >
> > > > Also, I don't get why the cache line matters here. Can you expand? Am
> > > > I missing something?
> > me too.
> >
> Just to avoid extra delay due to excessive coherency messages and frequent
> cache thrashing, device read over pci bus contends with host write/update on
> the descriptors in a same cache line..
>
> -Siwei
this should be infrequent, the whole idea is that there's parallelism:
device reads descriptors from X while host writes other ones to Y.
btw i can't say whether it's ok for device to just issue 2 reads,
or does it have to receive read result and only then issue the second
read.
--
MST
next prev parent reply other threads:[~2026-06-06 0:11 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 4:31 [PATCH] VIRTIO: Update the desc 'flag' fied last in packed ring yangjiale
2026-06-02 6:04 ` Eugenio Perez Martin
2026-06-02 6:59 ` Michael S. Tsirkin
2026-06-02 9:21 ` Dongli Zhang
2026-06-03 1:58 ` yangjiale133
2026-06-03 2:08 ` Xuan Zhuo
[not found] ` <5a3e06d5.103d.19e8b1863dd.Coremail.yangjiale133@163.com>
2026-06-03 5:10 ` Michael S. Tsirkin
2026-06-05 16:03 ` Si-Wei Liu
2026-06-05 17:43 ` Michael S. Tsirkin
2026-06-05 18:50 ` Si-Wei Liu
2026-06-06 0:11 ` Michael S. Tsirkin [this message]
2026-06-02 6:59 ` Michael S. Tsirkin
2026-06-03 1:09 ` Xuan Zhuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260605200908-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=Andrew.Boyer@amd.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=si-wei.liu@oracle.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yangjiale133@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox