From: "Michael S. Tsirkin" <mst@redhat.com>
To: Demi Marie Obenour <demiobenour@gmail.com>
Cc: virtio-comment@lists.linux.dev
Subject: Re: [RFC PATCH] ring: Forbid using descriptors in two chains at once
Date: Sun, 17 May 2026 20:59:34 -0400 [thread overview]
Message-ID: <20260517204742-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <adc807b9-2c95-4ddc-8243-0ed2bf5ac651@gmail.com>
On Sun, May 17, 2026 at 04:05:28PM -0400, Demi Marie Obenour wrote:
> On 5/7/26 01:50, Michael S. Tsirkin wrote:
> > On Thu, May 07, 2026 at 12:34:19AM -0400, Demi Marie Obenour wrote:
> >> Using the same descriptor in multiple chains at once creates
> >> difficulties. If the descriptor is device-writeable, this is begging
> >> for data corruption due to conflicting writes from the device. Even if
> >> the descriptor is device-readable, the mere possibility requires devices
> >> to use more sophisticated internal data structures.
> >
> > The natural way to index requests is by using the avail ring head,
> > not descriptors, though? I suspect it's about device memory being
> > limited, etc. But pls make it more explicit.
>
> It's about being able to *not* use scatter-gather I/O, and instead use
> a single I/O request for each descriptor. io_uring makes this very
> cheap. However, it requires that the lookup key be (avail ring head,
> descriptor index) rather than just (avail ring head).
>
> The latter has sufficiently few possibilities one can just use
> a bitmap. The former isn't. This means using a memory pool, heap
> allocation, or associative container.
>
> >> Require descriptors to be used in at most one chain at any given time.
> >> Indirect descriptors can be used if support for huge batches is needed.
> >
> > does this imply indirect are exempt from this rule? the text
> > you propose makes no such exemptions.
>
> It means that indirect descriptors can point to huge numbers of
> descriptors, rather than just what fits in the descriptor ring.
Nope, the spec limits the chain size to ring size.
> >> Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
> >> ---
> >> I have not checked whether this will break existing drivers, so I'm
> >> marking this as RFC for now.
> >> ---
> >> packed-ring.tex | 6 ++++++
> >> split-ring.tex | 6 ++++++
> >> 2 files changed, 12 insertions(+)
> >>
> >> diff --git a/packed-ring.tex b/packed-ring.tex
> >> index 3ee55a18d024d4da1a48d522e98ea563b9809b1c..8fd5a770c7ca4e6f60911b2861e00a273a69ea73 100644
> >> --- a/packed-ring.tex
> >> +++ b/packed-ring.tex
> >> @@ -473,6 +473,12 @@ \subsection{Event Suppression Structure Format}\label{sec:Basic
> >>
> >> This implies that loops in the descriptor list are forbidden!
> >>
> >> +Drivers MUST NOT include a descriptor in a chain if the
> >> +descriptor is already part of a previous chain whose processing
> >> +has not completed. This implies that the total number of
> >> +descriptors submitted to the device, but not yet processed, is
> >> +limited by the queue size.
> >
> > I do not understand what is this trying to say for the packed ring.
> > How do you include a descriptor in a chain twice - the ring has to
> > wrap around, no?
>
> I don't think you can.
>
> > For that matter, I do not think "This implies that loops in the
> > descriptor list are forbidden!" makes sense for packed either -
> > I think we copied it from split by mistake.
>
> Fair.
>
> >> +
> >> The driver MUST place any device-writable descriptor elements after
> >> any device-readable descriptor elements.
> >>
> >> diff --git a/split-ring.tex b/split-ring.tex
> >> index de9403882df1e6d20e16ec983ecf0a46d39efa6b..014e279ef1da7653c7051e3a24c6169fe8d51983 100644
> >> --- a/split-ring.tex
> >> +++ b/split-ring.tex
> >> @@ -223,6 +223,12 @@ \subsection{The Virtqueue Descriptor Table}\label{sec:Basic Facilities of a Virt
> >> Drivers MUST NOT add a descriptor chain longer than $2^{32}$ bytes in total;
> >> this implies that loops in the descriptor chain are forbidden!
> >>
> >> +Drivers MUST NOT include a descriptor in a chain if the
> >> +descriptor is already part of a previous chain whose processing
> >> +has not completed.
> >
> > what specifically is "include a descriptor in a chain"? something to do
> > with making a head available? and what is "whose processing has not
> > completed"? something to do with a head not used?
> >
> >> This implies that the total number of
> >> +descriptors submitted to the device, but not yet processed, is
> >> +limited by the queue size.
> >> +
> >> If VIRTIO_F_IN_ORDER has been negotiated, and when making a
> >> descriptor with VRING_DESC_F_NEXT set in \field{flags} at offset
> >> $x$ in the table available to the device, driver MUST set
> >
> > We can't really introduce new mandatory requirements without
> > a feature bit.
>
> I'm making the assumption that no existing driver actually violates
> this rule. That said, another option would be to force any driver
> that does violate it onto a slow path, perhaps by serializing the
> I/O operations.
>
> As an aside, this is why the spec I am working on for the vhost-user
> device backend has seemingly arbitrary restrictions around MSI-X
> interrupts. It makes life easier for device authors.
>
> Finally, I understand if this isn't viable. It can be worked
> around in devices without *too* much complexity.
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
We have two options:
- a new feature bit + if negotiated, driver MUST.
device can decide if it wants to support drivers without a feature bit
- a recommendation (driver SHOULD)
--
MST
next prev parent reply other threads:[~2026-05-18 0:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-07 4:34 [RFC PATCH] ring: Forbid using descriptors in two chains at once Demi Marie Obenour
2026-05-07 5:50 ` Michael S. Tsirkin
2026-05-17 20:05 ` Demi Marie Obenour
2026-05-18 0:59 ` Michael S. Tsirkin [this message]
2026-05-18 2:37 ` Demi Marie Obenour
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260517204742-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=demiobenour@gmail.com \
--cc=virtio-comment@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox