From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
virtio-comment@lists.oasis-open.org,
Virtio-Dev <virtio-dev@lists.oasis-open.org>,
virtio@lists.oasis-open.org, i.maximets@redhat.com,
Stefano Garzarella <sgarzare@redhat.com>
Subject: [virtio-comment] Re: [virtio-dev] [PATCH RFC] VIRTIO_F_PARTIAL_ORDER for page fault handling
Date: Tue, 29 Mar 2022 10:40:48 -0400 [thread overview]
Message-ID: <20220329103923-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAJaqyWfQGrsP-9nnFgAYhWdwHHvDYTu8Eg37iC3jFi-2Ho0GuQ@mail.gmail.com>
On Tue, Mar 29, 2022 at 12:30:25PM +0200, Eugenio Perez Martin wrote:
> On Tue, Mar 29, 2022 at 10:33 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Mon, Aug 10, 2020 at 12:15:15PM -0400, Michael S. Tsirkin wrote:
> > > Devices that normally use buffers in order can
> > > benefit from ability to temporarily switch to handle
> > > some buffers out of order.
> > >
> > > As a case in point, a networking device might handle
> > > RX buffers in order normally. However, should
> > > an access to an RX buffer cause a page fault
> > > (e.g. when using PRI), the device could benefit from
> > > ability to temporarily keep using following
> > > buffers in the ring (possibly with higher overhead)
> > > until the fault has been resolved.
> > >
> > > Page faults allow more features such as THP, auto-NUMA,
> > > live migration.
> > >
> > > Out of order is of course already possible, however,
> > > IN_ORDER is currently required for descriptor batching where
> > > device marks a whole batch of buffers used in one go.
> > >
> > > The idea behind this proposal is to relax that requirement,
> > > allowing batching without asking device to be in orde rat all times,
> > > as follows:
> > >
> > > Device uses buffers in any order. Eventually when device detects that it
> > > has used all previously outstanding buffers, it sets a FLUSH flag on the
> > > last buffer used. If it set this flag on the last buffer used
> > > previously, and now uses a batch of descriptors in-order, it can now
> > > signal the last buffer used again setting the FLUSH flag.
> > >
> > > Driver can detect in-order when it sees two FLUSH flags one after
> > > another. In other respects the feature is similar to IN_ORDER
> > > from the driver implementation POV.
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > > content.tex | 9 ++++++++-
> > > packed-ring.tex | 23 +++++++++++++++++++++++
> > > split-ring.tex | 26 ++++++++++++++++++++++++--
> > > 3 files changed, 55 insertions(+), 3 deletions(-)
> >
> > Hi Michael,
> > What is the status of this feature?
> >
> > There is a Google Summer of Code project to implement VIRTIO_F_IN_ORDER
> > in QEMU and Linux
> > (https://wiki.qemu.org/Google_Summer_of_Code_2022#VIRTIO_F_IN_ORDER_support_for_virtio_devices).
> >
> > I wanted to check if you still want to pursue PARTIAL_ORDER. Maybe
> > Stefano and Eugenio (the mentors) will find a way to connect it to the
> > GSoC project.
> >
>
> If the status is close to being merged I'm totally in to implement
> either partial order first or to implement both in the project. In
> some sense it superseded IN_ORDER.
>
> Thanks!
I think it's blocked on lack of implementation actually.
TC was concerned whether there are any correctness issues.
An implementation which can be stress-tested would resolve this.
> > Thanks to Ilya Maximets for mentioning PARTIAL_ORDER!
> >
> > Stefan
> >
> > >
> > > diff --git a/content.tex b/content.tex
> > > index 91735e3..8494eb6 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -296,7 +296,11 @@ \section{Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Virtqueues}
> > >
> > > Some devices always use descriptors in the same order in which
> > > they have been made available. These devices can offer the
> > > -VIRTIO_F_IN_ORDER feature. If negotiated, this knowledge
> > > +VIRTIO_F_IN_ORDER feature. Other devices sometimes use
> > > +descriptors in the same order in which they have been made
> > > +available. These devices can offer the VIRTIO_F_PARTIAL_ORDER
> > > +feature. If one of the features VIRTIO_F_IN_ORDER or
> > > +VIRTIO_F_PARTIAL_ORDER is negotiated, this knowledge
> > > might allow optimizations or simplify driver and/or device code.
> > >
> > > Each virtqueue can consist of up to 3 parts:
> > > @@ -6132,6 +6136,9 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> > > that the driver passes extra data (besides identifying the virtqueue)
> > > in its device notifications.
> > > See \ref{sec:Virtqueues / Driver notifications}~\nameref{sec:Virtqueues / Driver notifications}.
> > > + \item[VIRTIO_F_PARTIAL_ORDER(39)] This feature indicates
> > > + that device has ability to indicate use of (some of) buffers by the device in the same
> > > + order in which they have been made available.
> > > \end{description}
> > >
> > > \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > index ea92543..a120a19 100644
> > > --- a/packed-ring.tex
> > > +++ b/packed-ring.tex
> > > @@ -284,6 +284,29 @@ \subsection{In-order use of descriptors}
> > > only writing out a single used descriptor with the Buffer ID
> > > corresponding to the last descriptor in the batch.
> > >
> > > +Other devices sometimes use
> > > +descriptors in the same order in which they have been made
> > > +available. These devices can offer the VIRTIO_F_PARTIAL_ORDER
> > > +feature. If negotiated, whenever device has used all buffers
> > > +since the previous used buffer in the same order
> > > +in which they have been made available, device can set the
> > > +VIRTQ_DESC_F_FLUSH flag in the used descriptor.
> > > +\begin{lstlisting}
> > > +#define VIRTQ_DESC_F_FLUSH 8
> > > +\end{lstlisting}
> > > +
> > > +This knowledge allows
> > > +devices to notify the use of a batch of buffers to the driver by
> > > +only writing out a single used descriptor with the Buffer ID
> > > +corresponding to the last descriptor in the batch,
> > > +and VIRTQ_DESC_F_FLUSH set.
> > > +
> > > +Note that device is only allowed to batch buffers in this way
> > > +if the previous used descriptor also has the VIRTQ_DESC_F_FLUSH
> > > +flag set, as a result, considering the group of buffers
> > > +used between two buffers with VIRTQ_DESC_F_FLUSH set,
> > > +either all of them constitute a batch, or none at all.
> > > +
> > > The device then skips forward in the ring according to the size of
> > > the batch. The driver needs to look up the used Buffer ID and
> > > calculate the batch size to be able to advance to where the next
> > > diff --git a/split-ring.tex b/split-ring.tex
> > > index 123ac9f..cf197f8 100644
> > > --- a/split-ring.tex
> > > +++ b/split-ring.tex
> > > @@ -398,10 +398,11 @@ \subsection{The Virtqueue Used Ring}\label{sec:Basic Facilities of a Virtio Devi
> > > le16 avail_event; /* Only if VIRTIO_F_EVENT_IDX */
> > > };
> > >
> > > -/* le32 is used here for ids for padding reasons. */
> > > struct virtq_used_elem {
> > > /* Index of start of used descriptor chain. */
> > > - le32 id;
> > > + le16 id;
> > > +#define VIRTQ_USED_ELEM_F_FLUSH 0x8000
> > > + le16 flags;
> > > /* Total length of the descriptor chain which was used (written to) */
> > > le32 len;
> > > };
> > > @@ -481,6 +482,27 @@ \subsection{In-order use of descriptors}
> > > corresponding to the head entry of the
> > > descriptor chain describing the last buffer in the batch.
> > >
> > > +Other devices sometimes use
> > > +descriptors in the same order in which they have been made
> > > +available. These devices can offer the VIRTIO_F_PARTIAL_ORDER
> > > +feature. If negotiated, whenever device has used all buffers
> > > +since the previous used buffer in the same order
> > > +in which they have been made available, device can set the
> > > +VIRTQ_USED_ELEM_F_FLUSH flag in the used ring entry.
> > > +
> > > +This knowledge allows
> > > +devices to notify the use of a batch of buffers to the driver by
> > > +only writing out single used ring entry with the \field{id}
> > > +corresponding to the head entry of the
> > > +descriptor chain describing the last buffer in the batch,
> > > +and VIRTQ_USED_ELEM_F_FLUSH set.
> > > +
> > > +Note that device is only allowed to batch buffers in this way
> > > +if the previous used ring entry also has the VIRTQ_USED_ELEM_F_FLUSH
> > > +flag set, as a result, considering the group of buffers
> > > +used between two buffers with VIRTQ_USED_ELEM_F_FLUSH set,
> > > +either all of them constitute a batch, or none at all.
> > > +
> > > The device then skips forward in the ring according to the size of
> > > the batch. Accordingly, it increments the used \field{idx} by the
> > > size of the batch.
> > > --
> > > MST
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2022-03-29 14:40 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 16:15 [virtio] [PATCH RFC] VIRTIO_F_PARTIAL_ORDER for page fault handling Michael S. Tsirkin
2020-08-10 16:59 ` Cornelia Huck
2020-08-11 8:23 ` Michael S. Tsirkin
2020-08-11 8:39 ` Cornelia Huck
2020-08-11 14:53 ` [virtio-comment] RE: [virtio-dev] " Lars Ganrot
2020-08-11 15:43 ` Lars Ganrot
2020-08-12 12:50 ` [virtio] " Cornelia Huck
2020-08-12 15:55 ` [virtio-comment] " Lars Ganrot
2020-08-13 23:17 ` [virtio] " Michael S. Tsirkin
2020-08-17 8:11 ` Lars Ganrot
2021-09-06 6:33 ` Michael S. Tsirkin
2020-08-13 20:45 ` [virtio] " Michael S. Tsirkin
2022-03-29 8:33 ` [virtio-dev] " Stefan Hajnoczi
2022-03-29 10:30 ` Eugenio Perez Martin
2022-03-29 14:40 ` Michael S. Tsirkin [this message]
2022-03-30 9:03 ` Eugenio Perez Martin
2022-03-29 14:39 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220329103923-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=eperezma@redhat.com \
--cc=i.maximets@redhat.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtio@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox