qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Daniel Verkamp <dverkamp@chromium.org>
Cc: virtio-comment@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>
Subject: Re: [virtio-comment] virtio queue numbering and optional queues
Date: Tue, 22 Aug 2023 09:40:11 -0400	[thread overview]
Message-ID: <20230822134011.GB727224@fedora> (raw)
In-Reply-To: <CABVzXAke4LRt4=S4FsFTFf_WPrAhe1dukoLZto6t6R13kgjw0Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5683 bytes --]

On Mon, Aug 21, 2023 at 03:18:50PM -0700, Daniel Verkamp wrote:
> Hello virtio folks,

Hi Daniel,
I have CCed those involved in the free page hint and page reporting
features.

Stefan

> 
> I noticed a mismatch between the way the specification defines
> device-specific virtqueue indexes and the way device and driver
> implementers have interpreted the specification. As a practical example,
> consider the traditional memory balloon device [1]. The first two queues
> (indexes 0 and 1) are available as part of the baseline device, but the
> rest of the queues are tied to feature bits.
> 
> Section 5.5.2, "Virtqueues", gives a list that appears to be a mapping from
> queue index to queue name/function, defining queue index 3 as free_page_vq
> and index 4 as reporting_vq, and declaring that "free_page_vq only exists
> if VIRTIO_BALLOON_F_FREE_PAGE_HINT is set" and "reporting_vq only exists if
> VIRTIO_BALLOON_F_PAGE_REPORTING is set." This wording is a bit vague, but I
> assume "is set" means "is negotiated" (not just "advertised by the
> device"). Also presumably "exists" means something like "may only be used
> by the driver if the feature bit is negotiated" and "should be ignored by
> the device if the feature bit is not negotiated", although it would be nice
> to have a proper definition in the spec somewhere.
> 
> Section 5.5.3, "Feature bits", gives definitions of the feature bits, with
> similar descriptions of the relationship between the feature bits and
> virtqueue availability, although the wording is slightly different
> ("present" rather than "exists"). No dependency between feature bits is
> defined, so it seems like it should be valid for a device or driver to
> support or accept one of the higher-numbered features while not supporting
> a lower-numbered one.
> 
> 
> Notably, there is no mention of queue index assignments changing based on
> negotiated features in either of these sections. Hence a reader can only
> assume that the queue index assignments are fixed (i.e. stats_vq will
> always be vq index 4 if F_STATS_VQ is negotiated, regardless of any other
> feature bits).
> 
> Now consider a scenario where VIRTIO_BALLOON_F_STATS_VQ and
> VIRTIO_BALLOON_F_PAGE_REPORTING are negotiated but
> VIRTIO_BALLOON_F_FREE_PAGE_HINT is not (perhaps the device supports all of
> the defined features but the driver only wants to use reporting_vq, not
> free_page_vq). In this case, what queue index should be used by the driver
> when enabling reporting_vq? My reading of the specification is that the
> reporting_vq is always queue index 4, independent of whether
> VIRTIO_BALLOON_F_STATS_VQ or VIRTIO_BALLOON_F_FREE_PAGE_HINT are
> negotiated, but this contradicts existing device and driver
> implementations, which will use queue index 3 (the next one after stats_vq
> = 2) as reporting_vq in this case.
> 
> The qemu virtio-ballon device [2] assigns the next-highest unused queue
> index when calling virtio_add_queue(), and in the scenario presented above,
> free_page_vq will not be added since F_STATS_VQ is not negotiated, so
> reporting_vq will be assigned queue index 3, rather than 4. (Additionally,
> qemu always adds the stats_vq regardless of negotiated features, but that's
> irrelevant in this case since we are assuming the STATS_VQ feature is
> negotiated.)
> 
> The Linux virtio driver code originally seemed to use the correct (by my
> reading) indexes, but it was changed to match the layout used by qemu in a
> 2019 commit ("virtio_pci: use queue idx instead of array idx to set up the
> vq") [3] - in other words, it will now also expect queue index 3 to be
> reporting_vq in the scenario laid out above.
> 
> I'm not sure how to resolve the mismatch between the specification and
> actual implementation behavior. The simplest change would probably be to
> rewrite the specification to drop the explicit queue indexes in section
> 5.5.2 and add some wording about how queues are numbered based on
> negotiated feature bits (this would need to be applied to other device
> types that have specified queue indexes as well). However, this would also
> technically be an incompatible change of the specification. On the other
> hand, changing the device and driver implementations to match the
> specification would be even more challenging, since it would be an
> incompatible change in actual practice, not just a change of the spec to
> match consensus implementation behavior.
> 
> 
> Perhaps drivers could add a quirk to detect old versions of the qemu device
> and use the old behavior, while enabling the correct behavior only for
> other device vendors and newer qemu device revisions, and the qemu device
> could add an opt-in feature to enable the correct behavior that users would
> need to enable only when they know they have a sufficiently new driver with
> the fix.
> 
> 
> Or maybe there could be a new feature bit that would opt into following the
> spec-defined queue indexes (VIRTIO_F_VERSION_2?) and some new wording to
> require devices to use the old behavior when that bit is not negotiated,
> but that also feels less than ideal to me.
> 
> Any thoughts on how to proceed with this situation? Is my reading of the
> specification just wrong?
> 
> Thanks,
> 
> -- Daniel
> 
> [1]:
> https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-3160002
> 
> [2]:
> https://github.com/qemu/qemu/blob/f33c74576425fac2cbb0725229895fe096df4261/hw/virtio/virtio-balloon.c#L879-L897
> 
> [3]:
> https://github.com/torvalds/linux/commit/ddbeac07a39a81d82331a312d0578fab94fccbf1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-08-22 13:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 22:18 virtio queue numbering and optional queues Daniel Verkamp
2023-08-22 13:40 ` Stefan Hajnoczi [this message]
2023-08-22 14:29   ` [virtio-comment] " David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230822134011.GB727224@fedora \
    --to=stefanha@redhat.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=dverkamp@chromium.org \
    --cc=qemu-devel@nongnu.org \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).