Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	Cindy Lu <lulu@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	qemu-level <qemu-devel@nongnu.org>,
	Laurent Vivier <lvivier@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user
Date: Wed, 1 Feb 2023 05:44:45 -0500	[thread overview]
Message-ID: <20230201054323-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAJaqyWdxL+9gvjawpFTMg_ut8WpcZErdipAMMCSXYdOTcYK61w@mail.gmail.com>

On Wed, Feb 01, 2023 at 08:49:30AM +0100, Eugenio Perez Martin wrote:
> On Wed, Feb 1, 2023 at 4:29 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Feb 1, 2023 at 3:11 AM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > >
> > > On Tue, Jan 31, 2023 at 8:10 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > The current approach of offering an emulated CVQ to the guest and map
> > > > the commands to vhost-user is not scaling well:
> > > > * Some devices already offer it, so the transformation is redundant.
> > > > * There is no support for commands with variable length (RSS?)
> > > >
> > > > We can solve both of them by offering it through vhost-user the same
> > > > way as vhost-vdpa do. With this approach qemu needs to track the
> > > > commands, for similar reasons as vhost-vdpa: qemu needs to track the
> > > > device status for live migration. vhost-user should use the same SVQ
> > > > code for this, so we avoid duplications.
> > > >
> > > > One of the challenges here is to know what virtqueue to shadow /
> > > > isolate. The vhost-user device may not have the same queues as the
> > > > device frontend:
> > > > * The first depends on the actual vhost-user device, and qemu fetches
> > > > it with VHOST_USER_GET_QUEUE_NUM at the moment.
> > > > * The qemu device frontend's is set by netdev queues= cmdline parameter in qemu
> > > >
> > > > For the device, the CVQ is the last one it offers, but for the guest
> > > > it is the last one offered in config space.
> > > >
> > > > To create a new vhost-user command to decrease that maximum number of
> > > > queues may be an option. But we can do it without adding more
> > > > commands, remapping the CVQ index at virtqueue setup. I think it
> > > > should be doable using (struct vhost_dev).vq_index and maybe a few
> > > > adjustments here and there.
> > > >
> > > > Thoughts?
> > > >
> > > > Thanks!
> > >
> > >
> > > (Starting a separated thread to vhost-vdpa related use case)
> > >
> > > This could also work for vhost-vdpa if we ever decide to honor netdev
> > > queues argument. It is totally ignored now, as opposed to the rest of
> > > backends:
> > > * vhost-kernel, whose tap device has the requested number of queues.
> > > * vhost-user, that errors with ("you are asking more queues than
> > > supported") if the vhost-user parent device has less queues than
> > > requested (by vhost-user msg VHOST_USER_GET_QUEUE_NUM).
> > >
> > > One of the reasons for this is that device configuration space is
> > > totally passthrough, with the values for mtu, rss conditions, etc.
> > > This is not ideal, as qemu cannot check src and destination
> > > equivalence and they can change under the feets of the guest in the
> > > event of a migration.
> >
> > This looks not the responsibility of qemu but the upper layer (to
> > provision the same config/features in src/dst).
> 
> I think both share it. Or, at least, that it is inconsistent that QEMU
> is in charge of checking / providing consistency for virtio features,
> but not virtio-net config space.
> 
> If we follow that to the extreme, we could simply delete the feature
> checks, right?
> 
> >
> > > External tools are needed for this, duplicating
> > > part of the effort.
> > >
> > > Start intercepting config space accesses and offering an emulated one
> > > to the guest with this kind of adjustments is beneficial, as it makes
> > > vhost-vdpa more similar to the rest of backends, making the surprise
> > > on a change way lower.
> >
> > This probably needs more thought, since vDPA already provides a kind
> > of emulation in the kernel. My understanding is that it would be
> > sufficient to add checks to make sure the config that guests see is
> > consistent with what host provisioned?
> >
> 
> With host provisioned you mean with "vdpa" tool or with qemu? Also, we
> need a way to communicate the guest values to it If those checks are
> added in the kernel.
> 
> The reasoning here is the same as above: QEMU already filters features
> with its own emulated layer, so the operator can specify a feature
> that will never appear to the guest. It has other uses (abstract
> between transport for example), but feature filtering is definitely a
> thing there.
> 
> A feature set to off in a VM (or that does not exist in that
> particular qemu version) will never appear as on even in the case of
> migration to modern qemu versions.
> 
> We don't have the equivalent protection for device config space. QEMU
> could assure a consistent MTU, number of queues, etc for the guest in
> virtio_net_get_config (and equivalent for other kinds of devices).
> QEMU already has some transformations there. It shouldn't take a lot
> of code.

I think I agree. It's the easiest way to ensure migration
consistency without troubles.

> Having said that:
> * I'm ok with starting just with checks there instead of
> transformations like the queues remap proposed here.
> * If we choose not to implement it, I'm not proposing to actually
> delete the features checks, as I see them useful :).
> 
> Thanks!

next prev parent reply	other threads:[~2023-02-01 10:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 19:10 Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user Eugenio Perez Martin
2023-01-31 19:11 ` Eugenio Perez Martin
2023-01-31 21:32   ` Michael S. Tsirkin
2023-02-01  3:29   ` Jason Wang
2023-02-01  7:49     ` Eugenio Perez Martin
2023-02-01 10:44       ` Michael S. Tsirkin [this message]
2023-02-02  3:41       ` Jason Wang
2023-02-02 18:32         ` Eugenio Perez Martin
2023-02-01 10:53     ` Michael S. Tsirkin
2023-02-01  3:27 ` Jason Wang
2023-02-01  6:55   ` Eugenio Perez Martin
2023-02-02  3:02     ` Jason Wang
2023-02-01 11:14 ` Maxime Coquelin
2023-02-01 11:19   ` Eugenio Perez Martin
2023-03-02  8:48     ` Maxime Coquelin
2023-02-01 11:20   ` Michael S. Tsirkin
2023-02-01 11:48     ` Eugenio Perez Martin
2023-02-02  3:44       ` Jason Wang
2023-02-02 18:37         ` Eugenio Perez Martin
2023-03-08 10:33     ` Maxime Coquelin
2023-03-08 12:15       ` Michael S. Tsirkin
2023-03-10 10:33         ` Maxime Coquelin
2023-03-10 10:49           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230201054323-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lulu@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=sgarzare@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).