All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Jorge E. Moreira" <jemoreira@google.com>,
	hreitz@redhat.com, gmaglione@redhat.com,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Hanna Czenczek <xanclic@gmail.com>,
	Pierrick Bouvier <pierrick.bouvier@linaro.org>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH] vhost-user.rst: Explicitly allow front-end to write to kick FDs
Date: Tue, 21 Apr 2026 16:51:56 -0400	[thread overview]
Message-ID: <20260421205156.GB466778@fedora> (raw)
In-Reply-To: <CAGxU2F7=XQfc4PUEM=k5yc-tytRs-MkrRKFeRXAETVEY5eggZQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 8676 bytes --]

On Mon, Apr 20, 2026 at 05:57:40PM +0200, Stefano Garzarella wrote:
> On Mon, 20 Apr 2026 at 16:49, Stefano Garzarella <sgarzare@redhat.com> wrote:
> >
> > Thanks for starting the discussion here, let me add also Hanna, German,
> > and Stefan in CC that can help us.
> >
> > On Fri, Apr 10, 2026 at 07:12:05PM -0700, Jorge E. Moreira wrote:
> > >Migration of back-end state happens while the device is suspended (i.e
> > >all vrings are stopped). To resume normal operation on the destination,
> > >the vrings need to be started again with a kick (either a write on the
> > >FD or the VHOST_USER_VRING_KICK in-band message if negotiated). While
> >
> > It's true that in the spec we have:
> >    "Each ring is initialized in a stopped and disabled state. The
> >    back-end must start a ring upon receiving a kick (that is, detecting
> >    that file descriptor is readable) on the descriptor specified by
> >    VHOST_USER_SET_VRING_KICK or receiving the in-band message
> >    VHOST_USER_VRING_KICK if negotiated, and stop a ring upon receiving
> >    VHOST_USER_GET_VRING_BASE."
> >
> > But IMO this applies when a driver is not yet loaded.
> > When we are migrating, the driver could be already loaded. So, in the
> > new device running in the destination, IMO we should consider the ring
> > already started or add some messages to tell to the device: "hey, the
> > device was already started, this is a migration and it's completed".
> >
> > Sending a kick from the frontend, seems more an hack here.
> >
> > That said, for example, in subprojects/libvhost-user/libvhost-user.c
> > IIUC the virtqueue is started when the SET_VRING_KICK is handled by
> > vu_set_vring_kick_exec(), but not sure how compliant it is.
> >
> > >these notifications are typically sent by the driver, it has no reason
> > >to send them in the destination if it already sent them in the source as
> > >the driver is unaware that a migration took place. Therefore it should
> > >be the responsibility of the vhost-user front-end to ensure these vrings
> > >are started. This is particularly necessary for queues where data only
> > >flows from device to driver, such as those used by the vsock and input
> > >devices.
> >
> > Exactly, so IMO we should not use the kick, but maybe add something new
> > or clarify what to do after the migration.
> >
> > For example in the "Migrating back-end state" we have:
> >    "Migrating device state involves transferring the state from one
> >    back-end, called the source, to another back-end, called the
> >    destination. After migration, the destination transparently resumes
> >    operation without requiring the driver to re-initialize the device at
> >    the VIRTIO level."
> >
> > So, IMO we can use the VHOST_USER_SET_DEVICE_STATE_FD channel exactly to
> > inform the new device about the state: "there isn't any state to
> > transfer, but I notify you that the device was already initialized, so
> > the vrings can be started".
> >
> > >
> > >This behavior is already used by some qemu vhost-user front-ends (e.g
> > >vhost-user-blk) and by front-ends implemented on other VMMs(e.g CrosVm).
> >
> > I looked at vhost-user-blk frontend, but I don't see it. I mean I see
> > the code around the comment "/* Kick right away to begin processing
> > requests already in vring */" but that one IIUC was introduced more to
> > fix devices violating specs, so not sure it's a good example to follow:
> >
> > commit 110b9463d5c820120c8311db79f55a64c9d81ebe
> > Author: Yongji Xie <elohimes@gmail.com>
> > Date:   Wed Jun 6 21:24:48 2018 +0800
> >
> >      vhost-user-blk: start vhost when guest kicks
> >
> >      Some old guests (before commit 7a11370e5: "virtio_blk: enable VQs early")
> >      kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. This violates
> >      the virtio spec. But virtio 1.0 transitional devices support this behaviour.
> >      So we should start vhost when guest kicks in this case.
> >
> >      Signed-off-by: Yongji Xie <xieyongji@baidu.com>
> >      Signed-off-by: Chai Wen <chaiwen@baidu.com>
> >      Signed-off-by: Ni Xun <nixun@baidu.com>
> >      Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> >      Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> >
> > >Adding it to the vhost-user documentation makes it explicit that this
> > >strategy is permitted and suggest it to vhost-user front-end authors.
> > >Explicitly documenting it is necessary because vring kicks appear
> > >designed to originate in the driver, so having some originate in the
> > >front-end can be counterintuitive and cause developers to waste time
> > >looking for other alternatives or face pushback during code review.
> >
> > As I pointed out in our discussion in
> > https://github.com/rust-vmm/vhost-device/pull/936
> > IMO we should use some in-band messages and not relaying on kicks that
> > should be used only by the driver to notify the device about new
> > available buffers.
> >
> > That said, I agree that we need to clarify in the specifications exactly
> > what the backend and frontend should do after a migration to start
> > vrings if there is no need to exchange a state.
> >
> >
> > Any other opinion?
> >
> > Thanks,
> > Stefano
> >
> > >
> > >Signed-off-by: Jorge E. Moreira <jemoreira@google.com>
> > >---
> > > docs/interop/vhost-user.rst | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > >index 137c9f3669..ad5aba3430 100644
> > >--- a/docs/interop/vhost-user.rst
> > >+++ b/docs/interop/vhost-user.rst
> > >@@ -656,7 +656,10 @@ destination, following the usual protocol for establishing a connection
> > > to a vhost-user back-end: This includes, for example, setting up memory
> > > mappings and kick and call FDs as necessary, negotiating protocol
> > > features, or setting the initial vring base indices (to the same value
> > >-as on the source side, so that operation can resume).
> > >+as on the source side, so that operation can resume). The vhost-user front-end
> > >+may also write to the kick FDs of vrings containing unused buffers or send
> > >+``VHOST_USER_VRING_KICK`` if negotiated to start those vrings in the destination
> > >+since the driver likely already kicked them in the source and won't do it again.
> 
> After discussing this with Hanna, we came to the conclusion that your
> idea of injecting the kick is the least invasive option for now and
> complies with the spec (even though I still don’t think it’s a nice
> thing to do).
> 
> So it’s fine to continue in this direction, but I might add these
> words more in the "Migration" section than here, since we’re talking
> about an optional state migration here. For example after "No further
> update must be done before rings are restarted."
> Or in the "Ring states" section, where we can clarify how to restart a
> ring after a migration. Or in both :-)
> 
> WDYT?
> 
> Then instead of "may" I'd use "should". And I would refer to the fact
> that migration is transparent to the driver, so the front-end should
> kick all initialized vrings to comply with what we described in the
> "Ring states" section.

I don't agree. Here is my thinking about how to solve this:

Device state migration uses the ring state machine to suspend the device
on the source host and start rings on the destination host after loading
state.

Starting rings is not specific to migration, it is covered by the ring
state machine that's also used when pausing and resuming a VM locally,
for example. We need to fix the ring state machine section in the spec
rather than making changes to the migration or device state fd sections.

Given that there are vhost-user back-ends like DPDK that do not rely on
the kick fd (they do not use it when running in poll mode), injecting a
kick cannot be necessary. Adding anything to the spec that requires the
kick fd is at best a no-op and at worst could break those back-ends.

The spec must be updated to say that rings are started by
VHOST_USER_SET_KICK, which is what implementations already do today.

If you want to add a note that front-ends may send a kick after
VHOST_USER_SET_KICK completes to ensure that rings are started according
to the old spec wording, then that is fine, but it should be clear that
kicks are not the primary mechanism for starting rings since it would be
dangerous to rely on that (it breaks poll-mode back-ends).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2026-04-21 20:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-11  2:12 [PATCH] vhost-user.rst: Explicitly allow front-end to write to kick FDs Jorge E. Moreira
2026-04-20 14:49 ` Stefano Garzarella
2026-04-20 15:57   ` Stefano Garzarella
2026-04-21 20:51     ` Stefan Hajnoczi [this message]
2026-04-22  8:25       ` Stefano Garzarella
2026-04-20 18:18   ` Stefan Hajnoczi
2026-04-21  0:48     ` Jorge Moreira
2026-04-21  7:55       ` Stefano Garzarella
2026-04-21 16:06         ` Jorge Moreira
2026-04-21 16:49           ` Stefano Garzarella
2026-04-21 19:57             ` Stefan Hajnoczi
2026-04-21 21:12       ` Stefan Hajnoczi
2026-04-22  1:16         ` Jorge Moreira
2026-04-22  8:32           ` Stefano Garzarella
2026-04-22 19:20             ` Jorge Moreira
2026-04-27 22:45               ` Stefan Hajnoczi
2026-04-27 22:48                 ` Jorge Moreira
2026-04-28 14:33                   ` Stefano Garzarella
2026-04-28 17:19                     ` Jorge Moreira
2026-04-29 14:26                       ` Stefano Garzarella
2026-04-29 16:00                       ` Stefan Hajnoczi
2026-04-29 15:55                     ` Stefan Hajnoczi
2026-04-28  6:59                 ` Michael S. Tsirkin
2026-04-29 15:50                   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260421205156.GB466778@fedora \
    --to=stefanha@redhat.com \
    --cc=gmaglione@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=jemoreira@google.com \
    --cc=mst@redhat.com \
    --cc=pierrick.bouvier@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=xanclic@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.