From: Stefan Hajnoczi <stefanha@redhat.com>
To: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Cc: qemu-devel@nongnu.org, "Gonglei (Arei)" <arei.gonglei@huawei.com>,
"Zhenwei Pi" <pizhenwei@bytedance.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
"Raphael Norwitz" <raphael@enfabrica.net>,
"Kevin Wolf" <kwolf@redhat.com>,
"Hanna Reitz" <hreitz@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Fam Zheng" <fam@euphon.net>,
"Alex Bennée" <alex.bennee@linaro.org>,
mzamazal@redhat.com, "Peter Xu" <peterx@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
qemu-block@nongnu.org, virtio-fs@lists.linux.dev,
"yc-core@yandex-team.ru" <yc-core@yandex-team.ru>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>
Subject: Re: [PATCH v4 0/5] support inflight migration
Date: Tue, 6 Jan 2026 14:04:22 -0500 [thread overview]
Message-ID: <20260106190422.GB123256@fedora> (raw)
In-Reply-To: <20251229102107.1291790-1-dtalexundeer@yandex-team.ru>
[-- Attachment #1: Type: text/plain, Size: 5009 bytes --]
On Mon, Dec 29, 2025 at 03:21:03PM +0500, Alexandr Moshkov wrote:
> v4:
> While testing inflight migration, I notices a problem with the fact that
> GET_VRING_BASE is needed during migration, so the back-end stops
> dirtying pages and synchronizes `last_avail` counter with QEMU. So after
> migration in-flight I/O requests will be looks like resubmited on destination vm.
>
> However, in new logic, we no longer need to wait for in-flight requests
> to be complete at GET_VRING_BASE message. So support new parameter
> `should_drain` in the GET_VRING_BASE to allow back-end stop vrings
> immediately without waiting for in-flight I/O requests to complete.
>
> Also:
> - modify vhost-user rst
> - refactor on vhost-user-blk.c, now `should_drain` is based on
> device parameter `inflight-migration`
>
> v3:
> - use pre_load_errp instead of pre_load in vhost.c
> - change vhost-user-blk property to
> "skip-get-vring-base-inflight-migration"
> - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
>
> v2:
> - rewrite migration using VMSD instead of qemufile API
> - add vhost-user-blk parameter instead of migration capability
>
> I don't know if VMSD was used cleanly in migration implementation, so
> feel free for comments.
>
> Based on Vladimir's work:
> [PATCH v2 00/25] vhost-user-blk: live-backend local migration
> which was based on:
> - [PATCH v4 0/7] chardev: postpone connect
> (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
> - [PATCH v3 00/23] vhost refactoring and fixes
> - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
>
> Based-on: <20250924133309.334631-1-vsementsov@yandex-team.ru>
> Based-on: <20251015212051.1156334-1-vsementsov@yandex-team.ru>
> Based-on: <20251015145808.1112843-1-vsementsov@yandex-team.ru>
> Based-on: <20251015132136.1083972-15-vsementsov@yandex-team.ru>
> Based-on: <20251016114104.1384675-1-vsementsov@yandex-team.ru>
>
> ---
>
> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
I'm surprised by this statement because cancellation requires
communication with the disk. If in-flight requests are slow to drain,
then I would expect cancellation to be slow too. What kind of storage
are you using?
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>
> Therefore, for now I decided to leave this idea for later and
> implement the migration of the inflight region first for vhost-user-blk.
Sounds okay to me.
I'm not sure about the change to GET_VRING_BASE. A new parameter is
added without a feature bit, so there is no way to detect this feature
at runtime. Maybe a VHOST_USER_PROTOCOL_F_GET_VRING_BASE_INFLIGHT
feature bit should be added?
Once a feature bit exists, it may not even be necessary to add the
parameter to GET_VRING_BASE:
When VHOST_USER_PROTOCOL_F_GET_VRING_BASE_INFLIGHT is zero,
GET_VRING_BASE drains in-flight I/O before completing. When
VHOST_USER_PROTOCOL_F_GET_VRING_BASE_INFLIGHT is one, the backend may
leave requests in-flight (but host I/O requests must be cancelled in
order to comply with the "Suspended device state" semantics) when
GET_VRING_BASE completes.
What do you think?
>
> Alexandr Moshkov (5):
> vhost-user.rst: specify vhost-user back-end action on GET_VRING_BASE
> vhost-user: introduce should_drain on GET_VRING_BASE
> vmstate: introduce VMSTATE_VBUFFER_UINT64
> vhost: add vmstate for inflight region with inner buffer
> vhost-user-blk: support inter-host inflight migration
>
> backends/cryptodev-vhost.c | 2 +-
> backends/vhost-user.c | 2 +-
> docs/interop/vhost-user.rst | 8 +++-
> hw/block/vhost-user-blk.c | 28 ++++++++++++-
> hw/net/vhost_net.c | 9 ++--
> hw/scsi/vhost-scsi-common.c | 2 +-
> hw/virtio/vdpa-dev.c | 2 +-
> hw/virtio/vhost-user-base.c | 2 +-
> hw/virtio/vhost-user-fs.c | 2 +-
> hw/virtio/vhost-user-scmi.c | 2 +-
> hw/virtio/vhost-vsock-common.c | 2 +-
> hw/virtio/vhost.c | 66 ++++++++++++++++++++++++++----
> include/hw/virtio/vhost-user-blk.h | 1 +
> include/hw/virtio/vhost.h | 13 +++++-
> include/migration/vmstate.h | 10 +++++
> 15 files changed, 125 insertions(+), 26 deletions(-)
>
> --
> 2.34.1
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2026-01-06 19:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-29 10:21 [PATCH v4 0/5] support inflight migration Alexandr Moshkov
2025-12-29 10:21 ` [PATCH v4 1/5] vhost-user.rst: specify vhost-user back-end action on GET_VRING_BASE Alexandr Moshkov
2025-12-29 10:21 ` [PATCH v4 2/5] vhost-user: introduce should_drain " Alexandr Moshkov
2025-12-29 10:21 ` [PATCH v4 3/5] vmstate: introduce VMSTATE_VBUFFER_UINT64 Alexandr Moshkov
2025-12-29 10:21 ` [PATCH v4 4/5] vhost: add vmstate for inflight region with inner buffer Alexandr Moshkov
2025-12-29 10:21 ` [PATCH v4 5/5] vhost-user-blk: support inter-host inflight migration Alexandr Moshkov
2026-01-06 19:04 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260106190422.GB123256@fedora \
--to=stefanha@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=arei.gonglei@huawei.com \
--cc=armbru@redhat.com \
--cc=dtalexundeer@yandex-team.ru \
--cc=eblake@redhat.com \
--cc=fam@euphon.net \
--cc=farosas@suse.de \
--cc=hreitz@redhat.com \
--cc=jasowang@redhat.com \
--cc=kwolf@redhat.com \
--cc=mst@redhat.com \
--cc=mzamazal@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pizhenwei@bytedance.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=raphael@enfabrica.net \
--cc=sgarzare@redhat.com \
--cc=virtio-fs@lists.linux.dev \
--cc=yc-core@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox