* [PATCH v3 1/3] vmstate: introduce VMSTATE_VBUFFER_UINT64
2025-11-10 10:39 [PATCH v3 0/3] vhost-user-blk: support inflight migration Alexandr Moshkov
@ 2025-11-10 10:39 ` Alexandr Moshkov
2025-11-10 10:39 ` [PATCH v3 2/3] vhost: add vmstate for inflight region with inner buffer Alexandr Moshkov
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Alexandr Moshkov @ 2025-11-10 10:39 UTC (permalink / raw)
To: qemu-devel
Cc: Raphael Norwitz, Michael S. Tsirkin, Stefano Garzarella,
Kevin Wolf, Hanna Reitz, Peter Xu, Fabiano Rosas, Eric Blake,
Markus Armbruster, Alexandr Moshkov
This is an analog of VMSTATE_VBUFFER_UINT32 macro, but for uint64 type.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
Acked-by: Peter Xu <peterx@redhat.com>
---
include/migration/vmstate.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 7f1f1c166a..4c9e212d58 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -707,6 +707,16 @@ extern const VMStateInfo vmstate_info_qlist;
.offset = offsetof(_state, _field), \
}
+#define VMSTATE_VBUFFER_UINT64(_field, _state, _version, _test, _field_size) { \
+ .name = (stringify(_field)), \
+ .version_id = (_version), \
+ .field_exists = (_test), \
+ .size_offset = vmstate_offset_value(_state, _field_size, uint64_t),\
+ .info = &vmstate_info_buffer, \
+ .flags = VMS_VBUFFER | VMS_POINTER, \
+ .offset = offsetof(_state, _field), \
+}
+
#define VMSTATE_VBUFFER_ALLOC_UINT32(_field, _state, _version, \
_test, _field_size) { \
.name = (stringify(_field)), \
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v3 2/3] vhost: add vmstate for inflight region with inner buffer
2025-11-10 10:39 [PATCH v3 0/3] vhost-user-blk: support inflight migration Alexandr Moshkov
2025-11-10 10:39 ` [PATCH v3 1/3] vmstate: introduce VMSTATE_VBUFFER_UINT64 Alexandr Moshkov
@ 2025-11-10 10:39 ` Alexandr Moshkov
2025-11-10 10:39 ` [PATCH v3 3/3] vhost-user-blk: support inter-host inflight migration Alexandr Moshkov
2025-11-18 20:24 ` [PATCH v3 0/3] vhost-user-blk: support " Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 6+ messages in thread
From: Alexandr Moshkov @ 2025-11-10 10:39 UTC (permalink / raw)
To: qemu-devel
Cc: Raphael Norwitz, Michael S. Tsirkin, Stefano Garzarella,
Kevin Wolf, Hanna Reitz, Peter Xu, Fabiano Rosas, Eric Blake,
Markus Armbruster, Alexandr Moshkov
Prepare for future inflight region migration for vhost-user-blk.
We need to migrate size, queue_size, and inner buffer.
So firstly it migrate size and queue_size fields, then allocate memory for buffer with
migrated size, then migrate inner buffer itself.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
---
hw/virtio/vhost.c | 42 +++++++++++++++++++++++++++++++++++++++
include/hw/virtio/vhost.h | 6 ++++++
2 files changed, 48 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c46203eb9c..9a746c9861 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2028,6 +2028,48 @@ const VMStateDescription vmstate_backend_transfer_vhost_inflight = {
}
};
+static int vhost_inflight_buffer_pre_load(void *opaque, Error **errp)
+{
+ info_report("vhost_inflight_region_buffer_pre_load");
+ struct vhost_inflight *inflight = opaque;
+
+ int fd = -1;
+ void *addr = qemu_memfd_alloc("vhost-inflight", inflight->size,
+ F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL,
+ &fd, errp);
+ if (*errp) {
+ return -ENOMEM;
+ }
+
+ inflight->offset = 0;
+ inflight->addr = addr;
+ inflight->fd = fd;
+
+ return 0;
+}
+
+const VMStateDescription vmstate_vhost_inflight_region_buffer = {
+ .name = "vhost-inflight-region/buffer",
+ .pre_load_errp = vhost_inflight_buffer_pre_load,
+ .fields = (const VMStateField[]) {
+ VMSTATE_VBUFFER_UINT64(addr, struct vhost_inflight, 0, NULL, size),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
+const VMStateDescription vmstate_vhost_inflight_region = {
+ .name = "vhost-inflight-region",
+ .fields = (const VMStateField[]) {
+ VMSTATE_UINT64(size, struct vhost_inflight),
+ VMSTATE_UINT16(queue_size, struct vhost_inflight),
+ VMSTATE_END_OF_LIST()
+ },
+ .subsections = (const VMStateDescription * const []) {
+ &vmstate_vhost_inflight_region_buffer,
+ NULL
+ }
+};
+
const VMStateDescription vmstate_vhost_virtqueue = {
.name = "vhost-virtqueue",
.fields = (const VMStateField[]) {
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 13ca2c319f..dd552de91f 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -596,6 +596,12 @@ extern const VMStateDescription vmstate_backend_transfer_vhost_inflight;
vmstate_backend_transfer_vhost_inflight, \
struct vhost_inflight)
+extern const VMStateDescription vmstate_vhost_inflight_region;
+#define VMSTATE_VHOST_INFLIGHT_REGION(_field, _state) \
+ VMSTATE_STRUCT_POINTER(_field, _state, \
+ vmstate_vhost_inflight_region, \
+ struct vhost_inflight)
+
extern const VMStateDescription vmstate_vhost_dev;
#define VMSTATE_BACKEND_TRANSFER_VHOST(_field, _state) \
VMSTATE_STRUCT(_field, _state, 0, vmstate_vhost_dev, struct vhost_dev)
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v3 3/3] vhost-user-blk: support inter-host inflight migration
2025-11-10 10:39 [PATCH v3 0/3] vhost-user-blk: support inflight migration Alexandr Moshkov
2025-11-10 10:39 ` [PATCH v3 1/3] vmstate: introduce VMSTATE_VBUFFER_UINT64 Alexandr Moshkov
2025-11-10 10:39 ` [PATCH v3 2/3] vhost: add vmstate for inflight region with inner buffer Alexandr Moshkov
@ 2025-11-10 10:39 ` Alexandr Moshkov
2025-11-18 20:24 ` [PATCH v3 0/3] vhost-user-blk: support " Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 6+ messages in thread
From: Alexandr Moshkov @ 2025-11-10 10:39 UTC (permalink / raw)
To: qemu-devel
Cc: Raphael Norwitz, Michael S. Tsirkin, Stefano Garzarella,
Kevin Wolf, Hanna Reitz, Peter Xu, Fabiano Rosas, Eric Blake,
Markus Armbruster, Alexandr Moshkov
During inter-host migration, waiting for disk requests to be drained
in the vhost-user backend can incur significant downtime.
This can be avoided if QEMU migrates the inflight region in
vhost-user-blk.
Thus, during the qemu migration, the vhost-user backend can cancel all
inflight requests and
then, after migration, they will be executed on another host.
In vhost_user_blk_stop() on incoming inter-host migration make
force_stop = true,
so GET_VRING_BASE will not be executed.
Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
---
hw/block/vhost-user-blk.c | 29 +++++++++++++++++++++++++++++
include/hw/virtio/vhost-user-blk.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index a8fd90480a..2d9b398de6 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -139,6 +139,14 @@ const VhostDevConfigOps blk_ops = {
.vhost_dev_config_notifier = vhost_user_blk_handle_config_change,
};
+static bool vhost_user_blk_inflight_needed(void *opaque)
+{
+ struct VHostUserBlk *s = opaque;
+
+ return s->skip_get_vring_base_inflight_migration &&
+ !migrate_local_vhost_user_blk();
+}
+
static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
{
VHostUserBlk *s = VHOST_USER_BLK(vdev);
@@ -242,6 +250,11 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
force_stop = s->skip_get_vring_base_on_force_shutdown &&
qemu_force_shutdown_requested();
+ if (vhost_user_blk_inflight_needed(s) &&
+ runstate_check(RUN_STATE_FINISH_MIGRATE)) {
+ force_stop = true;
+ }
+
s->dev.backend_transfer = s->dev.backend_transfer ||
(runstate_check(RUN_STATE_FINISH_MIGRATE) &&
migrate_local_vhost_user_blk());
@@ -656,6 +669,16 @@ static struct vhost_dev *vhost_user_blk_get_vhost(VirtIODevice *vdev)
return &s->dev;
}
+static const VMStateDescription vmstate_vhost_user_blk_inflight = {
+ .name = "vhost-user-blk/inflight",
+ .version_id = 1,
+ .needed = vhost_user_blk_inflight_needed,
+ .fields = (const VMStateField[]) {
+ VMSTATE_VHOST_INFLIGHT_REGION(inflight, VHostUserBlk),
+ VMSTATE_END_OF_LIST()
+ },
+};
+
static bool vhost_user_blk_pre_incoming(void *opaque, Error **errp)
{
VHostUserBlk *s = VHOST_USER_BLK(opaque);
@@ -678,6 +701,10 @@ static const VMStateDescription vmstate_vhost_user_blk = {
VMSTATE_VIRTIO_DEVICE,
VMSTATE_END_OF_LIST()
},
+ .subsections = (const VMStateDescription * const []) {
+ &vmstate_vhost_user_blk_inflight,
+ NULL
+ }
};
static bool vhost_user_needed(void *opaque)
@@ -751,6 +778,8 @@ static const Property vhost_user_blk_properties[] = {
VIRTIO_BLK_F_WRITE_ZEROES, true),
DEFINE_PROP_BOOL("skip-get-vring-base-on-force-shutdown", VHostUserBlk,
skip_get_vring_base_on_force_shutdown, false),
+ DEFINE_PROP_BOOL("skip-get-vring-base-inflight-migration", VHostUserBlk,
+ skip_get_vring_base_inflight_migration, false),
};
static void vhost_user_blk_class_init(ObjectClass *klass, const void *data)
diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h
index b06f55fd6f..859fb96956 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -52,6 +52,7 @@ struct VHostUserBlk {
bool started_vu;
bool skip_get_vring_base_on_force_shutdown;
+ bool skip_get_vring_base_inflight_migration;
bool incoming_backend;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration
2025-11-10 10:39 [PATCH v3 0/3] vhost-user-blk: support inflight migration Alexandr Moshkov
` (2 preceding siblings ...)
2025-11-10 10:39 ` [PATCH v3 3/3] vhost-user-blk: support inter-host inflight migration Alexandr Moshkov
@ 2025-11-18 20:24 ` Vladimir Sementsov-Ogievskiy
2025-11-18 22:05 ` Peter Xu
3 siblings, 1 reply; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-11-18 20:24 UTC (permalink / raw)
To: Alexandr Moshkov, qemu-devel
Cc: Raphael Norwitz, Michael S. Tsirkin, Stefano Garzarella,
Kevin Wolf, Hanna Reitz, Peter Xu, Fabiano Rosas, Eric Blake,
Markus Armbruster, Daniel P. Berrangé
Add Daniel
On 10.11.25 13:39, Alexandr Moshkov wrote:
> v3:
> - use pre_load_errp instead of pre_load in vhost.c
> - change vhost-user-blk property to
> "skip-get-vring-base-inflight-migration"
> - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
>
> v2:
> - rewrite migration using VMSD instead of qemufile API
> - add vhost-user-blk parameter instead of migration capability
>
> I don't know if VMSD was used cleanly in migration implementation, so
> feel free for comments.
>
> Based on Vladimir's work:
> [PATCH v2 00/25] vhost-user-blk: live-backend local migration
> which was based on:
> - [PATCH v4 0/7] chardev: postpone connect
> (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
> - [PATCH v3 00/23] vhost refactoring and fixes
> - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
>
Hi!
On my series about backend-transfer migration, the final consensus (or at least,
I hope that it's a consensus:) is that using device properties to control migration
channel content is wrong. And we should instead use migration parameters.
(discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
So the API for backend-transfer features is a migration parameter
backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
and user don't have to change device properties in runtime to setup the following migration.
So I assume, similar practice should be applied here: don't use device
properties to control migration.
So, should it be a parameter like
migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
?
> Based-on: <20250924133309.334631-1-vsementsov@yandex-team.ru>
> Based-on: <20251015212051.1156334-1-vsementsov@yandex-team.ru>
> Based-on: <20251015145808.1112843-1-vsementsov@yandex-team.ru>
> Based-on: <20251015132136.1083972-15-vsementsov@yandex-team.ru>
> Based-on: <20251016114104.1384675-1-vsementsov@yandex-team.ru>
>
> ---
>
> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>
> Therefore, for now I decided to leave this idea for later and
> implement the migration of the inflight region first for vhost-user-blk.
>
> Alexandr Moshkov (3):
> vmstate: introduce VMSTATE_VBUFFER_UINT64
> vhost: add vmstate for inflight region with inner buffer
> vhost-user-blk: support inter-host inflight migration
>
> hw/block/vhost-user-blk.c | 29 +++++++++++++++++++++
> hw/virtio/vhost.c | 42 ++++++++++++++++++++++++++++++
> include/hw/virtio/vhost-user-blk.h | 1 +
> include/hw/virtio/vhost.h | 6 +++++
> include/migration/vmstate.h | 10 +++++++
> 5 files changed, 88 insertions(+)
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration
2025-11-18 20:24 ` [PATCH v3 0/3] vhost-user-blk: support " Vladimir Sementsov-Ogievskiy
@ 2025-11-18 22:05 ` Peter Xu
0 siblings, 0 replies; 6+ messages in thread
From: Peter Xu @ 2025-11-18 22:05 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: Alexandr Moshkov, qemu-devel, Raphael Norwitz, Michael S. Tsirkin,
Stefano Garzarella, Kevin Wolf, Hanna Reitz, Fabiano Rosas,
Eric Blake, Markus Armbruster, Daniel P. Berrangé
On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add Daniel
>
> On 10.11.25 13:39, Alexandr Moshkov wrote:
> > v3:
> > - use pre_load_errp instead of pre_load in vhost.c
> > - change vhost-user-blk property to
> > "skip-get-vring-base-inflight-migration"
> > - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
> >
> > v2:
> > - rewrite migration using VMSD instead of qemufile API
> > - add vhost-user-blk parameter instead of migration capability
> >
> > I don't know if VMSD was used cleanly in migration implementation, so
> > feel free for comments.
> >
> > Based on Vladimir's work:
> > [PATCH v2 00/25] vhost-user-blk: live-backend local migration
> > which was based on:
> > - [PATCH v4 0/7] chardev: postpone connect
> > (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
> > - [PATCH v3 00/23] vhost refactoring and fixes
> > - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
> >
>
> Hi!
>
> On my series about backend-transfer migration, the final consensus (or at least,
> I hope that it's a consensus:) is that using device properties to control migration
> channel content is wrong. And we should instead use migration parameters.
>
> (discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
>
> So the API for backend-transfer features is a migration parameter
>
> backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
>
> and user don't have to change device properties in runtime to setup the following migration.
>
> So I assume, similar practice should be applied here: don't use device
> properties to control migration.
>
> So, should it be a parameter like
>
> migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
>
> ?
I have concern that if we start doing this more, migration qapi/ will be
completely messed up.
Imagine a world where there'll be tons of lists like:
migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
...
That doesn't look reasonable at all. If some feature is likely only
supported in one device, that should not appear in migration.json but only
in the specific device.
I don't think I'm fully convinced we can't enable some form of machine type
properties (with QDEV or not) on backends we should stick with something
like that. I can have some closer look this week, but.. even if not, I
still think migration shouldn't care about some specific behavior of a
specific device.
If we really want to have some way to probe device features, maybe we
should also think about a generic interface (rather than "one new list
every time"). We also have some recent discussions on a proper interface
to query TAP backend features like USO*. Maybe they share some of the
goals here.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread