* [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 18:56 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 02/33] vhost: drop backend_features field Vladimir Sementsov-Ogievskiy
` (32 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Remove vhost-user specific hack from generic code.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost-user.c | 8 ++++++++
hw/virtio/vhost.c | 15 ++++++---------
include/hw/virtio/vhost-backend.h | 2 ++
3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 1e1d6b0d6e..1b2879a90c 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1230,6 +1230,12 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring, false);
}
+static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
+{
+ return virtio_has_feature(dev->backend_features,
+ VHOST_USER_F_PROTOCOL_FEATURES);
+}
+
static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
{
int i;
@@ -3032,6 +3038,8 @@ const VhostOps user_ops = {
.vhost_reset_device = vhost_user_reset_device,
.vhost_get_vq_index = vhost_user_get_vq_index,
.vhost_set_vring_enable = vhost_user_set_vring_enable,
+ .vhost_set_vring_enable_supported =
+ vhost_user_set_vring_enable_supported,
.vhost_requires_shm_log = vhost_user_requires_shm_log,
.vhost_migration_done = vhost_user_migration_done,
.vhost_net_set_mtu = vhost_user_net_set_mtu,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6557c58d12..c33dad4acd 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1986,15 +1986,12 @@ static int vhost_dev_set_vring_enable(struct vhost_dev *hdev, int enable)
return 0;
}
- /*
- * For vhost-user devices, if VHOST_USER_F_PROTOCOL_FEATURES has not
- * been negotiated, the rings start directly in the enabled state, and
- * .vhost_set_vring_enable callback will fail since
- * VHOST_USER_SET_VRING_ENABLE is not supported.
- */
- if (hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER &&
- !virtio_has_feature(hdev->backend_features,
- VHOST_USER_F_PROTOCOL_FEATURES)) {
+ if (hdev->vhost_ops->vhost_set_vring_enable_supported &&
+ !hdev->vhost_ops->vhost_set_vring_enable_supported(hdev)) {
+ /*
+ * This means, that rings are always enabled, and disable/enable
+ * API is not supported.
+ */
return 0;
}
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index d6df209a2f..f65fa26298 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -105,6 +105,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
int enable);
+typedef bool (*vhost_set_vring_enable_supported_op)(struct vhost_dev *dev);
typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
char *mac_addr);
@@ -193,6 +194,7 @@ typedef struct VhostOps {
vhost_reset_device_op vhost_reset_device;
vhost_get_vq_index_op vhost_get_vq_index;
vhost_set_vring_enable_op vhost_set_vring_enable;
+ vhost_set_vring_enable_supported_op vhost_set_vring_enable_supported;
vhost_requires_shm_log_op vhost_requires_shm_log;
vhost_migration_done_op vhost_migration_done;
vhost_vsock_set_guest_cid_op vhost_vsock_set_guest_cid;
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method
2025-08-13 16:48 ` [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method Vladimir Sementsov-Ogievskiy
@ 2025-10-09 18:56 ` Raphael Norwitz
2025-10-09 19:25 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:56 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Remove vhost-user specific hack from generic code.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost-user.c | 8 ++++++++
> hw/virtio/vhost.c | 15 ++++++---------
> include/hw/virtio/vhost-backend.h | 2 ++
> 3 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 1e1d6b0d6e..1b2879a90c 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1230,6 +1230,12 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
> return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring, false);
> }
>
> +static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
> +{
> + return virtio_has_feature(dev->backend_features,
> + VHOST_USER_F_PROTOCOL_FEATURES);
> +}
> +
> static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
> {
> int i;
> @@ -3032,6 +3038,8 @@ const VhostOps user_ops = {
> .vhost_reset_device = vhost_user_reset_device,
> .vhost_get_vq_index = vhost_user_get_vq_index,
> .vhost_set_vring_enable = vhost_user_set_vring_enable,
> + .vhost_set_vring_enable_supported =
> + vhost_user_set_vring_enable_supported,
Why not make this a callback like vhost_user_gpu_{set|shared}_socket()
in vhost_backend.h instead?
> .vhost_requires_shm_log = vhost_user_requires_shm_log,
> .vhost_migration_done = vhost_user_migration_done,
> .vhost_net_set_mtu = vhost_user_net_set_mtu,
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 6557c58d12..c33dad4acd 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1986,15 +1986,12 @@ static int vhost_dev_set_vring_enable(struct vhost_dev *hdev, int enable)
> return 0;
> }
>
> - /*
> - * For vhost-user devices, if VHOST_USER_F_PROTOCOL_FEATURES has not
> - * been negotiated, the rings start directly in the enabled state, and
> - * .vhost_set_vring_enable callback will fail since
> - * VHOST_USER_SET_VRING_ENABLE is not supported.
> - */
> - if (hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER &&
> - !virtio_has_feature(hdev->backend_features,
> - VHOST_USER_F_PROTOCOL_FEATURES)) {
> + if (hdev->vhost_ops->vhost_set_vring_enable_supported &&
> + !hdev->vhost_ops->vhost_set_vring_enable_supported(hdev)) {
> + /*
> + * This means, that rings are always enabled, and disable/enable
> + * API is not supported.
> + */
> return 0;
> }
>
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index d6df209a2f..f65fa26298 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -105,6 +105,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
> typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
> typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
> int enable);
> +typedef bool (*vhost_set_vring_enable_supported_op)(struct vhost_dev *dev);
> typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
> typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
> char *mac_addr);
> @@ -193,6 +194,7 @@ typedef struct VhostOps {
> vhost_reset_device_op vhost_reset_device;
> vhost_get_vq_index_op vhost_get_vq_index;
> vhost_set_vring_enable_op vhost_set_vring_enable;
> + vhost_set_vring_enable_supported_op vhost_set_vring_enable_supported;
> vhost_requires_shm_log_op vhost_requires_shm_log;
> vhost_migration_done_op vhost_migration_done;
> vhost_vsock_set_guest_cid_op vhost_vsock_set_guest_cid;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method
2025-10-09 18:56 ` Raphael Norwitz
@ 2025-10-09 19:25 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 19:25 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 21:56, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> Remove vhost-user specific hack from generic code.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/virtio/vhost-user.c | 8 ++++++++
>> hw/virtio/vhost.c | 15 ++++++---------
>> include/hw/virtio/vhost-backend.h | 2 ++
>> 3 files changed, 16 insertions(+), 9 deletions(-)
>>
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 1e1d6b0d6e..1b2879a90c 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -1230,6 +1230,12 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
>> return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring, false);
>> }
>>
>> +static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
>> +{
>> + return virtio_has_feature(dev->backend_features,
>> + VHOST_USER_F_PROTOCOL_FEATURES);
>> +}
>> +
>> static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
>> {
>> int i;
>> @@ -3032,6 +3038,8 @@ const VhostOps user_ops = {
>> .vhost_reset_device = vhost_user_reset_device,
>> .vhost_get_vq_index = vhost_user_get_vq_index,
>> .vhost_set_vring_enable = vhost_user_set_vring_enable,
>> + .vhost_set_vring_enable_supported =
>> + vhost_user_set_vring_enable_supported,
>
> Why not make this a callback like vhost_user_gpu_{set|shared}_socket()
> in vhost_backend.h instead?
You mean make it just a separate function vhost_user_set_vring_enable_supported()?
But this way we'll have to keep "hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER"
below..
Or what do you mean?
>
>
>> .vhost_requires_shm_log = vhost_user_requires_shm_log,
>> .vhost_migration_done = vhost_user_migration_done,
>> .vhost_net_set_mtu = vhost_user_net_set_mtu,
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 6557c58d12..c33dad4acd 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -1986,15 +1986,12 @@ static int vhost_dev_set_vring_enable(struct vhost_dev *hdev, int enable)
>> return 0;
>> }
>>
>> - /*
>> - * For vhost-user devices, if VHOST_USER_F_PROTOCOL_FEATURES has not
>> - * been negotiated, the rings start directly in the enabled state, and
>> - * .vhost_set_vring_enable callback will fail since
>> - * VHOST_USER_SET_VRING_ENABLE is not supported.
>> - */
>> - if (hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER &&
>> - !virtio_has_feature(hdev->backend_features,
>> - VHOST_USER_F_PROTOCOL_FEATURES)) {
>> + if (hdev->vhost_ops->vhost_set_vring_enable_supported &&
>> + !hdev->vhost_ops->vhost_set_vring_enable_supported(hdev)) {
>> + /*
>> + * This means, that rings are always enabled, and disable/enable
>> + * API is not supported.
>> + */
>> return 0;
>> }
>>
>> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
>> index d6df209a2f..f65fa26298 100644
>> --- a/include/hw/virtio/vhost-backend.h
>> +++ b/include/hw/virtio/vhost-backend.h
>> @@ -105,6 +105,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
>> typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
>> typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
>> int enable);
>> +typedef bool (*vhost_set_vring_enable_supported_op)(struct vhost_dev *dev);
>> typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
>> typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
>> char *mac_addr);
>> @@ -193,6 +194,7 @@ typedef struct VhostOps {
>> vhost_reset_device_op vhost_reset_device;
>> vhost_get_vq_index_op vhost_get_vq_index;
>> vhost_set_vring_enable_op vhost_set_vring_enable;
>> + vhost_set_vring_enable_supported_op vhost_set_vring_enable_supported;
>> vhost_requires_shm_log_op vhost_requires_shm_log;
>> vhost_migration_done_op vhost_migration_done;
>> vhost_vsock_set_guest_cid_op vhost_vsock_set_guest_cid;
>> --
>> 2.48.1
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 02/33] vhost: drop backend_features field
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
2025-08-13 16:48 ` [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-09-12 14:39 ` Markus Armbruster
2025-10-09 18:57 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 03/33] vhost-user: introduce vhost_user_has_prot() helper Vladimir Sementsov-Ogievskiy
` (31 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov, Jason Wang, Fam Zheng
This field is mostly unused and sometimes confusing (we even have
a TODO-like comment to drop it). Let's finally do.
The field is used to held VHOST_USER_F_PROTOCOL_FEATURES for vhost-user
and/or VHOST_NET_F_VIRTIO_NET_HDR for vhost-net (which may be
vhoust-user-net). But we can simply recalculte these two flags inplace
from hdev->features, and from net-client for
VHOST_NET_F_VIRTIO_NET_HDR.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/block/vhost-user-blk.c | 1 -
hw/net/vhost_net.c | 16 ++++++++--------
hw/scsi/vhost-scsi.c | 1 -
hw/scsi/vhost-user-scsi.c | 1 -
hw/virtio/vdpa-dev.c | 1 -
hw/virtio/vhost-user.c | 19 +++++++++----------
hw/virtio/virtio-hmp-cmds.c | 2 --
hw/virtio/virtio-qmp.c | 2 --
include/hw/virtio/vhost.h | 7 -------
qapi/virtio.json | 3 ---
10 files changed, 17 insertions(+), 36 deletions(-)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index c0cc5f6942..de7a810c93 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -348,7 +348,6 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
s->dev.nvqs = s->num_queues;
s->dev.vqs = s->vhost_vqs;
s->dev.vq_index = 0;
- s->dev.backend_features = 0;
vhost_dev_set_config_notifier(&s->dev, &blk_ops);
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 540492b37d..fcee279f0b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -53,7 +53,10 @@ int vhost_net_set_config(struct vhost_net *net, const uint8_t *data,
void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
{
- net->dev.acked_features = net->dev.backend_features;
+ net->dev.acked_features =
+ (qemu_has_vnet_hdr(net->nc) ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))
+ | (net->dev.features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
+
vhost_ack_features(&net->dev, net->feature_bits, features);
}
@@ -256,12 +259,9 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
if (r < 0) {
goto fail;
}
- net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
- ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
net->backend = r;
net->dev.protocol_features = 0;
} else {
- net->dev.backend_features = 0;
net->dev.protocol_features = 0;
net->backend = -1;
@@ -281,10 +281,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
net->dev.features &= ~(1ULL << VIRTIO_NET_F_MRG_RXBUF);
}
- if (~net->dev.features & net->dev.backend_features) {
- fprintf(stderr, "vhost lacks feature mask 0x%" PRIx64
- " for backend\n",
- (uint64_t)(~net->dev.features & net->dev.backend_features));
+ if (!qemu_has_vnet_hdr(options->net_backend) &&
+ (~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))) {
+ fprintf(stderr, "vhost lacks feature mask 0x%llx for backend\n",
+ ~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR));
goto fail;
}
}
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index cdf405b0f8..d694a25fe2 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -276,7 +276,6 @@ static void vhost_scsi_realize(DeviceState *dev, Error **errp)
vqs = g_new0(struct vhost_virtqueue, vsc->dev.nvqs);
vsc->dev.vqs = vqs;
vsc->dev.vq_index = 0;
- vsc->dev.backend_features = 0;
ret = vhost_dev_init(&vsc->dev, (void *)(uintptr_t)vhostfd,
VHOST_BACKEND_TYPE_KERNEL, 0, errp);
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index 25f2d894e7..0c80a271d8 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -159,7 +159,6 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
vsc->dev.nvqs = VIRTIO_SCSI_VQ_NUM_FIXED + vs->conf.num_queues;
vsc->dev.vqs = s->vhost_vqs;
vsc->dev.vq_index = 0;
- vsc->dev.backend_features = 0;
ret = vhost_dev_init(&vsc->dev, &s->vhost_user, VHOST_BACKEND_TYPE_USER, 0,
errp);
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index d1da40afc8..3c0eed3e8e 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -104,7 +104,6 @@ static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
v->dev.vqs = vqs;
v->dev.vq_index = 0;
v->dev.vq_index_end = v->dev.nvqs;
- v->dev.backend_features = 0;
v->started = false;
ret = vhost_vdpa_get_iova_range(v->vhostfd, &iova_range);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 1b2879a90c..cf6f53801d 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1232,7 +1232,7 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
{
- return virtio_has_feature(dev->backend_features,
+ return virtio_has_feature(dev->features,
VHOST_USER_F_PROTOCOL_FEATURES);
}
@@ -1449,14 +1449,15 @@ static int vhost_user_set_features(struct vhost_dev *dev,
int ret;
/*
- * We need to include any extra backend only feature bits that
- * might be needed by our device. Currently this includes the
- * VHOST_USER_F_PROTOCOL_FEATURES bit for enabling protocol
- * features.
+ * Don't lose VHOST_USER_F_PROTOCOL_FEATURES, which is vhost-user
+ * specific.
*/
- ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES,
- features | dev->backend_features,
- log_enabled);
+ if (virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
+ features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
+ }
+
+ ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features,
+ log_enabled);
if (virtio_has_feature(dev->protocol_features,
VHOST_USER_PROTOCOL_F_STATUS)) {
@@ -2187,8 +2188,6 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
(dev->config_ops && dev->config_ops->vhost_dev_config_notifier);
uint64_t protocol_features;
- dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
-
err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
&protocol_features);
if (err < 0) {
diff --git a/hw/virtio/virtio-hmp-cmds.c b/hw/virtio/virtio-hmp-cmds.c
index 7d8677bcf0..024904915d 100644
--- a/hw/virtio/virtio-hmp-cmds.c
+++ b/hw/virtio/virtio-hmp-cmds.c
@@ -175,8 +175,6 @@ void hmp_virtio_status(Monitor *mon, const QDict *qdict)
hmp_virtio_dump_features(mon, s->vhost_dev->features);
monitor_printf(mon, " Acked features:\n");
hmp_virtio_dump_features(mon, s->vhost_dev->acked_features);
- monitor_printf(mon, " Backend features:\n");
- hmp_virtio_dump_features(mon, s->vhost_dev->backend_features);
monitor_printf(mon, " Protocol features:\n");
hmp_virtio_dump_protocols(mon, s->vhost_dev->protocol_features);
}
diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
index 3b6377cf0d..e514a4797e 100644
--- a/hw/virtio/virtio-qmp.c
+++ b/hw/virtio/virtio-qmp.c
@@ -788,8 +788,6 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
qmp_decode_features(vdev->device_id, hdev->features);
status->vhost_dev->acked_features =
qmp_decode_features(vdev->device_id, hdev->acked_features);
- status->vhost_dev->backend_features =
- qmp_decode_features(vdev->device_id, hdev->backend_features);
status->vhost_dev->protocol_features =
qmp_decode_protocols(hdev->protocol_features);
status->vhost_dev->max_queues = hdev->max_queues;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 66be6afc88..9f9dd2d46d 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -100,16 +100,9 @@ struct vhost_dev {
*
* @features: available features provided by the backend
* @acked_features: final negotiated features with front-end driver
- *
- * @backend_features: this is used in a couple of places to either
- * store VHOST_USER_F_PROTOCOL_FEATURES to apply to
- * VHOST_USER_SET_FEATURES or VHOST_NET_F_VIRTIO_NET_HDR. Its
- * future use should be discouraged and the variable retired as
- * its easy to confuse with the VirtIO backend_features.
*/
uint64_t features;
uint64_t acked_features;
- uint64_t backend_features;
/**
* @protocol_features: is the vhost-user only feature set by
diff --git a/qapi/virtio.json b/qapi/virtio.json
index 9d652fe4a8..0aae77340d 100644
--- a/qapi/virtio.json
+++ b/qapi/virtio.json
@@ -85,8 +85,6 @@
#
# @acked-features: vhost_dev acked_features
#
-# @backend-features: vhost_dev backend_features
-#
# @protocol-features: vhost_dev protocol_features
#
# @max-queues: vhost_dev max_queues
@@ -106,7 +104,6 @@
'vq-index': 'int',
'features': 'VirtioDeviceFeatures',
'acked-features': 'VirtioDeviceFeatures',
- 'backend-features': 'VirtioDeviceFeatures',
'protocol-features': 'VhostDeviceProtocols',
'max-queues': 'uint64',
'backend-cap': 'uint64',
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 02/33] vhost: drop backend_features field
2025-08-13 16:48 ` [PATCH 02/33] vhost: drop backend_features field Vladimir Sementsov-Ogievskiy
@ 2025-09-12 14:39 ` Markus Armbruster
2025-10-09 18:57 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Markus Armbruster @ 2025-09-12 14:39 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, qemu-devel, qemu-block,
steven.sistare, den-plotnikov, Jason Wang, Fam Zheng, devel
Cc: libvirt
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> This field is mostly unused and sometimes confusing (we even have
> a TODO-like comment to drop it). Let's finally do.
>
> The field is used to held VHOST_USER_F_PROTOCOL_FEATURES for vhost-user
> and/or VHOST_NET_F_VIRTIO_NET_HDR for vhost-net (which may be
> vhoust-user-net). But we can simply recalculte these two flags inplace
> from hdev->features, and from net-client for
> VHOST_NET_F_VIRTIO_NET_HDR.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[...]
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 66be6afc88..9f9dd2d46d 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -100,16 +100,9 @@ struct vhost_dev {
> *
> * @features: available features provided by the backend
> * @acked_features: final negotiated features with front-end driver
> - *
> - * @backend_features: this is used in a couple of places to either
> - * store VHOST_USER_F_PROTOCOL_FEATURES to apply to
> - * VHOST_USER_SET_FEATURES or VHOST_NET_F_VIRTIO_NET_HDR. Its
> - * future use should be discouraged and the variable retired as
> - * its easy to confuse with the VirtIO backend_features.
I guess this is the TODO-like comment mentioned in the commit message.
> */
> uint64_t features;
> uint64_t acked_features;
> - uint64_t backend_features;
>
> /**
> * @protocol_features: is the vhost-user only feature set by
> diff --git a/qapi/virtio.json b/qapi/virtio.json
> index 9d652fe4a8..0aae77340d 100644
> --- a/qapi/virtio.json
> +++ b/qapi/virtio.json
> @@ -85,8 +85,6 @@
> #
> # @acked-features: vhost_dev acked_features
> #
> -# @backend-features: vhost_dev backend_features
> -#
> # @protocol-features: vhost_dev protocol_features
> #
> # @max-queues: vhost_dev max_queues
> @@ -106,7 +104,6 @@
> 'vq-index': 'int',
> 'features': 'VirtioDeviceFeatures',
> 'acked-features': 'VirtioDeviceFeatures',
> - 'backend-features': 'VirtioDeviceFeatures',
> 'protocol-features': 'VhostDeviceProtocols',
> 'max-queues': 'uint64',
> 'backend-cap': 'uint64',
Incompatible change. We can do this because it's only visible in the
return value of x-query-virtio-status, which is unstable. Recommend to
note this in the commit message.
Acked-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 02/33] vhost: drop backend_features field
2025-08-13 16:48 ` [PATCH 02/33] vhost: drop backend_features field Vladimir Sementsov-Ogievskiy
2025-09-12 14:39 ` Markus Armbruster
@ 2025-10-09 18:57 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:57 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Jason Wang, Fam Zheng
Acked-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> This field is mostly unused and sometimes confusing (we even have
> a TODO-like comment to drop it). Let's finally do.
>
> The field is used to held VHOST_USER_F_PROTOCOL_FEATURES for vhost-user
> and/or VHOST_NET_F_VIRTIO_NET_HDR for vhost-net (which may be
> vhoust-user-net). But we can simply recalculte these two flags inplace
> from hdev->features, and from net-client for
> VHOST_NET_F_VIRTIO_NET_HDR.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/block/vhost-user-blk.c | 1 -
> hw/net/vhost_net.c | 16 ++++++++--------
> hw/scsi/vhost-scsi.c | 1 -
> hw/scsi/vhost-user-scsi.c | 1 -
> hw/virtio/vdpa-dev.c | 1 -
> hw/virtio/vhost-user.c | 19 +++++++++----------
> hw/virtio/virtio-hmp-cmds.c | 2 --
> hw/virtio/virtio-qmp.c | 2 --
> include/hw/virtio/vhost.h | 7 -------
> qapi/virtio.json | 3 ---
> 10 files changed, 17 insertions(+), 36 deletions(-)
>
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index c0cc5f6942..de7a810c93 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -348,7 +348,6 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
> s->dev.nvqs = s->num_queues;
> s->dev.vqs = s->vhost_vqs;
> s->dev.vq_index = 0;
> - s->dev.backend_features = 0;
>
> vhost_dev_set_config_notifier(&s->dev, &blk_ops);
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 540492b37d..fcee279f0b 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -53,7 +53,10 @@ int vhost_net_set_config(struct vhost_net *net, const uint8_t *data,
>
> void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
> {
> - net->dev.acked_features = net->dev.backend_features;
> + net->dev.acked_features =
> + (qemu_has_vnet_hdr(net->nc) ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))
> + | (net->dev.features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
> +
> vhost_ack_features(&net->dev, net->feature_bits, features);
> }
>
> @@ -256,12 +259,9 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
> if (r < 0) {
> goto fail;
> }
> - net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
> - ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
> net->backend = r;
> net->dev.protocol_features = 0;
> } else {
> - net->dev.backend_features = 0;
> net->dev.protocol_features = 0;
> net->backend = -1;
>
> @@ -281,10 +281,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
> sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
> net->dev.features &= ~(1ULL << VIRTIO_NET_F_MRG_RXBUF);
> }
> - if (~net->dev.features & net->dev.backend_features) {
> - fprintf(stderr, "vhost lacks feature mask 0x%" PRIx64
> - " for backend\n",
> - (uint64_t)(~net->dev.features & net->dev.backend_features));
> + if (!qemu_has_vnet_hdr(options->net_backend) &&
> + (~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))) {
> + fprintf(stderr, "vhost lacks feature mask 0x%llx for backend\n",
> + ~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR));
> goto fail;
> }
> }
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index cdf405b0f8..d694a25fe2 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -276,7 +276,6 @@ static void vhost_scsi_realize(DeviceState *dev, Error **errp)
> vqs = g_new0(struct vhost_virtqueue, vsc->dev.nvqs);
> vsc->dev.vqs = vqs;
> vsc->dev.vq_index = 0;
> - vsc->dev.backend_features = 0;
>
> ret = vhost_dev_init(&vsc->dev, (void *)(uintptr_t)vhostfd,
> VHOST_BACKEND_TYPE_KERNEL, 0, errp);
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index 25f2d894e7..0c80a271d8 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -159,7 +159,6 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
> vsc->dev.nvqs = VIRTIO_SCSI_VQ_NUM_FIXED + vs->conf.num_queues;
> vsc->dev.vqs = s->vhost_vqs;
> vsc->dev.vq_index = 0;
> - vsc->dev.backend_features = 0;
>
> ret = vhost_dev_init(&vsc->dev, &s->vhost_user, VHOST_BACKEND_TYPE_USER, 0,
> errp);
> diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> index d1da40afc8..3c0eed3e8e 100644
> --- a/hw/virtio/vdpa-dev.c
> +++ b/hw/virtio/vdpa-dev.c
> @@ -104,7 +104,6 @@ static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> v->dev.vqs = vqs;
> v->dev.vq_index = 0;
> v->dev.vq_index_end = v->dev.nvqs;
> - v->dev.backend_features = 0;
> v->started = false;
>
> ret = vhost_vdpa_get_iova_range(v->vhostfd, &iova_range);
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 1b2879a90c..cf6f53801d 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1232,7 +1232,7 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
>
> static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
> {
> - return virtio_has_feature(dev->backend_features,
> + return virtio_has_feature(dev->features,
> VHOST_USER_F_PROTOCOL_FEATURES);
> }
>
> @@ -1449,14 +1449,15 @@ static int vhost_user_set_features(struct vhost_dev *dev,
> int ret;
>
> /*
> - * We need to include any extra backend only feature bits that
> - * might be needed by our device. Currently this includes the
> - * VHOST_USER_F_PROTOCOL_FEATURES bit for enabling protocol
> - * features.
> + * Don't lose VHOST_USER_F_PROTOCOL_FEATURES, which is vhost-user
> + * specific.
> */
> - ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES,
> - features | dev->backend_features,
> - log_enabled);
> + if (virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
> + features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
> + }
> +
> + ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features,
> + log_enabled);
>
> if (virtio_has_feature(dev->protocol_features,
> VHOST_USER_PROTOCOL_F_STATUS)) {
> @@ -2187,8 +2188,6 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> (dev->config_ops && dev->config_ops->vhost_dev_config_notifier);
> uint64_t protocol_features;
>
> - dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
> -
> err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
> &protocol_features);
> if (err < 0) {
> diff --git a/hw/virtio/virtio-hmp-cmds.c b/hw/virtio/virtio-hmp-cmds.c
> index 7d8677bcf0..024904915d 100644
> --- a/hw/virtio/virtio-hmp-cmds.c
> +++ b/hw/virtio/virtio-hmp-cmds.c
> @@ -175,8 +175,6 @@ void hmp_virtio_status(Monitor *mon, const QDict *qdict)
> hmp_virtio_dump_features(mon, s->vhost_dev->features);
> monitor_printf(mon, " Acked features:\n");
> hmp_virtio_dump_features(mon, s->vhost_dev->acked_features);
> - monitor_printf(mon, " Backend features:\n");
> - hmp_virtio_dump_features(mon, s->vhost_dev->backend_features);
> monitor_printf(mon, " Protocol features:\n");
> hmp_virtio_dump_protocols(mon, s->vhost_dev->protocol_features);
> }
> diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> index 3b6377cf0d..e514a4797e 100644
> --- a/hw/virtio/virtio-qmp.c
> +++ b/hw/virtio/virtio-qmp.c
> @@ -788,8 +788,6 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
> qmp_decode_features(vdev->device_id, hdev->features);
> status->vhost_dev->acked_features =
> qmp_decode_features(vdev->device_id, hdev->acked_features);
> - status->vhost_dev->backend_features =
> - qmp_decode_features(vdev->device_id, hdev->backend_features);
> status->vhost_dev->protocol_features =
> qmp_decode_protocols(hdev->protocol_features);
> status->vhost_dev->max_queues = hdev->max_queues;
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 66be6afc88..9f9dd2d46d 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -100,16 +100,9 @@ struct vhost_dev {
> *
> * @features: available features provided by the backend
> * @acked_features: final negotiated features with front-end driver
> - *
> - * @backend_features: this is used in a couple of places to either
> - * store VHOST_USER_F_PROTOCOL_FEATURES to apply to
> - * VHOST_USER_SET_FEATURES or VHOST_NET_F_VIRTIO_NET_HDR. Its
> - * future use should be discouraged and the variable retired as
> - * its easy to confuse with the VirtIO backend_features.
> */
> uint64_t features;
> uint64_t acked_features;
> - uint64_t backend_features;
>
> /**
> * @protocol_features: is the vhost-user only feature set by
> diff --git a/qapi/virtio.json b/qapi/virtio.json
> index 9d652fe4a8..0aae77340d 100644
> --- a/qapi/virtio.json
> +++ b/qapi/virtio.json
> @@ -85,8 +85,6 @@
> #
> # @acked-features: vhost_dev acked_features
> #
> -# @backend-features: vhost_dev backend_features
> -#
> # @protocol-features: vhost_dev protocol_features
> #
> # @max-queues: vhost_dev max_queues
> @@ -106,7 +104,6 @@
> 'vq-index': 'int',
> 'features': 'VirtioDeviceFeatures',
> 'acked-features': 'VirtioDeviceFeatures',
> - 'backend-features': 'VirtioDeviceFeatures',
> 'protocol-features': 'VhostDeviceProtocols',
> 'max-queues': 'uint64',
> 'backend-cap': 'uint64',
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 03/33] vhost-user: introduce vhost_user_has_prot() helper
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
2025-08-13 16:48 ` [PATCH 01/33] vhost: introduce vhost_ops->vhost_set_vring_enable_supported method Vladimir Sementsov-Ogievskiy
2025-08-13 16:48 ` [PATCH 02/33] vhost: drop backend_features field Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 18:57 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 04/33] vhost: move protocol_features to vhost_user Vladimir Sementsov-Ogievskiy
` (30 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Make all protocol feature checks in the same way.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost-user.c | 102 ++++++++++++++++++-----------------------
1 file changed, 44 insertions(+), 58 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index cf6f53801d..6fa5b8a8bd 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -272,6 +272,11 @@ struct scrub_regions {
int fd_idx;
};
+static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
+{
+ return virtio_has_feature(dev->protocol_features, feature);
+}
+
static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
{
struct vhost_user *u = dev->opaque;
@@ -435,8 +440,7 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
{
int fds[VHOST_USER_MAX_RAM_SLOTS];
size_t fd_num = 0;
- bool shmfd = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_LOG_SHMFD);
+ bool shmfd = vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD);
int ret;
VhostUserMsg msg = {
.hdr.request = VHOST_USER_SET_LOG_BASE,
@@ -1006,11 +1010,10 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
int fds[VHOST_MEMORY_BASELINE_NREGIONS];
size_t fd_num = 0;
bool do_postcopy = u->postcopy_listen && u->postcopy_fd.handler;
- bool reply_supported = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK);
+ bool reply_supported =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
bool config_mem_slots =
- virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS);
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS);
int ret;
if (do_postcopy) {
@@ -1058,8 +1061,8 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
static int vhost_user_set_vring_endian(struct vhost_dev *dev,
struct vhost_vring_state *ring)
{
- bool cross_endian = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CROSS_ENDIAN);
+ bool cross_endian =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CROSS_ENDIAN);
VhostUserMsg msg = {
.hdr.request = VHOST_USER_SET_VRING_ENDIAN,
.hdr.flags = VHOST_USER_VERSION,
@@ -1129,8 +1132,8 @@ static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
int ret;
if (wait_for_reply) {
- bool reply_supported = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK);
+ bool reply_supported =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
if (reply_supported) {
msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
}
@@ -1459,8 +1462,7 @@ static int vhost_user_set_features(struct vhost_dev *dev,
ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features,
log_enabled);
- if (virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_STATUS)) {
+ if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
if (!ret) {
return vhost_user_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
}
@@ -1514,8 +1516,7 @@ static int vhost_user_reset_device(struct vhost_dev *dev)
* Historically, reset was not implemented so only reset devices
* that are expecting it.
*/
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
return -ENOSYS;
}
@@ -1572,8 +1573,7 @@ static int vhost_user_backend_handle_vring_host_notifier(struct vhost_dev *dev,
void *addr;
char *name;
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
vdev == NULL || queue_idx >= virtio_get_num_queues(vdev)) {
return -EINVAL;
}
@@ -1885,13 +1885,12 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
};
struct vhost_user *u = dev->opaque;
int sv[2], ret = 0;
- bool reply_supported = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK);
+ bool reply_supported =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
Error *local_err = NULL;
QIOChannel *ioc;
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
return 0;
}
@@ -2136,8 +2135,7 @@ static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
switch (pnd->reason) {
case POSTCOPY_NOTIFY_PROBE:
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
/* TODO: Get the device name into this error somehow */
error_setg(errp,
"vhost-user backend not capable of postcopy");
@@ -2228,7 +2226,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
}
/* query the max queues we support if backend supports Multiple Queue */
- if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ)) {
+ if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_MQ)) {
err = vhost_user_get_u64(dev, VHOST_USER_GET_QUEUE_NUM,
&dev->max_queues);
if (err < 0) {
@@ -2246,18 +2244,16 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
}
if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
- !(virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_BACKEND_REQ) &&
- virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
+ !(vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ) &&
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
error_setg(errp, "IOMMU support requires reply-ack and "
"backend-req protocol features.");
return -EINVAL;
}
/* get max memory regions if backend supports configurable RAM slots */
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS)) {
+ if (!vhost_user_has_prot(dev,
+ VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS)) {
u->user->memory_slots = VHOST_MEMORY_BASELINE_NREGIONS;
} else {
err = vhost_user_get_max_memslots(dev, &ram_slots);
@@ -2279,8 +2275,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
}
if (dev->migration_blocker == NULL &&
- !virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
+ !vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
error_setg(&dev->migration_blocker,
"Migration disabled: vhost-user backend lacks "
"VHOST_USER_PROTOCOL_F_LOG_SHMFD feature.");
@@ -2349,8 +2344,7 @@ static bool vhost_user_requires_shm_log(struct vhost_dev *dev)
{
assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
- return virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_LOG_SHMFD);
+ return vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD);
}
static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
@@ -2365,8 +2359,7 @@ static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
}
/* if backend supports VHOST_USER_PROTOCOL_F_RARP ask it to send the RARP */
- if (virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_RARP)) {
+ if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_RARP)) {
msg.hdr.request = VHOST_USER_SEND_RARP;
msg.hdr.flags = VHOST_USER_VERSION;
memcpy((char *)&msg.payload.u64, mac_addr, 6);
@@ -2380,11 +2373,11 @@ static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
static int vhost_user_net_set_mtu(struct vhost_dev *dev, uint16_t mtu)
{
VhostUserMsg msg;
- bool reply_supported = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK);
+ bool reply_supported =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
int ret;
- if (!(dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_NET_MTU)) {
return 0;
}
@@ -2444,8 +2437,7 @@ static int vhost_user_get_config(struct vhost_dev *dev, uint8_t *config,
.hdr.size = VHOST_USER_CONFIG_HDR_SIZE + config_len,
};
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CONFIG)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIG)) {
error_setg(errp, "VHOST_USER_PROTOCOL_F_CONFIG not supported");
return -EINVAL;
}
@@ -2488,8 +2480,8 @@ static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *data,
{
int ret;
uint8_t *p;
- bool reply_supported = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_REPLY_ACK);
+ bool reply_supported =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
VhostUserMsg msg = {
.hdr.request = VHOST_USER_SET_CONFIG,
@@ -2497,8 +2489,7 @@ static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *data,
.hdr.size = VHOST_USER_CONFIG_HDR_SIZE + size,
};
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CONFIG)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIG)) {
return -ENOTSUP;
}
@@ -2533,8 +2524,8 @@ static int vhost_user_crypto_create_session(struct vhost_dev *dev,
uint64_t *session_id)
{
int ret;
- bool crypto_session = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
+ bool crypto_session =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
CryptoDevBackendSessionInfo *backend_info = session_info;
VhostUserMsg msg = {
.hdr.request = VHOST_USER_CREATE_CRYPTO_SESSION,
@@ -2635,8 +2626,8 @@ static int
vhost_user_crypto_close_session(struct vhost_dev *dev, uint64_t session_id)
{
int ret;
- bool crypto_session = virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
+ bool crypto_session =
+ vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
VhostUserMsg msg = {
.hdr.request = VHOST_USER_CLOSE_CRYPTO_SESSION,
.hdr.flags = VHOST_USER_VERSION,
@@ -2681,8 +2672,7 @@ static int vhost_user_get_inflight_fd(struct vhost_dev *dev,
.hdr.size = sizeof(msg.payload.inflight),
};
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
return 0;
}
@@ -2749,8 +2739,7 @@ static int vhost_user_set_inflight_fd(struct vhost_dev *dev,
.hdr.size = sizeof(msg.payload.inflight),
};
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
return 0;
}
@@ -2849,8 +2838,7 @@ void vhost_user_async_close(DeviceState *d,
static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
{
- if (!virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_STATUS)) {
+ if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
return 0;
}
@@ -2875,16 +2863,14 @@ static void vhost_user_reset_status(struct vhost_dev *dev)
return;
}
- if (virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_STATUS)) {
+ if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
vhost_user_set_status(dev, 0);
}
}
static bool vhost_user_supports_device_state(struct vhost_dev *dev)
{
- return virtio_has_feature(dev->protocol_features,
- VHOST_USER_PROTOCOL_F_DEVICE_STATE);
+ return vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_DEVICE_STATE);
}
static int vhost_user_set_device_state_fd(struct vhost_dev *dev,
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 03/33] vhost-user: introduce vhost_user_has_prot() helper
2025-08-13 16:48 ` [PATCH 03/33] vhost-user: introduce vhost_user_has_prot() helper Vladimir Sementsov-Ogievskiy
@ 2025-10-09 18:57 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:57 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:59 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Make all protocol feature checks in the same way.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost-user.c | 102 ++++++++++++++++++-----------------------
> 1 file changed, 44 insertions(+), 58 deletions(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index cf6f53801d..6fa5b8a8bd 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -272,6 +272,11 @@ struct scrub_regions {
> int fd_idx;
> };
>
> +static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
> +{
> + return virtio_has_feature(dev->protocol_features, feature);
> +}
> +
> static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
> {
> struct vhost_user *u = dev->opaque;
> @@ -435,8 +440,7 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
> {
> int fds[VHOST_USER_MAX_RAM_SLOTS];
> size_t fd_num = 0;
> - bool shmfd = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_LOG_SHMFD);
> + bool shmfd = vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD);
> int ret;
> VhostUserMsg msg = {
> .hdr.request = VHOST_USER_SET_LOG_BASE,
> @@ -1006,11 +1010,10 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
> int fds[VHOST_MEMORY_BASELINE_NREGIONS];
> size_t fd_num = 0;
> bool do_postcopy = u->postcopy_listen && u->postcopy_fd.handler;
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + bool reply_supported =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
> bool config_mem_slots =
> - virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS);
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS);
> int ret;
>
> if (do_postcopy) {
> @@ -1058,8 +1061,8 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
> static int vhost_user_set_vring_endian(struct vhost_dev *dev,
> struct vhost_vring_state *ring)
> {
> - bool cross_endian = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CROSS_ENDIAN);
> + bool cross_endian =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CROSS_ENDIAN);
> VhostUserMsg msg = {
> .hdr.request = VHOST_USER_SET_VRING_ENDIAN,
> .hdr.flags = VHOST_USER_VERSION,
> @@ -1129,8 +1132,8 @@ static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
> int ret;
>
> if (wait_for_reply) {
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + bool reply_supported =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
> if (reply_supported) {
> msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> }
> @@ -1459,8 +1462,7 @@ static int vhost_user_set_features(struct vhost_dev *dev,
> ret = vhost_user_set_u64(dev, VHOST_USER_SET_FEATURES, features,
> log_enabled);
>
> - if (virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_STATUS)) {
> + if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
> if (!ret) {
> return vhost_user_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
> }
> @@ -1514,8 +1516,7 @@ static int vhost_user_reset_device(struct vhost_dev *dev)
> * Historically, reset was not implemented so only reset devices
> * that are expecting it.
> */
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
> return -ENOSYS;
> }
>
> @@ -1572,8 +1573,7 @@ static int vhost_user_backend_handle_vring_host_notifier(struct vhost_dev *dev,
> void *addr;
> char *name;
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) ||
> vdev == NULL || queue_idx >= virtio_get_num_queues(vdev)) {
> return -EINVAL;
> }
> @@ -1885,13 +1885,12 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
> };
> struct vhost_user *u = dev->opaque;
> int sv[2], ret = 0;
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + bool reply_supported =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
> Error *local_err = NULL;
> QIOChannel *ioc;
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
> return 0;
> }
>
> @@ -2136,8 +2135,7 @@ static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
>
> switch (pnd->reason) {
> case POSTCOPY_NOTIFY_PROBE:
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
> /* TODO: Get the device name into this error somehow */
> error_setg(errp,
> "vhost-user backend not capable of postcopy");
> @@ -2228,7 +2226,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> }
>
> /* query the max queues we support if backend supports Multiple Queue */
> - if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ)) {
> + if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_MQ)) {
> err = vhost_user_get_u64(dev, VHOST_USER_GET_QUEUE_NUM,
> &dev->max_queues);
> if (err < 0) {
> @@ -2246,18 +2244,16 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> }
>
> if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
> - !(virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_BACKEND_REQ) &&
> - virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
> + !(vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ) &&
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
> error_setg(errp, "IOMMU support requires reply-ack and "
> "backend-req protocol features.");
> return -EINVAL;
> }
>
> /* get max memory regions if backend supports configurable RAM slots */
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS)) {
> + if (!vhost_user_has_prot(dev,
> + VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS)) {
> u->user->memory_slots = VHOST_MEMORY_BASELINE_NREGIONS;
> } else {
> err = vhost_user_get_max_memslots(dev, &ram_slots);
> @@ -2279,8 +2275,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> }
>
> if (dev->migration_blocker == NULL &&
> - !virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
> + !vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
> error_setg(&dev->migration_blocker,
> "Migration disabled: vhost-user backend lacks "
> "VHOST_USER_PROTOCOL_F_LOG_SHMFD feature.");
> @@ -2349,8 +2344,7 @@ static bool vhost_user_requires_shm_log(struct vhost_dev *dev)
> {
> assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
>
> - return virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_LOG_SHMFD);
> + return vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD);
> }
>
> static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
> @@ -2365,8 +2359,7 @@ static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
> }
>
> /* if backend supports VHOST_USER_PROTOCOL_F_RARP ask it to send the RARP */
> - if (virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_RARP)) {
> + if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_RARP)) {
> msg.hdr.request = VHOST_USER_SEND_RARP;
> msg.hdr.flags = VHOST_USER_VERSION;
> memcpy((char *)&msg.payload.u64, mac_addr, 6);
> @@ -2380,11 +2373,11 @@ static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr)
> static int vhost_user_net_set_mtu(struct vhost_dev *dev, uint16_t mtu)
> {
> VhostUserMsg msg;
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + bool reply_supported =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
> int ret;
>
> - if (!(dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_NET_MTU)) {
> return 0;
> }
>
> @@ -2444,8 +2437,7 @@ static int vhost_user_get_config(struct vhost_dev *dev, uint8_t *config,
> .hdr.size = VHOST_USER_CONFIG_HDR_SIZE + config_len,
> };
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CONFIG)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIG)) {
> error_setg(errp, "VHOST_USER_PROTOCOL_F_CONFIG not supported");
> return -EINVAL;
> }
> @@ -2488,8 +2480,8 @@ static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *data,
> {
> int ret;
> uint8_t *p;
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + bool reply_supported =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
>
> VhostUserMsg msg = {
> .hdr.request = VHOST_USER_SET_CONFIG,
> @@ -2497,8 +2489,7 @@ static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *data,
> .hdr.size = VHOST_USER_CONFIG_HDR_SIZE + size,
> };
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CONFIG)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CONFIG)) {
> return -ENOTSUP;
> }
>
> @@ -2533,8 +2524,8 @@ static int vhost_user_crypto_create_session(struct vhost_dev *dev,
> uint64_t *session_id)
> {
> int ret;
> - bool crypto_session = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
> + bool crypto_session =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
> CryptoDevBackendSessionInfo *backend_info = session_info;
> VhostUserMsg msg = {
> .hdr.request = VHOST_USER_CREATE_CRYPTO_SESSION,
> @@ -2635,8 +2626,8 @@ static int
> vhost_user_crypto_close_session(struct vhost_dev *dev, uint64_t session_id)
> {
> int ret;
> - bool crypto_session = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
> + bool crypto_session =
> + vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_CRYPTO_SESSION);
> VhostUserMsg msg = {
> .hdr.request = VHOST_USER_CLOSE_CRYPTO_SESSION,
> .hdr.flags = VHOST_USER_VERSION,
> @@ -2681,8 +2672,7 @@ static int vhost_user_get_inflight_fd(struct vhost_dev *dev,
> .hdr.size = sizeof(msg.payload.inflight),
> };
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
> return 0;
> }
>
> @@ -2749,8 +2739,7 @@ static int vhost_user_set_inflight_fd(struct vhost_dev *dev,
> .hdr.size = sizeof(msg.payload.inflight),
> };
>
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) {
> return 0;
> }
>
> @@ -2849,8 +2838,7 @@ void vhost_user_async_close(DeviceState *d,
>
> static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
> {
> - if (!virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_STATUS)) {
> + if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
> return 0;
> }
>
> @@ -2875,16 +2863,14 @@ static void vhost_user_reset_status(struct vhost_dev *dev)
> return;
> }
>
> - if (virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_STATUS)) {
> + if (vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
> vhost_user_set_status(dev, 0);
> }
> }
>
> static bool vhost_user_supports_device_state(struct vhost_dev *dev)
> {
> - return virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_DEVICE_STATE);
> + return vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_DEVICE_STATE);
> }
>
> static int vhost_user_set_device_state_fd(struct vhost_dev *dev,
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 04/33] vhost: move protocol_features to vhost_user
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (2 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 03/33] vhost-user: introduce vhost_user_has_prot() helper Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 18:57 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 05/33] vhost-user-gpu: drop code duplication Vladimir Sementsov-Ogievskiy
` (29 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov, Gonglei (Arei), Zhenwei Pi, Jason Wang
As comment says: it's only for vhost-user. So, let's move it
to corresponding vhost backend realization.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
backends/cryptodev-vhost.c | 1 -
hw/net/vhost_net.c | 2 --
hw/virtio/vhost-user.c | 23 ++++++++++++++++++++---
hw/virtio/virtio-qmp.c | 6 ++++--
include/hw/virtio/vhost-backend.h | 3 +++
include/hw/virtio/vhost.h | 8 --------
6 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
index 943680a23a..3bcdc494d8 100644
--- a/backends/cryptodev-vhost.c
+++ b/backends/cryptodev-vhost.c
@@ -60,7 +60,6 @@ cryptodev_vhost_init(
crypto->cc = options->cc;
- crypto->dev.protocol_features = 0;
crypto->backend = -1;
/* vhost-user needs vq_index to initiate a specific queue pair */
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index fcee279f0b..ce30b6e197 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -260,9 +260,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
goto fail;
}
net->backend = r;
- net->dev.protocol_features = 0;
} else {
- net->dev.protocol_features = 0;
net->backend = -1;
/* vhost-user needs vq_index to initiate a specific queue pair */
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 6fa5b8a8bd..abdf47ee7b 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -11,6 +11,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "hw/virtio/virtio-dmabuf.h"
+#include "hw/virtio/virtio-qmp.h"
#include "hw/virtio/vhost.h"
#include "hw/virtio/virtio-crypto.h"
#include "hw/virtio/vhost-user.h"
@@ -264,6 +265,14 @@ struct vhost_user {
/* Our current regions */
int num_shadow_regions;
struct vhost_memory_region shadow_regions[VHOST_USER_MAX_RAM_SLOTS];
+
+ /**
+ * @protocol_features: the vhost-user protocol feature set by
+ * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
+ * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
+ * by the backend (see @features).
+ */
+ uint64_t protocol_features;
};
struct scrub_regions {
@@ -274,7 +283,8 @@ struct scrub_regions {
static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
{
- return virtio_has_feature(dev->protocol_features, feature);
+ struct vhost_user *u = dev->opaque;
+ return virtio_has_feature(u->protocol_features, feature);
}
static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
@@ -2218,8 +2228,8 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
}
/* final set of protocol features */
- dev->protocol_features = protocol_features;
- err = vhost_user_set_protocol_features(dev, dev->protocol_features);
+ u->protocol_features = protocol_features;
+ err = vhost_user_set_protocol_features(dev, u->protocol_features);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_backend_init failed");
return -EPROTO;
@@ -3001,6 +3011,12 @@ static int vhost_user_check_device_state(struct vhost_dev *dev, Error **errp)
return 0;
}
+static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
+{
+ struct vhost_user *u = dev->opaque;
+ status->protocol_features = qmp_decode_protocols(u->protocol_features);
+}
+
const VhostOps user_ops = {
.backend_type = VHOST_BACKEND_TYPE_USER,
.vhost_backend_init = vhost_user_backend_init,
@@ -3041,4 +3057,5 @@ const VhostOps user_ops = {
.vhost_supports_device_state = vhost_user_supports_device_state,
.vhost_set_device_state_fd = vhost_user_set_device_state_fd,
.vhost_check_device_state = vhost_user_check_device_state,
+ .vhost_qmp_status = vhost_user_qmp_status,
};
diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
index e514a4797e..d55b12f9f3 100644
--- a/hw/virtio/virtio-qmp.c
+++ b/hw/virtio/virtio-qmp.c
@@ -788,12 +788,14 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
qmp_decode_features(vdev->device_id, hdev->features);
status->vhost_dev->acked_features =
qmp_decode_features(vdev->device_id, hdev->acked_features);
- status->vhost_dev->protocol_features =
- qmp_decode_protocols(hdev->protocol_features);
status->vhost_dev->max_queues = hdev->max_queues;
status->vhost_dev->backend_cap = hdev->backend_cap;
status->vhost_dev->log_enabled = hdev->log_enabled;
status->vhost_dev->log_size = hdev->log_size;
+
+ if (hdev->vhost_ops->vhost_qmp_status) {
+ hdev->vhost_ops->vhost_qmp_status(hdev, status->vhost_dev);
+ }
}
return status;
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index f65fa26298..0785fc764d 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -12,6 +12,7 @@
#define VHOST_BACKEND_H
#include "system/memory.h"
+#include "qapi/qapi-commands-virtio.h"
typedef enum VhostBackendType {
VHOST_BACKEND_TYPE_NONE = 0,
@@ -160,6 +161,7 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
int *reply_fd,
Error **errp);
typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
+typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
typedef struct VhostOps {
VhostBackendType backend_type;
@@ -216,6 +218,7 @@ typedef struct VhostOps {
vhost_supports_device_state_op vhost_supports_device_state;
vhost_set_device_state_fd_op vhost_set_device_state_fd;
vhost_check_device_state_op vhost_check_device_state;
+ vhost_qmp_status_op vhost_qmp_status;
} VhostOps;
int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 9f9dd2d46d..15bc287a9d 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -104,14 +104,6 @@ struct vhost_dev {
uint64_t features;
uint64_t acked_features;
- /**
- * @protocol_features: is the vhost-user only feature set by
- * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
- * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
- * by the backend (see @features).
- */
- uint64_t protocol_features;
-
uint64_t max_queues;
uint64_t backend_cap;
/* @started: is the vhost device started? */
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 04/33] vhost: move protocol_features to vhost_user
2025-08-13 16:48 ` [PATCH 04/33] vhost: move protocol_features to vhost_user Vladimir Sementsov-Ogievskiy
@ 2025-10-09 18:57 ` Raphael Norwitz
2025-10-09 19:35 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:57 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Gonglei (Arei),
Zhenwei Pi, Jason Wang
On Wed, Aug 13, 2025 at 12:57 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> As comment says: it's only for vhost-user. So, let's move it
> to corresponding vhost backend realization.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> backends/cryptodev-vhost.c | 1 -
> hw/net/vhost_net.c | 2 --
> hw/virtio/vhost-user.c | 23 ++++++++++++++++++++---
> hw/virtio/virtio-qmp.c | 6 ++++--
> include/hw/virtio/vhost-backend.h | 3 +++
> include/hw/virtio/vhost.h | 8 --------
> 6 files changed, 27 insertions(+), 16 deletions(-)
>
> diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
> index 943680a23a..3bcdc494d8 100644
> --- a/backends/cryptodev-vhost.c
> +++ b/backends/cryptodev-vhost.c
> @@ -60,7 +60,6 @@ cryptodev_vhost_init(
>
> crypto->cc = options->cc;
>
> - crypto->dev.protocol_features = 0;
> crypto->backend = -1;
>
> /* vhost-user needs vq_index to initiate a specific queue pair */
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index fcee279f0b..ce30b6e197 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -260,9 +260,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
> goto fail;
> }
> net->backend = r;
> - net->dev.protocol_features = 0;
> } else {
> - net->dev.protocol_features = 0;
> net->backend = -1;
>
> /* vhost-user needs vq_index to initiate a specific queue pair */
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 6fa5b8a8bd..abdf47ee7b 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -11,6 +11,7 @@
> #include "qemu/osdep.h"
> #include "qapi/error.h"
> #include "hw/virtio/virtio-dmabuf.h"
> +#include "hw/virtio/virtio-qmp.h"
> #include "hw/virtio/vhost.h"
> #include "hw/virtio/virtio-crypto.h"
> #include "hw/virtio/vhost-user.h"
> @@ -264,6 +265,14 @@ struct vhost_user {
> /* Our current regions */
> int num_shadow_regions;
> struct vhost_memory_region shadow_regions[VHOST_USER_MAX_RAM_SLOTS];
> +
> + /**
> + * @protocol_features: the vhost-user protocol feature set by
> + * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
> + * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
> + * by the backend (see @features).
> + */
> + uint64_t protocol_features;
> };
>
> struct scrub_regions {
> @@ -274,7 +283,8 @@ struct scrub_regions {
>
> static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
> {
> - return virtio_has_feature(dev->protocol_features, feature);
> + struct vhost_user *u = dev->opaque;
> + return virtio_has_feature(u->protocol_features, feature);
> }
>
> static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
> @@ -2218,8 +2228,8 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> }
>
> /* final set of protocol features */
> - dev->protocol_features = protocol_features;
> - err = vhost_user_set_protocol_features(dev, dev->protocol_features);
> + u->protocol_features = protocol_features;
> + err = vhost_user_set_protocol_features(dev, u->protocol_features);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_backend_init failed");
> return -EPROTO;
> @@ -3001,6 +3011,12 @@ static int vhost_user_check_device_state(struct vhost_dev *dev, Error **errp)
> return 0;
> }
>
> +static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
> +{
> + struct vhost_user *u = dev->opaque;
> + status->protocol_features = qmp_decode_protocols(u->protocol_features);
> +}
> +
> const VhostOps user_ops = {
> .backend_type = VHOST_BACKEND_TYPE_USER,
> .vhost_backend_init = vhost_user_backend_init,
> @@ -3041,4 +3057,5 @@ const VhostOps user_ops = {
> .vhost_supports_device_state = vhost_user_supports_device_state,
> .vhost_set_device_state_fd = vhost_user_set_device_state_fd,
> .vhost_check_device_state = vhost_user_check_device_state,
> + .vhost_qmp_status = vhost_user_qmp_status,
> };
> diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> index e514a4797e..d55b12f9f3 100644
> --- a/hw/virtio/virtio-qmp.c
> +++ b/hw/virtio/virtio-qmp.c
> @@ -788,12 +788,14 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
> qmp_decode_features(vdev->device_id, hdev->features);
> status->vhost_dev->acked_features =
> qmp_decode_features(vdev->device_id, hdev->acked_features);
> - status->vhost_dev->protocol_features =
> - qmp_decode_protocols(hdev->protocol_features);
> status->vhost_dev->max_queues = hdev->max_queues;
> status->vhost_dev->backend_cap = hdev->backend_cap;
> status->vhost_dev->log_enabled = hdev->log_enabled;
> status->vhost_dev->log_size = hdev->log_size;
> +
> + if (hdev->vhost_ops->vhost_qmp_status) {
> + hdev->vhost_ops->vhost_qmp_status(hdev, status->vhost_dev);
> + }
Same comment as patch 1/33 - why have it in vhost_ops if it is
vhost_user specific, rather than checking the backend type and calling
a helper?
> }
>
> return status;
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index f65fa26298..0785fc764d 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -12,6 +12,7 @@
> #define VHOST_BACKEND_H
>
> #include "system/memory.h"
> +#include "qapi/qapi-commands-virtio.h"
>
> typedef enum VhostBackendType {
> VHOST_BACKEND_TYPE_NONE = 0,
> @@ -160,6 +161,7 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
> int *reply_fd,
> Error **errp);
> typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
> +typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
>
> typedef struct VhostOps {
> VhostBackendType backend_type;
> @@ -216,6 +218,7 @@ typedef struct VhostOps {
> vhost_supports_device_state_op vhost_supports_device_state;
> vhost_set_device_state_fd_op vhost_set_device_state_fd;
> vhost_check_device_state_op vhost_check_device_state;
> + vhost_qmp_status_op vhost_qmp_status;
> } VhostOps;
>
> int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 9f9dd2d46d..15bc287a9d 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -104,14 +104,6 @@ struct vhost_dev {
> uint64_t features;
> uint64_t acked_features;
>
> - /**
> - * @protocol_features: is the vhost-user only feature set by
> - * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
> - * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
> - * by the backend (see @features).
> - */
> - uint64_t protocol_features;
> -
> uint64_t max_queues;
> uint64_t backend_cap;
> /* @started: is the vhost device started? */
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 04/33] vhost: move protocol_features to vhost_user
2025-10-09 18:57 ` Raphael Norwitz
@ 2025-10-09 19:35 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:45 ` Raphael Norwitz
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 19:35 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Gonglei (Arei),
Zhenwei Pi, Jason Wang
On 09.10.25 21:57, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 12:57 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> As comment says: it's only for vhost-user. So, let's move it
>> to corresponding vhost backend realization.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> backends/cryptodev-vhost.c | 1 -
>> hw/net/vhost_net.c | 2 --
>> hw/virtio/vhost-user.c | 23 ++++++++++++++++++++---
>> hw/virtio/virtio-qmp.c | 6 ++++--
>> include/hw/virtio/vhost-backend.h | 3 +++
>> include/hw/virtio/vhost.h | 8 --------
>> 6 files changed, 27 insertions(+), 16 deletions(-)
>>
>> diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
>> index 943680a23a..3bcdc494d8 100644
>> --- a/backends/cryptodev-vhost.c
>> +++ b/backends/cryptodev-vhost.c
>> @@ -60,7 +60,6 @@ cryptodev_vhost_init(
>>
>> crypto->cc = options->cc;
>>
>> - crypto->dev.protocol_features = 0;
>> crypto->backend = -1;
>>
>> /* vhost-user needs vq_index to initiate a specific queue pair */
>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> index fcee279f0b..ce30b6e197 100644
>> --- a/hw/net/vhost_net.c
>> +++ b/hw/net/vhost_net.c
>> @@ -260,9 +260,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>> goto fail;
>> }
>> net->backend = r;
>> - net->dev.protocol_features = 0;
>> } else {
>> - net->dev.protocol_features = 0;
>> net->backend = -1;
>>
>> /* vhost-user needs vq_index to initiate a specific queue pair */
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 6fa5b8a8bd..abdf47ee7b 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -11,6 +11,7 @@
>> #include "qemu/osdep.h"
>> #include "qapi/error.h"
>> #include "hw/virtio/virtio-dmabuf.h"
>> +#include "hw/virtio/virtio-qmp.h"
>> #include "hw/virtio/vhost.h"
>> #include "hw/virtio/virtio-crypto.h"
>> #include "hw/virtio/vhost-user.h"
>> @@ -264,6 +265,14 @@ struct vhost_user {
>> /* Our current regions */
>> int num_shadow_regions;
>> struct vhost_memory_region shadow_regions[VHOST_USER_MAX_RAM_SLOTS];
>> +
>> + /**
>> + * @protocol_features: the vhost-user protocol feature set by
>> + * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
>> + * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
>> + * by the backend (see @features).
>> + */
>> + uint64_t protocol_features;
>> };
>>
>> struct scrub_regions {
>> @@ -274,7 +283,8 @@ struct scrub_regions {
>>
>> static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
>> {
>> - return virtio_has_feature(dev->protocol_features, feature);
>> + struct vhost_user *u = dev->opaque;
>> + return virtio_has_feature(u->protocol_features, feature);
>> }
>>
>> static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
>> @@ -2218,8 +2228,8 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
>> }
>>
>> /* final set of protocol features */
>> - dev->protocol_features = protocol_features;
>> - err = vhost_user_set_protocol_features(dev, dev->protocol_features);
>> + u->protocol_features = protocol_features;
>> + err = vhost_user_set_protocol_features(dev, u->protocol_features);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_backend_init failed");
>> return -EPROTO;
>> @@ -3001,6 +3011,12 @@ static int vhost_user_check_device_state(struct vhost_dev *dev, Error **errp)
>> return 0;
>> }
>>
>> +static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
>> +{
>> + struct vhost_user *u = dev->opaque;
>> + status->protocol_features = qmp_decode_protocols(u->protocol_features);
>> +}
>> +
>> const VhostOps user_ops = {
>> .backend_type = VHOST_BACKEND_TYPE_USER,
>> .vhost_backend_init = vhost_user_backend_init,
>> @@ -3041,4 +3057,5 @@ const VhostOps user_ops = {
>> .vhost_supports_device_state = vhost_user_supports_device_state,
>> .vhost_set_device_state_fd = vhost_user_set_device_state_fd,
>> .vhost_check_device_state = vhost_user_check_device_state,
>> + .vhost_qmp_status = vhost_user_qmp_status,
>> };
>> diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
>> index e514a4797e..d55b12f9f3 100644
>> --- a/hw/virtio/virtio-qmp.c
>> +++ b/hw/virtio/virtio-qmp.c
>> @@ -788,12 +788,14 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
>> qmp_decode_features(vdev->device_id, hdev->features);
>> status->vhost_dev->acked_features =
>> qmp_decode_features(vdev->device_id, hdev->acked_features);
>> - status->vhost_dev->protocol_features =
>> - qmp_decode_protocols(hdev->protocol_features);
>> status->vhost_dev->max_queues = hdev->max_queues;
>> status->vhost_dev->backend_cap = hdev->backend_cap;
>> status->vhost_dev->log_enabled = hdev->log_enabled;
>> status->vhost_dev->log_size = hdev->log_size;
>> +
>> + if (hdev->vhost_ops->vhost_qmp_status) {
>> + hdev->vhost_ops->vhost_qmp_status(hdev, status->vhost_dev);
>> + }
>
> Same comment as patch 1/33 - why have it in vhost_ops if it is
> vhost_user specific, rather than checking the backend type and calling
> a helper?
>
Aha, I think now, I undestand you correctly in 1/33.
No specific reason, but a try to keep generic code backend-agnostic, without
knowledge about specific backends.
Not a problem for me to swith to "if (backend_type == ", if it's preferable
in this case.
>
>> }
>>
>> return status;
>> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
>> index f65fa26298..0785fc764d 100644
>> --- a/include/hw/virtio/vhost-backend.h
>> +++ b/include/hw/virtio/vhost-backend.h
>> @@ -12,6 +12,7 @@
>> #define VHOST_BACKEND_H
>>
>> #include "system/memory.h"
>> +#include "qapi/qapi-commands-virtio.h"
>>
>> typedef enum VhostBackendType {
>> VHOST_BACKEND_TYPE_NONE = 0,
>> @@ -160,6 +161,7 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
>> int *reply_fd,
>> Error **errp);
>> typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
>> +typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
>>
>> typedef struct VhostOps {
>> VhostBackendType backend_type;
>> @@ -216,6 +218,7 @@ typedef struct VhostOps {
>> vhost_supports_device_state_op vhost_supports_device_state;
>> vhost_set_device_state_fd_op vhost_set_device_state_fd;
>> vhost_check_device_state_op vhost_check_device_state;
>> + vhost_qmp_status_op vhost_qmp_status;
>> } VhostOps;
>>
>> int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>> index 9f9dd2d46d..15bc287a9d 100644
>> --- a/include/hw/virtio/vhost.h
>> +++ b/include/hw/virtio/vhost.h
>> @@ -104,14 +104,6 @@ struct vhost_dev {
>> uint64_t features;
>> uint64_t acked_features;
>>
>> - /**
>> - * @protocol_features: is the vhost-user only feature set by
>> - * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
>> - * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
>> - * by the backend (see @features).
>> - */
>> - uint64_t protocol_features;
>> -
>> uint64_t max_queues;
>> uint64_t backend_cap;
>> /* @started: is the vhost device started? */
>> --
>> 2.48.1
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 04/33] vhost: move protocol_features to vhost_user
2025-10-09 19:35 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:45 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:45 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Gonglei (Arei),
Zhenwei Pi, Jason Wang
On Thu, Oct 9, 2025 at 3:35 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 09.10.25 21:57, Raphael Norwitz wrote:
> > On Wed, Aug 13, 2025 at 12:57 PM Vladimir Sementsov-Ogievskiy
> > <vsementsov@yandex-team.ru> wrote:
> >>
> >> As comment says: it's only for vhost-user. So, let's move it
> >> to corresponding vhost backend realization.
> >>
> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> >> ---
> >> backends/cryptodev-vhost.c | 1 -
> >> hw/net/vhost_net.c | 2 --
> >> hw/virtio/vhost-user.c | 23 ++++++++++++++++++++---
> >> hw/virtio/virtio-qmp.c | 6 ++++--
> >> include/hw/virtio/vhost-backend.h | 3 +++
> >> include/hw/virtio/vhost.h | 8 --------
> >> 6 files changed, 27 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
> >> index 943680a23a..3bcdc494d8 100644
> >> --- a/backends/cryptodev-vhost.c
> >> +++ b/backends/cryptodev-vhost.c
> >> @@ -60,7 +60,6 @@ cryptodev_vhost_init(
> >>
> >> crypto->cc = options->cc;
> >>
> >> - crypto->dev.protocol_features = 0;
> >> crypto->backend = -1;
> >>
> >> /* vhost-user needs vq_index to initiate a specific queue pair */
> >> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >> index fcee279f0b..ce30b6e197 100644
> >> --- a/hw/net/vhost_net.c
> >> +++ b/hw/net/vhost_net.c
> >> @@ -260,9 +260,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
> >> goto fail;
> >> }
> >> net->backend = r;
> >> - net->dev.protocol_features = 0;
> >> } else {
> >> - net->dev.protocol_features = 0;
> >> net->backend = -1;
> >>
> >> /* vhost-user needs vq_index to initiate a specific queue pair */
> >> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> >> index 6fa5b8a8bd..abdf47ee7b 100644
> >> --- a/hw/virtio/vhost-user.c
> >> +++ b/hw/virtio/vhost-user.c
> >> @@ -11,6 +11,7 @@
> >> #include "qemu/osdep.h"
> >> #include "qapi/error.h"
> >> #include "hw/virtio/virtio-dmabuf.h"
> >> +#include "hw/virtio/virtio-qmp.h"
> >> #include "hw/virtio/vhost.h"
> >> #include "hw/virtio/virtio-crypto.h"
> >> #include "hw/virtio/vhost-user.h"
> >> @@ -264,6 +265,14 @@ struct vhost_user {
> >> /* Our current regions */
> >> int num_shadow_regions;
> >> struct vhost_memory_region shadow_regions[VHOST_USER_MAX_RAM_SLOTS];
> >> +
> >> + /**
> >> + * @protocol_features: the vhost-user protocol feature set by
> >> + * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
> >> + * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
> >> + * by the backend (see @features).
> >> + */
> >> + uint64_t protocol_features;
> >> };
> >>
> >> struct scrub_regions {
> >> @@ -274,7 +283,8 @@ struct scrub_regions {
> >>
> >> static bool vhost_user_has_prot(struct vhost_dev *dev, uint64_t feature)
> >> {
> >> - return virtio_has_feature(dev->protocol_features, feature);
> >> + struct vhost_user *u = dev->opaque;
> >> + return virtio_has_feature(u->protocol_features, feature);
> >> }
> >>
> >> static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
> >> @@ -2218,8 +2228,8 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> >> }
> >>
> >> /* final set of protocol features */
> >> - dev->protocol_features = protocol_features;
> >> - err = vhost_user_set_protocol_features(dev, dev->protocol_features);
> >> + u->protocol_features = protocol_features;
> >> + err = vhost_user_set_protocol_features(dev, u->protocol_features);
> >> if (err < 0) {
> >> error_setg_errno(errp, EPROTO, "vhost_backend_init failed");
> >> return -EPROTO;
> >> @@ -3001,6 +3011,12 @@ static int vhost_user_check_device_state(struct vhost_dev *dev, Error **errp)
> >> return 0;
> >> }
> >>
> >> +static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
> >> +{
> >> + struct vhost_user *u = dev->opaque;
> >> + status->protocol_features = qmp_decode_protocols(u->protocol_features);
> >> +}
> >> +
> >> const VhostOps user_ops = {
> >> .backend_type = VHOST_BACKEND_TYPE_USER,
> >> .vhost_backend_init = vhost_user_backend_init,
> >> @@ -3041,4 +3057,5 @@ const VhostOps user_ops = {
> >> .vhost_supports_device_state = vhost_user_supports_device_state,
> >> .vhost_set_device_state_fd = vhost_user_set_device_state_fd,
> >> .vhost_check_device_state = vhost_user_check_device_state,
> >> + .vhost_qmp_status = vhost_user_qmp_status,
> >> };
> >> diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> >> index e514a4797e..d55b12f9f3 100644
> >> --- a/hw/virtio/virtio-qmp.c
> >> +++ b/hw/virtio/virtio-qmp.c
> >> @@ -788,12 +788,14 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
> >> qmp_decode_features(vdev->device_id, hdev->features);
> >> status->vhost_dev->acked_features =
> >> qmp_decode_features(vdev->device_id, hdev->acked_features);
> >> - status->vhost_dev->protocol_features =
> >> - qmp_decode_protocols(hdev->protocol_features);
> >> status->vhost_dev->max_queues = hdev->max_queues;
> >> status->vhost_dev->backend_cap = hdev->backend_cap;
> >> status->vhost_dev->log_enabled = hdev->log_enabled;
> >> status->vhost_dev->log_size = hdev->log_size;
> >> +
> >> + if (hdev->vhost_ops->vhost_qmp_status) {
> >> + hdev->vhost_ops->vhost_qmp_status(hdev, status->vhost_dev);
> >> + }
> >
> > Same comment as patch 1/33 - why have it in vhost_ops if it is
> > vhost_user specific, rather than checking the backend type and calling
> > a helper?
> >
>
> Aha, I think now, I undestand you correctly in 1/33.
>
> No specific reason, but a try to keep generic code backend-agnostic, without
> knowledge about specific backends.
>
> Not a problem for me to swith to "if (backend_type == ", if it's preferable
> in this case.
>
Yes, that is what I meant. I think it makes the code more readable but
if others disagree we can leave it.
> >
> >> }
> >>
> >> return status;
> >> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> >> index f65fa26298..0785fc764d 100644
> >> --- a/include/hw/virtio/vhost-backend.h
> >> +++ b/include/hw/virtio/vhost-backend.h
> >> @@ -12,6 +12,7 @@
> >> #define VHOST_BACKEND_H
> >>
> >> #include "system/memory.h"
> >> +#include "qapi/qapi-commands-virtio.h"
> >>
> >> typedef enum VhostBackendType {
> >> VHOST_BACKEND_TYPE_NONE = 0,
> >> @@ -160,6 +161,7 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
> >> int *reply_fd,
> >> Error **errp);
> >> typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
> >> +typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
> >>
> >> typedef struct VhostOps {
> >> VhostBackendType backend_type;
> >> @@ -216,6 +218,7 @@ typedef struct VhostOps {
> >> vhost_supports_device_state_op vhost_supports_device_state;
> >> vhost_set_device_state_fd_op vhost_set_device_state_fd;
> >> vhost_check_device_state_op vhost_check_device_state;
> >> + vhost_qmp_status_op vhost_qmp_status;
> >> } VhostOps;
> >>
> >> int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
> >> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >> index 9f9dd2d46d..15bc287a9d 100644
> >> --- a/include/hw/virtio/vhost.h
> >> +++ b/include/hw/virtio/vhost.h
> >> @@ -104,14 +104,6 @@ struct vhost_dev {
> >> uint64_t features;
> >> uint64_t acked_features;
> >>
> >> - /**
> >> - * @protocol_features: is the vhost-user only feature set by
> >> - * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
> >> - * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
> >> - * by the backend (see @features).
> >> - */
> >> - uint64_t protocol_features;
> >> -
> >> uint64_t max_queues;
> >> uint64_t backend_cap;
> >> /* @started: is the vhost device started? */
> >> --
> >> 2.48.1
> >>
> >>
>
>
> --
> Best regards,
> Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 05/33] vhost-user-gpu: drop code duplication
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (3 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 04/33] vhost: move protocol_features to vhost_user Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-18 6:54 ` Philippe Mathieu-Daudé
2025-10-09 18:58 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 06/33] vhost: make vhost_dev.features private Vladimir Sementsov-Ogievskiy
` (28 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Obviously, this duplicated fragment doesn't make any sense.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/display/vhost-user-gpu.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 9fc6bbcd2c..79ea64b12c 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -644,10 +644,6 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error **errp)
VIRTIO_GPU_F_RESOURCE_UUID)) {
g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
}
- if (virtio_has_feature(g->vhost->dev.features,
- VIRTIO_GPU_F_RESOURCE_UUID)) {
- g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
- }
if (!virtio_gpu_base_device_realize(qdev, NULL, NULL, errp)) {
return;
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 05/33] vhost-user-gpu: drop code duplication
2025-08-13 16:48 ` [PATCH 05/33] vhost-user-gpu: drop code duplication Vladimir Sementsov-Ogievskiy
@ 2025-08-18 6:54 ` Philippe Mathieu-Daudé
2025-10-09 18:58 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-18 6:54 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> Obviously, this duplicated fragment doesn't make any sense.
Better safe than sorry.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/display/vhost-user-gpu.c | 4 ----
> 1 file changed, 4 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 05/33] vhost-user-gpu: drop code duplication
2025-08-13 16:48 ` [PATCH 05/33] vhost-user-gpu: drop code duplication Vladimir Sementsov-Ogievskiy
2025-08-18 6:54 ` Philippe Mathieu-Daudé
@ 2025-10-09 18:58 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:58 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:54 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Obviously, this duplicated fragment doesn't make any sense.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/display/vhost-user-gpu.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
> index 9fc6bbcd2c..79ea64b12c 100644
> --- a/hw/display/vhost-user-gpu.c
> +++ b/hw/display/vhost-user-gpu.c
> @@ -644,10 +644,6 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error **errp)
> VIRTIO_GPU_F_RESOURCE_UUID)) {
> g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
> }
> - if (virtio_has_feature(g->vhost->dev.features,
> - VIRTIO_GPU_F_RESOURCE_UUID)) {
> - g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
> - }
>
> if (!virtio_gpu_base_device_realize(qdev, NULL, NULL, errp)) {
> return;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 06/33] vhost: make vhost_dev.features private
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (4 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 05/33] vhost-user-gpu: drop code duplication Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 18:58 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code Vladimir Sementsov-Ogievskiy
` (27 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov, Jason Wang, Alex Bennée
It's hard to control where and how do we use this field. Let's
cover all usages by getters/setters, and keep direct access to the
field only in vhost.c. It will help to control migration of this
field in further commits.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/display/vhost-user-gpu.c | 7 +++----
hw/net/vhost_net.c | 17 +++++++++--------
hw/virtio/vdpa-dev.c | 2 +-
hw/virtio/vhost-user-base.c | 8 ++++++--
hw/virtio/vhost-user.c | 7 +++----
hw/virtio/vhost.c | 21 ++++++++++++++++++---
hw/virtio/virtio-qmp.c | 2 +-
include/hw/virtio/vhost.h | 21 +++++++++++++++++++--
net/vhost-vdpa.c | 7 +++----
9 files changed, 63 insertions(+), 29 deletions(-)
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 79ea64b12c..146620e0a3 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -631,17 +631,16 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error **errp)
/* existing backend may send DMABUF, so let's add that requirement */
g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED;
- if (virtio_has_feature(g->vhost->dev.features, VIRTIO_GPU_F_VIRGL)) {
+ if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_VIRGL)) {
g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED;
}
- if (virtio_has_feature(g->vhost->dev.features, VIRTIO_GPU_F_EDID)) {
+ if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_EDID)) {
g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_EDID_ENABLED;
} else {
error_report("EDID requested but the backend doesn't support it.");
g->parent_obj.conf.flags &= ~(1 << VIRTIO_GPU_FLAG_EDID_ENABLED);
}
- if (virtio_has_feature(g->vhost->dev.features,
- VIRTIO_GPU_F_RESOURCE_UUID)) {
+ if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_RESOURCE_UUID)) {
g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
}
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index ce30b6e197..5269533864 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -55,7 +55,8 @@ void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
{
net->dev.acked_features =
(qemu_has_vnet_hdr(net->nc) ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))
- | (net->dev.features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
+ | (vhost_dev_features(&net->dev) &
+ (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
vhost_ack_features(&net->dev, net->feature_bits, features);
}
@@ -277,23 +278,23 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
if (backend_kernel) {
if (!qemu_has_vnet_hdr_len(options->net_backend,
sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
- net->dev.features &= ~(1ULL << VIRTIO_NET_F_MRG_RXBUF);
+ vhost_dev_clear_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF);
}
if (!qemu_has_vnet_hdr(options->net_backend) &&
- (~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))) {
- fprintf(stderr, "vhost lacks feature mask 0x%llx for backend\n",
- ~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR));
+ !vhost_dev_has_feature(&net->dev, VHOST_NET_F_VIRTIO_NET_HDR)) {
+ fprintf(stderr, "vhost lacks VHOST_NET_F_VIRTIO_NET_HDR "
+ "feature for backend\n");
goto fail;
}
}
/* Set sane init value. Override when guest acks. */
if (options->get_acked_features) {
+ uint64_t backend_features = vhost_dev_features(&net->dev);
features = options->get_acked_features(net->nc);
- if (~net->dev.features & features) {
+ if (~backend_features & features) {
fprintf(stderr, "vhost lacks feature mask 0x%" PRIx64
- " for backend\n",
- (uint64_t)(~net->dev.features & features));
+ " for backend\n", ~backend_features & features);
goto fail;
}
}
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 3c0eed3e8e..4dfb03aaa7 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -224,7 +224,7 @@ static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
Error **errp)
{
VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
- uint64_t backend_features = s->dev.features;
+ uint64_t backend_features = vhost_dev_features(&s->dev);
if (!virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM)) {
virtio_clear_feature(&backend_features, VIRTIO_F_IOMMU_PLATFORM);
diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
index ff67a020b4..cf311c3bfc 100644
--- a/hw/virtio/vhost-user-base.c
+++ b/hw/virtio/vhost-user-base.c
@@ -118,9 +118,13 @@ static uint64_t vub_get_features(VirtIODevice *vdev,
uint64_t requested_features, Error **errp)
{
VHostUserBase *vub = VHOST_USER_BASE(vdev);
+ uint64_t backend_features = vhost_dev_features(&vub->vhost_dev);
+
/* This should be set when the vhost connection initialises */
- g_assert(vub->vhost_dev.features);
- return vub->vhost_dev.features & ~(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
+ g_assert(backend_features);
+ virtio_clear_feature(&backend_features, VHOST_USER_F_PROTOCOL_FEATURES);
+
+ return backend_features;
}
/*
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index abdf47ee7b..46f09f5988 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1245,15 +1245,14 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
{
- return virtio_has_feature(dev->features,
- VHOST_USER_F_PROTOCOL_FEATURES);
+ return vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES);
}
static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
{
int i;
- if (!virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
+ if (!vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES)) {
return -EINVAL;
}
@@ -1465,7 +1464,7 @@ static int vhost_user_set_features(struct vhost_dev *dev,
* Don't lose VHOST_USER_F_PROTOCOL_FEATURES, which is vhost-user
* specific.
*/
- if (virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
+ if (vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES)) {
features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c33dad4acd..2631bbabcf 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -50,6 +50,21 @@ static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];
static QLIST_HEAD(, vhost_dev) vhost_devices =
QLIST_HEAD_INITIALIZER(vhost_devices);
+bool vhost_dev_has_feature(struct vhost_dev *dev, uint64_t feature)
+{
+ return virtio_has_feature(dev->_features, feature);
+}
+
+uint64_t vhost_dev_features(struct vhost_dev *dev)
+{
+ return dev->_features;
+}
+
+void vhost_dev_clear_feature(struct vhost_dev *dev, uint64_t feature)
+{
+ virtio_clear_feature(&dev->_features, feature);
+}
+
unsigned int vhost_get_max_memslots(void)
{
unsigned int max = UINT_MAX;
@@ -1571,7 +1586,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
}
}
- hdev->features = features;
+ hdev->_features = features;
hdev->memory_listener = (MemoryListener) {
.name = "vhost",
@@ -1594,7 +1609,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
};
if (hdev->migration_blocker == NULL) {
- if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
+ if (!vhost_dev_has_feature(hdev, VHOST_F_LOG_ALL)) {
error_setg(&hdev->migration_blocker,
"Migration disabled: vhost lacks VHOST_F_LOG_ALL feature.");
} else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_alloc_check()) {
@@ -1865,7 +1880,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
const int *bit = feature_bits;
while (*bit != VHOST_INVALID_FEATURE_BIT) {
uint64_t bit_mask = (1ULL << *bit);
- if (!(hdev->features & bit_mask)) {
+ if (!vhost_dev_has_feature(hdev, *bit)) {
features &= ~bit_mask;
}
bit++;
diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
index d55b12f9f3..4bd23c015e 100644
--- a/hw/virtio/virtio-qmp.c
+++ b/hw/virtio/virtio-qmp.c
@@ -785,7 +785,7 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
status->vhost_dev->nvqs = hdev->nvqs;
status->vhost_dev->vq_index = hdev->vq_index;
status->vhost_dev->features =
- qmp_decode_features(vdev->device_id, hdev->features);
+ qmp_decode_features(vdev->device_id, vhost_dev_features(hdev));
status->vhost_dev->acked_features =
qmp_decode_features(vdev->device_id, hdev->acked_features);
status->vhost_dev->max_queues = hdev->max_queues;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 15bc287a9d..8a4c8c3502 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -98,10 +98,11 @@ struct vhost_dev {
* offered by a backend which may be a subset of the total
* features eventually offered to the guest.
*
- * @features: available features provided by the backend
+ * @_features: available features provided by the backend, private,
+ * direct access only in vhost.c
* @acked_features: final negotiated features with front-end driver
*/
- uint64_t features;
+ uint64_t _features;
uint64_t acked_features;
uint64_t max_queues;
@@ -352,6 +353,22 @@ int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
struct vhost_inflight *inflight);
bool vhost_dev_has_iommu(struct vhost_dev *dev);
+/**
+ * vhost_dev_has_feature() - check if vhost device has a specific feature
+ * @dev: common vhost_dev structure
+ * @feature: feature bit to check
+ *
+ * Return: true if the feature is supported, false otherwise
+ */
+bool vhost_dev_has_feature(struct vhost_dev *dev, uint64_t feature);
+
+/**
+ * vhost_dev_features() - simple getter for dev->features
+ */
+uint64_t vhost_dev_features(struct vhost_dev *dev);
+
+void vhost_dev_clear_feature(struct vhost_dev *dev, uint64_t feature);
+
#ifdef CONFIG_VHOST
int vhost_reset_device(struct vhost_dev *hdev);
#else
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 74d26a9497..0af0d3bdd3 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -256,15 +256,14 @@ static bool vhost_vdpa_get_vnet_hash_supported_types(NetClientState *nc,
{
assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
- uint64_t features = s->vhost_vdpa.dev->features;
int fd = s->vhost_vdpa.shared->device_fd;
struct {
struct vhost_vdpa_config hdr;
uint32_t supported_hash_types;
} config;
- if (!virtio_has_feature(features, VIRTIO_NET_F_HASH_REPORT) &&
- !virtio_has_feature(features, VIRTIO_NET_F_RSS)) {
+ if (!vhost_dev_has_feature(s->vhost_vdpa.dev, VIRTIO_NET_F_HASH_REPORT) &&
+ !vhost_dev_has_feature(s->vhost_vdpa.dev, VIRTIO_NET_F_RSS)) {
return false;
}
@@ -585,7 +584,7 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
* If we early return in these cases SVQ will not be enabled. The migration
* will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
*/
- if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+ if (!vhost_vdpa_net_valid_svq_features(vhost_dev_features(v->dev), NULL)) {
return 0;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 06/33] vhost: make vhost_dev.features private
2025-08-13 16:48 ` [PATCH 06/33] vhost: make vhost_dev.features private Vladimir Sementsov-Ogievskiy
@ 2025-10-09 18:58 ` Raphael Norwitz
2025-10-09 19:40 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:58 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Jason Wang,
Alex Bennée
On Wed, Aug 13, 2025 at 12:58 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> It's hard to control where and how do we use this field. Let's
> cover all usages by getters/setters, and keep direct access to the
> field only in vhost.c. It will help to control migration of this
> field in further commits.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/display/vhost-user-gpu.c | 7 +++----
> hw/net/vhost_net.c | 17 +++++++++--------
> hw/virtio/vdpa-dev.c | 2 +-
> hw/virtio/vhost-user-base.c | 8 ++++++--
> hw/virtio/vhost-user.c | 7 +++----
> hw/virtio/vhost.c | 21 ++++++++++++++++++---
> hw/virtio/virtio-qmp.c | 2 +-
> include/hw/virtio/vhost.h | 21 +++++++++++++++++++--
> net/vhost-vdpa.c | 7 +++----
> 9 files changed, 63 insertions(+), 29 deletions(-)
>
> diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
> index 79ea64b12c..146620e0a3 100644
> --- a/hw/display/vhost-user-gpu.c
> +++ b/hw/display/vhost-user-gpu.c
> @@ -631,17 +631,16 @@ vhost_user_gpu_device_realize(DeviceState *qdev, Error **errp)
>
> /* existing backend may send DMABUF, so let's add that requirement */
> g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED;
> - if (virtio_has_feature(g->vhost->dev.features, VIRTIO_GPU_F_VIRGL)) {
> + if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_VIRGL)) {
> g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED;
> }
> - if (virtio_has_feature(g->vhost->dev.features, VIRTIO_GPU_F_EDID)) {
> + if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_EDID)) {
> g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_EDID_ENABLED;
> } else {
> error_report("EDID requested but the backend doesn't support it.");
> g->parent_obj.conf.flags &= ~(1 << VIRTIO_GPU_FLAG_EDID_ENABLED);
> }
> - if (virtio_has_feature(g->vhost->dev.features,
> - VIRTIO_GPU_F_RESOURCE_UUID)) {
> + if (vhost_dev_has_feature(&g->vhost->dev, VIRTIO_GPU_F_RESOURCE_UUID)) {
> g->parent_obj.conf.flags |= 1 << VIRTIO_GPU_FLAG_RESOURCE_UUID_ENABLED;
> }
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index ce30b6e197..5269533864 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -55,7 +55,8 @@ void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
> {
> net->dev.acked_features =
> (qemu_has_vnet_hdr(net->nc) ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))
> - | (net->dev.features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
> + | (vhost_dev_features(&net->dev) &
> + (1ULL << VHOST_USER_F_PROTOCOL_FEATURES));
>
> vhost_ack_features(&net->dev, net->feature_bits, features);
> }
> @@ -277,23 +278,23 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
> if (backend_kernel) {
> if (!qemu_has_vnet_hdr_len(options->net_backend,
> sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
> - net->dev.features &= ~(1ULL << VIRTIO_NET_F_MRG_RXBUF);
> + vhost_dev_clear_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF);
> }
> if (!qemu_has_vnet_hdr(options->net_backend) &&
> - (~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR))) {
> - fprintf(stderr, "vhost lacks feature mask 0x%llx for backend\n",
> - ~net->dev.features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR));
> + !vhost_dev_has_feature(&net->dev, VHOST_NET_F_VIRTIO_NET_HDR)) {
> + fprintf(stderr, "vhost lacks VHOST_NET_F_VIRTIO_NET_HDR "
> + "feature for backend\n");
> goto fail;
> }
> }
>
> /* Set sane init value. Override when guest acks. */
> if (options->get_acked_features) {
> + uint64_t backend_features = vhost_dev_features(&net->dev);
> features = options->get_acked_features(net->nc);
> - if (~net->dev.features & features) {
> + if (~backend_features & features) {
> fprintf(stderr, "vhost lacks feature mask 0x%" PRIx64
> - " for backend\n",
> - (uint64_t)(~net->dev.features & features));
> + " for backend\n", ~backend_features & features);
> goto fail;
> }
> }
> diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> index 3c0eed3e8e..4dfb03aaa7 100644
> --- a/hw/virtio/vdpa-dev.c
> +++ b/hw/virtio/vdpa-dev.c
> @@ -224,7 +224,7 @@ static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
> Error **errp)
> {
> VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> - uint64_t backend_features = s->dev.features;
> + uint64_t backend_features = vhost_dev_features(&s->dev);
>
> if (!virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM)) {
> virtio_clear_feature(&backend_features, VIRTIO_F_IOMMU_PLATFORM);
> diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
> index ff67a020b4..cf311c3bfc 100644
> --- a/hw/virtio/vhost-user-base.c
> +++ b/hw/virtio/vhost-user-base.c
> @@ -118,9 +118,13 @@ static uint64_t vub_get_features(VirtIODevice *vdev,
> uint64_t requested_features, Error **errp)
> {
> VHostUserBase *vub = VHOST_USER_BASE(vdev);
> + uint64_t backend_features = vhost_dev_features(&vub->vhost_dev);
> +
> /* This should be set when the vhost connection initialises */
> - g_assert(vub->vhost_dev.features);
> - return vub->vhost_dev.features & ~(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
> + g_assert(backend_features);
> + virtio_clear_feature(&backend_features, VHOST_USER_F_PROTOCOL_FEATURES);
> +
> + return backend_features;
> }
>
> /*
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index abdf47ee7b..46f09f5988 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1245,15 +1245,14 @@ static int vhost_user_set_vring_base(struct vhost_dev *dev,
>
> static bool vhost_user_set_vring_enable_supported(struct vhost_dev *dev)
> {
> - return virtio_has_feature(dev->features,
> - VHOST_USER_F_PROTOCOL_FEATURES);
> + return vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES);
> }
>
> static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
> {
> int i;
>
> - if (!virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
> + if (!vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES)) {
> return -EINVAL;
> }
>
> @@ -1465,7 +1464,7 @@ static int vhost_user_set_features(struct vhost_dev *dev,
> * Don't lose VHOST_USER_F_PROTOCOL_FEATURES, which is vhost-user
> * specific.
> */
> - if (virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
> + if (vhost_dev_has_feature(dev, VHOST_USER_F_PROTOCOL_FEATURES)) {
> features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
> }
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index c33dad4acd..2631bbabcf 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -50,6 +50,21 @@ static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];
> static QLIST_HEAD(, vhost_dev) vhost_devices =
> QLIST_HEAD_INITIALIZER(vhost_devices);
>
> +bool vhost_dev_has_feature(struct vhost_dev *dev, uint64_t feature)
> +{
> + return virtio_has_feature(dev->_features, feature);
> +}
> +
> +uint64_t vhost_dev_features(struct vhost_dev *dev)
> +{
> + return dev->_features;
> +}
> +
> +void vhost_dev_clear_feature(struct vhost_dev *dev, uint64_t feature)
> +{
> + virtio_clear_feature(&dev->_features, feature);
> +}
> +
> unsigned int vhost_get_max_memslots(void)
> {
> unsigned int max = UINT_MAX;
> @@ -1571,7 +1586,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> }
> }
>
> - hdev->features = features;
> + hdev->_features = features;
>
> hdev->memory_listener = (MemoryListener) {
> .name = "vhost",
> @@ -1594,7 +1609,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> };
>
> if (hdev->migration_blocker == NULL) {
> - if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
> + if (!vhost_dev_has_feature(hdev, VHOST_F_LOG_ALL)) {
> error_setg(&hdev->migration_blocker,
> "Migration disabled: vhost lacks VHOST_F_LOG_ALL feature.");
> } else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_alloc_check()) {
> @@ -1865,7 +1880,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> const int *bit = feature_bits;
> while (*bit != VHOST_INVALID_FEATURE_BIT) {
> uint64_t bit_mask = (1ULL << *bit);
> - if (!(hdev->features & bit_mask)) {
> + if (!vhost_dev_has_feature(hdev, *bit)) {
> features &= ~bit_mask;
> }
> bit++;
> diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> index d55b12f9f3..4bd23c015e 100644
> --- a/hw/virtio/virtio-qmp.c
> +++ b/hw/virtio/virtio-qmp.c
> @@ -785,7 +785,7 @@ VirtioStatus *qmp_x_query_virtio_status(const char *path, Error **errp)
> status->vhost_dev->nvqs = hdev->nvqs;
> status->vhost_dev->vq_index = hdev->vq_index;
> status->vhost_dev->features =
> - qmp_decode_features(vdev->device_id, hdev->features);
> + qmp_decode_features(vdev->device_id, vhost_dev_features(hdev));
> status->vhost_dev->acked_features =
> qmp_decode_features(vdev->device_id, hdev->acked_features);
> status->vhost_dev->max_queues = hdev->max_queues;
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 15bc287a9d..8a4c8c3502 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -98,10 +98,11 @@ struct vhost_dev {
> * offered by a backend which may be a subset of the total
> * features eventually offered to the guest.
> *
> - * @features: available features provided by the backend
> + * @_features: available features provided by the backend, private,
> + * direct access only in vhost.c
> * @acked_features: final negotiated features with front-end driver
> */
> - uint64_t features;
> + uint64_t _features;
> uint64_t acked_features;
>
> uint64_t max_queues;
> @@ -352,6 +353,22 @@ int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
> struct vhost_inflight *inflight);
> bool vhost_dev_has_iommu(struct vhost_dev *dev);
>
> +/**
> + * vhost_dev_has_feature() - check if vhost device has a specific feature
> + * @dev: common vhost_dev structure
> + * @feature: feature bit to check
> + *
> + * Return: true if the feature is supported, false otherwise
> + */
> +bool vhost_dev_has_feature(struct vhost_dev *dev, uint64_t feature);
> +
> +/**
> + * vhost_dev_features() - simple getter for dev->features
> + */
> +uint64_t vhost_dev_features(struct vhost_dev *dev);
> +
> +void vhost_dev_clear_feature(struct vhost_dev *dev, uint64_t feature);
Why not define these as static inline helpers in the header file?
> +
> #ifdef CONFIG_VHOST
> int vhost_reset_device(struct vhost_dev *hdev);
> #else
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 74d26a9497..0af0d3bdd3 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -256,15 +256,14 @@ static bool vhost_vdpa_get_vnet_hash_supported_types(NetClientState *nc,
> {
> assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> - uint64_t features = s->vhost_vdpa.dev->features;
> int fd = s->vhost_vdpa.shared->device_fd;
> struct {
> struct vhost_vdpa_config hdr;
> uint32_t supported_hash_types;
> } config;
>
> - if (!virtio_has_feature(features, VIRTIO_NET_F_HASH_REPORT) &&
> - !virtio_has_feature(features, VIRTIO_NET_F_RSS)) {
> + if (!vhost_dev_has_feature(s->vhost_vdpa.dev, VIRTIO_NET_F_HASH_REPORT) &&
> + !vhost_dev_has_feature(s->vhost_vdpa.dev, VIRTIO_NET_F_RSS)) {
> return false;
> }
>
> @@ -585,7 +584,7 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> * If we early return in these cases SVQ will not be enabled. The migration
> * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> */
> - if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> + if (!vhost_vdpa_net_valid_svq_features(vhost_dev_features(v->dev), NULL)) {
> return 0;
> }
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 06/33] vhost: make vhost_dev.features private
2025-10-09 18:58 ` Raphael Norwitz
@ 2025-10-09 19:40 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 19:40 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov, Jason Wang,
Alex Bennée
On 09.10.25 21:58, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 12:58 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> It's hard to control where and how do we use this field. Let's
>> cover all usages by getters/setters, and keep direct access to the
>> field only in vhost.c. It will help to control migration of this
>> field in further commits.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
[..]
>> +/**
>> + * vhost_dev_has_feature() - check if vhost device has a specific feature
>> + * @dev: common vhost_dev structure
>> + * @feature: feature bit to check
>> + *
>> + * Return: true if the feature is supported, false otherwise
>> + */
>> +bool vhost_dev_has_feature(struct vhost_dev *dev, uint64_t feature);
>> +
>> +/**
>> + * vhost_dev_features() - simple getter for dev->features
>> + */
>> +uint64_t vhost_dev_features(struct vhost_dev *dev);
>> +
>> +void vhost_dev_clear_feature(struct vhost_dev *dev, uint64_t feature);
>
> Why not define these as static inline helpers in the header file?
>
Agree, will do.
>
>> +
>> #ifdef CONFIG_VHOST
>> int vhost_reset_device(struct vhost_dev *hdev);
>> #else
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (5 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 06/33] vhost: make vhost_dev.features private Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-14 4:53 ` Philippe Mathieu-Daudé
2025-08-13 16:48 ` [PATCH 08/33] virtio: drop *_set_guest_notifier_fd_handler() helpers Vladimir Sementsov-Ogievskiy
` (26 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
virtio-pci and virtio-mmiio handle config notifier equally but
with different code (mmio adds a separate function, when pci
use common function). Let's chose the more compact way (pci)
and reuse it for mmio.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/virtio-mmio.c | 41 +++++------------------------
hw/virtio/virtio-pci.c | 34 +++---------------------
hw/virtio/virtio.c | 48 +++++++++++++++++++++++++++++++---
include/hw/virtio/virtio-pci.h | 3 ---
include/hw/virtio/virtio.h | 7 +++--
5 files changed, 58 insertions(+), 75 deletions(-)
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 532c67107b..18afa1c0d0 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -658,18 +658,11 @@ static int virtio_mmio_set_guest_notifier(DeviceState *d, int n, bool assign,
VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
- VirtQueue *vq = virtio_get_queue(vdev, n);
- EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
+ int r;
- if (assign) {
- int r = event_notifier_init(notifier, 0);
- if (r < 0) {
- return r;
- }
- virtio_queue_set_guest_notifier_fd_handler(vq, true, with_irqfd);
- } else {
- virtio_queue_set_guest_notifier_fd_handler(vq, false, with_irqfd);
- event_notifier_cleanup(notifier);
+ r = virtio_queue_set_guest_notifier(vdev, n, assign, with_irqfd);
+ if (r < 0) {
+ return r;
}
if (vdc->guest_notifier_mask && vdev->use_guest_notifier_mask) {
@@ -678,30 +671,7 @@ static int virtio_mmio_set_guest_notifier(DeviceState *d, int n, bool assign,
return 0;
}
-static int virtio_mmio_set_config_guest_notifier(DeviceState *d, bool assign,
- bool with_irqfd)
-{
- VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
- VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
- VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
- EventNotifier *notifier = virtio_config_get_guest_notifier(vdev);
- int r = 0;
- if (assign) {
- r = event_notifier_init(notifier, 0);
- if (r < 0) {
- return r;
- }
- virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
- } else {
- virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
- event_notifier_cleanup(notifier);
- }
- if (vdc->guest_notifier_mask && vdev->use_guest_notifier_mask) {
- vdc->guest_notifier_mask(vdev, VIRTIO_CONFIG_IRQ_IDX, !assign);
- }
- return r;
-}
static int virtio_mmio_set_guest_notifiers(DeviceState *d, int nvqs,
bool assign)
{
@@ -723,7 +693,8 @@ static int virtio_mmio_set_guest_notifiers(DeviceState *d, int nvqs,
goto assign_error;
}
}
- r = virtio_mmio_set_config_guest_notifier(d, assign, with_irqfd);
+ r = virtio_mmio_set_guest_notifier(d, VIRTIO_CONFIG_IRQ_IDX, assign,
+ with_irqfd);
if (r < 0) {
goto assign_error;
}
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 767216d795..3eca3fe2c8 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1163,43 +1163,17 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
}
}
-void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev, VirtQueue *vq,
- int n, bool assign,
- bool with_irqfd)
-{
- if (n == VIRTIO_CONFIG_IRQ_IDX) {
- virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
- } else {
- virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
- }
-}
-
static int virtio_pci_set_guest_notifier(DeviceState *d, int n, bool assign,
bool with_irqfd)
{
VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
- VirtQueue *vq = NULL;
- EventNotifier *notifier = NULL;
+ int r;
- if (n == VIRTIO_CONFIG_IRQ_IDX) {
- notifier = virtio_config_get_guest_notifier(vdev);
- } else {
- vq = virtio_get_queue(vdev, n);
- notifier = virtio_queue_get_guest_notifier(vq);
- }
-
- if (assign) {
- int r = event_notifier_init(notifier, 0);
- if (r < 0) {
- return r;
- }
- virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, true, with_irqfd);
- } else {
- virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, false,
- with_irqfd);
- event_notifier_cleanup(notifier);
+ r = virtio_queue_set_guest_notifier(vdev, n, assign, with_irqfd);
+ if (r < 0) {
+ return r;
}
if (!msix_enabled(&proxy->pci_dev) &&
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 9a81ad912e..7880c3bcd9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3750,8 +3750,10 @@ static void virtio_config_guest_notifier_read(EventNotifier *n)
virtio_notify_config(vdev);
}
}
-void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
- bool with_irqfd)
+
+static void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq,
+ bool assign,
+ bool with_irqfd)
{
if (assign && !with_irqfd) {
event_notifier_set_handler(&vq->guest_notifier,
@@ -3766,7 +3768,7 @@ void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
}
}
-void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
+static void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
bool assign, bool with_irqfd)
{
EventNotifier *n;
@@ -3783,6 +3785,46 @@ void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
}
}
+static void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev,
+ VirtQueue *vq,
+ int n, bool assign,
+ bool with_irqfd)
+{
+ if (n == VIRTIO_CONFIG_IRQ_IDX) {
+ virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
+ } else {
+ virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
+ }
+}
+
+int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
+ bool with_irqfd)
+{
+ VirtQueue *vq = NULL;
+ EventNotifier *notifier = NULL;
+
+ if (n == VIRTIO_CONFIG_IRQ_IDX) {
+ notifier = virtio_config_get_guest_notifier(vdev);
+ } else {
+ vq = virtio_get_queue(vdev, n);
+ notifier = virtio_queue_get_guest_notifier(vq);
+ }
+
+ if (assign) {
+ int r = event_notifier_init(notifier, 0);
+ if (r < 0) {
+ return r;
+ }
+ virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, true, with_irqfd);
+ } else {
+ virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, false,
+ with_irqfd);
+ event_notifier_cleanup(notifier);
+ }
+
+ return 0;
+}
+
EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq)
{
return &vq->guest_notifier;
diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index eab5394898..02c2dfb3c6 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -263,9 +263,6 @@ void virtio_pci_types_register(const VirtioPCIDeviceTypeInfo *t);
* @fixed_queues.
*/
unsigned virtio_pci_optimal_num_queues(unsigned fixed_queues);
-void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev, VirtQueue *vq,
- int n, bool assign,
- bool with_irqfd);
int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy, uint8_t bar, uint64_t offset,
uint64_t length, uint8_t id);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c594764f23..8b9db08ddf 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -416,8 +416,6 @@ void virtio_queue_update_used_idx(VirtIODevice *vdev, int n);
VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
uint16_t virtio_get_queue_index(VirtQueue *vq);
EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
-void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
- bool with_irqfd);
int virtio_device_start_ioeventfd(VirtIODevice *vdev);
int virtio_device_grab_ioeventfd(VirtIODevice *vdev);
void virtio_device_release_ioeventfd(VirtIODevice *vdev);
@@ -431,8 +429,9 @@ void virtio_queue_aio_detach_host_notifier(VirtQueue *vq, AioContext *ctx);
VirtQueue *virtio_vector_first_queue(VirtIODevice *vdev, uint16_t vector);
VirtQueue *virtio_vector_next_queue(VirtQueue *vq);
EventNotifier *virtio_config_get_guest_notifier(VirtIODevice *vdev);
-void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
- bool assign, bool with_irqfd);
+
+int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
+ bool with_irqfd);
static inline void virtio_add_feature(uint64_t *features, unsigned int fbit)
{
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code
2025-08-13 16:48 ` [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code Vladimir Sementsov-Ogievskiy
@ 2025-08-14 4:53 ` Philippe Mathieu-Daudé
2025-08-14 11:15 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-14 4:53 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
Hi Vladimir,
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> virtio-pci and virtio-mmiio handle config notifier equally but
Typo virtio-mmio.
> with different code (mmio adds a separate function, when pci
> use common function). Let's chose the more compact way (pci)
> and reuse it for mmio.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/virtio-mmio.c | 41 +++++------------------------
> hw/virtio/virtio-pci.c | 34 +++---------------------
> hw/virtio/virtio.c | 48 +++++++++++++++++++++++++++++++---
> include/hw/virtio/virtio-pci.h | 3 ---
> include/hw/virtio/virtio.h | 7 +++--
> 5 files changed, 58 insertions(+), 75 deletions(-)
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 9a81ad912e..7880c3bcd9 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> +static void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev,
> + VirtQueue *vq,
> + int n, bool assign,
> + bool with_irqfd)
> +{
> + if (n == VIRTIO_CONFIG_IRQ_IDX) {
> + virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
> + } else {
> + virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
> + }
> +}
> +
> +int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
> + bool with_irqfd)
> +{
> + VirtQueue *vq = NULL;
> + EventNotifier *notifier = NULL;
> +
> + if (n == VIRTIO_CONFIG_IRQ_IDX) {
> + notifier = virtio_config_get_guest_notifier(vdev);
> + } else {
> + vq = virtio_get_queue(vdev, n);
> + notifier = virtio_queue_get_guest_notifier(vq);
> + }
> +
> + if (assign) {
> + int r = event_notifier_init(notifier, 0);
> + if (r < 0) {
> + return r;
> + }
> + virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, true, with_irqfd);
> + } else {
> + virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, false,
> + with_irqfd);
> + event_notifier_cleanup(notifier);
> + }
> +
> + return 0;
> +}
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index c594764f23..8b9db08ddf 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> -void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
> - bool assign, bool with_irqfd);
> +
> +int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
> + bool with_irqfd);
Please add a @docstring to document (@n in particular).
Thanks,
Phil.
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code
2025-08-14 4:53 ` Philippe Mathieu-Daudé
@ 2025-08-14 11:15 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-14 11:15 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 14.08.25 07:53, Philippe Mathieu-Daudé wrote:
> Hi Vladimir,
>
> On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
>> virtio-pci and virtio-mmiio handle config notifier equally but
>
> Typo virtio-mmio.
>
>> with different code (mmio adds a separate function, when pci
>> use common function). Let's chose the more compact way (pci)
>> and reuse it for mmio.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/virtio/virtio-mmio.c | 41 +++++------------------------
>> hw/virtio/virtio-pci.c | 34 +++---------------------
>> hw/virtio/virtio.c | 48 +++++++++++++++++++++++++++++++---
>> include/hw/virtio/virtio-pci.h | 3 ---
>> include/hw/virtio/virtio.h | 7 +++--
>> 5 files changed, 58 insertions(+), 75 deletions(-)
>
>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 9a81ad912e..7880c3bcd9 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>
>
>> +static void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev,
>> + VirtQueue *vq,
>> + int n, bool assign,
>> + bool with_irqfd)
>> +{
>> + if (n == VIRTIO_CONFIG_IRQ_IDX) {
>> + virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
>> + } else {
>> + virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
>> + }
>> +}
>> +
>> +int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
>> + bool with_irqfd)
>> +{
>> + VirtQueue *vq = NULL;
>> + EventNotifier *notifier = NULL;
>> +
>> + if (n == VIRTIO_CONFIG_IRQ_IDX) {
>> + notifier = virtio_config_get_guest_notifier(vdev);
>> + } else {
>> + vq = virtio_get_queue(vdev, n);
>> + notifier = virtio_queue_get_guest_notifier(vq);
>> + }
>> +
>> + if (assign) {
>> + int r = event_notifier_init(notifier, 0);
>> + if (r < 0) {
>> + return r;
>> + }
>> + virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, true, with_irqfd);
>> + } else {
>> + virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, false,
>> + with_irqfd);
>> + event_notifier_cleanup(notifier);
>> + }
>> +
>> + return 0;
>> +}
>
>
>> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
>> index c594764f23..8b9db08ddf 100644
>> --- a/include/hw/virtio/virtio.h
>> +++ b/include/hw/virtio/virtio.h
>
>
>> -void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
>> - bool assign, bool with_irqfd);
>> +
>> +int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
>> + bool with_irqfd);
>
> Please add a @docstring to document (@n in particular).
>
> Thanks,
>
> Phil.
Will do. Thanks for reviewing!
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 08/33] virtio: drop *_set_guest_notifier_fd_handler() helpers
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (6 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 07/33] virtio: move common part of _set_guest_notifier to generic code Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-13 16:48 ` [PATCH 09/33] vhost-user: keep QIOChannelSocket for backend channel Vladimir Sementsov-Ogievskiy
` (25 subsequent siblings)
33 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Now they don't make code more readable. Let's better put the whole
logic into virtio_queue_set_guest_notifier().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/virtio.c | 76 +++++++++++-----------------------------------
1 file changed, 17 insertions(+), 59 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7880c3bcd9..10891f0e0c 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3751,74 +3751,32 @@ static void virtio_config_guest_notifier_read(EventNotifier *n)
}
}
-static void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq,
- bool assign,
- bool with_irqfd)
-{
- if (assign && !with_irqfd) {
- event_notifier_set_handler(&vq->guest_notifier,
- virtio_queue_guest_notifier_read);
- } else {
- event_notifier_set_handler(&vq->guest_notifier, NULL);
- }
- if (!assign) {
- /* Test and clear notifier before closing it,
- * in case poll callback didn't have time to run. */
- virtio_queue_guest_notifier_read(&vq->guest_notifier);
- }
-}
-
-static void virtio_config_set_guest_notifier_fd_handler(VirtIODevice *vdev,
- bool assign, bool with_irqfd)
-{
- EventNotifier *n;
- n = &vdev->config_notifier;
- if (assign && !with_irqfd) {
- event_notifier_set_handler(n, virtio_config_guest_notifier_read);
- } else {
- event_notifier_set_handler(n, NULL);
- }
- if (!assign) {
- /* Test and clear notifier before closing it,*/
- /* in case poll callback didn't have time to run. */
- virtio_config_guest_notifier_read(n);
- }
-}
-
-static void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev,
- VirtQueue *vq,
- int n, bool assign,
- bool with_irqfd)
-{
- if (n == VIRTIO_CONFIG_IRQ_IDX) {
- virtio_config_set_guest_notifier_fd_handler(vdev, assign, with_irqfd);
- } else {
- virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
- }
-}
-
int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
bool with_irqfd)
{
- VirtQueue *vq = NULL;
- EventNotifier *notifier = NULL;
-
- if (n == VIRTIO_CONFIG_IRQ_IDX) {
- notifier = virtio_config_get_guest_notifier(vdev);
- } else {
- vq = virtio_get_queue(vdev, n);
- notifier = virtio_queue_get_guest_notifier(vq);
- }
+ bool is_config = n == VIRTIO_CONFIG_IRQ_IDX;
+ VirtQueue *vq = is_config ? NULL : virtio_get_queue(vdev, n);
+ EventNotifier *notifier = is_config ?
+ virtio_config_get_guest_notifier(vdev) :
+ virtio_queue_get_guest_notifier(vq);
+ EventNotifierHandler *read_fn = is_config ?
+ virtio_config_guest_notifier_read :
+ virtio_queue_guest_notifier_read;
if (assign) {
int r = event_notifier_init(notifier, 0);
if (r < 0) {
return r;
}
- virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, true, with_irqfd);
- } else {
- virtio_pci_set_guest_notifier_fd_handler(vdev, vq, n, false,
- with_irqfd);
+ }
+
+ event_notifier_set_handler(notifier,
+ (assign && !with_irqfd) ? read_fn : NULL);
+
+ if (!assign) {
+ /* Test and clear notifier before closing it,*/
+ /* in case poll callback didn't have time to run. */
+ read_fn(notifier);
event_notifier_cleanup(notifier);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 09/33] vhost-user: keep QIOChannelSocket for backend channel
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (7 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 08/33] virtio: drop *_set_guest_notifier_fd_handler() helpers Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 18:58 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 10/33] vhost: vhost_virtqueue_start(): fix failure path Vladimir Sementsov-Ogievskiy
` (24 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Keep QIOChannelSocket pointer instead of more generic
QIOChannel. No real difference for now, but it would
be simpler to migrate socket fd in further commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost-user.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 46f09f5988..fe9d91348d 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -244,7 +244,7 @@ struct vhost_user {
struct vhost_dev *dev;
/* Shared between vhost devs of the same virtio device */
VhostUserState *user;
- QIOChannel *backend_ioc;
+ QIOChannelSocket *backend_sioc;
GSource *backend_src;
NotifierWithReturn postcopy_notifier;
struct PostCopyFD postcopy_fd;
@@ -1789,8 +1789,8 @@ static void close_backend_channel(struct vhost_user *u)
g_source_destroy(u->backend_src);
g_source_unref(u->backend_src);
u->backend_src = NULL;
- object_unref(OBJECT(u->backend_ioc));
- u->backend_ioc = NULL;
+ object_unref(OBJECT(u->backend_sioc));
+ u->backend_sioc = NULL;
}
static gboolean backend_read(QIOChannel *ioc, GIOCondition condition,
@@ -1897,7 +1897,6 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
bool reply_supported =
vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
Error *local_err = NULL;
- QIOChannel *ioc;
if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
return 0;
@@ -1909,15 +1908,15 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
return -saved_errno;
}
- ioc = QIO_CHANNEL(qio_channel_socket_new_fd(sv[0], &local_err));
- if (!ioc) {
+ u->backend_sioc = qio_channel_socket_new_fd(sv[0], &local_err);
+ if (!u->backend_sioc) {
error_report_err(local_err);
return -ECONNREFUSED;
}
- u->backend_ioc = ioc;
- u->backend_src = qio_channel_add_watch_source(u->backend_ioc,
- G_IO_IN | G_IO_HUP,
- backend_read, dev, NULL, NULL);
+ u->backend_src = qio_channel_add_watch_source(QIO_CHANNEL(u->backend_sioc),
+ G_IO_IN | G_IO_HUP,
+ backend_read, dev,
+ NULL, NULL);
if (reply_supported) {
msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
@@ -2321,7 +2320,7 @@ static int vhost_user_backend_cleanup(struct vhost_dev *dev)
close(u->postcopy_fd.fd);
u->postcopy_fd.handler = NULL;
}
- if (u->backend_ioc) {
+ if (u->backend_sioc) {
close_backend_channel(u);
}
g_free(u->region_rb);
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 09/33] vhost-user: keep QIOChannelSocket for backend channel
2025-08-13 16:48 ` [PATCH 09/33] vhost-user: keep QIOChannelSocket for backend channel Vladimir Sementsov-Ogievskiy
@ 2025-10-09 18:58 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 18:58 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Acked-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Keep QIOChannelSocket pointer instead of more generic
> QIOChannel. No real difference for now, but it would
> be simpler to migrate socket fd in further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost-user.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 46f09f5988..fe9d91348d 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -244,7 +244,7 @@ struct vhost_user {
> struct vhost_dev *dev;
> /* Shared between vhost devs of the same virtio device */
> VhostUserState *user;
> - QIOChannel *backend_ioc;
> + QIOChannelSocket *backend_sioc;
> GSource *backend_src;
> NotifierWithReturn postcopy_notifier;
> struct PostCopyFD postcopy_fd;
> @@ -1789,8 +1789,8 @@ static void close_backend_channel(struct vhost_user *u)
> g_source_destroy(u->backend_src);
> g_source_unref(u->backend_src);
> u->backend_src = NULL;
> - object_unref(OBJECT(u->backend_ioc));
> - u->backend_ioc = NULL;
> + object_unref(OBJECT(u->backend_sioc));
> + u->backend_sioc = NULL;
> }
>
> static gboolean backend_read(QIOChannel *ioc, GIOCondition condition,
> @@ -1897,7 +1897,6 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
> bool reply_supported =
> vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_REPLY_ACK);
> Error *local_err = NULL;
> - QIOChannel *ioc;
>
> if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_BACKEND_REQ)) {
> return 0;
> @@ -1909,15 +1908,15 @@ static int vhost_setup_backend_channel(struct vhost_dev *dev)
> return -saved_errno;
> }
>
> - ioc = QIO_CHANNEL(qio_channel_socket_new_fd(sv[0], &local_err));
> - if (!ioc) {
> + u->backend_sioc = qio_channel_socket_new_fd(sv[0], &local_err);
> + if (!u->backend_sioc) {
> error_report_err(local_err);
> return -ECONNREFUSED;
> }
> - u->backend_ioc = ioc;
> - u->backend_src = qio_channel_add_watch_source(u->backend_ioc,
> - G_IO_IN | G_IO_HUP,
> - backend_read, dev, NULL, NULL);
> + u->backend_src = qio_channel_add_watch_source(QIO_CHANNEL(u->backend_sioc),
> + G_IO_IN | G_IO_HUP,
> + backend_read, dev,
> + NULL, NULL);
>
> if (reply_supported) {
> msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> @@ -2321,7 +2320,7 @@ static int vhost_user_backend_cleanup(struct vhost_dev *dev)
> close(u->postcopy_fd.fd);
> u->postcopy_fd.handler = NULL;
> }
> - if (u->backend_ioc) {
> + if (u->backend_sioc) {
> close_backend_channel(u);
> }
> g_free(u->region_rb);
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 10/33] vhost: vhost_virtqueue_start(): fix failure path
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (8 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 09/33] vhost-user: keep QIOChannelSocket for backend channel Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:00 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe Vladimir Sementsov-Ogievskiy
` (23 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
We miss call to unmap in cases when vhost_memory_map() returns
lenght less than requested (still we consider such cases as an
error). Let's fix it in vhost_memory_map().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2631bbabcf..1e14987cd5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -467,10 +467,19 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
}
static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
- hwaddr *plen, bool is_write)
+ hwaddr len, bool is_write)
{
+ hwaddr mapped_len = len;
if (!vhost_dev_has_iommu(dev)) {
- return cpu_physical_memory_map(addr, plen, is_write);
+ void *res = cpu_physical_memory_map(addr, &mapped_len, is_write);
+ if (!res) {
+ return NULL;
+ }
+ if (len != mapped_len) {
+ cpu_physical_memory_unmap(res, mapped_len, 0, 0);
+ return NULL;
+ }
+ return res;
} else {
return (void *)(uintptr_t)addr;
}
@@ -1259,7 +1268,7 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
- hwaddr s, l, a;
+ hwaddr l, a;
int r;
int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
struct vhost_vring_file file = {
@@ -1299,24 +1308,24 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
}
}
- vq->desc_size = s = l = virtio_queue_get_desc_size(vdev, idx);
+ vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
vq->desc_phys = a;
- vq->desc = vhost_memory_map(dev, a, &l, false);
- if (!vq->desc || l != s) {
+ vq->desc = vhost_memory_map(dev, a, l, false);
+ if (!vq->desc) {
r = -ENOMEM;
goto fail_alloc_desc;
}
- vq->avail_size = s = l = virtio_queue_get_avail_size(vdev, idx);
+ vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
- vq->avail = vhost_memory_map(dev, a, &l, false);
- if (!vq->avail || l != s) {
+ vq->avail = vhost_memory_map(dev, a, l, false);
+ if (!vq->avail) {
r = -ENOMEM;
goto fail_alloc_avail;
}
- vq->used_size = s = l = virtio_queue_get_used_size(vdev, idx);
+ vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
- vq->used = vhost_memory_map(dev, a, &l, true);
- if (!vq->used || l != s) {
+ vq->used = vhost_memory_map(dev, a, l, true);
+ if (!vq->used) {
r = -ENOMEM;
goto fail_alloc_used;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 10/33] vhost: vhost_virtqueue_start(): fix failure path
2025-08-13 16:48 ` [PATCH 10/33] vhost: vhost_virtqueue_start(): fix failure path Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:00 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:00 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Nice cleanup.
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> We miss call to unmap in cases when vhost_memory_map() returns
> lenght less than requested (still we consider such cases as an
> error). Let's fix it in vhost_memory_map().
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 33 +++++++++++++++++++++------------
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2631bbabcf..1e14987cd5 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -467,10 +467,19 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
> }
>
> static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> - hwaddr *plen, bool is_write)
> + hwaddr len, bool is_write)
> {
> + hwaddr mapped_len = len;
> if (!vhost_dev_has_iommu(dev)) {
> - return cpu_physical_memory_map(addr, plen, is_write);
> + void *res = cpu_physical_memory_map(addr, &mapped_len, is_write);
> + if (!res) {
> + return NULL;
> + }
> + if (len != mapped_len) {
> + cpu_physical_memory_unmap(res, mapped_len, 0, 0);
> + return NULL;
> + }
> + return res;
> } else {
> return (void *)(uintptr_t)addr;
> }
> @@ -1259,7 +1268,7 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> - hwaddr s, l, a;
> + hwaddr l, a;
> int r;
> int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
> struct vhost_vring_file file = {
> @@ -1299,24 +1308,24 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> }
> }
>
> - vq->desc_size = s = l = virtio_queue_get_desc_size(vdev, idx);
> + vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
> vq->desc_phys = a;
> - vq->desc = vhost_memory_map(dev, a, &l, false);
> - if (!vq->desc || l != s) {
> + vq->desc = vhost_memory_map(dev, a, l, false);
> + if (!vq->desc) {
> r = -ENOMEM;
> goto fail_alloc_desc;
> }
> - vq->avail_size = s = l = virtio_queue_get_avail_size(vdev, idx);
> + vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
> vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
> - vq->avail = vhost_memory_map(dev, a, &l, false);
> - if (!vq->avail || l != s) {
> + vq->avail = vhost_memory_map(dev, a, l, false);
> + if (!vq->avail) {
> r = -ENOMEM;
> goto fail_alloc_avail;
> }
> - vq->used_size = s = l = virtio_queue_get_used_size(vdev, idx);
> + vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
> vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
> - vq->used = vhost_memory_map(dev, a, &l, true);
> - if (!vq->used || l != s) {
> + vq->used = vhost_memory_map(dev, a, l, true);
> + if (!vq->used) {
> r = -ENOMEM;
> goto fail_alloc_used;
> }
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (9 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 10/33] vhost: vhost_virtqueue_start(): fix failure path Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:00 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 12/33] vhost: simplify calls to vhost_memory_unmap() Vladimir Sementsov-Ogievskiy
` (22 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
This helps to simplify failure paths of vhost_virtqueue_start()
a lot.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1e14987cd5..1fdc1937b6 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -489,6 +489,10 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
hwaddr len, int is_write,
hwaddr access_len)
{
+ if (!buffer) {
+ return;
+ }
+
if (!vhost_dev_has_iommu(dev)) {
cpu_physical_memory_unmap(buffer, len, is_write, access_len);
}
@@ -1313,33 +1317,33 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
vq->desc = vhost_memory_map(dev, a, l, false);
if (!vq->desc) {
r = -ENOMEM;
- goto fail_alloc_desc;
+ goto fail;
}
vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
vq->avail = vhost_memory_map(dev, a, l, false);
if (!vq->avail) {
r = -ENOMEM;
- goto fail_alloc_avail;
+ goto fail;
}
vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
vq->used = vhost_memory_map(dev, a, l, true);
if (!vq->used) {
r = -ENOMEM;
- goto fail_alloc_used;
+ goto fail;
}
r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
if (r < 0) {
- goto fail_alloc;
+ goto fail;
}
file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
if (r) {
VHOST_OPS_DEBUG(r, "vhost_set_vring_kick failed");
- goto fail_kick;
+ goto fail;
}
/* Clear and discard previous events if any. */
@@ -1359,24 +1363,19 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
file.fd = -1;
r = dev->vhost_ops->vhost_set_vring_call(dev, &file);
if (r) {
- goto fail_vector;
+ goto fail;
}
}
return 0;
-fail_vector:
-fail_kick:
-fail_alloc:
+fail:
vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
0, 0);
-fail_alloc_used:
vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
0, 0);
-fail_alloc_avail:
vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
0, 0);
-fail_alloc_desc:
return r;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe
2025-08-13 16:48 ` [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:00 ` Raphael Norwitz
2025-10-09 20:00 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:00 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> This helps to simplify failure paths of vhost_virtqueue_start()
> a lot.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 23 +++++++++++------------
> 1 file changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 1e14987cd5..1fdc1937b6 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -489,6 +489,10 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
> hwaddr len, int is_write,
> hwaddr access_len)
> {
> + if (!buffer) {
> + return;
> + }
> +
> if (!vhost_dev_has_iommu(dev)) {
> cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> }
> @@ -1313,33 +1317,33 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> vq->desc = vhost_memory_map(dev, a, l, false);
> if (!vq->desc) {
> r = -ENOMEM;
> - goto fail_alloc_desc;
> + goto fail;
> }
> vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
> vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
> vq->avail = vhost_memory_map(dev, a, l, false);
> if (!vq->avail) {
> r = -ENOMEM;
> - goto fail_alloc_avail;
> + goto fail;
> }
> vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
> vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
> vq->used = vhost_memory_map(dev, a, l, true);
> if (!vq->used) {
> r = -ENOMEM;
> - goto fail_alloc_used;
> + goto fail;
> }
>
> r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
> if (r < 0) {
> - goto fail_alloc;
> + goto fail;
> }
>
> file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
> r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
> if (r) {
> VHOST_OPS_DEBUG(r, "vhost_set_vring_kick failed");
> - goto fail_kick;
> + goto fail;
> }
>
> /* Clear and discard previous events if any. */
> @@ -1359,24 +1363,19 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> file.fd = -1;
> r = dev->vhost_ops->vhost_set_vring_call(dev, &file);
> if (r) {
> - goto fail_vector;
> + goto fail;
> }
> }
>
> return 0;
>
> -fail_vector:
> -fail_kick:
> -fail_alloc:
> +fail:
> vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
> 0, 0);
> -fail_alloc_used:
> vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
> 0, 0);
> -fail_alloc_avail:
> vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
> 0, 0);
> -fail_alloc_desc:
> return r;
This assumes that vq->{used, avail, desc} will be nulled out. I’m not
totally convinced that will be the case when a device is started and
stopped, or at least I don’t see the unmap path doing it.
> }
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe
2025-10-09 19:00 ` Raphael Norwitz
@ 2025-10-09 20:00 ` Vladimir Sementsov-Ogievskiy
2025-10-11 19:10 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:00 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:00, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> This helps to simplify failure paths of vhost_virtqueue_start()
>> a lot.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/virtio/vhost.c | 23 +++++++++++------------
>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 1e14987cd5..1fdc1937b6 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -489,6 +489,10 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
>> hwaddr len, int is_write,
>> hwaddr access_len)
>> {
>> + if (!buffer) {
>> + return;
>> + }
>> +
>> if (!vhost_dev_has_iommu(dev)) {
>> cpu_physical_memory_unmap(buffer, len, is_write, access_len);
>> }
>> @@ -1313,33 +1317,33 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
>> vq->desc = vhost_memory_map(dev, a, l, false);
>> if (!vq->desc) {
>> r = -ENOMEM;
>> - goto fail_alloc_desc;
>> + goto fail;
>> }
>> vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
>> vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
>> vq->avail = vhost_memory_map(dev, a, l, false);
>> if (!vq->avail) {
>> r = -ENOMEM;
>> - goto fail_alloc_avail;
>> + goto fail;
>> }
>> vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
>> vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
>> vq->used = vhost_memory_map(dev, a, l, true);
>> if (!vq->used) {
>> r = -ENOMEM;
>> - goto fail_alloc_used;
>> + goto fail;
>> }
>>
>> r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
>> if (r < 0) {
>> - goto fail_alloc;
>> + goto fail;
>> }
>>
>> file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
>> r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
>> if (r) {
>> VHOST_OPS_DEBUG(r, "vhost_set_vring_kick failed");
>> - goto fail_kick;
>> + goto fail;
>> }
>>
>> /* Clear and discard previous events if any. */
>> @@ -1359,24 +1363,19 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
>> file.fd = -1;
>> r = dev->vhost_ops->vhost_set_vring_call(dev, &file);
>> if (r) {
>> - goto fail_vector;
>> + goto fail;
>> }
>> }
>>
>> return 0;
>>
>> -fail_vector:
>> -fail_kick:
>> -fail_alloc:
>> +fail:
>> vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
>> 0, 0);
>> -fail_alloc_used:
>> vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
>> 0, 0);
>> -fail_alloc_avail:
>> vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>> 0, 0);
>> -fail_alloc_desc:
>> return r;
>
> This assumes that vq->{used, avail, desc} will be nulled out. I’m not
> totally convinced that will be the case when a device is started and
> stopped, or at least I don’t see the unmap path doing it.
>
Oh, right, good caught. Seems we never zero these fields, and after do_vhost_virtqueue_stop()
they become invalid. I'll rework it somehow.
I also notice now, that we do
vq->used = vhost_memory_map(dev, a, &l, true);
if (!vq->used || l != s) {
r = -ENOMEM;
goto fail_alloc_used;
}
so, theoretically pre-patch we may leak the mapping in case when vq->used is not NULL but l != s after vhost_memory_map().
this should be fixed with this commit (of course, if fix also the problem you pointed out)
>> }
>
>>
>> --
>> 2.48.1
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe
2025-10-09 20:00 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-11 19:10 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-11 19:10 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 23:00, Vladimir Sementsov-Ogievskiy wrote:
> On 09.10.25 22:00, Raphael Norwitz wrote:
>> On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
>> <vsementsov@yandex-team.ru> wrote:
>>>
>>> This helps to simplify failure paths of vhost_virtqueue_start()
>>> a lot.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>> ---
>>> hw/virtio/vhost.c | 23 +++++++++++------------
>>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 1e14987cd5..1fdc1937b6 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -489,6 +489,10 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
>>> hwaddr len, int is_write,
>>> hwaddr access_len)
>>> {
>>> + if (!buffer) {
>>> + return;
>>> + }
>>> +
>>> if (!vhost_dev_has_iommu(dev)) {
>>> cpu_physical_memory_unmap(buffer, len, is_write, access_len);
>>> }
>>> @@ -1313,33 +1317,33 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
>>> vq->desc = vhost_memory_map(dev, a, l, false);
>>> if (!vq->desc) {
>>> r = -ENOMEM;
>>> - goto fail_alloc_desc;
>>> + goto fail;
>>> }
>>> vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
>>> vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
>>> vq->avail = vhost_memory_map(dev, a, l, false);
>>> if (!vq->avail) {
>>> r = -ENOMEM;
>>> - goto fail_alloc_avail;
>>> + goto fail;
>>> }
>>> vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
>>> vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
>>> vq->used = vhost_memory_map(dev, a, l, true);
>>> if (!vq->used) {
>>> r = -ENOMEM;
>>> - goto fail_alloc_used;
>>> + goto fail;
>>> }
>>>
>>> r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
>>> if (r < 0) {
>>> - goto fail_alloc;
>>> + goto fail;
>>> }
>>>
>>> file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
>>> r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
>>> if (r) {
>>> VHOST_OPS_DEBUG(r, "vhost_set_vring_kick failed");
>>> - goto fail_kick;
>>> + goto fail;
>>> }
>>>
>>> /* Clear and discard previous events if any. */
>>> @@ -1359,24 +1363,19 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
>>> file.fd = -1;
>>> r = dev->vhost_ops->vhost_set_vring_call(dev, &file);
>>> if (r) {
>>> - goto fail_vector;
>>> + goto fail;
>>> }
>>> }
>>>
>>> return 0;
>>>
>>> -fail_vector:
>>> -fail_kick:
>>> -fail_alloc:
>>> +fail:
>>> vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
>>> 0, 0);
>>> -fail_alloc_used:
>>> vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
>>> 0, 0);
>>> -fail_alloc_avail:
>>> vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
>>> 0, 0);
>>> -fail_alloc_desc:
>>> return r;
>>
>> This assumes that vq->{used, avail, desc} will be nulled out. I’m not
>> totally convinced that will be the case when a device is started and
>> stopped, or at least I don’t see the unmap path doing it.
>>
>
> Oh, right, good caught. Seems we never zero these fields, and after do_vhost_virtqueue_stop()
> they become invalid. I'll rework it somehow.
>
> I also notice now, that we do
>
> vq->used = vhost_memory_map(dev, a, &l, true);
> if (!vq->used || l != s) {
> r = -ENOMEM;
> goto fail_alloc_used;
> }
>
> so, theoretically pre-patch we may leak the mapping in case when vq->used is not NULL but l != s after vhost_memory_map().
>
> this should be fixed with this commit (of course, if fix also the problem you pointed out)
oh, I forget that previous patch already fixes it.
>
>>> }
>>
>>>
>>> --
>>> 2.48.1
>>>
>>>
>
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 12/33] vhost: simplify calls to vhost_memory_unmap()
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (10 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 11/33] vhost: make vhost_memory_unmap() null-safe Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:00 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 13/33] vhost: move vrings mapping to the top of vhost_virtqueue_start() Vladimir Sementsov-Ogievskiy
` (21 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
No reason to calculate memory size again, as we have corresponding
variable for each vring.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1fdc1937b6..bc1821eadd 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1370,12 +1370,9 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
return 0;
fail:
- vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
- 0, 0);
- vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
- 0, 0);
- vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
- 0, 0);
+ vhost_memory_unmap(dev, vq->used, vq->used_size, 0, 0);
+ vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, 0);
+ vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, 0);
return r;
}
@@ -1422,12 +1419,9 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
vhost_vq_index);
}
- vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
- 1, virtio_queue_get_used_size(vdev, idx));
- vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
- 0, virtio_queue_get_avail_size(vdev, idx));
- vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
- 0, virtio_queue_get_desc_size(vdev, idx));
+ vhost_memory_unmap(dev, vq->used, vq->used_size, 1, vq->used_size);
+ vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, vq->avail_size);
+ vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, vq->desc_size);
return r;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 12/33] vhost: simplify calls to vhost_memory_unmap()
2025-08-13 16:48 ` [PATCH 12/33] vhost: simplify calls to vhost_memory_unmap() Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:00 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:00 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
I’m happy with this modulo comments for patch 11/33.
On Wed, Aug 13, 2025 at 12:52 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> No reason to calculate memory size again, as we have corresponding
> variable for each vring.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 18 ++++++------------
> 1 file changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 1fdc1937b6..bc1821eadd 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1370,12 +1370,9 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> return 0;
>
> fail:
> - vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
> - 0, 0);
> - vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
> - 0, 0);
> - vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
> - 0, 0);
> + vhost_memory_unmap(dev, vq->used, vq->used_size, 0, 0);
> + vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, 0);
> + vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, 0);
> return r;
> }
>
> @@ -1422,12 +1419,9 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
> vhost_vq_index);
> }
>
> - vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
> - 1, virtio_queue_get_used_size(vdev, idx));
> - vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
> - 0, virtio_queue_get_avail_size(vdev, idx));
> - vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
> - 0, virtio_queue_get_desc_size(vdev, idx));
> + vhost_memory_unmap(dev, vq->used, vq->used_size, 1, vq->used_size);
> + vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, vq->avail_size);
> + vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, vq->desc_size);
> return r;
> }
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 13/33] vhost: move vrings mapping to the top of vhost_virtqueue_start()
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (11 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 12/33] vhost: simplify calls to vhost_memory_unmap() Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:01 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 14/33] vhost: vhost_virtqueue_start(): drop extra local variables Vladimir Sementsov-Ogievskiy
` (20 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
This simplifies further refactoring and final introduction
of vhost backend live migration.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 47 +++++++++++++++++++++++------------------------
1 file changed, 23 insertions(+), 24 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index bc1821eadd..97113174b9 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1288,30 +1288,6 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
/* Queue might not be ready for start */
return 0;
}
-
- vq->num = state.num = virtio_queue_get_num(vdev, idx);
- r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
- if (r) {
- VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
- return r;
- }
-
- state.num = virtio_queue_get_last_avail_idx(vdev, idx);
- r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
- if (r) {
- VHOST_OPS_DEBUG(r, "vhost_set_vring_base failed");
- return r;
- }
-
- if (vhost_needs_vring_endian(vdev)) {
- r = vhost_virtqueue_set_vring_endian_legacy(dev,
- virtio_is_big_endian(vdev),
- vhost_vq_index);
- if (r) {
- return r;
- }
- }
-
vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
vq->desc_phys = a;
vq->desc = vhost_memory_map(dev, a, l, false);
@@ -1334,6 +1310,29 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
goto fail;
}
+ vq->num = state.num = virtio_queue_get_num(vdev, idx);
+ r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
+ if (r) {
+ VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
+ goto fail;
+ }
+
+ state.num = virtio_queue_get_last_avail_idx(vdev, idx);
+ r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
+ if (r) {
+ VHOST_OPS_DEBUG(r, "vhost_set_vring_base failed");
+ goto fail;
+ }
+
+ if (vhost_needs_vring_endian(vdev)) {
+ r = vhost_virtqueue_set_vring_endian_legacy(dev,
+ virtio_is_big_endian(vdev),
+ vhost_vq_index);
+ if (r) {
+ goto fail;
+ }
+ }
+
r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
if (r < 0) {
goto fail;
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 13/33] vhost: move vrings mapping to the top of vhost_virtqueue_start()
2025-08-13 16:48 ` [PATCH 13/33] vhost: move vrings mapping to the top of vhost_virtqueue_start() Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:01 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:01 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:52 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> This simplifies further refactoring and final introduction
> of vhost backend live migration.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 47 +++++++++++++++++++++++------------------------
> 1 file changed, 23 insertions(+), 24 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index bc1821eadd..97113174b9 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1288,30 +1288,6 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> /* Queue might not be ready for start */
> return 0;
> }
> -
> - vq->num = state.num = virtio_queue_get_num(vdev, idx);
> - r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
> - if (r) {
> - VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
> - return r;
> - }
> -
> - state.num = virtio_queue_get_last_avail_idx(vdev, idx);
> - r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
> - if (r) {
> - VHOST_OPS_DEBUG(r, "vhost_set_vring_base failed");
> - return r;
> - }
> -
> - if (vhost_needs_vring_endian(vdev)) {
> - r = vhost_virtqueue_set_vring_endian_legacy(dev,
> - virtio_is_big_endian(vdev),
> - vhost_vq_index);
> - if (r) {
> - return r;
> - }
> - }
> -
> vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
> vq->desc_phys = a;
> vq->desc = vhost_memory_map(dev, a, l, false);
> @@ -1334,6 +1310,29 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> goto fail;
> }
>
> + vq->num = state.num = virtio_queue_get_num(vdev, idx);
> + r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
> + if (r) {
> + VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
> + goto fail;
> + }
> +
> + state.num = virtio_queue_get_last_avail_idx(vdev, idx);
> + r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
> + if (r) {
> + VHOST_OPS_DEBUG(r, "vhost_set_vring_base failed");
> + goto fail;
> + }
> +
> + if (vhost_needs_vring_endian(vdev)) {
> + r = vhost_virtqueue_set_vring_endian_legacy(dev,
> + virtio_is_big_endian(vdev),
> + vhost_vq_index);
> + if (r) {
> + goto fail;
> + }
> + }
> +
> r = vhost_virtqueue_set_addr(dev, vq, vhost_vq_index, dev->log_enabled);
> if (r < 0) {
> goto fail;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 14/33] vhost: vhost_virtqueue_start(): drop extra local variables
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (12 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 13/33] vhost: move vrings mapping to the top of vhost_virtqueue_start() Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:02 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 15/33] vhost: final refactoring of vhost vrings map/unmap Vladimir Sementsov-Ogievskiy
` (19 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
One letter named variables doesn't really help to read the code,
and they simply duplicate structure fields.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 97113174b9..c76e2dbb4e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1272,7 +1272,6 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
- hwaddr l, a;
int r;
int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
struct vhost_vring_file file = {
@@ -1283,28 +1282,27 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
};
struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
- a = virtio_queue_get_desc_addr(vdev, idx);
- if (a == 0) {
+ vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
+ if (vq->desc_phys == 0) {
/* Queue might not be ready for start */
return 0;
}
- vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
- vq->desc_phys = a;
- vq->desc = vhost_memory_map(dev, a, l, false);
+ vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
+ vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
if (!vq->desc) {
r = -ENOMEM;
goto fail;
}
- vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
- vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
- vq->avail = vhost_memory_map(dev, a, l, false);
+ vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
+ vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
+ vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
if (!vq->avail) {
r = -ENOMEM;
goto fail;
}
- vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
- vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
- vq->used = vhost_memory_map(dev, a, l, true);
+ vq->used_size = virtio_queue_get_used_size(vdev, idx);
+ vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
+ vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
if (!vq->used) {
r = -ENOMEM;
goto fail;
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 14/33] vhost: vhost_virtqueue_start(): drop extra local variables
2025-08-13 16:48 ` [PATCH 14/33] vhost: vhost_virtqueue_start(): drop extra local variables Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:02 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:02 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> One letter named variables doesn't really help to read the code,
> and they simply duplicate structure fields.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 22 ++++++++++------------
> 1 file changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 97113174b9..c76e2dbb4e 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1272,7 +1272,6 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> VirtioBusState *vbus = VIRTIO_BUS(qbus);
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> - hwaddr l, a;
> int r;
> int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx);
> struct vhost_vring_file file = {
> @@ -1283,28 +1282,27 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> };
> struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
>
> - a = virtio_queue_get_desc_addr(vdev, idx);
> - if (a == 0) {
> + vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
> + if (vq->desc_phys == 0) {
> /* Queue might not be ready for start */
> return 0;
> }
> - vq->desc_size = l = virtio_queue_get_desc_size(vdev, idx);
> - vq->desc_phys = a;
> - vq->desc = vhost_memory_map(dev, a, l, false);
> + vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
> + vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
> if (!vq->desc) {
> r = -ENOMEM;
> goto fail;
> }
> - vq->avail_size = l = virtio_queue_get_avail_size(vdev, idx);
> - vq->avail_phys = a = virtio_queue_get_avail_addr(vdev, idx);
> - vq->avail = vhost_memory_map(dev, a, l, false);
> + vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
> + vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
> + vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
> if (!vq->avail) {
> r = -ENOMEM;
> goto fail;
> }
> - vq->used_size = l = virtio_queue_get_used_size(vdev, idx);
> - vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
> - vq->used = vhost_memory_map(dev, a, l, true);
> + vq->used_size = virtio_queue_get_used_size(vdev, idx);
> + vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
> + vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
> if (!vq->used) {
> r = -ENOMEM;
> goto fail;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 15/33] vhost: final refactoring of vhost vrings map/unmap
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (13 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 14/33] vhost: vhost_virtqueue_start(): drop extra local variables Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:02 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 16/33] vhost: simplify vhost_dev_init() error-path Vladimir Sementsov-Ogievskiy
` (18 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Introduce helper functions vhost_vrings_map() and
vhost_vrings_unmap() and use them.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 82 ++++++++++++++++++++++++++++++-----------------
1 file changed, 52 insertions(+), 30 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c76e2dbb4e..f6ee59425f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -498,6 +498,53 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
}
}
+static void vhost_vrings_unmap(struct vhost_dev *dev,
+ struct vhost_virtqueue *vq, bool touched)
+{
+ vhost_memory_unmap(dev, vq->used, vq->used_size, touched,
+ touched ? vq->used_size : 0);
+ vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0,
+ touched ? vq->avail_size : 0);
+ vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0,
+ touched ? vq->desc_size : 0);
+}
+
+static int vhost_vrings_map(struct vhost_dev *dev,
+ struct VirtIODevice *vdev,
+ struct vhost_virtqueue *vq,
+ unsigned idx)
+{
+ vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
+ if (vq->desc_phys == 0) {
+ /* Queue might not be ready for start */
+ return 0;
+ }
+ vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
+ vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
+ if (!vq->desc) {
+ return -ENOMEM;
+ }
+
+ vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
+ vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
+ vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
+ if (!vq->avail) {
+ goto fail;
+ }
+
+ vq->used_size = virtio_queue_get_used_size(vdev, idx);
+ vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
+ vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
+ if (!vq->used) {
+ goto fail;
+ }
+
+ return 1;
+fail:
+ vhost_vrings_unmap(dev, vq, false);
+ return -ENOMEM;
+}
+
static int vhost_verify_ring_part_mapping(void *ring_hva,
uint64_t ring_gpa,
uint64_t ring_size,
@@ -1282,30 +1329,9 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
};
struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
- vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
- if (vq->desc_phys == 0) {
- /* Queue might not be ready for start */
- return 0;
- }
- vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
- vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
- if (!vq->desc) {
- r = -ENOMEM;
- goto fail;
- }
- vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
- vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
- vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
- if (!vq->avail) {
- r = -ENOMEM;
- goto fail;
- }
- vq->used_size = virtio_queue_get_used_size(vdev, idx);
- vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
- vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
- if (!vq->used) {
- r = -ENOMEM;
- goto fail;
+ r = vhost_vrings_map(dev, vdev, vq, idx);
+ if (r <= 0) {
+ return r;
}
vq->num = state.num = virtio_queue_get_num(vdev, idx);
@@ -1367,9 +1393,7 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
return 0;
fail:
- vhost_memory_unmap(dev, vq->used, vq->used_size, 0, 0);
- vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, 0);
- vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, 0);
+ vhost_vrings_unmap(dev, vq, false);
return r;
}
@@ -1416,9 +1440,7 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
vhost_vq_index);
}
- vhost_memory_unmap(dev, vq->used, vq->used_size, 1, vq->used_size);
- vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, vq->avail_size);
- vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, vq->desc_size);
+ vhost_vrings_unmap(dev, vq, true);
return r;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 15/33] vhost: final refactoring of vhost vrings map/unmap
2025-08-13 16:48 ` [PATCH 15/33] vhost: final refactoring of vhost vrings map/unmap Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:02 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:02 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
I’m happy with factoring out these helpers. My only concern is the one
around vq->{used, avail, desc} being nulled out in patch 11/33.
On Wed, Aug 13, 2025 at 12:51 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Introduce helper functions vhost_vrings_map() and
> vhost_vrings_unmap() and use them.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 82 ++++++++++++++++++++++++++++++-----------------
> 1 file changed, 52 insertions(+), 30 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index c76e2dbb4e..f6ee59425f 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -498,6 +498,53 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
> }
> }
>
> +static void vhost_vrings_unmap(struct vhost_dev *dev,
> + struct vhost_virtqueue *vq, bool touched)
> +{
> + vhost_memory_unmap(dev, vq->used, vq->used_size, touched,
> + touched ? vq->used_size : 0);
> + vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0,
> + touched ? vq->avail_size : 0);
> + vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0,
> + touched ? vq->desc_size : 0);
> +}
> +
> +static int vhost_vrings_map(struct vhost_dev *dev,
> + struct VirtIODevice *vdev,
> + struct vhost_virtqueue *vq,
> + unsigned idx)
> +{
> + vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
> + if (vq->desc_phys == 0) {
> + /* Queue might not be ready for start */
> + return 0;
> + }
> + vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
> + vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
> + if (!vq->desc) {
> + return -ENOMEM;
> + }
> +
> + vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
> + vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
> + vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
> + if (!vq->avail) {
> + goto fail;
> + }
> +
> + vq->used_size = virtio_queue_get_used_size(vdev, idx);
> + vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
> + vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
> + if (!vq->used) {
> + goto fail;
> + }
> +
> + return 1;
> +fail:
> + vhost_vrings_unmap(dev, vq, false);
> + return -ENOMEM;
> +}
> +
> static int vhost_verify_ring_part_mapping(void *ring_hva,
> uint64_t ring_gpa,
> uint64_t ring_size,
> @@ -1282,30 +1329,9 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> };
> struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
>
> - vq->desc_phys = virtio_queue_get_desc_addr(vdev, idx);
> - if (vq->desc_phys == 0) {
> - /* Queue might not be ready for start */
> - return 0;
> - }
> - vq->desc_size = virtio_queue_get_desc_size(vdev, idx);
> - vq->desc = vhost_memory_map(dev, vq->desc_phys, vq->desc_size, false);
> - if (!vq->desc) {
> - r = -ENOMEM;
> - goto fail;
> - }
> - vq->avail_size = virtio_queue_get_avail_size(vdev, idx);
> - vq->avail_phys = virtio_queue_get_avail_addr(vdev, idx);
> - vq->avail = vhost_memory_map(dev, vq->avail_phys, vq->avail_size, false);
> - if (!vq->avail) {
> - r = -ENOMEM;
> - goto fail;
> - }
> - vq->used_size = virtio_queue_get_used_size(vdev, idx);
> - vq->used_phys = virtio_queue_get_used_addr(vdev, idx);
> - vq->used = vhost_memory_map(dev, vq->used_phys, vq->used_size, true);
> - if (!vq->used) {
> - r = -ENOMEM;
> - goto fail;
> + r = vhost_vrings_map(dev, vdev, vq, idx);
> + if (r <= 0) {
> + return r;
> }
>
> vq->num = state.num = virtio_queue_get_num(vdev, idx);
> @@ -1367,9 +1393,7 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> return 0;
>
> fail:
> - vhost_memory_unmap(dev, vq->used, vq->used_size, 0, 0);
> - vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, 0);
> - vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, 0);
> + vhost_vrings_unmap(dev, vq, false);
> return r;
> }
>
> @@ -1416,9 +1440,7 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
> vhost_vq_index);
> }
>
> - vhost_memory_unmap(dev, vq->used, vq->used_size, 1, vq->used_size);
> - vhost_memory_unmap(dev, vq->avail, vq->avail_size, 0, vq->avail_size);
> - vhost_memory_unmap(dev, vq->desc, vq->desc_size, 0, vq->desc_size);
> + vhost_vrings_unmap(dev, vq, true);
> return r;
> }
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 16/33] vhost: simplify vhost_dev_init() error-path
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (14 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 15/33] vhost: final refactoring of vhost vrings map/unmap Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:04 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 17/33] vhost: move busyloop timeout initialization to vhost_virtqueue_init() Vladimir Sementsov-Ogievskiy
` (17 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
No reason to rollback setting up busyloop timeout on failure.
We don't do such rollback for other things we setup in backend.
Also, look at vhost_net_init() in hw/net/vhost_net.c: we may fail
after successfully called vhost_dev_init(), and in this case we'll
just call vhost_dev_cleanup(), which doesn't rollback busyloop
timeout.
So, let's keep it simple.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index f6ee59425f..a3620c82d8 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1602,7 +1602,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
busyloop_timeout);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to set busyloop timeout");
- goto fail_busyloop;
+ goto fail;
}
}
}
@@ -1642,7 +1642,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
if (hdev->migration_blocker != NULL) {
r = migrate_add_blocker_normal(&hdev->migration_blocker, errp);
if (r < 0) {
- goto fail_busyloop;
+ goto fail;
}
}
@@ -1674,17 +1674,11 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
" than current number of used (%d) and reserved (%d)"
" memory slots for memory devices.", limit, used, reserved);
r = -EINVAL;
- goto fail_busyloop;
+ goto fail;
}
return 0;
-fail_busyloop:
- if (busyloop_timeout) {
- while (--i >= 0) {
- vhost_virtqueue_set_busyloop_timeout(hdev, hdev->vq_index + i, 0);
- }
- }
fail:
hdev->nvqs = n_initialized_vqs;
vhost_dev_cleanup(hdev);
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 16/33] vhost: simplify vhost_dev_init() error-path
2025-08-13 16:48 ` [PATCH 16/33] vhost: simplify vhost_dev_init() error-path Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:04 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:04 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Acked-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:54 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> No reason to rollback setting up busyloop timeout on failure.
> We don't do such rollback for other things we setup in backend.
> Also, look at vhost_net_init() in hw/net/vhost_net.c: we may fail
> after successfully called vhost_dev_init(), and in this case we'll
> just call vhost_dev_cleanup(), which doesn't rollback busyloop
> timeout.
>
> So, let's keep it simple.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index f6ee59425f..a3620c82d8 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1602,7 +1602,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> busyloop_timeout);
> if (r < 0) {
> error_setg_errno(errp, -r, "Failed to set busyloop timeout");
> - goto fail_busyloop;
> + goto fail;
> }
> }
> }
> @@ -1642,7 +1642,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> if (hdev->migration_blocker != NULL) {
> r = migrate_add_blocker_normal(&hdev->migration_blocker, errp);
> if (r < 0) {
> - goto fail_busyloop;
> + goto fail;
> }
> }
>
> @@ -1674,17 +1674,11 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> " than current number of used (%d) and reserved (%d)"
> " memory slots for memory devices.", limit, used, reserved);
> r = -EINVAL;
> - goto fail_busyloop;
> + goto fail;
> }
>
> return 0;
>
> -fail_busyloop:
> - if (busyloop_timeout) {
> - while (--i >= 0) {
> - vhost_virtqueue_set_busyloop_timeout(hdev, hdev->vq_index + i, 0);
> - }
> - }
> fail:
> hdev->nvqs = n_initialized_vqs;
> vhost_dev_cleanup(hdev);
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 17/33] vhost: move busyloop timeout initialization to vhost_virtqueue_init()
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (15 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 16/33] vhost: simplify vhost_dev_init() error-path Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:04 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 18/33] vhost: introduce check_memslots() helper Vladimir Sementsov-Ogievskiy
` (16 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Let's all per-virtqueue initializations be in one place.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a3620c82d8..a8f8b85012 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1489,7 +1489,8 @@ static void vhost_virtqueue_error_notifier(EventNotifier *n)
}
static int vhost_virtqueue_init(struct vhost_dev *dev,
- struct vhost_virtqueue *vq, int n)
+ struct vhost_virtqueue *vq, int n,
+ bool busyloop_timeout)
{
int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
struct vhost_vring_file file = {
@@ -1526,6 +1527,14 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
vhost_virtqueue_error_notifier);
}
+ if (busyloop_timeout) {
+ r = vhost_virtqueue_set_busyloop_timeout(dev, n, busyloop_timeout);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "Failed to set busyloop timeout");
+ goto fail_err;
+ }
+ }
+
return 0;
fail_err:
@@ -1589,24 +1598,14 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
}
for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
- r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i);
+ r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i,
+ busyloop_timeout);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to initialize virtqueue %d", i);
goto fail;
}
}
- if (busyloop_timeout) {
- for (i = 0; i < hdev->nvqs; ++i) {
- r = vhost_virtqueue_set_busyloop_timeout(hdev, hdev->vq_index + i,
- busyloop_timeout);
- if (r < 0) {
- error_setg_errno(errp, -r, "Failed to set busyloop timeout");
- goto fail;
- }
- }
- }
-
hdev->_features = features;
hdev->memory_listener = (MemoryListener) {
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 17/33] vhost: move busyloop timeout initialization to vhost_virtqueue_init()
2025-08-13 16:48 ` [PATCH 17/33] vhost: move busyloop timeout initialization to vhost_virtqueue_init() Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:04 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:04 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Let's all per-virtqueue initializations be in one place.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 25 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 13 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index a3620c82d8..a8f8b85012 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1489,7 +1489,8 @@ static void vhost_virtqueue_error_notifier(EventNotifier *n)
> }
>
> static int vhost_virtqueue_init(struct vhost_dev *dev,
> - struct vhost_virtqueue *vq, int n)
> + struct vhost_virtqueue *vq, int n,
> + bool busyloop_timeout)
> {
> int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
> struct vhost_vring_file file = {
> @@ -1526,6 +1527,14 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
> vhost_virtqueue_error_notifier);
> }
>
> + if (busyloop_timeout) {
> + r = vhost_virtqueue_set_busyloop_timeout(dev, n, busyloop_timeout);
> + if (r < 0) {
> + VHOST_OPS_DEBUG(r, "Failed to set busyloop timeout");
> + goto fail_err;
> + }
> + }
> +
> return 0;
>
> fail_err:
> @@ -1589,24 +1598,14 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> }
>
> for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
> - r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i);
> + r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i,
> + busyloop_timeout);
> if (r < 0) {
> error_setg_errno(errp, -r, "Failed to initialize virtqueue %d", i);
> goto fail;
> }
> }
>
> - if (busyloop_timeout) {
> - for (i = 0; i < hdev->nvqs; ++i) {
> - r = vhost_virtqueue_set_busyloop_timeout(hdev, hdev->vq_index + i,
> - busyloop_timeout);
> - if (r < 0) {
> - error_setg_errno(errp, -r, "Failed to set busyloop timeout");
> - goto fail;
> - }
> - }
> - }
> -
> hdev->_features = features;
>
> hdev->memory_listener = (MemoryListener) {
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 18/33] vhost: introduce check_memslots() helper
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (16 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 17/33] vhost: move busyloop timeout initialization to vhost_virtqueue_init() Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:06 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable Vladimir Sementsov-Ogievskiy
` (15 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 71 ++++++++++++++++++++++++++---------------------
1 file changed, 40 insertions(+), 31 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a8f8b85012..f9163ba895 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1553,11 +1553,49 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
}
}
+static bool check_memslots(struct vhost_dev *hdev, Error **errp)
+{
+ unsigned int used, reserved, limit;
+
+ limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
+ if (limit < MEMORY_DEVICES_SAFE_MAX_MEMSLOTS &&
+ memory_devices_memslot_auto_decision_active()) {
+ error_setg(errp, "some memory device (like virtio-mem)"
+ " decided how many memory slots to use based on the overall"
+ " number of memory slots; this vhost backend would further"
+ " restricts the overall number of memory slots");
+ error_append_hint(errp, "Try plugging this vhost backend before"
+ " plugging such memory devices.\n");
+ return false;
+ }
+
+ /*
+ * The listener we registered properly setup the number of required
+ * memslots in vhost_commit().
+ */
+ used = hdev->mem->nregions;
+
+ /*
+ * We assume that all reserved memslots actually require a real memslot
+ * in our vhost backend. This might not be true, for example, if the
+ * memslot would be ROM. If ever relevant, we can optimize for that --
+ * but we'll need additional information about the reservations.
+ */
+ reserved = memory_devices_get_reserved_memslots();
+ if (used + reserved > limit) {
+ error_setg(errp, "vhost backend memory slots limit (%d) is less"
+ " than current number of used (%d) and reserved (%d)"
+ " memory slots for memory devices.", limit, used, reserved);
+ return false;
+ }
+
+ return true;
+}
+
int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
VhostBackendType backend_type, uint32_t busyloop_timeout,
Error **errp)
{
- unsigned int used, reserved, limit;
uint64_t features;
int i, r, n_initialized_vqs = 0;
@@ -1584,19 +1622,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
goto fail;
}
- limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
- if (limit < MEMORY_DEVICES_SAFE_MAX_MEMSLOTS &&
- memory_devices_memslot_auto_decision_active()) {
- error_setg(errp, "some memory device (like virtio-mem)"
- " decided how many memory slots to use based on the overall"
- " number of memory slots; this vhost backend would further"
- " restricts the overall number of memory slots");
- error_append_hint(errp, "Try plugging this vhost backend before"
- " plugging such memory devices.\n");
- r = -EINVAL;
- goto fail;
- }
-
for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i,
busyloop_timeout);
@@ -1655,23 +1680,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
memory_listener_register(&hdev->memory_listener, &address_space_memory);
QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
- /*
- * The listener we registered properly setup the number of required
- * memslots in vhost_commit().
- */
- used = hdev->mem->nregions;
-
- /*
- * We assume that all reserved memslots actually require a real memslot
- * in our vhost backend. This might not be true, for example, if the
- * memslot would be ROM. If ever relevant, we can optimize for that --
- * but we'll need additional information about the reservations.
- */
- reserved = memory_devices_get_reserved_memslots();
- if (used + reserved > limit) {
- error_setg(errp, "vhost backend memory slots limit (%d) is less"
- " than current number of used (%d) and reserved (%d)"
- " memory slots for memory devices.", limit, used, reserved);
+ if (!check_memslots(hdev, errp)) {
r = -EINVAL;
goto fail;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 18/33] vhost: introduce check_memslots() helper
2025-08-13 16:48 ` [PATCH 18/33] vhost: introduce check_memslots() helper Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:06 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:06 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 71 ++++++++++++++++++++++++++---------------------
> 1 file changed, 40 insertions(+), 31 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index a8f8b85012..f9163ba895 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1553,11 +1553,49 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
> }
> }
>
> +static bool check_memslots(struct vhost_dev *hdev, Error **errp)
> +{
> + unsigned int used, reserved, limit;
> +
> + limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
> + if (limit < MEMORY_DEVICES_SAFE_MAX_MEMSLOTS &&
> + memory_devices_memslot_auto_decision_active()) {
> + error_setg(errp, "some memory device (like virtio-mem)"
> + " decided how many memory slots to use based on the overall"
> + " number of memory slots; this vhost backend would further"
> + " restricts the overall number of memory slots");
> + error_append_hint(errp, "Try plugging this vhost backend before"
> + " plugging such memory devices.\n");
> + return false;
> + }
> +
> + /*
> + * The listener we registered properly setup the number of required
> + * memslots in vhost_commit().
> + */
> + used = hdev->mem->nregions;
> +
> + /*
> + * We assume that all reserved memslots actually require a real memslot
> + * in our vhost backend. This might not be true, for example, if the
> + * memslot would be ROM. If ever relevant, we can optimize for that --
> + * but we'll need additional information about the reservations.
> + */
> + reserved = memory_devices_get_reserved_memslots();
> + if (used + reserved > limit) {
> + error_setg(errp, "vhost backend memory slots limit (%d) is less"
> + " than current number of used (%d) and reserved (%d)"
> + " memory slots for memory devices.", limit, used, reserved);
> + return false;
> + }
> +
> + return true;
> +}
> +
> int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> VhostBackendType backend_type, uint32_t busyloop_timeout,
> Error **errp)
> {
> - unsigned int used, reserved, limit;
> uint64_t features;
> int i, r, n_initialized_vqs = 0;
>
> @@ -1584,19 +1622,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> goto fail;
> }
>
> - limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev);
> - if (limit < MEMORY_DEVICES_SAFE_MAX_MEMSLOTS &&
> - memory_devices_memslot_auto_decision_active()) {
> - error_setg(errp, "some memory device (like virtio-mem)"
> - " decided how many memory slots to use based on the overall"
> - " number of memory slots; this vhost backend would further"
> - " restricts the overall number of memory slots");
> - error_append_hint(errp, "Try plugging this vhost backend before"
> - " plugging such memory devices.\n");
> - r = -EINVAL;
> - goto fail;
> - }
> -
> for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
> r = vhost_virtqueue_init(hdev, hdev->vqs + i, hdev->vq_index + i,
> busyloop_timeout);
> @@ -1655,23 +1680,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> memory_listener_register(&hdev->memory_listener, &address_space_memory);
> QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>
> - /*
> - * The listener we registered properly setup the number of required
> - * memslots in vhost_commit().
> - */
> - used = hdev->mem->nregions;
> -
> - /*
> - * We assume that all reserved memslots actually require a real memslot
> - * in our vhost backend. This might not be true, for example, if the
> - * memslot would be ROM. If ever relevant, we can optimize for that --
> - * but we'll need additional information about the reservations.
> - */
> - reserved = memory_devices_get_reserved_memslots();
> - if (used + reserved > limit) {
> - error_setg(errp, "vhost backend memory slots limit (%d) is less"
> - " than current number of used (%d) and reserved (%d)"
> - " memory slots for memory devices.", limit, used, reserved);
> + if (!check_memslots(hdev, errp)) {
> r = -EINVAL;
> goto fail;
> }
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (17 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 18/33] vhost: introduce check_memslots() helper Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:06 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier() Vladimir Sementsov-Ogievskiy
` (14 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index f9163ba895..e796ad347d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1596,7 +1596,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
VhostBackendType backend_type, uint32_t busyloop_timeout,
Error **errp)
{
- uint64_t features;
int i, r, n_initialized_vqs = 0;
hdev->vdev = NULL;
@@ -1616,7 +1615,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
goto fail;
}
- r = hdev->vhost_ops->vhost_get_features(hdev, &features);
+ r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
if (r < 0) {
error_setg_errno(errp, -r, "vhost_get_features failed");
goto fail;
@@ -1631,8 +1630,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
}
}
- hdev->_features = features;
-
hdev->memory_listener = (MemoryListener) {
.name = "vhost",
.begin = vhost_begin,
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable
2025-08-13 16:48 ` [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:06 ` Raphael Norwitz
2025-10-09 20:15 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:06 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Looks like this patch no longer applies cleanly but looks like the
same cleanup to drop the local array may be fine?
On Wed, Aug 13, 2025 at 12:51 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index f9163ba895..e796ad347d 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1596,7 +1596,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> VhostBackendType backend_type, uint32_t busyloop_timeout,
> Error **errp)
> {
> - uint64_t features;
> int i, r, n_initialized_vqs = 0;
>
> hdev->vdev = NULL;
> @@ -1616,7 +1615,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> goto fail;
> }
>
> - r = hdev->vhost_ops->vhost_get_features(hdev, &features);
> + r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
> if (r < 0) {
> error_setg_errno(errp, -r, "vhost_get_features failed");
> goto fail;
> @@ -1631,8 +1630,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> }
> }
>
> - hdev->_features = features;
> -
> hdev->memory_listener = (MemoryListener) {
> .name = "vhost",
> .begin = vhost_begin,
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable
2025-10-09 19:06 ` Raphael Norwitz
@ 2025-10-09 20:15 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:15 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:06, Raphael Norwitz wrote:
> Looks like this patch no longer applies cleanly but looks like the
> same cleanup to drop the local array may be fine?
Yes, seems we can do simply
vhost_dev_get_features(hdev, hdev->features_ex)
without extra copying.
>
> On Wed, Aug 13, 2025 at 12:51 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/virtio/vhost.c | 5 +----
>> 1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index f9163ba895..e796ad347d 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -1596,7 +1596,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>> VhostBackendType backend_type, uint32_t busyloop_timeout,
>> Error **errp)
>> {
>> - uint64_t features;
>> int i, r, n_initialized_vqs = 0;
>>
>> hdev->vdev = NULL;
>> @@ -1616,7 +1615,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>> goto fail;
>> }
>>
>> - r = hdev->vhost_ops->vhost_get_features(hdev, &features);
>> + r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
>> if (r < 0) {
>> error_setg_errno(errp, -r, "vhost_get_features failed");
>> goto fail;
>> @@ -1631,8 +1630,6 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>> }
>> }
>>
>> - hdev->_features = features;
>> -
>> hdev->memory_listener = (MemoryListener) {
>> .name = "vhost",
>> .begin = vhost_begin,
>> --
>> 2.48.1
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier()
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (18 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 19/33] vhost: vhost_dev_init(): drop extra features variable Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-14 6:00 ` Philippe Mathieu-Daudé
2025-10-09 19:07 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 21/33] vhost-user: make trace events more readable Vladimir Sementsov-Ogievskiy
` (13 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
The logic kept as is. Reaftor to simplify further changes.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/virtio-bus.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index 11adfbf3ab..c7e3941b1e 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -293,20 +293,18 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign)
__func__, strerror(-r), r);
return r;
}
- r = k->ioeventfd_assign(proxy, notifier, n, true);
- if (r < 0) {
- error_report("%s: unable to assign ioeventfd: %d", __func__, r);
- virtio_bus_cleanup_host_notifier(bus, n);
- }
- } else {
- k->ioeventfd_assign(proxy, notifier, n, false);
}
- if (r == 0) {
- virtio_queue_set_host_notifier_enabled(vq, assign);
+ r = k->ioeventfd_assign(proxy, notifier, n, assign);
+ if (r < 0 && assign) {
+ error_report("%s: unable to assign ioeventfd: %d", __func__, r);
+ virtio_bus_cleanup_host_notifier(bus, n);
+ return r;
}
- return r;
+ virtio_queue_set_host_notifier_enabled(vq, assign);
+
+ return 0;
}
void virtio_bus_cleanup_host_notifier(VirtioBusState *bus, int n)
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier()
2025-08-13 16:48 ` [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier() Vladimir Sementsov-Ogievskiy
@ 2025-08-14 6:00 ` Philippe Mathieu-Daudé
2025-10-09 19:07 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-14 6:00 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> The logic kept as is. Reaftor to simplify further changes.
Typo "refactor".
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/virtio-bus.c | 18 ++++++++----------
> 1 file changed, 8 insertions(+), 10 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier()
2025-08-13 16:48 ` [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier() Vladimir Sementsov-Ogievskiy
2025-08-14 6:00 ` Philippe Mathieu-Daudé
@ 2025-10-09 19:07 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:07 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:52 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> The logic kept as is. Reaftor to simplify further changes.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/virtio-bus.c | 18 ++++++++----------
> 1 file changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 11adfbf3ab..c7e3941b1e 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -293,20 +293,18 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign)
> __func__, strerror(-r), r);
> return r;
> }
> - r = k->ioeventfd_assign(proxy, notifier, n, true);
> - if (r < 0) {
> - error_report("%s: unable to assign ioeventfd: %d", __func__, r);
> - virtio_bus_cleanup_host_notifier(bus, n);
> - }
> - } else {
> - k->ioeventfd_assign(proxy, notifier, n, false);
> }
>
> - if (r == 0) {
> - virtio_queue_set_host_notifier_enabled(vq, assign);
> + r = k->ioeventfd_assign(proxy, notifier, n, assign);
> + if (r < 0 && assign) {
> + error_report("%s: unable to assign ioeventfd: %d", __func__, r);
> + virtio_bus_cleanup_host_notifier(bus, n);
> + return r;
> }
>
> - return r;
> + virtio_queue_set_host_notifier_enabled(vq, assign);
> +
> + return 0;
> }
>
> void virtio_bus_cleanup_host_notifier(VirtioBusState *bus, int n)
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 21/33] vhost-user: make trace events more readable
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (19 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 20/33] hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier() Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-14 5:59 ` Philippe Mathieu-Daudé
2025-10-09 19:07 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 22/33] vhost-user-blk: add some useful trace-points Vladimir Sementsov-Ogievskiy
` (12 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/trace-events | 4 +-
hw/virtio/vhost-user.c | 94 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 94 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 76f0d458b2..e5142c27f9 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -25,8 +25,8 @@ vhost_user_set_mem_table_withfd(int index, const char *name, uint64_t memory_siz
vhost_user_postcopy_waker(const char *rb, uint64_t rb_offset) "%s + 0x%"PRIx64
vhost_user_postcopy_waker_found(uint64_t client_addr) "0x%"PRIx64
vhost_user_postcopy_waker_nomatch(const char *rb, uint64_t rb_offset) "%s + 0x%"PRIx64
-vhost_user_read(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
-vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
+vhost_user_read(uint32_t req, const char *req_name, uint32_t flags) "req:%d (%s) flags:0x%"PRIx32""
+vhost_user_write(uint32_t req, const char *req_name, uint32_t flags) "req:%d (%s) flags:0x%"PRIx32""
vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
# vhost-vdpa.c
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index fe9d91348d..3979582975 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -119,6 +119,94 @@ typedef enum VhostUserBackendRequest {
VHOST_USER_BACKEND_MAX
} VhostUserBackendRequest;
+static const char *vhost_req_name(VhostUserRequest req)
+{
+ switch (req) {
+ case VHOST_USER_NONE:
+ return "NONE";
+ case VHOST_USER_GET_FEATURES:
+ return "GET_FEATURES";
+ case VHOST_USER_SET_FEATURES:
+ return "SET_FEATURES";
+ case VHOST_USER_SET_OWNER:
+ return "SET_OWNER";
+ case VHOST_USER_RESET_OWNER:
+ return "RESET_OWNER";
+ case VHOST_USER_SET_MEM_TABLE:
+ return "SET_MEM_TABLE";
+ case VHOST_USER_SET_LOG_BASE:
+ return "SET_LOG_BASE";
+ case VHOST_USER_SET_LOG_FD:
+ return "SET_LOG_FD";
+ case VHOST_USER_SET_VRING_NUM:
+ return "SET_VRING_NUM";
+ case VHOST_USER_SET_VRING_ADDR:
+ return "SET_VRING_ADDR";
+ case VHOST_USER_SET_VRING_BASE:
+ return "SET_VRING_BASE";
+ case VHOST_USER_GET_VRING_BASE:
+ return "GET_VRING_BASE";
+ case VHOST_USER_SET_VRING_KICK:
+ return "SET_VRING_KICK";
+ case VHOST_USER_SET_VRING_CALL:
+ return "SET_VRING_CALL";
+ case VHOST_USER_SET_VRING_ERR:
+ return "SET_VRING_ERR";
+ case VHOST_USER_GET_PROTOCOL_FEATURES:
+ return "GET_PROTOCOL_FEATURES";
+ case VHOST_USER_SET_PROTOCOL_FEATURES:
+ return "SET_PROTOCOL_FEATURES";
+ case VHOST_USER_GET_QUEUE_NUM:
+ return "GET_QUEUE_NUM";
+ case VHOST_USER_SET_VRING_ENABLE:
+ return "SET_VRING_ENABLE";
+ case VHOST_USER_SEND_RARP:
+ return "SEND_RARP";
+ case VHOST_USER_NET_SET_MTU:
+ return "NET_SET_MTU";
+ case VHOST_USER_SET_BACKEND_REQ_FD:
+ return "SET_BACKEND_REQ_FD";
+ case VHOST_USER_IOTLB_MSG:
+ return "IOTLB_MSG";
+ case VHOST_USER_SET_VRING_ENDIAN:
+ return "SET_VRING_ENDIAN";
+ case VHOST_USER_GET_CONFIG:
+ return "GET_CONFIG";
+ case VHOST_USER_SET_CONFIG:
+ return "SET_CONFIG";
+ case VHOST_USER_CREATE_CRYPTO_SESSION:
+ return "CREATE_CRYPTO_SESSION";
+ case VHOST_USER_CLOSE_CRYPTO_SESSION:
+ return "CLOSE_CRYPTO_SESSION";
+ case VHOST_USER_POSTCOPY_ADVISE:
+ return "POSTCOPY_ADVISE";
+ case VHOST_USER_POSTCOPY_LISTEN:
+ return "POSTCOPY_LISTEN";
+ case VHOST_USER_POSTCOPY_END:
+ return "POSTCOPY_END";
+ case VHOST_USER_GET_INFLIGHT_FD:
+ return "GET_INFLIGHT_FD";
+ case VHOST_USER_SET_INFLIGHT_FD:
+ return "SET_INFLIGHT_FD";
+ case VHOST_USER_GPU_SET_SOCKET:
+ return "GPU_SET_SOCKET";
+ case VHOST_USER_RESET_DEVICE:
+ return "RESET_DEVICE";
+ case VHOST_USER_GET_MAX_MEM_SLOTS:
+ return "GET_MAX_MEM_SLOTS";
+ case VHOST_USER_ADD_MEM_REG:
+ return "ADD_MEM_REG";
+ case VHOST_USER_REM_MEM_REG:
+ return "REM_MEM_REG";
+ case VHOST_USER_SET_STATUS:
+ return "SET_STATUS";
+ case VHOST_USER_GET_STATUS:
+ return "GET_STATUS";
+ default:
+ return "<unknown>";
+ }
+}
+
typedef struct VhostUserMemoryRegion {
uint64_t guest_phys_addr;
uint64_t memory_size;
@@ -310,7 +398,8 @@ static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
return -EPROTO;
}
- trace_vhost_user_read(msg->hdr.request, msg->hdr.flags);
+ trace_vhost_user_read(msg->hdr.request,
+ vhost_req_name(msg->hdr.request), msg->hdr.flags);
return 0;
}
@@ -430,7 +519,8 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
return ret < 0 ? -saved_errno : -EIO;
}
- trace_vhost_user_write(msg->hdr.request, msg->hdr.flags);
+ trace_vhost_user_write(msg->hdr.request,
+ vhost_req_name(msg->hdr.request), msg->hdr.flags);
return 0;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 21/33] vhost-user: make trace events more readable
2025-08-13 16:48 ` [PATCH 21/33] vhost-user: make trace events more readable Vladimir Sementsov-Ogievskiy
@ 2025-08-14 5:59 ` Philippe Mathieu-Daudé
2025-10-09 19:07 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-14 5:59 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/trace-events | 4 +-
> hw/virtio/vhost-user.c | 94 +++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 94 insertions(+), 4 deletions(-)
> @@ -430,7 +519,8 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
> return ret < 0 ? -saved_errno : -EIO;
> }
>
> - trace_vhost_user_write(msg->hdr.request, msg->hdr.flags);
> + trace_vhost_user_write(msg->hdr.request,
> + vhost_req_name(msg->hdr.request), msg->hdr.flags);
Mis-indent ;)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>
> return 0;
> }
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 21/33] vhost-user: make trace events more readable
2025-08-13 16:48 ` [PATCH 21/33] vhost-user: make trace events more readable Vladimir Sementsov-Ogievskiy
2025-08-14 5:59 ` Philippe Mathieu-Daudé
@ 2025-10-09 19:07 ` Raphael Norwitz
1 sibling, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:07 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Reviewed-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 12:57 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/trace-events | 4 +-
> hw/virtio/vhost-user.c | 94 +++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 94 insertions(+), 4 deletions(-)
>
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 76f0d458b2..e5142c27f9 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -25,8 +25,8 @@ vhost_user_set_mem_table_withfd(int index, const char *name, uint64_t memory_siz
> vhost_user_postcopy_waker(const char *rb, uint64_t rb_offset) "%s + 0x%"PRIx64
> vhost_user_postcopy_waker_found(uint64_t client_addr) "0x%"PRIx64
> vhost_user_postcopy_waker_nomatch(const char *rb, uint64_t rb_offset) "%s + 0x%"PRIx64
> -vhost_user_read(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
> -vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
> +vhost_user_read(uint32_t req, const char *req_name, uint32_t flags) "req:%d (%s) flags:0x%"PRIx32""
> +vhost_user_write(uint32_t req, const char *req_name, uint32_t flags) "req:%d (%s) flags:0x%"PRIx32""
> vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
>
> # vhost-vdpa.c
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index fe9d91348d..3979582975 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -119,6 +119,94 @@ typedef enum VhostUserBackendRequest {
> VHOST_USER_BACKEND_MAX
> } VhostUserBackendRequest;
>
> +static const char *vhost_req_name(VhostUserRequest req)
> +{
> + switch (req) {
> + case VHOST_USER_NONE:
> + return "NONE";
> + case VHOST_USER_GET_FEATURES:
> + return "GET_FEATURES";
> + case VHOST_USER_SET_FEATURES:
> + return "SET_FEATURES";
> + case VHOST_USER_SET_OWNER:
> + return "SET_OWNER";
> + case VHOST_USER_RESET_OWNER:
> + return "RESET_OWNER";
> + case VHOST_USER_SET_MEM_TABLE:
> + return "SET_MEM_TABLE";
> + case VHOST_USER_SET_LOG_BASE:
> + return "SET_LOG_BASE";
> + case VHOST_USER_SET_LOG_FD:
> + return "SET_LOG_FD";
> + case VHOST_USER_SET_VRING_NUM:
> + return "SET_VRING_NUM";
> + case VHOST_USER_SET_VRING_ADDR:
> + return "SET_VRING_ADDR";
> + case VHOST_USER_SET_VRING_BASE:
> + return "SET_VRING_BASE";
> + case VHOST_USER_GET_VRING_BASE:
> + return "GET_VRING_BASE";
> + case VHOST_USER_SET_VRING_KICK:
> + return "SET_VRING_KICK";
> + case VHOST_USER_SET_VRING_CALL:
> + return "SET_VRING_CALL";
> + case VHOST_USER_SET_VRING_ERR:
> + return "SET_VRING_ERR";
> + case VHOST_USER_GET_PROTOCOL_FEATURES:
> + return "GET_PROTOCOL_FEATURES";
> + case VHOST_USER_SET_PROTOCOL_FEATURES:
> + return "SET_PROTOCOL_FEATURES";
> + case VHOST_USER_GET_QUEUE_NUM:
> + return "GET_QUEUE_NUM";
> + case VHOST_USER_SET_VRING_ENABLE:
> + return "SET_VRING_ENABLE";
> + case VHOST_USER_SEND_RARP:
> + return "SEND_RARP";
> + case VHOST_USER_NET_SET_MTU:
> + return "NET_SET_MTU";
> + case VHOST_USER_SET_BACKEND_REQ_FD:
> + return "SET_BACKEND_REQ_FD";
> + case VHOST_USER_IOTLB_MSG:
> + return "IOTLB_MSG";
> + case VHOST_USER_SET_VRING_ENDIAN:
> + return "SET_VRING_ENDIAN";
> + case VHOST_USER_GET_CONFIG:
> + return "GET_CONFIG";
> + case VHOST_USER_SET_CONFIG:
> + return "SET_CONFIG";
> + case VHOST_USER_CREATE_CRYPTO_SESSION:
> + return "CREATE_CRYPTO_SESSION";
> + case VHOST_USER_CLOSE_CRYPTO_SESSION:
> + return "CLOSE_CRYPTO_SESSION";
> + case VHOST_USER_POSTCOPY_ADVISE:
> + return "POSTCOPY_ADVISE";
> + case VHOST_USER_POSTCOPY_LISTEN:
> + return "POSTCOPY_LISTEN";
> + case VHOST_USER_POSTCOPY_END:
> + return "POSTCOPY_END";
> + case VHOST_USER_GET_INFLIGHT_FD:
> + return "GET_INFLIGHT_FD";
> + case VHOST_USER_SET_INFLIGHT_FD:
> + return "SET_INFLIGHT_FD";
> + case VHOST_USER_GPU_SET_SOCKET:
> + return "GPU_SET_SOCKET";
> + case VHOST_USER_RESET_DEVICE:
> + return "RESET_DEVICE";
> + case VHOST_USER_GET_MAX_MEM_SLOTS:
> + return "GET_MAX_MEM_SLOTS";
> + case VHOST_USER_ADD_MEM_REG:
> + return "ADD_MEM_REG";
> + case VHOST_USER_REM_MEM_REG:
> + return "REM_MEM_REG";
> + case VHOST_USER_SET_STATUS:
> + return "SET_STATUS";
> + case VHOST_USER_GET_STATUS:
> + return "GET_STATUS";
> + default:
> + return "<unknown>";
> + }
> +}
> +
> typedef struct VhostUserMemoryRegion {
> uint64_t guest_phys_addr;
> uint64_t memory_size;
> @@ -310,7 +398,8 @@ static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg)
> return -EPROTO;
> }
>
> - trace_vhost_user_read(msg->hdr.request, msg->hdr.flags);
> + trace_vhost_user_read(msg->hdr.request,
> + vhost_req_name(msg->hdr.request), msg->hdr.flags);
>
> return 0;
> }
> @@ -430,7 +519,8 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
> return ret < 0 ? -saved_errno : -EIO;
> }
>
> - trace_vhost_user_write(msg->hdr.request, msg->hdr.flags);
> + trace_vhost_user_write(msg->hdr.request,
> + vhost_req_name(msg->hdr.request), msg->hdr.flags);
>
> return 0;
> }
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 22/33] vhost-user-blk: add some useful trace-points
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (20 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 21/33] vhost-user: make trace events more readable Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-14 4:58 ` Philippe Mathieu-Daudé
2025-10-09 19:07 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 23/33] vhost: " Vladimir Sementsov-Ogievskiy
` (11 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/block/trace-events | 10 ++++++++++
hw/block/vhost-user-blk.c | 15 +++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/hw/block/trace-events b/hw/block/trace-events
index cc9a9f2460..3b5fd2a599 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -58,6 +58,16 @@ virtio_blk_handle_zone_mgmt(void *vdev, void *req, uint8_t op, int64_t sector, i
virtio_blk_handle_zone_reset_all(void *vdev, void *req, int64_t sector, int64_t len) "vdev %p req %p sector 0x%" PRIx64 " cap 0x%" PRIx64 ""
virtio_blk_handle_zone_append(void *vdev, void *req, int64_t sector) "vdev %p req %p, append sector 0x%" PRIx64 ""
+# vhost-user-blk.c
+vhost_user_blk_start(void) ""
+vhost_user_blk_start_finish(void) ""
+vhost_user_blk_stop(void) ""
+vhost_user_blk_stop_finish(void) ""
+vhost_user_blk_connect(void) ""
+vhost_user_blk_connect_finish(void) ""
+vhost_user_blk_device_realize(void) ""
+vhost_user_blk_device_realize_finish(void) ""
+
# hd-geometry.c
hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d"
hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d"
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index de7a810c93..c8bc2c78e6 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -31,6 +31,7 @@
#include "hw/virtio/virtio-access.h"
#include "system/system.h"
#include "system/runstate.h"
+#include "trace.h"
static const int user_feature_bits[] = {
VIRTIO_BLK_F_SIZE_MAX,
@@ -137,6 +138,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
int i, ret;
+ trace_vhost_user_blk_start();
+
if (!k->set_guest_notifiers) {
error_setg(errp, "binding does not support guest notifiers");
return -ENOSYS;
@@ -192,6 +195,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
}
s->started_vu = true;
+ trace_vhost_user_blk_start_finish();
+
return ret;
err_guest_notifiers:
@@ -212,6 +217,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
int ret;
bool force_stop = false;
+ trace_vhost_user_blk_stop();
+
if (!s->started_vu) {
return 0;
}
@@ -233,6 +240,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
}
vhost_dev_disable_notifiers(&s->dev, vdev);
+
+ trace_vhost_user_blk_stop_finish();
return ret;
}
@@ -340,6 +349,8 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
VHostUserBlk *s = VHOST_USER_BLK(vdev);
int ret = 0;
+ trace_vhost_user_blk_connect();
+
if (s->connected) {
return 0;
}
@@ -365,6 +376,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
ret = vhost_user_blk_start(vdev, errp);
}
+ trace_vhost_user_blk_connect_finish();
return ret;
}
@@ -455,6 +467,8 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
int retries;
int i, ret;
+ trace_vhost_user_blk_device_realize();
+
if (!s->chardev.chr) {
error_setg(errp, "chardev is mandatory");
return;
@@ -514,6 +528,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
vhost_user_blk_event, NULL, (void *)dev,
NULL, true);
+ trace_vhost_user_blk_device_realize_finish();
return;
virtio_err:
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 22/33] vhost-user-blk: add some useful trace-points
2025-08-13 16:48 ` [PATCH 22/33] vhost-user-blk: add some useful trace-points Vladimir Sementsov-Ogievskiy
@ 2025-08-14 4:58 ` Philippe Mathieu-Daudé
2025-08-14 11:14 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:07 ` Raphael Norwitz
1 sibling, 1 reply; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-14 4:58 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/block/trace-events | 10 ++++++++++
> hw/block/vhost-user-blk.c | 15 +++++++++++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index cc9a9f2460..3b5fd2a599 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> +# vhost-user-blk.c
> +vhost_user_blk_start(void) ""
> +vhost_user_blk_start_finish(void) ""
> +vhost_user_blk_stop(void) ""
> +vhost_user_blk_stop_finish(void) ""
> +vhost_user_blk_connect(void) ""
> +vhost_user_blk_connect_finish(void) ""
> +vhost_user_blk_device_realize(void) ""
> +vhost_user_blk_device_realize_finish(void) ""
Maybe use _in / _out suffixes? Naming is hard...
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 22/33] vhost-user-blk: add some useful trace-points
2025-08-14 4:58 ` Philippe Mathieu-Daudé
@ 2025-08-14 11:14 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-14 11:14 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov
On 14.08.25 07:58, Philippe Mathieu-Daudé wrote:
> On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/block/trace-events | 10 ++++++++++
>> hw/block/vhost-user-blk.c | 15 +++++++++++++++
>> 2 files changed, 25 insertions(+)
>>
>> diff --git a/hw/block/trace-events b/hw/block/trace-events
>> index cc9a9f2460..3b5fd2a599 100644
>> --- a/hw/block/trace-events
>> +++ b/hw/block/trace-events
>
>
>> +# vhost-user-blk.c
>> +vhost_user_blk_start(void) ""
>> +vhost_user_blk_start_finish(void) ""
>> +vhost_user_blk_stop(void) ""
>> +vhost_user_blk_stop_finish(void) ""
>> +vhost_user_blk_connect(void) ""
>> +vhost_user_blk_connect_finish(void) ""
>> +vhost_user_blk_device_realize(void) ""
>> +vhost_user_blk_device_realize_finish(void) ""
>
> Maybe use _in / _out suffixes? Naming is hard...
Agree, will change.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 22/33] vhost-user-blk: add some useful trace-points
2025-08-13 16:48 ` [PATCH 22/33] vhost-user-blk: add some useful trace-points Vladimir Sementsov-Ogievskiy
2025-08-14 4:58 ` Philippe Mathieu-Daudé
@ 2025-10-09 19:07 ` Raphael Norwitz
2025-10-09 20:19 ` Vladimir Sementsov-Ogievskiy
1 sibling, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:07 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/block/trace-events | 10 ++++++++++
> hw/block/vhost-user-blk.c | 15 +++++++++++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index cc9a9f2460..3b5fd2a599 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -58,6 +58,16 @@ virtio_blk_handle_zone_mgmt(void *vdev, void *req, uint8_t op, int64_t sector, i
> virtio_blk_handle_zone_reset_all(void *vdev, void *req, int64_t sector, int64_t len) "vdev %p req %p sector 0x%" PRIx64 " cap 0x%" PRIx64 ""
> virtio_blk_handle_zone_append(void *vdev, void *req, int64_t sector) "vdev %p req %p, append sector 0x%" PRIx64 ""
>
> +# vhost-user-blk.c
> +vhost_user_blk_start(void) ""
> +vhost_user_blk_start_finish(void) ""
> +vhost_user_blk_stop(void) ""
> +vhost_user_blk_stop_finish(void) ""
> +vhost_user_blk_connect(void) ""
> +vhost_user_blk_connect_finish(void) ""
> +vhost_user_blk_device_realize(void) ""
> +vhost_user_blk_device_realize_finish(void) ""
Should we also trace the VirtIODevice/vdev pointer like in virtio-blk.c?
> +
> # hd-geometry.c
> hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d"
> hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d"
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index de7a810c93..c8bc2c78e6 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -31,6 +31,7 @@
> #include "hw/virtio/virtio-access.h"
> #include "system/system.h"
> #include "system/runstate.h"
> +#include "trace.h"
>
> static const int user_feature_bits[] = {
> VIRTIO_BLK_F_SIZE_MAX,
> @@ -137,6 +138,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> int i, ret;
>
> + trace_vhost_user_blk_start();
> +
> if (!k->set_guest_notifiers) {
> error_setg(errp, "binding does not support guest notifiers");
> return -ENOSYS;
> @@ -192,6 +195,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
> }
> s->started_vu = true;
>
> + trace_vhost_user_blk_start_finish();
> +
> return ret;
>
> err_guest_notifiers:
> @@ -212,6 +217,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
> int ret;
> bool force_stop = false;
>
> + trace_vhost_user_blk_stop();
> +
> if (!s->started_vu) {
> return 0;
> }
> @@ -233,6 +240,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
> }
>
> vhost_dev_disable_notifiers(&s->dev, vdev);
> +
> + trace_vhost_user_blk_stop_finish();
> return ret;
> }
>
> @@ -340,6 +349,8 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
> VHostUserBlk *s = VHOST_USER_BLK(vdev);
> int ret = 0;
>
> + trace_vhost_user_blk_connect();
> +
> if (s->connected) {
> return 0;
> }
> @@ -365,6 +376,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
> ret = vhost_user_blk_start(vdev, errp);
> }
>
> + trace_vhost_user_blk_connect_finish();
> return ret;
> }
>
> @@ -455,6 +467,8 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> int retries;
> int i, ret;
>
> + trace_vhost_user_blk_device_realize();
> +
> if (!s->chardev.chr) {
> error_setg(errp, "chardev is mandatory");
> return;
> @@ -514,6 +528,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
> vhost_user_blk_event, NULL, (void *)dev,
> NULL, true);
> + trace_vhost_user_blk_device_realize_finish();
> return;
>
> virtio_err:
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 22/33] vhost-user-blk: add some useful trace-points
2025-10-09 19:07 ` Raphael Norwitz
@ 2025-10-09 20:19 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:19 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:07, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> hw/block/trace-events | 10 ++++++++++
>> hw/block/vhost-user-blk.c | 15 +++++++++++++++
>> 2 files changed, 25 insertions(+)
>>
>> diff --git a/hw/block/trace-events b/hw/block/trace-events
>> index cc9a9f2460..3b5fd2a599 100644
>> --- a/hw/block/trace-events
>> +++ b/hw/block/trace-events
>> @@ -58,6 +58,16 @@ virtio_blk_handle_zone_mgmt(void *vdev, void *req, uint8_t op, int64_t sector, i
>> virtio_blk_handle_zone_reset_all(void *vdev, void *req, int64_t sector, int64_t len) "vdev %p req %p sector 0x%" PRIx64 " cap 0x%" PRIx64 ""
>> virtio_blk_handle_zone_append(void *vdev, void *req, int64_t sector) "vdev %p req %p, append sector 0x%" PRIx64 ""
>>
>> +# vhost-user-blk.c
>> +vhost_user_blk_start(void) ""
>> +vhost_user_blk_start_finish(void) ""
>> +vhost_user_blk_stop(void) ""
>> +vhost_user_blk_stop_finish(void) ""
>> +vhost_user_blk_connect(void) ""
>> +vhost_user_blk_connect_finish(void) ""
>> +vhost_user_blk_device_realize(void) ""
>> +vhost_user_blk_device_realize_finish(void) ""
>
> Should we also trace the VirtIODevice/vdev pointer like in virtio-blk.c?
>
Agree, it may help to debug a setup with several disk, and just for consistency with virtio-blk.c
>
>> +
>> # hd-geometry.c
>> hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d"
>> hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d"
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index de7a810c93..c8bc2c78e6 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -31,6 +31,7 @@
>> #include "hw/virtio/virtio-access.h"
>> #include "system/system.h"
>> #include "system/runstate.h"
>> +#include "trace.h"
>>
>> static const int user_feature_bits[] = {
>> VIRTIO_BLK_F_SIZE_MAX,
>> @@ -137,6 +138,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>> int i, ret;
>>
>> + trace_vhost_user_blk_start();
>> +
>> if (!k->set_guest_notifiers) {
>> error_setg(errp, "binding does not support guest notifiers");
>> return -ENOSYS;
>> @@ -192,6 +195,8 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
>> }
>> s->started_vu = true;
>>
>> + trace_vhost_user_blk_start_finish();
>> +
>> return ret;
>>
>> err_guest_notifiers:
>> @@ -212,6 +217,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
>> int ret;
>> bool force_stop = false;
>>
>> + trace_vhost_user_blk_stop();
>> +
>> if (!s->started_vu) {
>> return 0;
>> }
>> @@ -233,6 +240,8 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
>> }
>>
>> vhost_dev_disable_notifiers(&s->dev, vdev);
>> +
>> + trace_vhost_user_blk_stop_finish();
>> return ret;
>> }
>>
>> @@ -340,6 +349,8 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
>> VHostUserBlk *s = VHOST_USER_BLK(vdev);
>> int ret = 0;
>>
>> + trace_vhost_user_blk_connect();
>> +
>> if (s->connected) {
>> return 0;
>> }
>> @@ -365,6 +376,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
>> ret = vhost_user_blk_start(vdev, errp);
>> }
>>
>> + trace_vhost_user_blk_connect_finish();
>> return ret;
>> }
>>
>> @@ -455,6 +467,8 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
>> int retries;
>> int i, ret;
>>
>> + trace_vhost_user_blk_device_realize();
>> +
>> if (!s->chardev.chr) {
>> error_setg(errp, "chardev is mandatory");
>> return;
>> @@ -514,6 +528,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
>> qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
>> vhost_user_blk_event, NULL, (void *)dev,
>> NULL, true);
>> + trace_vhost_user_blk_device_realize_finish();
>> return;
>>
>> virtio_err:
>> --
>> 2.48.1
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 23/33] vhost: add some useful trace-points
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (21 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 22/33] vhost-user-blk: add some useful trace-points Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:08 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 24/33] chardev-add: support local migration Vladimir Sementsov-Ogievskiy
` (10 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/trace-events | 8 ++++++++
hw/virtio/vhost.c | 16 ++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e5142c27f9..bd595fcd91 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -10,7 +10,15 @@ vhost_reject_section(const char *name, int d) "%s:%d"
vhost_iotlb_miss(void *dev, int step) "%p step %d"
vhost_dev_cleanup(void *dev) "%p"
vhost_dev_start(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
+vhost_dev_start_finish(const char *name) "%s"
vhost_dev_stop(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
+vhost_dev_stop_finish(const char *name) "%s"
+vhost_virtque_start(const char *name, int idx) "%s %d"
+vhost_virtque_start_finish(const char *name, int idx) "%s %d"
+vhost_virtque_stop(const char *name, int idx) "%s %d"
+vhost_virtque_stop_finish(const char *name, int idx) "%s %d"
+vhost_dev_init(void) ""
+vhost_dev_init_finish(void) ""
# vhost-user.c
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e796ad347d..e7c809400b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1329,6 +1329,8 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
};
struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
+ trace_vhost_virtque_start(vdev->name, idx);
+
r = vhost_vrings_map(dev, vdev, vq, idx);
if (r <= 0) {
return r;
@@ -1390,6 +1392,8 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
}
}
+ trace_vhost_virtque_start_finish(vdev->name, idx);
+
return 0;
fail:
@@ -1408,6 +1412,8 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
};
int r = 0;
+ trace_vhost_virtque_stop(vdev->name, idx);
+
if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
/* Don't stop the virtqueue which might have not been started */
return 0;
@@ -1441,6 +1447,8 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
}
vhost_vrings_unmap(dev, vq, true);
+
+ trace_vhost_virtque_stop_finish(vdev->name, idx);
return r;
}
@@ -1598,6 +1606,8 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
{
int i, r, n_initialized_vqs = 0;
+ trace_vhost_dev_init();
+
hdev->vdev = NULL;
hdev->migration_blocker = NULL;
@@ -1682,6 +1692,8 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
goto fail;
}
+ trace_vhost_dev_init_finish();
+
return 0;
fail:
@@ -2132,6 +2144,8 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
}
}
vhost_start_config_intr(hdev);
+
+ trace_vhost_dev_start_finish(vdev->name);
return 0;
fail_iotlb:
if (vhost_dev_has_iommu(hdev) &&
@@ -2210,6 +2224,8 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev,
hdev->started = false;
vdev->vhost_started = false;
hdev->vdev = NULL;
+
+ trace_vhost_dev_stop_finish(vdev->name);
return rc;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 23/33] vhost: add some useful trace-points
2025-08-13 16:48 ` [PATCH 23/33] vhost: " Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:08 ` Raphael Norwitz
2025-10-09 20:20 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 12:58 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/trace-events | 8 ++++++++
> hw/virtio/vhost.c | 16 ++++++++++++++++
> 2 files changed, 24 insertions(+)
>
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index e5142c27f9..bd595fcd91 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -10,7 +10,15 @@ vhost_reject_section(const char *name, int d) "%s:%d"
> vhost_iotlb_miss(void *dev, int step) "%p step %d"
> vhost_dev_cleanup(void *dev) "%p"
> vhost_dev_start(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
> +vhost_dev_start_finish(const char *name) "%s"
> vhost_dev_stop(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
> +vhost_dev_stop_finish(const char *name) "%s"
> +vhost_virtque_start(const char *name, int idx) "%s %d"
> +vhost_virtque_start_finish(const char *name, int idx) "%s %d"
> +vhost_virtque_stop(const char *name, int idx) "%s %d"
> +vhost_virtque_stop_finish(const char *name, int idx) "%s %d"
> +vhost_dev_init(void) ""
> +vhost_dev_init_finish(void) ""
Ditto here - I would think this should also have the VirtIODevice/vdev pointer.
>
>
> # vhost-user.c
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index e796ad347d..e7c809400b 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1329,6 +1329,8 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> };
> struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
>
> + trace_vhost_virtque_start(vdev->name, idx);
> +
> r = vhost_vrings_map(dev, vdev, vq, idx);
> if (r <= 0) {
> return r;
> @@ -1390,6 +1392,8 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> }
> }
>
> + trace_vhost_virtque_start_finish(vdev->name, idx);
> +
> return 0;
>
> fail:
> @@ -1408,6 +1412,8 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
> };
> int r = 0;
>
> + trace_vhost_virtque_stop(vdev->name, idx);
> +
> if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
> /* Don't stop the virtqueue which might have not been started */
> return 0;
> @@ -1441,6 +1447,8 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
> }
>
> vhost_vrings_unmap(dev, vq, true);
> +
> + trace_vhost_virtque_stop_finish(vdev->name, idx);
> return r;
> }
>
> @@ -1598,6 +1606,8 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> {
> int i, r, n_initialized_vqs = 0;
>
> + trace_vhost_dev_init();
> +
> hdev->vdev = NULL;
> hdev->migration_blocker = NULL;
>
> @@ -1682,6 +1692,8 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> goto fail;
> }
>
> + trace_vhost_dev_init_finish();
> +
> return 0;
>
> fail:
> @@ -2132,6 +2144,8 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> }
> }
> vhost_start_config_intr(hdev);
> +
> + trace_vhost_dev_start_finish(vdev->name);
> return 0;
> fail_iotlb:
> if (vhost_dev_has_iommu(hdev) &&
> @@ -2210,6 +2224,8 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev,
> hdev->started = false;
> vdev->vhost_started = false;
> hdev->vdev = NULL;
> +
> + trace_vhost_dev_stop_finish(vdev->name);
> return rc;
> }
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 23/33] vhost: add some useful trace-points
2025-10-09 19:08 ` Raphael Norwitz
@ 2025-10-09 20:20 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:20 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:08, Raphael Norwitz wrote:
> On Wed, Aug 13, 2025 at 12:58 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>> Signed-off-by: Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>
>> ---
>> hw/virtio/trace-events | 8 ++++++++
>> hw/virtio/vhost.c | 16 ++++++++++++++++
>> 2 files changed, 24 insertions(+)
>>
>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>> index e5142c27f9..bd595fcd91 100644
>> --- a/hw/virtio/trace-events
>> +++ b/hw/virtio/trace-events
>> @@ -10,7 +10,15 @@ vhost_reject_section(const char *name, int d) "%s:%d"
>> vhost_iotlb_miss(void *dev, int step) "%p step %d"
>> vhost_dev_cleanup(void *dev) "%p"
>> vhost_dev_start(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
>> +vhost_dev_start_finish(const char *name) "%s"
>> vhost_dev_stop(void *dev, const char *name, bool vrings) "%p:%s vrings:%d"
>> +vhost_dev_stop_finish(const char *name) "%s"
>> +vhost_virtque_start(const char *name, int idx) "%s %d"
>> +vhost_virtque_start_finish(const char *name, int idx) "%s %d"
>> +vhost_virtque_stop(const char *name, int idx) "%s %d"
>> +vhost_virtque_stop_finish(const char *name, int idx) "%s %d"
>> +vhost_dev_init(void) ""
>> +vhost_dev_init_finish(void) ""
> Ditto here - I would think this should also have the VirtIODevice/vdev pointer.
Ok
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 24/33] chardev-add: support local migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (22 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 23/33] vhost: " Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-09-12 14:56 ` Markus Armbruster
2025-08-13 16:48 ` [PATCH 25/33] virtio: introduce .skip_vhost_migration_log() handler Vladimir Sementsov-Ogievskiy
` (9 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov, Laurent Vivier
This commit introduces a possibility to migrate open chardev
socket fd through migration channel without reconnecting.
For this, user should:
- enable new migration capability local-char-socket
- mark the socket by an option support-local-migration=true
- on target add local-incoming=true option to the socket
Motivation for the API:
1. We don't want to migrate all sockets. For example, QMP-connection is
bad candidate, as it is separate on source and target. So, we need
@support-local-migration option to mark sockets, which we want to
migrate (after this series, we'll want to migrate chardev used to
connect with vhost-user-server).
2. Still, for remote migration, we can't migrate any sockets, so, we
need a capability, to enable/disable the whole feature.
3. And finally, we need a sign for the socket to not open a connection
on initialization, but wait for incoming migration. We can't use
@support-local-migration option for it, as it may be enabled, but we
are in incoming-remote migration. Also, we can't rely on the
migration capability, as user is free to setup capabilities before or
after chardev creation, and it would be a bad precedent to create
relations here.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
chardev/char-socket.c | 101 +++++++++++++++++++++++++++++++++-
include/chardev/char-socket.h | 3 +
migration/options.c | 7 +++
migration/options.h | 1 +
qapi/char.json | 16 +++++-
qapi/migration.json | 8 ++-
stubs/meson.build | 1 +
stubs/qemu_file.c | 15 +++++
stubs/vmstate.c | 6 ++
tests/qtest/meson.build | 2 +-
tests/unit/meson.build | 4 +-
11 files changed, 158 insertions(+), 6 deletions(-)
create mode 100644 stubs/qemu_file.c
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1e8313915b..db6616e2f2 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -24,6 +24,7 @@
#include "qemu/osdep.h"
#include "chardev/char.h"
+#include "qapi-types-char.h"
#include "io/channel-socket.h"
#include "io/channel-websock.h"
#include "qemu/error-report.h"
@@ -34,6 +35,10 @@
#include "qapi/qapi-visit-sockets.h"
#include "qemu/yank.h"
#include "trace.h"
+#include "migration/vmstate.h"
+#include "migration/qemu-file.h"
+#include "migration/migration.h"
+#include "hw/vmstate-if.h"
#include "chardev/char-io.h"
#include "chardev/char-socket.h"
@@ -1118,6 +1123,7 @@ static void char_socket_finalize(Object *obj)
object_unref(OBJECT(s->tls_creds));
}
g_free(s->tls_authz);
+ g_free(s->vmstate_name);
if (s->registered_yank) {
/*
* In the chardev-change special-case, we shouldn't unregister the yank
@@ -1276,8 +1282,15 @@ static int qmp_chardev_open_socket_client(Chardev *chr,
{
SocketChardev *s = SOCKET_CHARDEV(chr);
+ s->reconnect_time_ms = reconnect_ms;
+
+ if (s->local_incoming) {
+ /* We'll get fd at migreation load. This field works once */
+ s->local_incoming = false;
+ return 0;
+ }
+
if (reconnect_ms > 0) {
- s->reconnect_time_ms = reconnect_ms;
tcp_chr_connect_client_async(chr);
return 0;
} else {
@@ -1367,6 +1380,52 @@ static bool qmp_chardev_validate_socket(ChardevSocket *sock,
return true;
}
+static int char_socket_save(QEMUFile *f, void *opaque, size_t size,
+ const VMStateField *field, JSONWriter *vmdesc)
+{
+ SocketChardev *s = opaque;
+
+ warn_report("%s", __func__);
+ return qemu_file_put_fd(f, s->sioc->fd);
+}
+
+static int char_socket_load(QEMUFile *f, void *opaque, size_t size,
+ const VMStateField *field)
+{
+ Chardev *chr = opaque;
+
+ int fd = qemu_file_get_fd(f);
+ warn_report("%s %d", __func__, fd);
+ if (fd < 0) {
+ return fd;
+ }
+ return tcp_chr_add_client(chr, fd);
+}
+
+static bool char_socket_needed(void *opaque)
+{
+ SocketChardev *s = opaque;
+
+ return !!s->vmstate_name;
+}
+
+const VMStateDescription vmstate_char_socket = {
+ .name = "char_socket",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .needed = char_socket_needed,
+ .fields = (VMStateField[]) {
+ {
+ .name = "fd",
+ .info = &(const VMStateInfo) {
+ .name = "fd",
+ .get = char_socket_load,
+ .put = char_socket_save,
+ },
+ },
+ VMSTATE_END_OF_LIST()
+ },
+};
static void qmp_chardev_open_socket(Chardev *chr,
ChardevBackend *backend,
@@ -1381,14 +1440,36 @@ static void qmp_chardev_open_socket(Chardev *chr,
bool is_tn3270 = sock->has_tn3270 ? sock->tn3270 : false;
bool is_waitconnect = sock->has_wait ? sock->wait : false;
bool is_websock = sock->has_websocket ? sock->websocket : false;
+ bool support_local_mig = sock->has_support_local_migration
+ ? sock->support_local_migration
+ : false;
+ bool local_incoming = sock->local_incoming;
int64_t reconnect_ms = 0;
SocketAddress *addr;
+ if (support_local_mig && is_listen) {
+ error_setg(errp,
+ "local migration is not supported for listening sockets");
+ return;
+ }
+
+ if (support_local_mig && !(chr->label && chr->label[0])) {
+ error_setg(errp,
+ "local migration is not supported for unnamed chardevs");
+ return;
+ }
+
s->is_listen = is_listen;
s->is_telnet = is_telnet;
s->is_tn3270 = is_tn3270;
s->is_websock = is_websock;
s->do_nodelay = do_nodelay;
+ s->local_incoming = local_incoming;
+
+ if (support_local_mig) {
+ s->vmstate_name = g_strdup_printf("__yc-chardev-%s", chr->label);
+ }
+
if (sock->tls_creds) {
Object *creds;
creds = object_resolve_path_component(
@@ -1448,6 +1529,7 @@ static void qmp_chardev_open_socket(Chardev *chr,
update_disconnected_filename(s);
if (s->is_listen) {
+ assert(!s->vmstate_name);
if (qmp_chardev_open_socket_server(chr, is_telnet || is_tn3270,
is_waitconnect, errp) < 0) {
return;
@@ -1463,6 +1545,8 @@ static void qmp_chardev_open_socket(Chardev *chr,
return;
}
}
+
+ vmstate_register(VMSTATE_IF(s), -1, &vmstate_char_socket, s);
}
static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
@@ -1581,9 +1665,20 @@ char_socket_get_connected(Object *obj, Error **errp)
return s->state == TCP_CHARDEV_STATE_CONNECTED;
}
+static char *
+char_socket_if_get_id(VMStateIf *obj)
+{
+ SocketChardev *s = SOCKET_CHARDEV(obj);
+
+ return s->vmstate_name;
+}
+
static void char_socket_class_init(ObjectClass *oc, const void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
+ VMStateIfClass *vc = VMSTATE_IF_CLASS(oc);
+
+ vc->get_id = char_socket_if_get_id;
cc->supports_yank = true;
@@ -1613,6 +1708,10 @@ static const TypeInfo char_socket_type_info = {
.instance_size = sizeof(SocketChardev),
.instance_finalize = char_socket_finalize,
.class_init = char_socket_class_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_VMSTATE_IF },
+ { }
+ }
};
static void register_types(void)
diff --git a/include/chardev/char-socket.h b/include/chardev/char-socket.h
index d6d13ad37f..69b9609215 100644
--- a/include/chardev/char-socket.h
+++ b/include/chardev/char-socket.h
@@ -78,6 +78,9 @@ struct SocketChardev {
bool connect_err_reported;
QIOTask *connect_task;
+
+ char *vmstate_name;
+ bool local_incoming;
};
typedef struct SocketChardev SocketChardev;
diff --git a/migration/options.c b/migration/options.c
index 4e923a2e07..dffb6910f4 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -262,6 +262,13 @@ bool migrate_mapped_ram(void)
return s->capabilities[MIGRATION_CAPABILITY_MAPPED_RAM];
}
+bool migrate_local_char_socket(void)
+{
+ MigrationState *s = migrate_get_current();
+
+ return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
+}
+
bool migrate_ignore_shared(void)
{
MigrationState *s = migrate_get_current();
diff --git a/migration/options.h b/migration/options.h
index 82d839709e..40971f0aa0 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -30,6 +30,7 @@ bool migrate_colo(void);
bool migrate_dirty_bitmaps(void);
bool migrate_events(void);
bool migrate_mapped_ram(void);
+bool migrate_local_char_socket(void);
bool migrate_ignore_shared(void);
bool migrate_late_block_activate(void);
bool migrate_multifd(void);
diff --git a/qapi/char.json b/qapi/char.json
index f0a53f742c..5b535c196a 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -280,11 +280,23 @@
# mutually exclusive with @reconnect.
# (default: 0) (Since: 9.2)
#
+# @support-local-migration: The socket open file descriptor will
+# migrate if this field is true and local-char-socket migration
+# capability enabled (default: false) (Since: 10.2)
+#
+# @local-incoming: Do load open file descriptor for the socket
+# on incoming migration. May be used only if QEMU is started
+# for incoming migration and only together with local-char-socket
+# migration capability (default: false) (Since: 10.2)
+#
# Features:
#
# @deprecated: Member @reconnect is deprecated. Use @reconnect-ms
# instead.
#
+# @unstable: Members @support-local-migration and @local-incoming
+# are experimental
+#
# Since: 1.4
##
{ 'struct': 'ChardevSocket',
@@ -298,7 +310,9 @@
'*tn3270': 'bool',
'*websocket': 'bool',
'*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
- '*reconnect-ms': 'int' },
+ '*reconnect-ms': 'int',
+ '*support-local-migration': { 'type': 'bool', 'features': [ 'unstable' ] },
+ '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } },
'base': 'ChardevCommon' }
##
diff --git a/qapi/migration.json b/qapi/migration.json
index 2387c21e9c..4f282d168e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -517,6 +517,11 @@
# each RAM page. Requires a migration URI that supports seeking,
# such as a file. (since 9.0)
#
+# @local-char-socket: Migrate socket chardevs open file descriptors.
+# Only may be used when migration channel is unix socket. Only
+# involves socket chardevs with "support-local-migration" option
+# enabled. (since 10.2)
+#
# Features:
#
# @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -536,7 +541,8 @@
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
- 'dirty-limit', 'mapped-ram'] }
+ 'dirty-limit', 'mapped-ram',
+ { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
##
# @MigrationCapabilityStatus:
diff --git a/stubs/meson.build b/stubs/meson.build
index cef046e685..7855483639 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -50,6 +50,7 @@ if have_block or have_user
stub_ss.add(files('qtest.c'))
stub_ss.add(files('vm-stop.c'))
stub_ss.add(files('vmstate.c'))
+ stub_ss.add(files('qemu_file.c'))
endif
if have_user
diff --git a/stubs/qemu_file.c b/stubs/qemu_file.c
new file mode 100644
index 0000000000..854d4c13cd
--- /dev/null
+++ b/stubs/qemu_file.c
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "qemu/osdep.h"
+#include "migration/qemu-file.h"
+
+
+int qemu_file_get_fd(QEMUFile *f)
+{
+ return -1;
+}
+
+int qemu_file_put_fd(QEMUFile *f, int fd)
+{
+ return -1;
+}
diff --git a/stubs/vmstate.c b/stubs/vmstate.c
index c190762d7c..f345edf3c9 100644
--- a/stubs/vmstate.c
+++ b/stubs/vmstate.c
@@ -2,6 +2,7 @@
#include "migration/vmstate.h"
#include "qapi/qapi-types-migration.h"
#include "migration/client-options.h"
+#include "migration/options.h"
int vmstate_register_with_alias_id(VMStateIf *obj,
uint32_t instance_id,
@@ -28,3 +29,8 @@ MigMode migrate_mode(void)
{
return MIG_MODE_NORMAL;
}
+
+bool migrate_local_char_socket(void)
+{
+ return false;
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 669d07c06b..8be0c1dc7c 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -381,7 +381,7 @@ qtests = {
'pxe-test': files('boot-sector.c'),
'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
'pnv-xive2-nvpg_bar.c'),
- 'qos-test': [chardev, io, qos_test_ss.apply({}).sources()],
+ 'qos-test': [chardev, io, qos_test_ss.apply({}).sources(), '../../hw/core/vmstate-if.c'],
'tpm-crb-swtpm-test': [io, tpmemu_files],
'tpm-crb-test': [io, tpmemu_files],
'tpm-tis-swtpm-test': [io, tpmemu_files, 'tpm-tis-util.c'],
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index d5248ae51d..b96f4dcabe 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -139,7 +139,7 @@ if have_system
'test-bufferiszero': [],
'test-smp-parse': [qom, meson.project_source_root() / 'hw/core/machine-smp.c'],
'test-vmstate': [migration, io],
- 'test-yank': ['socket-helpers.c', qom, io, chardev]
+ 'test-yank': ['socket-helpers.c', qom, io, chardev, '../../hw/core/vmstate-if.c']
}
if config_host_data.get('CONFIG_INOTIFY1')
tests += {'test-util-filemonitor': []}
@@ -151,7 +151,7 @@ if have_system
if not get_option('tsan')
if host_os != 'windows'
tests += {
- 'test-char': ['socket-helpers.c', qom, io, chardev]
+ 'test-char': ['socket-helpers.c', qom, io, chardev, '../../hw/core/vmstate-if.c']
}
endif
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 24/33] chardev-add: support local migration
2025-08-13 16:48 ` [PATCH 24/33] chardev-add: support local migration Vladimir Sementsov-Ogievskiy
@ 2025-09-12 14:56 ` Markus Armbruster
2025-09-12 15:04 ` Vladimir Sementsov-Ogievskiy
2025-09-12 15:24 ` Steven Sistare
0 siblings, 2 replies; 108+ messages in thread
From: Markus Armbruster @ 2025-09-12 14:56 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, qemu-devel, qemu-block,
steven.sistare, den-plotnikov, Laurent Vivier
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> This commit introduces a possibility to migrate open chardev
> socket fd through migration channel without reconnecting.
>
> For this, user should:
> - enable new migration capability local-char-socket
> - mark the socket by an option support-local-migration=true
> - on target add local-incoming=true option to the socket
>
> Motivation for the API:
>
> 1. We don't want to migrate all sockets. For example, QMP-connection is
> bad candidate, as it is separate on source and target. So, we need
> @support-local-migration option to mark sockets, which we want to
> migrate (after this series, we'll want to migrate chardev used to
> connect with vhost-user-server).
>
> 2. Still, for remote migration, we can't migrate any sockets, so, we
> need a capability, to enable/disable the whole feature.
>
> 3. And finally, we need a sign for the socket to not open a connection
> on initialization, but wait for incoming migration. We can't use
> @support-local-migration option for it, as it may be enabled, but we
> are in incoming-remote migration. Also, we can't rely on the
> migration capability, as user is free to setup capabilities before or
> after chardev creation, and it would be a bad precedent to create
> relations here.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[...]
> diff --git a/qapi/char.json b/qapi/char.json
> index f0a53f742c..5b535c196a 100644
> --- a/qapi/char.json
> +++ b/qapi/char.json
> @@ -280,11 +280,23 @@
> # mutually exclusive with @reconnect.
> # (default: 0) (Since: 9.2)
> #
> +# @support-local-migration: The socket open file descriptor will
> +# migrate if this field is true and local-char-socket migration
> +# capability enabled (default: false) (Since: 10.2)
> +#
> +# @local-incoming: Do load open file descriptor for the socket
> +# on incoming migration. May be used only if QEMU is started
> +# for incoming migration and only together with local-char-socket
> +# migration capability (default: false) (Since: 10.2)
> +#
> # Features:
> #
> # @deprecated: Member @reconnect is deprecated. Use @reconnect-ms
> # instead.
> #
> +# @unstable: Members @support-local-migration and @local-incoming
> +# are experimental
> +#
> # Since: 1.4
> ##
> { 'struct': 'ChardevSocket',
> @@ -298,7 +310,9 @@
> '*tn3270': 'bool',
> '*websocket': 'bool',
> '*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
> - '*reconnect-ms': 'int' },
> + '*reconnect-ms': 'int',
> + '*support-local-migration': { 'type': 'bool', 'features': [ 'unstable' ] },
> + '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } },
> 'base': 'ChardevCommon' }
>
> ##
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 2387c21e9c..4f282d168e 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -517,6 +517,11 @@
> # each RAM page. Requires a migration URI that supports seeking,
> # such as a file. (since 9.0)
> #
> +# @local-char-socket: Migrate socket chardevs open file descriptors.
> +# Only may be used when migration channel is unix socket. Only
> +# involves socket chardevs with "support-local-migration" option
> +# enabled. (since 10.2)
> +#
> # Features:
> #
> # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> @@ -536,7 +541,8 @@
> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> 'validate-uuid', 'background-snapshot',
> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
> - 'dirty-limit', 'mapped-ram'] }
> + 'dirty-limit', 'mapped-ram',
> + { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
>
> ##
> # @MigrationCapabilityStatus:
I understand why we need a knob to enable the feature. A
MigrationCapability looks fine to me. We could perhaps come up with a
better name, but let's leave that for later.
I'm unsure about making users mark the sockets (really: the sockets
wrapped in a character device backend) to be migrated that way.
Which sockets are users supposed to mark, and how would they know?
What happens when a user marks the QMP socket? You called that a "bad
candidate".
Doesn't feel like good user interface design.
Could QEMU decide (in principle) which sockets are suitable for
sending down the migration channel?
If yes, could we make it do the right thing automatically? Or at least
a check that stops the user from doing the wrong thing?
[...]
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 24/33] chardev-add: support local migration
2025-09-12 14:56 ` Markus Armbruster
@ 2025-09-12 15:04 ` Vladimir Sementsov-Ogievskiy
2025-09-12 15:24 ` Steven Sistare
1 sibling, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-12 15:04 UTC (permalink / raw)
To: Markus Armbruster
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, qemu-devel, qemu-block,
steven.sistare, den-plotnikov, Laurent Vivier
On 12.09.25 17:56, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>
>> This commit introduces a possibility to migrate open chardev
>> socket fd through migration channel without reconnecting.
>>
>> For this, user should:
>> - enable new migration capability local-char-socket
>> - mark the socket by an option support-local-migration=true
>> - on target add local-incoming=true option to the socket
>>
>> Motivation for the API:
>>
>> 1. We don't want to migrate all sockets. For example, QMP-connection is
>> bad candidate, as it is separate on source and target. So, we need
>> @support-local-migration option to mark sockets, which we want to
>> migrate (after this series, we'll want to migrate chardev used to
>> connect with vhost-user-server).
>>
>> 2. Still, for remote migration, we can't migrate any sockets, so, we
>> need a capability, to enable/disable the whole feature.
>>
>> 3. And finally, we need a sign for the socket to not open a connection
>> on initialization, but wait for incoming migration. We can't use
>> @support-local-migration option for it, as it may be enabled, but we
>> are in incoming-remote migration. Also, we can't rely on the
>> migration capability, as user is free to setup capabilities before or
>> after chardev creation, and it would be a bad precedent to create
>> relations here.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>
> [...]
>
>> diff --git a/qapi/char.json b/qapi/char.json
>> index f0a53f742c..5b535c196a 100644
>> --- a/qapi/char.json
>> +++ b/qapi/char.json
>> @@ -280,11 +280,23 @@
>> # mutually exclusive with @reconnect.
>> # (default: 0) (Since: 9.2)
>> #
>> +# @support-local-migration: The socket open file descriptor will
>> +# migrate if this field is true and local-char-socket migration
>> +# capability enabled (default: false) (Since: 10.2)
>> +#
>> +# @local-incoming: Do load open file descriptor for the socket
>> +# on incoming migration. May be used only if QEMU is started
>> +# for incoming migration and only together with local-char-socket
>> +# migration capability (default: false) (Since: 10.2)
>> +#
>> # Features:
>> #
>> # @deprecated: Member @reconnect is deprecated. Use @reconnect-ms
>> # instead.
>> #
>> +# @unstable: Members @support-local-migration and @local-incoming
>> +# are experimental
>> +#
>> # Since: 1.4
>> ##
>> { 'struct': 'ChardevSocket',
>> @@ -298,7 +310,9 @@
>> '*tn3270': 'bool',
>> '*websocket': 'bool',
>> '*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
>> - '*reconnect-ms': 'int' },
>> + '*reconnect-ms': 'int',
>> + '*support-local-migration': { 'type': 'bool', 'features': [ 'unstable' ] },
>> + '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } },
>> 'base': 'ChardevCommon' }
>>
>> ##
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 2387c21e9c..4f282d168e 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -517,6 +517,11 @@
>> # each RAM page. Requires a migration URI that supports seeking,
>> # such as a file. (since 9.0)
>> #
>> +# @local-char-socket: Migrate socket chardevs open file descriptors.
>> +# Only may be used when migration channel is unix socket. Only
>> +# involves socket chardevs with "support-local-migration" option
>> +# enabled. (since 10.2)
>> +#
>> # Features:
>> #
>> # @unstable: Members @x-colo and @x-ignore-shared are experimental.
>> @@ -536,7 +541,8 @@
>> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
>> 'validate-uuid', 'background-snapshot',
>> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
>> - 'dirty-limit', 'mapped-ram'] }
>> + 'dirty-limit', 'mapped-ram',
>> + { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
>>
>> ##
>> # @MigrationCapabilityStatus:
>
> I understand why we need a knob to enable the feature. A
> MigrationCapability looks fine to me. We could perhaps come up with a
> better name, but let's leave that for later.
>
> I'm unsure about making users mark the sockets (really: the sockets
> wrapped in a character device backend) to be migrated that way.
>
> Which sockets are users supposed to mark, and how would they know?
>
> What happens when a user marks the QMP socket? You called that a "bad
> candidate".
>
> Doesn't feel like good user interface design.
>
> Could QEMU decide (in principle) which sockets are suitable for
> sending down the migration channel?
>
> If yes, could we make it do the right thing automatically? Or at least
> a check that stops the user from doing the wrong thing?
>
Yes, I'm thinking about this too. In my live-TAP series, I don't migrate TAP
netdev in separate, but it's migrated as part of virtio-net device. I hope,
it should be possible to do something similar here.
So the good target interface for me now is one "migarate-fds" (or better named)
capability, which turns on the whole feature both for virtio-net and vhost-user-blk
devices (and probably future ones). With additional optional device parameters
to be able to disable fds-migration per-device.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 24/33] chardev-add: support local migration
2025-09-12 14:56 ` Markus Armbruster
2025-09-12 15:04 ` Vladimir Sementsov-Ogievskiy
@ 2025-09-12 15:24 ` Steven Sistare
2025-09-15 13:28 ` Vladimir Sementsov-Ogievskiy
1 sibling, 1 reply; 108+ messages in thread
From: Steven Sistare @ 2025-09-12 15:24 UTC (permalink / raw)
To: Markus Armbruster, Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, qemu-devel, qemu-block,
den-plotnikov, Laurent Vivier
[-- Attachment #1: Type: text/plain, Size: 5559 bytes --]
On 9/12/2025 10:56 AM, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>
>> This commit introduces a possibility to migrate open chardev
>> socket fd through migration channel without reconnecting.
>>
>> For this, user should:
>> - enable new migration capability local-char-socket
>> - mark the socket by an option support-local-migration=true
>> - on target add local-incoming=true option to the socket
>>
>> Motivation for the API:
>>
>> 1. We don't want to migrate all sockets. For example, QMP-connection is
>> bad candidate, as it is separate on source and target. So, we need
>> @support-local-migration option to mark sockets, which we want to
>> migrate (after this series, we'll want to migrate chardev used to
>> connect with vhost-user-server).
>>
>> 2. Still, for remote migration, we can't migrate any sockets, so, we
>> need a capability, to enable/disable the whole feature.
>>
>> 3. And finally, we need a sign for the socket to not open a connection
>> on initialization, but wait for incoming migration. We can't use
>> @support-local-migration option for it, as it may be enabled, but we
>> are in incoming-remote migration. Also, we can't rely on the
>> migration capability, as user is free to setup capabilities before or
>> after chardev creation, and it would be a bad precedent to create
>> relations here.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>
> [...]
>
>> diff --git a/qapi/char.json b/qapi/char.json
>> index f0a53f742c..5b535c196a 100644
>> --- a/qapi/char.json
>> +++ b/qapi/char.json
>> @@ -280,11 +280,23 @@
>> # mutually exclusive with @reconnect.
>> # (default: 0) (Since: 9.2)
>> #
>> +# @support-local-migration: The socket open file descriptor will
>> +# migrate if this field is true and local-char-socket migration
>> +# capability enabled (default: false) (Since: 10.2)
>> +#
>> +# @local-incoming: Do load open file descriptor for the socket
>> +# on incoming migration. May be used only if QEMU is started
>> +# for incoming migration and only together with local-char-socket
>> +# migration capability (default: false) (Since: 10.2)
>> +#
>> # Features:
>> #
>> # @deprecated: Member @reconnect is deprecated. Use @reconnect-ms
>> # instead.
>> #
>> +# @unstable: Members @support-local-migration and @local-incoming
>> +# are experimental
>> +#
>> # Since: 1.4
>> ##
>> { 'struct': 'ChardevSocket',
>> @@ -298,7 +310,9 @@
>> '*tn3270': 'bool',
>> '*websocket': 'bool',
>> '*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
>> - '*reconnect-ms': 'int' },
>> + '*reconnect-ms': 'int',
>> + '*support-local-migration': { 'type': 'bool', 'features': [ 'unstable' ] },
>> + '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } },
>> 'base': 'ChardevCommon' }
>>
>> ##
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 2387c21e9c..4f282d168e 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -517,6 +517,11 @@
>> # each RAM page. Requires a migration URI that supports seeking,
>> # such as a file. (since 9.0)
>> #
>> +# @local-char-socket: Migrate socket chardevs open file descriptors.
>> +# Only may be used when migration channel is unix socket. Only
>> +# involves socket chardevs with "support-local-migration" option
>> +# enabled. (since 10.2)
>> +#
>> # Features:
>> #
>> # @unstable: Members @x-colo and @x-ignore-shared are experimental.
>> @@ -536,7 +541,8 @@
>> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
>> 'validate-uuid', 'background-snapshot',
>> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
>> - 'dirty-limit', 'mapped-ram'] }
>> + 'dirty-limit', 'mapped-ram',
>> + { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
>>
>> ##
>> # @MigrationCapabilityStatus:
>
> I understand why we need a knob to enable the feature. A
> MigrationCapability looks fine to me. We could perhaps come up with a
> better name, but let's leave that for later.
>
> I'm unsure about making users mark the sockets (really: the sockets
> wrapped in a character device backend) to be migrated that way.
>
> Which sockets are users supposed to mark, and how would they know?
>
> What happens when a user marks the QMP socket? You called that a "bad
> candidate".
>
> Doesn't feel like good user interface design.
>
> Could QEMU decide (in principle) which sockets are suitable for
> sending down the migration channel?
>
> If yes, could we make it do the right thing automatically? Or at least
> a check that stops the user from doing the wrong thing?
>
> [...]
Hi Vladimir, I did not notice this patch before.
I also submitted patches for preserving chardevs including sockets, here:
https://lore.kernel.org/qemu-devel/1658851843-236870-40-git-send-email-steven.sistare@oracle.com
and have fixed more bugs since then. I have attached my latest unsubmitted version
from my workspace.
My interface for enabling it is here:
https://lore.kernel.org/qemu-devel/1658851843-236870-37-git-send-email-steven.sistare@oracle.com/
I am not wedded to either the interface or my socket patch, but the capability
must be supported for CPR. And an acknowledgement of the prior work would
be nice.
- Steve
[-- Attachment #2: 0001-chardev-cpr-for-sockets.patch --]
[-- Type: text/plain, Size: 6822 bytes --]
From 262eff756f425862c4afbd522128e7bf38d50118 Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sistare@oracle.com>
Date: Tue, 20 Feb 2024 06:25:57 -0800
Subject: [PATCH] chardev: cpr for sockets
Save an accepted socket and restore it after cpr. Re-create the
corresponding listen socket, but do not listen again until the accepted
socket is closed.
Save a connected socket and restore it after cpr, and do not re-connect
via qmp_chardev_open_socket_client. Hoist the setting of
s->reconnect_time_ms so we still reconnect when the socket is closed.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
chardev/char-socket.c | 83 +++++++++++++++++++++++++++++++++++++++----
include/chardev/char-socket.h | 1 +
2 files changed, 78 insertions(+), 6 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1e83139..f83f078 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -26,6 +26,7 @@
#include "chardev/char.h"
#include "io/channel-socket.h"
#include "io/channel-websock.h"
+#include "migration/cpr.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "qemu/option.h"
@@ -360,11 +361,52 @@ static void char_socket_yank_iochannel(void *opaque)
qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
}
+static void cpr_save_socket(Chardev *chr, QIOChannelSocket *sioc)
+{
+ SocketChardev *sockchar = SOCKET_CHARDEV(chr);
+
+ if (sioc && chr->cpr_enabled && !sockchar->cpr_reused) {
+ cpr_save_fd(chr->label, 0, sioc->fd);
+ }
+}
+
+/*
+ * Return 0 if fd is found and restored, -1 if found but an error occurred,
+ * and 1 if no fd found.
+ */
+static int cpr_load_socket(Chardev *chr, Error **errp)
+{
+ ERRP_GUARD();
+ SocketChardev *sockchar = SOCKET_CHARDEV(chr);
+ QIOChannelSocket *sioc;
+ const char *label = chr->label;
+ int fd = cpr_find_fd(label, 0);
+
+ sockchar->cpr_reused = (fd >= 0);
+ if (fd != -1) {
+ sockchar = SOCKET_CHARDEV(chr);
+ sioc = qio_channel_socket_new_fd(fd, errp);
+ if (sioc) {
+ tcp_chr_accept(sockchar->listener, sioc, chr);
+ object_unref(OBJECT(sioc));
+ } else {
+ error_prepend(errp, "could not restore socket for %s", label);
+ return -1;
+ }
+ return 0;
+ }
+ return 1;
+}
+
static void tcp_chr_free_connection(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
int i;
+ if (chr->cpr_enabled) {
+ cpr_delete_fd(chr->label, 0);
+ }
+
if (s->read_msgfds_num) {
for (i = 0; i < s->read_msgfds_num; i++) {
close(s->read_msgfds[i]);
@@ -923,6 +965,8 @@ static int tcp_chr_new_client(Chardev *chr, QIOChannelSocket *sioc)
tcp_chr_connect(chr);
}
+ cpr_save_socket(chr, sioc);
+
return 0;
}
@@ -1228,6 +1272,7 @@ static gboolean socket_reconnect_timeout(gpointer opaque)
static int qmp_chardev_open_socket_server(Chardev *chr,
bool is_telnet,
bool is_waitconnect,
+ bool do_listen,
Error **errp)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
@@ -1259,11 +1304,15 @@ skip_listen:
if (is_waitconnect) {
tcp_chr_accept_server_sync(chr);
- } else {
+ } else if (do_listen) {
qio_net_listener_set_client_func_full(s->listener,
tcp_chr_accept,
chr, NULL,
chr->gcontext);
+ } else {
+ qio_net_listener_set_client_func_full(s->listener,
+ NULL, NULL, NULL,
+ chr->gcontext);
}
return 0;
@@ -1271,13 +1320,11 @@ skip_listen:
static int qmp_chardev_open_socket_client(Chardev *chr,
- int64_t reconnect_ms,
Error **errp)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
- if (reconnect_ms > 0) {
- s->reconnect_time_ms = reconnect_ms;
+ if (s->reconnect_time_ms > 0) {
tcp_chr_connect_client_async(chr);
return 0;
} else {
@@ -1381,8 +1428,10 @@ static void qmp_chardev_open_socket(Chardev *chr,
bool is_tn3270 = sock->has_tn3270 ? sock->tn3270 : false;
bool is_waitconnect = sock->has_wait ? sock->wait : false;
bool is_websock = sock->has_websocket ? sock->websocket : false;
+ bool do_listen = is_listen;
int64_t reconnect_ms = 0;
SocketAddress *addr;
+ int ret;
s->is_listen = is_listen;
s->is_telnet = is_telnet;
@@ -1442,14 +1491,31 @@ static void qmp_chardev_open_socket(Chardev *chr,
}
s->registered_yank = true;
+ qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_CPR);
+
/* be isn't opened until we get a connection */
*be_opened = false;
update_disconnected_filename(s);
if (s->is_listen) {
+ /*
+ * If an accepted socket has been preserved, use it. It may be closed
+ * later, so still set up the listen socket so it can accept again,
+ * but do not wait, and do not listen yet.
+ */
+
+ ret = cpr_load_socket(chr, errp);
+ if (ret < 0) {
+ return;
+ } else if (ret == 0) {
+ is_waitconnect = false;
+ do_listen = false;
+ }
+
if (qmp_chardev_open_socket_server(chr, is_telnet || is_tn3270,
- is_waitconnect, errp) < 0) {
+ is_waitconnect, do_listen,
+ errp) < 0) {
return;
}
} else {
@@ -1458,8 +1524,13 @@ static void qmp_chardev_open_socket(Chardev *chr,
} else if (sock->has_reconnect_ms) {
reconnect_ms = sock->reconnect_ms;
}
+ s->reconnect_time_ms = reconnect_ms;
- if (qmp_chardev_open_socket_client(chr, reconnect_ms, errp) < 0) {
+ /* If a connected socket has been preserved, use it, else connect. */
+ if (cpr_load_socket(chr, errp) <= 0) {
+ return;
+ }
+ if (qmp_chardev_open_socket_client(chr, errp) < 0) {
return;
}
}
diff --git a/include/chardev/char-socket.h b/include/chardev/char-socket.h
index d6d13ad..640d033 100644
--- a/include/chardev/char-socket.h
+++ b/include/chardev/char-socket.h
@@ -63,6 +63,7 @@ struct SocketChardev {
int *write_msgfds;
size_t write_msgfds_num;
bool registered_yank;
+ bool cpr_reused;
SocketAddress *addr;
bool is_listen;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 24/33] chardev-add: support local migration
2025-09-12 15:24 ` Steven Sistare
@ 2025-09-15 13:28 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-15 13:28 UTC (permalink / raw)
To: Steven Sistare, Markus Armbruster
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, qemu-devel, qemu-block,
den-plotnikov, Laurent Vivier
On 12.09.25 18:24, Steven Sistare wrote:
> On 9/12/2025 10:56 AM, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>>
>>> This commit introduces a possibility to migrate open chardev
>>> socket fd through migration channel without reconnecting.
>>>
>>> For this, user should:
>>> - enable new migration capability local-char-socket
>>> - mark the socket by an option support-local-migration=true
>>> - on target add local-incoming=true option to the socket
>>>
>>> Motivation for the API:
>>>
>>> 1. We don't want to migrate all sockets. For example, QMP-connection is
>>> bad candidate, as it is separate on source and target. So, we need
>>> @support-local-migration option to mark sockets, which we want to
>>> migrate (after this series, we'll want to migrate chardev used to
>>> connect with vhost-user-server).
>>>
>>> 2. Still, for remote migration, we can't migrate any sockets, so, we
>>> need a capability, to enable/disable the whole feature.
>>>
>>> 3. And finally, we need a sign for the socket to not open a connection
>>> on initialization, but wait for incoming migration. We can't use
>>> @support-local-migration option for it, as it may be enabled, but we
>>> are in incoming-remote migration. Also, we can't rely on the
>>> migration capability, as user is free to setup capabilities before or
>>> after chardev creation, and it would be a bad precedent to create
>>> relations here.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>
>> [...]
>>
>>> diff --git a/qapi/char.json b/qapi/char.json
>>> index f0a53f742c..5b535c196a 100644
>>> --- a/qapi/char.json
>>> +++ b/qapi/char.json
>>> @@ -280,11 +280,23 @@
>>> # mutually exclusive with @reconnect.
>>> # (default: 0) (Since: 9.2)
>>> #
>>> +# @support-local-migration: The socket open file descriptor will
>>> +# migrate if this field is true and local-char-socket migration
>>> +# capability enabled (default: false) (Since: 10.2)
>>> +#
>>> +# @local-incoming: Do load open file descriptor for the socket
>>> +# on incoming migration. May be used only if QEMU is started
>>> +# for incoming migration and only together with local-char-socket
>>> +# migration capability (default: false) (Since: 10.2)
>>> +#
>>> # Features:
>>> #
>>> # @deprecated: Member @reconnect is deprecated. Use @reconnect-ms
>>> # instead.
>>> #
>>> +# @unstable: Members @support-local-migration and @local-incoming
>>> +# are experimental
>>> +#
>>> # Since: 1.4
>>> ##
>>> { 'struct': 'ChardevSocket',
>>> @@ -298,7 +310,9 @@
>>> '*tn3270': 'bool',
>>> '*websocket': 'bool',
>>> '*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
>>> - '*reconnect-ms': 'int' },
>>> + '*reconnect-ms': 'int',
>>> + '*support-local-migration': { 'type': 'bool', 'features': [ 'unstable' ] },
>>> + '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } },
>>> 'base': 'ChardevCommon' }
>>> ##
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 2387c21e9c..4f282d168e 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -517,6 +517,11 @@
>>> # each RAM page. Requires a migration URI that supports seeking,
>>> # such as a file. (since 9.0)
>>> #
>>> +# @local-char-socket: Migrate socket chardevs open file descriptors.
>>> +# Only may be used when migration channel is unix socket. Only
>>> +# involves socket chardevs with "support-local-migration" option
>>> +# enabled. (since 10.2)
>>> +#
>>> # Features:
>>> #
>>> # @unstable: Members @x-colo and @x-ignore-shared are experimental.
>>> @@ -536,7 +541,8 @@
>>> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
>>> 'validate-uuid', 'background-snapshot',
>>> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
>>> - 'dirty-limit', 'mapped-ram'] }
>>> + 'dirty-limit', 'mapped-ram',
>>> + { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
>>> ##
>>> # @MigrationCapabilityStatus:
>>
>> I understand why we need a knob to enable the feature. A
>> MigrationCapability looks fine to me. We could perhaps come up with a
>> better name, but let's leave that for later.
>>
>> I'm unsure about making users mark the sockets (really: the sockets
>> wrapped in a character device backend) to be migrated that way.
>>
>> Which sockets are users supposed to mark, and how would they know?
>>
>> What happens when a user marks the QMP socket? You called that a "bad
>> candidate".
>>
>> Doesn't feel like good user interface design.
>>
>> Could QEMU decide (in principle) which sockets are suitable for
>> sending down the migration channel?
>>
>> If yes, could we make it do the right thing automatically? Or at least
>> a check that stops the user from doing the wrong thing?
>>
>> [...]
>
> Hi Vladimir, I did not notice this patch before.
> I also submitted patches for preserving chardevs including sockets, here:
> https://lore.kernel.org/qemu-devel/1658851843-236870-40-git-send-email-steven.sistare@oracle.com
> and have fixed more bugs since then. I have attached my latest unsubmitted version
> from my workspace.
>
> My interface for enabling it is here:
> https://lore.kernel.org/qemu-devel/1658851843-236870-37-git-send-email-steven.sistare@oracle.com/
>
> I am not wedded to either the interface or my socket patch, but the capability
> must be supported for CPR. And an acknowledgement of the prior work would
> be nice.
>
Thanks! I'll consider this when preparing a new version for vhost-user-blk.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 25/33] virtio: introduce .skip_vhost_migration_log() handler
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (23 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 24/33] chardev-add: support local migration Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:08 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock() Vladimir Sementsov-Ogievskiy
` (8 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
For vhost user backend migration we'll need to disable memory
logging on the device. Let's prepare a corresponding handler for
the device.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 10 ++++++++++
include/hw/virtio/virtio.h | 2 ++
2 files changed, 12 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e7c809400b..0427fc29b2 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1134,6 +1134,16 @@ static int vhost_migration_log(MemoryListener *listener, bool enable)
struct vhost_dev *dev = container_of(listener, struct vhost_dev,
memory_listener);
int r;
+
+ if (dev->vdev) {
+ VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
+
+ if (vdc->skip_vhost_migration_log &&
+ vdc->skip_vhost_migration_log(dev->vdev)) {
+ return 0;
+ }
+ }
+
if (enable == dev->log_enabled) {
return 0;
}
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 8b9db08ddf..9a4a0a94aa 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -234,6 +234,8 @@ struct VirtioDeviceClass {
/* May be called even when vdev->vhost_started is false */
struct vhost_dev *(*get_vhost)(VirtIODevice *vdev);
void (*toggle_device_iotlb)(VirtIODevice *vdev);
+
+ bool (*skip_vhost_migration_log)(VirtIODevice *vdev);
};
void virtio_instance_init_common(Object *proxy_obj, void *data,
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 25/33] virtio: introduce .skip_vhost_migration_log() handler
2025-08-13 16:48 ` [PATCH 25/33] virtio: introduce .skip_vhost_migration_log() handler Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:08 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Acked-by: Raphael Norwitz <raphael.s.norwitz@gmail.com>
On Wed, Aug 13, 2025 at 1:00 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> For vhost user backend migration we'll need to disable memory
> logging on the device. Let's prepare a corresponding handler for
> the device.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 10 ++++++++++
> include/hw/virtio/virtio.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index e7c809400b..0427fc29b2 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1134,6 +1134,16 @@ static int vhost_migration_log(MemoryListener *listener, bool enable)
> struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> memory_listener);
> int r;
> +
> + if (dev->vdev) {
> + VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> +
> + if (vdc->skip_vhost_migration_log &&
> + vdc->skip_vhost_migration_log(dev->vdev)) {
> + return 0;
> + }
> + }
> +
> if (enable == dev->log_enabled) {
> return 0;
> }
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 8b9db08ddf..9a4a0a94aa 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -234,6 +234,8 @@ struct VirtioDeviceClass {
> /* May be called even when vdev->vhost_started is false */
> struct vhost_dev *(*get_vhost)(VirtIODevice *vdev);
> void (*toggle_device_iotlb)(VirtIODevice *vdev);
> +
> + bool (*skip_vhost_migration_log)(VirtIODevice *vdev);
> };
>
> void virtio_instance_init_common(Object *proxy_obj, void *data,
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (24 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 25/33] virtio: introduce .skip_vhost_migration_log() handler Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-20 13:27 ` Peter Xu
2025-08-20 13:37 ` Daniel P. Berrangé
2025-08-13 16:48 ` [PATCH 27/33] migration/socket: keep fds non-block Vladimir Sementsov-Ogievskiy
` (7 subsequent siblings)
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Add a possibility to keep socket non-block status when passing
through qio channel. We need this to support migration of open
fds through migration channel.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
include/io/channel-socket.h | 3 +++
io/channel-socket.c | 16 ++++++++++++----
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
index a88cf8b3a9..0a4327d745 100644
--- a/include/io/channel-socket.h
+++ b/include/io/channel-socket.h
@@ -49,6 +49,7 @@ struct QIOChannelSocket {
socklen_t remoteAddrLen;
ssize_t zero_copy_queued;
ssize_t zero_copy_sent;
+ bool keep_nonblock;
};
@@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
size_t size,
Error **errp);
+void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
+
#endif /* QIO_CHANNEL_SOCKET_H */
diff --git a/io/channel-socket.c b/io/channel-socket.c
index 3b7ca924ff..cd93d7f180 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
}
+void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
+{
+ QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+ sioc->keep_nonblock = true;
+}
+
+
#ifndef WIN32
static void qio_channel_socket_copy_fds(struct msghdr *msg,
- int **fds, size_t *nfds)
+ int **fds, size_t *nfds, bool set_block)
{
struct cmsghdr *cmsg;
@@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
continue;
}
- /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
- qemu_socket_set_block(fd);
+ if (set_block) {
+ qemu_socket_set_block(fd);
+ }
#ifndef MSG_CMSG_CLOEXEC
qemu_set_cloexec(fd);
@@ -556,7 +564,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
}
if (fds && nfds) {
- qio_channel_socket_copy_fds(&msg, fds, nfds);
+ qio_channel_socket_copy_fds(&msg, fds, nfds, !sioc->keep_nonblock);
}
return ret;
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-13 16:48 ` [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock() Vladimir Sementsov-Ogievskiy
@ 2025-08-20 13:27 ` Peter Xu
2025-08-20 13:43 ` Daniel P. Berrangé
2025-08-20 13:37 ` Daniel P. Berrangé
1 sibling, 1 reply; 108+ messages in thread
From: Peter Xu @ 2025-08-20 13:27 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, farosas, raphael, sgarzare, marcandre.lureau, pbonzini,
kwolf, hreitz, berrange, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add a possibility to keep socket non-block status when passing
> through qio channel. We need this to support migration of open
> fds through migration channel.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> include/io/channel-socket.h | 3 +++
> io/channel-socket.c | 16 ++++++++++++----
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> index a88cf8b3a9..0a4327d745 100644
> --- a/include/io/channel-socket.h
> +++ b/include/io/channel-socket.h
> @@ -49,6 +49,7 @@ struct QIOChannelSocket {
> socklen_t remoteAddrLen;
> ssize_t zero_copy_queued;
> ssize_t zero_copy_sent;
> + bool keep_nonblock;
> };
>
>
> @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
> size_t size,
> Error **errp);
>
> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
> +
> #endif /* QIO_CHANNEL_SOCKET_H */
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 3b7ca924ff..cd93d7f180 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
> }
>
>
> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
> +{
> + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> + sioc->keep_nonblock = true;
> +}
> +
> +
> #ifndef WIN32
> static void qio_channel_socket_copy_fds(struct msghdr *msg,
> - int **fds, size_t *nfds)
> + int **fds, size_t *nfds, bool set_block)
> {
> struct cmsghdr *cmsg;
>
> @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
> continue;
> }
>
> - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> - qemu_socket_set_block(fd);
> + if (set_block) {
> + qemu_socket_set_block(fd);
> + }
"keep_nonblock" as a feature in iochannel is slightly hard to digest. It
can also be read as "keep the fd to be always nonblocking".
Is this feature required, or can this also be done in a get() or
post_load() on the other side to set nonblock to whatever it should be
(that dest QEMU should be aware of)?
>
> #ifndef MSG_CMSG_CLOEXEC
> qemu_set_cloexec(fd);
> @@ -556,7 +564,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
> }
>
> if (fds && nfds) {
> - qio_channel_socket_copy_fds(&msg, fds, nfds);
> + qio_channel_socket_copy_fds(&msg, fds, nfds, !sioc->keep_nonblock);
> }
>
> return ret;
> --
> 2.48.1
>
>
--
Peter Xu
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-20 13:27 ` Peter Xu
@ 2025-08-20 13:43 ` Daniel P. Berrangé
2025-08-20 14:37 ` Peter Xu
2025-08-21 12:07 ` Vladimir Sementsov-Ogievskiy
0 siblings, 2 replies; 108+ messages in thread
From: Daniel P. Berrangé @ 2025-08-20 13:43 UTC (permalink / raw)
To: Peter Xu
Cc: Vladimir Sementsov-Ogievskiy, mst, farosas, raphael, sgarzare,
marcandre.lureau, pbonzini, kwolf, hreitz, eblake, armbru,
qemu-devel, qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 20, 2025 at 09:27:09AM -0400, Peter Xu wrote:
> On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > Add a possibility to keep socket non-block status when passing
> > through qio channel. We need this to support migration of open
> > fds through migration channel.
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > ---
> > include/io/channel-socket.h | 3 +++
> > io/channel-socket.c | 16 ++++++++++++----
> > 2 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> > index a88cf8b3a9..0a4327d745 100644
> > --- a/include/io/channel-socket.h
> > +++ b/include/io/channel-socket.h
> > @@ -49,6 +49,7 @@ struct QIOChannelSocket {
> > socklen_t remoteAddrLen;
> > ssize_t zero_copy_queued;
> > ssize_t zero_copy_sent;
> > + bool keep_nonblock;
> > };
> >
> >
> > @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
> > size_t size,
> > Error **errp);
> >
> > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
> > +
> > #endif /* QIO_CHANNEL_SOCKET_H */
> > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > index 3b7ca924ff..cd93d7f180 100644
> > --- a/io/channel-socket.c
> > +++ b/io/channel-socket.c
> > @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
> > }
> >
> >
> > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
> > +{
> > + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> > + sioc->keep_nonblock = true;
> > +}
> > +
> > +
> > #ifndef WIN32
> > static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > - int **fds, size_t *nfds)
> > + int **fds, size_t *nfds, bool set_block)
> > {
> > struct cmsghdr *cmsg;
> >
> > @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > continue;
> > }
> >
> > - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> > - qemu_socket_set_block(fd);
> > + if (set_block) {
> > + qemu_socket_set_block(fd);
> > + }
>
> "keep_nonblock" as a feature in iochannel is slightly hard to digest. It
> can also be read as "keep the fd to be always nonblocking".
>
> Is this feature required, or can this also be done in a get() or
> post_load() on the other side to set nonblock to whatever it should be
> (that dest QEMU should be aware of)?
Either we preserve state of the flag when receiving the FD,
or every QEMU backend that we're receiving FDs on behalf of
needs to reset the flag when migration passes over the FD.
The latter might actually be a more robust scheme. If we're
migrating across QEMU versions, there is not a strict
guarantee that the new QEMU version's backend will want the
O_NONBLOCK flag in the same state as the old QEMU version.
The code might have been re-written to work in a different
way than previously.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-20 13:43 ` Daniel P. Berrangé
@ 2025-08-20 14:37 ` Peter Xu
2025-08-20 14:42 ` Daniel P. Berrangé
2025-08-21 12:07 ` Vladimir Sementsov-Ogievskiy
1 sibling, 1 reply; 108+ messages in thread
From: Peter Xu @ 2025-08-20 14:37 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Vladimir Sementsov-Ogievskiy, mst, farosas, raphael, sgarzare,
marcandre.lureau, pbonzini, kwolf, hreitz, eblake, armbru,
qemu-devel, qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 20, 2025 at 02:43:54PM +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 20, 2025 at 09:27:09AM -0400, Peter Xu wrote:
> > On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > Add a possibility to keep socket non-block status when passing
> > > through qio channel. We need this to support migration of open
> > > fds through migration channel.
> > >
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > > ---
> > > include/io/channel-socket.h | 3 +++
> > > io/channel-socket.c | 16 ++++++++++++----
> > > 2 files changed, 15 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> > > index a88cf8b3a9..0a4327d745 100644
> > > --- a/include/io/channel-socket.h
> > > +++ b/include/io/channel-socket.h
> > > @@ -49,6 +49,7 @@ struct QIOChannelSocket {
> > > socklen_t remoteAddrLen;
> > > ssize_t zero_copy_queued;
> > > ssize_t zero_copy_sent;
> > > + bool keep_nonblock;
> > > };
> > >
> > >
> > > @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
> > > size_t size,
> > > Error **errp);
> > >
> > > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
> > > +
> > > #endif /* QIO_CHANNEL_SOCKET_H */
> > > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > > index 3b7ca924ff..cd93d7f180 100644
> > > --- a/io/channel-socket.c
> > > +++ b/io/channel-socket.c
> > > @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
> > > }
> > >
> > >
> > > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
> > > +{
> > > + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> > > + sioc->keep_nonblock = true;
> > > +}
> > > +
> > > +
> > > #ifndef WIN32
> > > static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > > - int **fds, size_t *nfds)
> > > + int **fds, size_t *nfds, bool set_block)
> > > {
> > > struct cmsghdr *cmsg;
> > >
> > > @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > > continue;
> > > }
> > >
> > > - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> > > - qemu_socket_set_block(fd);
> > > + if (set_block) {
> > > + qemu_socket_set_block(fd);
> > > + }
> >
> > "keep_nonblock" as a feature in iochannel is slightly hard to digest. It
> > can also be read as "keep the fd to be always nonblocking".
> >
> > Is this feature required, or can this also be done in a get() or
> > post_load() on the other side to set nonblock to whatever it should be
> > (that dest QEMU should be aware of)?
>
> Either we preserve state of the flag when receiving the FD,
> or every QEMU backend that we're receiving FDs on behalf of
> needs to reset the flag when migration passes over the FD.
>
> The latter might actually be a more robust scheme. If we're
> migrating across QEMU versions, there is not a strict
> guarantee that the new QEMU version's backend will want the
> O_NONBLOCK flag in the same state as the old QEMU version.
> The code might have been re-written to work in a different
> way than previously.
Good point.
Do you remember why we reset that in the very initial git commit?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-20 14:37 ` Peter Xu
@ 2025-08-20 14:42 ` Daniel P. Berrangé
0 siblings, 0 replies; 108+ messages in thread
From: Daniel P. Berrangé @ 2025-08-20 14:42 UTC (permalink / raw)
To: Peter Xu
Cc: Vladimir Sementsov-Ogievskiy, mst, farosas, raphael, sgarzare,
marcandre.lureau, pbonzini, kwolf, hreitz, eblake, armbru,
qemu-devel, qemu-block, steven.sistare, den-plotnikov
On Wed, Aug 20, 2025 at 10:37:41AM -0400, Peter Xu wrote:
> On Wed, Aug 20, 2025 at 02:43:54PM +0100, Daniel P. Berrangé wrote:
> > On Wed, Aug 20, 2025 at 09:27:09AM -0400, Peter Xu wrote:
> > > On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > Add a possibility to keep socket non-block status when passing
> > > > through qio channel. We need this to support migration of open
> > > > fds through migration channel.
> > > >
> > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > > > ---
> > > > include/io/channel-socket.h | 3 +++
> > > > io/channel-socket.c | 16 ++++++++++++----
> > > > 2 files changed, 15 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> > > > index a88cf8b3a9..0a4327d745 100644
> > > > --- a/include/io/channel-socket.h
> > > > +++ b/include/io/channel-socket.h
> > > > @@ -49,6 +49,7 @@ struct QIOChannelSocket {
> > > > socklen_t remoteAddrLen;
> > > > ssize_t zero_copy_queued;
> > > > ssize_t zero_copy_sent;
> > > > + bool keep_nonblock;
> > > > };
> > > >
> > > >
> > > > @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
> > > > size_t size,
> > > > Error **errp);
> > > >
> > > > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
> > > > +
> > > > #endif /* QIO_CHANNEL_SOCKET_H */
> > > > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > > > index 3b7ca924ff..cd93d7f180 100644
> > > > --- a/io/channel-socket.c
> > > > +++ b/io/channel-socket.c
> > > > @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
> > > > }
> > > >
> > > >
> > > > +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
> > > > +{
> > > > + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> > > > + sioc->keep_nonblock = true;
> > > > +}
> > > > +
> > > > +
> > > > #ifndef WIN32
> > > > static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > > > - int **fds, size_t *nfds)
> > > > + int **fds, size_t *nfds, bool set_block)
> > > > {
> > > > struct cmsghdr *cmsg;
> > > >
> > > > @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
> > > > continue;
> > > > }
> > > >
> > > > - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> > > > - qemu_socket_set_block(fd);
> > > > + if (set_block) {
> > > > + qemu_socket_set_block(fd);
> > > > + }
> > >
> > > "keep_nonblock" as a feature in iochannel is slightly hard to digest. It
> > > can also be read as "keep the fd to be always nonblocking".
> > >
> > > Is this feature required, or can this also be done in a get() or
> > > post_load() on the other side to set nonblock to whatever it should be
> > > (that dest QEMU should be aware of)?
> >
> > Either we preserve state of the flag when receiving the FD,
> > or every QEMU backend that we're receiving FDs on behalf of
> > needs to reset the flag when migration passes over the FD.
> >
> > The latter might actually be a more robust scheme. If we're
> > migrating across QEMU versions, there is not a strict
> > guarantee that the new QEMU version's backend will want the
> > O_NONBLOCK flag in the same state as the old QEMU version.
> > The code might have been re-written to work in a different
> > way than previously.
>
> Good point.
>
> Do you remember why we reset that in the very initial git commit?
Historical needs to receive FDs in QEMU have been in relation to external
non-QEMU processes. We reset the blocking state to ensure that all FDs
we receive have a well defined initial state, and the QEMU backends
consuming the FDs can then alter that as required.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-20 13:43 ` Daniel P. Berrangé
2025-08-20 14:37 ` Peter Xu
@ 2025-08-21 12:07 ` Vladimir Sementsov-Ogievskiy
2025-08-21 13:45 ` Peter Xu
1 sibling, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-21 12:07 UTC (permalink / raw)
To: Daniel P. Berrangé, Peter Xu
Cc: mst, farosas, raphael, sgarzare, marcandre.lureau, pbonzini,
kwolf, hreitz, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On 20.08.25 16:43, Daniel P. Berrangé wrote:
> On Wed, Aug 20, 2025 at 09:27:09AM -0400, Peter Xu wrote:
>> On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>> Add a possibility to keep socket non-block status when passing
>>> through qio channel. We need this to support migration of open
>>> fds through migration channel.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>> ---
>>> include/io/channel-socket.h | 3 +++
>>> io/channel-socket.c | 16 ++++++++++++----
>>> 2 files changed, 15 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
>>> index a88cf8b3a9..0a4327d745 100644
>>> --- a/include/io/channel-socket.h
>>> +++ b/include/io/channel-socket.h
>>> @@ -49,6 +49,7 @@ struct QIOChannelSocket {
>>> socklen_t remoteAddrLen;
>>> ssize_t zero_copy_queued;
>>> ssize_t zero_copy_sent;
>>> + bool keep_nonblock;
>>> };
>>>
>>>
>>> @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
>>> size_t size,
>>> Error **errp);
>>>
>>> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
>>> +
>>> #endif /* QIO_CHANNEL_SOCKET_H */
>>> diff --git a/io/channel-socket.c b/io/channel-socket.c
>>> index 3b7ca924ff..cd93d7f180 100644
>>> --- a/io/channel-socket.c
>>> +++ b/io/channel-socket.c
>>> @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
>>> }
>>>
>>>
>>> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
>>> +{
>>> + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
>>> + sioc->keep_nonblock = true;
>>> +}
>>> +
>>> +
>>> #ifndef WIN32
>>> static void qio_channel_socket_copy_fds(struct msghdr *msg,
>>> - int **fds, size_t *nfds)
>>> + int **fds, size_t *nfds, bool set_block)
>>> {
>>> struct cmsghdr *cmsg;
>>>
>>> @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
>>> continue;
>>> }
>>>
>>> - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
>>> - qemu_socket_set_block(fd);
>>> + if (set_block) {
>>> + qemu_socket_set_block(fd);
>>> + }
>>
>> "keep_nonblock" as a feature in iochannel is slightly hard to digest. It
>> can also be read as "keep the fd to be always nonblocking".
>>
>> Is this feature required, or can this also be done in a get() or
>> post_load() on the other side to set nonblock to whatever it should be
>> (that dest QEMU should be aware of)?
>
> Either we preserve state of the flag when receiving the FD,
> or every QEMU backend that we're receiving FDs on behalf of
> needs to reset the flag when migration passes over the FD.
>
> The latter might actually be a more robust scheme. If we're
> migrating across QEMU versions, there is not a strict
> guarantee that the new QEMU version's backend will want the
> O_NONBLOCK flag in the same state as the old QEMU version.
> The code might have been re-written to work in a different
> way than previously.
>
What I dislike in the way, when we reset to blocking always, and set non-blocking again where needed:
1. Extra fcntl calls for nothing (I think actually, in most cases, for fds passed through migration stream(s) we'll want to keep fd as is)
2. When we reset to blocking on target, it's visible on source and may break things.
In these series it's probably doesn't really matter, as at the time when we get the descriptor on target, it should not be used anymore on source.
But for example, in CPR-transfer, where descriptors are passed in the preliminary stage, and source is running and use the descriptors, we shouldn't change the non-blocking status of fd on target. Probably, CPR-transfer for now only works with fds which are blpcking, so we don't have a problem.
So, I think, that better default is preserve state of the flag for fds passed through migration stream. And backends may modify it if needed (I think, in most cases - they will not need).
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-21 12:07 ` Vladimir Sementsov-Ogievskiy
@ 2025-08-21 13:45 ` Peter Xu
2025-08-21 14:11 ` Daniel P. Berrangé
0 siblings, 1 reply; 108+ messages in thread
From: Peter Xu @ 2025-08-21 13:45 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: Daniel P. Berrangé, mst, farosas, raphael, sgarzare,
marcandre.lureau, pbonzini, kwolf, hreitz, eblake, armbru,
qemu-devel, qemu-block, steven.sistare, den-plotnikov
On Thu, Aug 21, 2025 at 03:07:57PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> What I dislike in the way, when we reset to blocking always, and set
> non-blocking again where needed:
>
> 1. Extra fcntl calls for nothing (I think actually, in most cases, for
> fds passed through migration stream(s) we'll want to keep fd as is)
>
> 2. When we reset to blocking on target, it's visible on source and may
> break things.
>
> In these series it's probably doesn't really matter, as at the time when
> we get the descriptor on target, it should not be used anymore on source.
>
> But for example, in CPR-transfer, where descriptors are passed in the
> preliminary stage, and source is running and use the descriptors, we
> shouldn't change the non-blocking status of fd on target. Probably,
> CPR-transfer for now only works with fds which are blpcking, so we don't
> have a problem.
>
> So, I think, that better default is preserve state of the flag for fds
> passed through migration stream. And backends may modify it if needed (I
> think, in most cases - they will not need).
I agree having that as a default iochannel behavior is questionable.
If it was defined for any fd-passing protocols making sure nonblocking
status is predictable (in this case, fds will be always blocking), IMHO it
should be done in the protocol layer either constantly setting or clearing
NONBLOCK flag before sending the fds, rather than having it silently
processed in iochannel internals.
Do we know how many existing users are relying such behavior? I wonder if
we could still push this operation to the protocols that will need it, then
we can avoid this slightly awkward iochannel feature flag, because the
iochannel user will simply always have full control.
--
Peter Xu
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-21 13:45 ` Peter Xu
@ 2025-08-21 14:11 ` Daniel P. Berrangé
0 siblings, 0 replies; 108+ messages in thread
From: Daniel P. Berrangé @ 2025-08-21 14:11 UTC (permalink / raw)
To: Peter Xu
Cc: Vladimir Sementsov-Ogievskiy, mst, farosas, raphael, sgarzare,
marcandre.lureau, pbonzini, kwolf, hreitz, eblake, armbru,
qemu-devel, qemu-block, steven.sistare, den-plotnikov
On Thu, Aug 21, 2025 at 09:45:36AM -0400, Peter Xu wrote:
> On Thu, Aug 21, 2025 at 03:07:57PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > What I dislike in the way, when we reset to blocking always, and set
> > non-blocking again where needed:
> >
> > 1. Extra fcntl calls for nothing (I think actually, in most cases, for
> > fds passed through migration stream(s) we'll want to keep fd as is)
> >
> > 2. When we reset to blocking on target, it's visible on source and may
> > break things.
> >
> > In these series it's probably doesn't really matter, as at the time when
> > we get the descriptor on target, it should not be used anymore on source.
> >
> > But for example, in CPR-transfer, where descriptors are passed in the
> > preliminary stage, and source is running and use the descriptors, we
> > shouldn't change the non-blocking status of fd on target. Probably,
> > CPR-transfer for now only works with fds which are blpcking, so we don't
> > have a problem.
> >
> > So, I think, that better default is preserve state of the flag for fds
> > passed through migration stream. And backends may modify it if needed (I
> > think, in most cases - they will not need).
>
> I agree having that as a default iochannel behavior is questionable.
>
> If it was defined for any fd-passing protocols making sure nonblocking
> status is predictable (in this case, fds will be always blocking), IMHO it
> should be done in the protocol layer either constantly setting or clearing
> NONBLOCK flag before sending the fds, rather than having it silently
> processed in iochannel internals.
We explicitly did not want to rely on the senders to do this, as the
senders generally does not know what receiving QEMU wants to do with
the FDs.
Clearing the non-blocking flag was chosen to make the default state of
passed-in FDs, be identical to the default state of FDs that QEMU opens
itself. This removes the possibility of bugs we've seen in the past,
where code assumed an FD's blocking flag was in its default state you
get from 'socket()' and then broke when receiving a pre-opened FD
over QEMU in a different state. Basically anything using SocketAddress
should be getting an FD in the same state whether opened by QEMU or
passed in.
> Do we know how many existing users are relying such behavior? I wonder if
> we could still push this operation to the protocols that will need it, then
> we can avoid this slightly awkward iochannel feature flag, because the
> iochannel user will simply always have full control.
Primarily this is relevant to the monitor QMP/HMP, which in turns
makes it is relevant to UNIX chardevs. There are a number of other
areas in QEMU calling qio_channel_readv_full() with a non-NULL fds
argument though.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-13 16:48 ` [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock() Vladimir Sementsov-Ogievskiy
2025-08-20 13:27 ` Peter Xu
@ 2025-08-20 13:37 ` Daniel P. Berrangé
2025-08-21 12:08 ` Vladimir Sementsov-Ogievskiy
1 sibling, 1 reply; 108+ messages in thread
From: Daniel P. Berrangé @ 2025-08-20 13:37 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add a possibility to keep socket non-block status when passing
> through qio channel. We need this to support migration of open
> fds through migration channel.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> include/io/channel-socket.h | 3 +++
> io/channel-socket.c | 16 ++++++++++++----
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> index a88cf8b3a9..0a4327d745 100644
> --- a/include/io/channel-socket.h
> +++ b/include/io/channel-socket.h
> @@ -49,6 +49,7 @@ struct QIOChannelSocket {
> socklen_t remoteAddrLen;
> ssize_t zero_copy_queued;
> ssize_t zero_copy_sent;
> + bool keep_nonblock;
> };
>
>
> @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
> size_t size,
> Error **errp);
>
> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
> +
> #endif /* QIO_CHANNEL_SOCKET_H */
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 3b7ca924ff..cd93d7f180 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
> }
>
>
> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
> +{
> + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> + sioc->keep_nonblock = true;
> +}
> +
> +
> #ifndef WIN32
> static void qio_channel_socket_copy_fds(struct msghdr *msg,
> - int **fds, size_t *nfds)
> + int **fds, size_t *nfds, bool set_block)
> {
> struct cmsghdr *cmsg;
>
> @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
> continue;
> }
>
> - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> - qemu_socket_set_block(fd);
> + if (set_block) {
> + qemu_socket_set_block(fd);
> + }
>
> #ifndef MSG_CMSG_CLOEXEC
> qemu_set_cloexec(fd);
> @@ -556,7 +564,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
> }
>
> if (fds && nfds) {
> - qio_channel_socket_copy_fds(&msg, fds, nfds);
> + qio_channel_socket_copy_fds(&msg, fds, nfds, !sioc->keep_nonblock);
> }
>
> return ret;
If this is needed, then it should be done by defining another flag
constant to be passed to qio_channel_read*, not via a new API
QIO_CHANNEL_READ_FLAG_FD_PRESERVE_NONBLOCKING
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock()
2025-08-20 13:37 ` Daniel P. Berrangé
@ 2025-08-21 12:08 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-21 12:08 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On 20.08.25 16:37, Daniel P. Berrangé wrote:
> On Wed, Aug 13, 2025 at 07:48:47PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Add a possibility to keep socket non-block status when passing
>> through qio channel. We need this to support migration of open
>> fds through migration channel.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> include/io/channel-socket.h | 3 +++
>> io/channel-socket.c | 16 ++++++++++++----
>> 2 files changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
>> index a88cf8b3a9..0a4327d745 100644
>> --- a/include/io/channel-socket.h
>> +++ b/include/io/channel-socket.h
>> @@ -49,6 +49,7 @@ struct QIOChannelSocket {
>> socklen_t remoteAddrLen;
>> ssize_t zero_copy_queued;
>> ssize_t zero_copy_sent;
>> + bool keep_nonblock;
>> };
>>
>>
>> @@ -275,4 +276,6 @@ int qio_channel_socket_set_send_buffer(QIOChannelSocket *ioc,
>> size_t size,
>> Error **errp);
>>
>> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc);
>> +
>> #endif /* QIO_CHANNEL_SOCKET_H */
>> diff --git a/io/channel-socket.c b/io/channel-socket.c
>> index 3b7ca924ff..cd93d7f180 100644
>> --- a/io/channel-socket.c
>> +++ b/io/channel-socket.c
>> @@ -462,9 +462,16 @@ static void qio_channel_socket_finalize(Object *obj)
>> }
>>
>>
>> +void qio_channel_socket_keep_nonblock(QIOChannel *ioc)
>> +{
>> + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
>> + sioc->keep_nonblock = true;
>> +}
>> +
>> +
>> #ifndef WIN32
>> static void qio_channel_socket_copy_fds(struct msghdr *msg,
>> - int **fds, size_t *nfds)
>> + int **fds, size_t *nfds, bool set_block)
>> {
>> struct cmsghdr *cmsg;
>>
>> @@ -497,8 +504,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
>> continue;
>> }
>>
>> - /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
>> - qemu_socket_set_block(fd);
>> + if (set_block) {
>> + qemu_socket_set_block(fd);
>> + }
>>
>> #ifndef MSG_CMSG_CLOEXEC
>> qemu_set_cloexec(fd);
>> @@ -556,7 +564,7 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
>> }
>>
>> if (fds && nfds) {
>> - qio_channel_socket_copy_fds(&msg, fds, nfds);
>> + qio_channel_socket_copy_fds(&msg, fds, nfds, !sioc->keep_nonblock);
>> }
>>
>> return ret;
>
> If this is needed, then it should be done by defining another flag
> constant to be passed to qio_channel_read*, not via a new API
>
> QIO_CHANNEL_READ_FLAG_FD_PRESERVE_NONBLOCKING
>
Ok, thanks, will do.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 27/33] migration/socket: keep fds non-block
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (25 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 26/33] io/channel-socket: introduce qio_channel_socket_keep_nonblock() Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-20 13:30 ` Peter Xu
2025-08-13 16:48 ` [PATCH 28/33] vhost: introduce backend migration Vladimir Sementsov-Ogievskiy
` (6 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
For migration channel keep fds non-blocking property as is.
It's needed for future local migration of fds.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
migration/socket.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/migration/socket.c b/migration/socket.c
index 5ec65b8c03..9f7b6919cf 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -129,6 +129,7 @@ static void socket_accept_incoming_migration(QIONetListener *listener,
}
qio_channel_set_name(QIO_CHANNEL(cioc), "migration-socket-incoming");
+ qio_channel_socket_keep_nonblock(QIO_CHANNEL(cioc));
migration_channel_process_incoming(QIO_CHANNEL(cioc));
}
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 27/33] migration/socket: keep fds non-block
2025-08-13 16:48 ` [PATCH 27/33] migration/socket: keep fds non-block Vladimir Sementsov-Ogievskiy
@ 2025-08-20 13:30 ` Peter Xu
2025-08-21 12:15 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Peter Xu @ 2025-08-20 13:30 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, farosas, raphael, sgarzare, marcandre.lureau, pbonzini,
kwolf, hreitz, berrange, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On Wed, Aug 13, 2025 at 07:48:48PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> For migration channel keep fds non-blocking property as is.
> It's needed for future local migration of fds.
It is pretty risky. This changes the attribute for all the iochannels that
migration incoming side uses, including multifd / postcopy / ...
I left comment in previous patch as a pure question trying to understand
whether the feature is needed. If it is, here it might still be good to:
- Above the line add a comment explaning why
- Only apply it to whatever channel that matters. In this case, IIUC
only the main channel matters
Thanks,
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> migration/socket.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/migration/socket.c b/migration/socket.c
> index 5ec65b8c03..9f7b6919cf 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -129,6 +129,7 @@ static void socket_accept_incoming_migration(QIONetListener *listener,
> }
>
> qio_channel_set_name(QIO_CHANNEL(cioc), "migration-socket-incoming");
> + qio_channel_socket_keep_nonblock(QIO_CHANNEL(cioc));
> migration_channel_process_incoming(QIO_CHANNEL(cioc));
> }
>
> --
> 2.48.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 27/33] migration/socket: keep fds non-block
2025-08-20 13:30 ` Peter Xu
@ 2025-08-21 12:15 ` Vladimir Sementsov-Ogievskiy
2025-08-21 13:49 ` Peter Xu
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-21 12:15 UTC (permalink / raw)
To: Peter Xu
Cc: mst, farosas, raphael, sgarzare, marcandre.lureau, pbonzini,
kwolf, hreitz, berrange, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On 20.08.25 16:30, Peter Xu wrote:
> On Wed, Aug 13, 2025 at 07:48:48PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> For migration channel keep fds non-blocking property as is.
>> It's needed for future local migration of fds.
>
> It is pretty risky. This changes the attribute for all the iochannels that
> migration incoming side uses, including multifd / postcopy / ...
But for now nobody (except CPR-transfer) really pass fds through migration,
and for CPR-transfer it's obviously better to preserve the state by default (see
my answer in previous patch).
So I think, we are in a point, where we can chose the good default, and
document it.
>
> I left comment in previous patch as a pure question trying to understand
> whether the feature is needed. If it is, here it might still be good to:
>
> - Above the line add a comment explaning why
> - Only apply it to whatever channel that matters. In this case, IIUC
> only the main channel matters
>
I still think that preserving non-blocking flag "as is" is good default for migration,
please look at my answer in previous patch. However, in this series I may adopt
to any approach.
>
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> migration/socket.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/migration/socket.c b/migration/socket.c
>> index 5ec65b8c03..9f7b6919cf 100644
>> --- a/migration/socket.c
>> +++ b/migration/socket.c
>> @@ -129,6 +129,7 @@ static void socket_accept_incoming_migration(QIONetListener *listener,
>> }
>>
>> qio_channel_set_name(QIO_CHANNEL(cioc), "migration-socket-incoming");
>> + qio_channel_socket_keep_nonblock(QIO_CHANNEL(cioc));
>> migration_channel_process_incoming(QIO_CHANNEL(cioc));
>> }
>>
>> --
>> 2.48.1
>>
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 27/33] migration/socket: keep fds non-block
2025-08-21 12:15 ` Vladimir Sementsov-Ogievskiy
@ 2025-08-21 13:49 ` Peter Xu
0 siblings, 0 replies; 108+ messages in thread
From: Peter Xu @ 2025-08-21 13:49 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, farosas, raphael, sgarzare, marcandre.lureau, pbonzini,
kwolf, hreitz, berrange, eblake, armbru, qemu-devel, qemu-block,
steven.sistare, den-plotnikov
On Thu, Aug 21, 2025 at 03:15:24PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 20.08.25 16:30, Peter Xu wrote:
> > On Wed, Aug 13, 2025 at 07:48:48PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > For migration channel keep fds non-blocking property as is.
> > > It's needed for future local migration of fds.
> >
> > It is pretty risky. This changes the attribute for all the iochannels that
> > migration incoming side uses, including multifd / postcopy / ...
>
> But for now nobody (except CPR-transfer) really pass fds through migration,
> and for CPR-transfer it's obviously better to preserve the state by default (see
> my answer in previous patch).
>
> So I think, we are in a point, where we can chose the good default, and
> document it.
>
> >
> > I left comment in previous patch as a pure question trying to understand
> > whether the feature is needed. If it is, here it might still be good to:
> >
> > - Above the line add a comment explaning why
> > - Only apply it to whatever channel that matters. In this case, IIUC
> > only the main channel matters
> >
>
> I still think that preserving non-blocking flag "as is" is good default for migration,
> please look at my answer in previous patch. However, in this series I may adopt
> to any approach.
I also commented in the previous patch, let's see whether we can make it
not only the default for migration, but the default for iochannels (hence,
any chance to drop the new feature flag completely..).
If that won't fly, I think this is fine. In that case, please explicitly
mention that it's intentional to change all iochannels that migration uses
in the commit message, and we can also add a comment inline explaining why
we set this default for all migration channels.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 28/33] vhost: introduce backend migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (26 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 27/33] migration/socket: keep fds non-block Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:09 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 29/33] vhost-user: support " Vladimir Sementsov-Ogievskiy
` (5 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Normally on migration we stop and destroy connection with
vhost (vhost-user-blk server, or kernel vhost) on source
and reinitialize it on target.
With this commit we start to implement vhost backend migration,
i.e. we don't stop the connection and operation of vhost. Instead,
we pass backend-related state, including open file descriptors
to target process. Of course, it's possible only for local
migration, and migration channel should be a unix socket.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost.c | 184 +++++++++++++++++++++++++-----
include/hw/virtio/vhost-backend.h | 5 +
include/hw/virtio/vhost.h | 6 +
3 files changed, 167 insertions(+), 28 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 0427fc29b2..80371a2653 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -26,8 +26,10 @@
#include "hw/mem/memory-device.h"
#include "migration/blocker.h"
#include "migration/qemu-file-types.h"
+#include "migration/qemu-file.h"
#include "system/dma.h"
#include "trace.h"
+#include <stdint.h>
/* enabled until disconnected backend stabilizes */
#define _VHOST_DEBUG 1
@@ -1321,6 +1323,8 @@ out:
return ret;
}
+static void vhost_virtqueue_error_notifier(EventNotifier *n);
+
int vhost_virtqueue_start(struct vhost_dev *dev,
struct VirtIODevice *vdev,
struct vhost_virtqueue *vq,
@@ -1346,7 +1350,17 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
return r;
}
- vq->num = state.num = virtio_queue_get_num(vdev, idx);
+ vq->num = virtio_queue_get_num(vdev, idx);
+
+ if (dev->migrating_backend) {
+ if (dev->vhost_ops->vhost_set_vring_err) {
+ event_notifier_set_handler(&vq->error_notifier,
+ vhost_virtqueue_error_notifier);
+ }
+ return 0;
+ }
+
+ state.num = vq->num;
r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
if (r) {
VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
@@ -1424,6 +1438,10 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
trace_vhost_virtque_stop(vdev->name, idx);
+ if (dev->migrating_backend) {
+ return 0;
+ }
+
if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
/* Don't stop the virtqueue which might have not been started */
return 0;
@@ -1514,7 +1532,15 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
struct vhost_vring_file file = {
.index = vhost_vq_index,
};
- int r = event_notifier_init(&vq->masked_notifier, 0);
+ int r;
+
+ vq->dev = dev;
+
+ if (dev->migrating_backend) {
+ return 0;
+ }
+
+ r = event_notifier_init(&vq->masked_notifier, 0);
if (r < 0) {
return r;
}
@@ -1526,8 +1552,6 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
goto fail_call;
}
- vq->dev = dev;
-
if (dev->vhost_ops->vhost_set_vring_err) {
r = event_notifier_init(&vq->error_notifier, 0);
if (r < 0) {
@@ -1564,10 +1588,14 @@ fail_call:
static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
{
- event_notifier_cleanup(&vq->masked_notifier);
+ if (!vq->dev->migrating_backend) {
+ event_notifier_cleanup(&vq->masked_notifier);
+ }
if (vq->dev->vhost_ops->vhost_set_vring_err) {
event_notifier_set_handler(&vq->error_notifier, NULL);
- event_notifier_cleanup(&vq->error_notifier);
+ if (!vq->dev->migrating_backend) {
+ event_notifier_cleanup(&vq->error_notifier);
+ }
}
}
@@ -1624,21 +1652,30 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
r = vhost_set_backend_type(hdev, backend_type);
assert(r >= 0);
- r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
- if (r < 0) {
- goto fail;
+ if (hdev->migrating_backend) {
+ /* backend must support detached state */
+ assert(hdev->vhost_ops->vhost_save_backend);
+ assert(hdev->vhost_ops->vhost_load_backend);
+ hdev->_features_wait_incoming = true;
}
- r = hdev->vhost_ops->vhost_set_owner(hdev);
+ r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
if (r < 0) {
- error_setg_errno(errp, -r, "vhost_set_owner failed");
goto fail;
}
- r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
- if (r < 0) {
- error_setg_errno(errp, -r, "vhost_get_features failed");
- goto fail;
+ if (!hdev->migrating_backend) {
+ r = hdev->vhost_ops->vhost_set_owner(hdev);
+ if (r < 0) {
+ error_setg_errno(errp, -r, "vhost_set_owner failed");
+ goto fail;
+ }
+
+ r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
+ if (r < 0) {
+ error_setg_errno(errp, -r, "vhost_get_features failed");
+ goto fail;
+ }
}
for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
@@ -1670,7 +1707,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
.region_del = vhost_iommu_region_del,
};
- if (hdev->migration_blocker == NULL) {
+ if (!hdev->migrating_backend && hdev->migration_blocker == NULL) {
if (!vhost_dev_has_feature(hdev, VHOST_F_LOG_ALL)) {
error_setg(&hdev->migration_blocker,
"Migration disabled: vhost lacks VHOST_F_LOG_ALL feature.");
@@ -1697,7 +1734,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
memory_listener_register(&hdev->memory_listener, &address_space_memory);
QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
- if (!check_memslots(hdev, errp)) {
+ if (!hdev->migrating_backend && !check_memslots(hdev, errp)) {
r = -EINVAL;
goto fail;
}
@@ -1765,8 +1802,11 @@ void vhost_dev_disable_notifiers_nvqs(struct vhost_dev *hdev,
*/
memory_region_transaction_commit();
- for (i = 0; i < nvqs; ++i) {
- virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i);
+ if (!hdev->migrating_backend) {
+ for (i = 0; i < nvqs; ++i) {
+ virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus),
+ hdev->vq_index + i);
+ }
}
virtio_device_release_ioeventfd(vdev);
}
@@ -1920,6 +1960,12 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
uint64_t features)
{
const int *bit = feature_bits;
+
+ if (hdev->_features_wait_incoming) {
+ /* Excessive set is enough for early initialization. */
+ return features;
+ }
+
while (*bit != VHOST_INVALID_FEATURE_BIT) {
uint64_t bit_mask = (1ULL << *bit);
if (!vhost_dev_has_feature(hdev, *bit)) {
@@ -1930,6 +1976,66 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
return features;
}
+void vhost_save_backend(struct vhost_dev *hdev, QEMUFile *f)
+{
+ int i;
+
+ assert(hdev->migrating_backend);
+
+ if (hdev->vhost_ops->vhost_save_backend) {
+ hdev->vhost_ops->vhost_save_backend(hdev, f);
+ }
+
+ qemu_put_be64(f, hdev->_features);
+ qemu_put_be64(f, hdev->max_queues);
+ qemu_put_be64(f, hdev->nvqs);
+
+ for (i = 0; i < hdev->nvqs; i++) {
+ qemu_file_put_fd(f,
+ event_notifier_get_fd(&hdev->vqs[i].error_notifier));
+ qemu_file_put_fd(f,
+ event_notifier_get_fd(&hdev->vqs[i].masked_notifier));
+ }
+}
+
+int vhost_load_backend(struct vhost_dev *hdev, QEMUFile *f)
+{
+ int i;
+ Error *err = NULL;
+ uint64_t nvqs;
+
+ assert(hdev->migrating_backend);
+
+ if (hdev->vhost_ops->vhost_load_backend) {
+ hdev->vhost_ops->vhost_load_backend(hdev, f);
+ }
+
+ qemu_get_be64s(f, &hdev->_features);
+ qemu_get_be64s(f, &hdev->max_queues);
+ qemu_get_be64s(f, &nvqs);
+
+ if (nvqs != hdev->nvqs) {
+ error_report("%s: number of virt queues mismatch", __func__);
+ return -EINVAL;
+ }
+
+ for (i = 0; i < hdev->nvqs; i++) {
+ event_notifier_init_fd(&hdev->vqs[i].error_notifier,
+ qemu_file_get_fd(f));
+ event_notifier_init_fd(&hdev->vqs[i].masked_notifier,
+ qemu_file_get_fd(f));
+ }
+
+ if (!check_memslots(hdev, &err)) {
+ error_report_err(err);
+ return -EINVAL;
+ }
+
+ hdev->_features_wait_incoming = false;
+
+ return 0;
+}
+
void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
uint64_t features)
{
@@ -2075,19 +2181,24 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
hdev->started = true;
hdev->vdev = vdev;
- r = vhost_dev_set_features(hdev, hdev->log_enabled);
- if (r < 0) {
- goto fail_features;
+ if (!hdev->migrating_backend) {
+ r = vhost_dev_set_features(hdev, hdev->log_enabled);
+ if (r < 0) {
+ warn_report("%s %d", __func__, __LINE__);
+ goto fail_features;
+ }
}
if (vhost_dev_has_iommu(hdev)) {
memory_listener_register(&hdev->iommu_listener, vdev->dma_as);
}
- r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
- if (r < 0) {
- VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
- goto fail_mem;
+ if (!hdev->migrating_backend) {
+ r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
+ goto fail_mem;
+ }
}
for (i = 0; i < hdev->nvqs; ++i) {
r = vhost_virtqueue_start(hdev,
@@ -2127,7 +2238,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
}
vhost_dev_elect_mem_logger(hdev, true);
}
- if (vrings) {
+ if (vrings && !hdev->migrating_backend) {
r = vhost_dev_set_vring_enable(hdev, true);
if (r) {
goto fail_log;
@@ -2155,6 +2266,8 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
}
vhost_start_config_intr(hdev);
+ hdev->migrating_backend = false;
+
trace_vhost_dev_start_finish(vdev->name);
return 0;
fail_iotlb:
@@ -2204,14 +2317,29 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev,
event_notifier_cleanup(
&hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
+ if (hdev->migrating_backend) {
+ /* backend must support detached state */
+ assert(hdev->vhost_ops->vhost_save_backend);
+ assert(hdev->vhost_ops->vhost_load_backend);
+ }
+
trace_vhost_dev_stop(hdev, vdev->name, vrings);
if (hdev->vhost_ops->vhost_dev_start) {
hdev->vhost_ops->vhost_dev_start(hdev, false);
}
- if (vrings) {
+ if (vrings && !hdev->migrating_backend) {
vhost_dev_set_vring_enable(hdev, false);
}
+
+ if (hdev->migrating_backend) {
+ for (i = 0; i < hdev->nvqs; ++i) {
+ struct vhost_virtqueue *vq = hdev->vqs + i;
+
+ event_notifier_set_handler(&vq->error_notifier, NULL);
+ }
+ }
+
for (i = 0; i < hdev->nvqs; ++i) {
rc |= do_vhost_virtqueue_stop(hdev,
vdev,
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index 0785fc764d..66627c6a56 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -163,6 +163,9 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
+typedef void (*vhost_detached_save_op)(struct vhost_dev *dev, QEMUFile *f);
+typedef int (*vhost_detached_load_op)(struct vhost_dev *dev, QEMUFile *f);
+
typedef struct VhostOps {
VhostBackendType backend_type;
vhost_backend_init vhost_backend_init;
@@ -219,6 +222,8 @@ typedef struct VhostOps {
vhost_set_device_state_fd_op vhost_set_device_state_fd;
vhost_check_device_state_op vhost_check_device_state;
vhost_qmp_status_op vhost_qmp_status;
+ vhost_detached_save_op vhost_save_backend;
+ vhost_detached_load_op vhost_load_backend;
} VhostOps;
int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 8a4c8c3502..330374aca2 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -103,6 +103,10 @@ struct vhost_dev {
* @acked_features: final negotiated features with front-end driver
*/
uint64_t _features;
+ bool _features_wait_incoming;
+
+ bool migrating_backend;
+
uint64_t acked_features;
uint64_t max_queues;
@@ -318,6 +322,8 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
*/
uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
uint64_t features);
+void vhost_save_backend(struct vhost_dev *hdev, QEMUFile *f);
+int vhost_load_backend(struct vhost_dev *hdev, QEMUFile *f);
/**
* vhost_ack_features() - set vhost acked_features
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 28/33] vhost: introduce backend migration
2025-08-13 16:48 ` [PATCH 28/33] vhost: introduce backend migration Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:09 ` Raphael Norwitz
2025-10-09 20:51 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:09 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
A few suggestions here. Overall, it looks sane to me.
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Normally on migration we stop and destroy connection with
> vhost (vhost-user-blk server, or kernel vhost) on source
> and reinitialize it on target.
>
> With this commit we start to implement vhost backend migration,
> i.e. we don't stop the connection and operation of vhost. Instead,
> we pass backend-related state, including open file descriptors
> to target process. Of course, it's possible only for local
> migration, and migration channel should be a unix socket.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost.c | 184 +++++++++++++++++++++++++-----
> include/hw/virtio/vhost-backend.h | 5 +
> include/hw/virtio/vhost.h | 6 +
> 3 files changed, 167 insertions(+), 28 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 0427fc29b2..80371a2653 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -26,8 +26,10 @@
> #include "hw/mem/memory-device.h"
> #include "migration/blocker.h"
> #include "migration/qemu-file-types.h"
> +#include "migration/qemu-file.h"
> #include "system/dma.h"
> #include "trace.h"
> +#include <stdint.h>
>
> /* enabled until disconnected backend stabilizes */
> #define _VHOST_DEBUG 1
> @@ -1321,6 +1323,8 @@ out:
> return ret;
> }
>
> +static void vhost_virtqueue_error_notifier(EventNotifier *n);
> +
> int vhost_virtqueue_start(struct vhost_dev *dev,
> struct VirtIODevice *vdev,
> struct vhost_virtqueue *vq,
> @@ -1346,7 +1350,17 @@ int vhost_virtqueue_start(struct vhost_dev *dev,
> return r;
> }
>
> - vq->num = state.num = virtio_queue_get_num(vdev, idx);
> + vq->num = virtio_queue_get_num(vdev, idx);
> +
> + if (dev->migrating_backend) {
> + if (dev->vhost_ops->vhost_set_vring_err) {
> + event_notifier_set_handler(&vq->error_notifier,
> + vhost_virtqueue_error_notifier);
> + }
> + return 0;
> + }
> +
> + state.num = vq->num;
> r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
> if (r) {
> VHOST_OPS_DEBUG(r, "vhost_set_vring_num failed");
> @@ -1424,6 +1438,10 @@ static int do_vhost_virtqueue_stop(struct vhost_dev *dev,
>
> trace_vhost_virtque_stop(vdev->name, idx);
>
> + if (dev->migrating_backend) {
> + return 0;
> + }
> +
> if (virtio_queue_get_desc_addr(vdev, idx) == 0) {
> /* Don't stop the virtqueue which might have not been started */
> return 0;
> @@ -1514,7 +1532,15 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
> struct vhost_vring_file file = {
> .index = vhost_vq_index,
> };
> - int r = event_notifier_init(&vq->masked_notifier, 0);
> + int r;
> +
> + vq->dev = dev;
> +
> + if (dev->migrating_backend) {
> + return 0;
> + }
> +
> + r = event_notifier_init(&vq->masked_notifier, 0);
> if (r < 0) {
> return r;
> }
> @@ -1526,8 +1552,6 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
> goto fail_call;
> }
>
> - vq->dev = dev;
> -
> if (dev->vhost_ops->vhost_set_vring_err) {
> r = event_notifier_init(&vq->error_notifier, 0);
> if (r < 0) {
> @@ -1564,10 +1588,14 @@ fail_call:
>
> static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
> {
> - event_notifier_cleanup(&vq->masked_notifier);
> + if (!vq->dev->migrating_backend) {
> + event_notifier_cleanup(&vq->masked_notifier);
> + }
> if (vq->dev->vhost_ops->vhost_set_vring_err) {
> event_notifier_set_handler(&vq->error_notifier, NULL);
> - event_notifier_cleanup(&vq->error_notifier);
> + if (!vq->dev->migrating_backend) {
> + event_notifier_cleanup(&vq->error_notifier);
> + }
> }
> }
>
> @@ -1624,21 +1652,30 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> r = vhost_set_backend_type(hdev, backend_type);
> assert(r >= 0);
>
> - r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
> - if (r < 0) {
> - goto fail;
> + if (hdev->migrating_backend) {
> + /* backend must support detached state */
Probably better to error_report() or something other than a raw assert?
> + assert(hdev->vhost_ops->vhost_save_backend);
> + assert(hdev->vhost_ops->vhost_load_backend);
> + hdev->_features_wait_incoming = true;
> }
>
> - r = hdev->vhost_ops->vhost_set_owner(hdev);
> + r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
> if (r < 0) {
> - error_setg_errno(errp, -r, "vhost_set_owner failed");
> goto fail;
> }
>
> - r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
> - if (r < 0) {
> - error_setg_errno(errp, -r, "vhost_get_features failed");
> - goto fail;
> + if (!hdev->migrating_backend) {
> + r = hdev->vhost_ops->vhost_set_owner(hdev);
> + if (r < 0) {
> + error_setg_errno(errp, -r, "vhost_set_owner failed");
> + goto fail;
> + }
> +
> + r = hdev->vhost_ops->vhost_get_features(hdev, &hdev->_features);
> + if (r < 0) {
> + error_setg_errno(errp, -r, "vhost_get_features failed");
> + goto fail;
> + }
> }
>
> for (i = 0; i < hdev->nvqs; ++i, ++n_initialized_vqs) {
> @@ -1670,7 +1707,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> .region_del = vhost_iommu_region_del,
> };
>
> - if (hdev->migration_blocker == NULL) {
> + if (!hdev->migrating_backend && hdev->migration_blocker == NULL) {
> if (!vhost_dev_has_feature(hdev, VHOST_F_LOG_ALL)) {
> error_setg(&hdev->migration_blocker,
> "Migration disabled: vhost lacks VHOST_F_LOG_ALL feature.");
> @@ -1697,7 +1734,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> memory_listener_register(&hdev->memory_listener, &address_space_memory);
> QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>
> - if (!check_memslots(hdev, errp)) {
> + if (!hdev->migrating_backend && !check_memslots(hdev, errp)) {
> r = -EINVAL;
> goto fail;
> }
> @@ -1765,8 +1802,11 @@ void vhost_dev_disable_notifiers_nvqs(struct vhost_dev *hdev,
> */
> memory_region_transaction_commit();
>
> - for (i = 0; i < nvqs; ++i) {
> - virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i);
> + if (!hdev->migrating_backend) {
> + for (i = 0; i < nvqs; ++i) {
> + virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus),
> + hdev->vq_index + i);
> + }
> }
> virtio_device_release_ioeventfd(vdev);
> }
> @@ -1920,6 +1960,12 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> uint64_t features)
> {
> const int *bit = feature_bits;
> +
Should this be
if (hdev->_features_wait_incoming && hdev->migrating_backend) {
to not impact existing flows?
> + if (hdev->_features_wait_incoming) {
> + /* Excessive set is enough for early initialization. */
> + return features;
> + }
> +
> while (*bit != VHOST_INVALID_FEATURE_BIT) {
> uint64_t bit_mask = (1ULL << *bit);
> if (!vhost_dev_has_feature(hdev, *bit)) {
> @@ -1930,6 +1976,66 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> return features;
> }
>
> +void vhost_save_backend(struct vhost_dev *hdev, QEMUFile *f)
> +{
> + int i;
> +
> + assert(hdev->migrating_backend);
> +
> + if (hdev->vhost_ops->vhost_save_backend) {
> + hdev->vhost_ops->vhost_save_backend(hdev, f);
> + }
> +
> + qemu_put_be64(f, hdev->_features);
> + qemu_put_be64(f, hdev->max_queues);
> + qemu_put_be64(f, hdev->nvqs);
> +
> + for (i = 0; i < hdev->nvqs; i++) {
> + qemu_file_put_fd(f,
> + event_notifier_get_fd(&hdev->vqs[i].error_notifier));
> + qemu_file_put_fd(f,
> + event_notifier_get_fd(&hdev->vqs[i].masked_notifier));
> + }
> +}
> +
> +int vhost_load_backend(struct vhost_dev *hdev, QEMUFile *f)
> +{
> + int i;
> + Error *err = NULL;
> + uint64_t nvqs;
> +
> + assert(hdev->migrating_backend);
> +
> + if (hdev->vhost_ops->vhost_load_backend) {
> + hdev->vhost_ops->vhost_load_backend(hdev, f);
> + }
> +
> + qemu_get_be64s(f, &hdev->_features);
> + qemu_get_be64s(f, &hdev->max_queues);
> + qemu_get_be64s(f, &nvqs);
> +
> + if (nvqs != hdev->nvqs) {
> + error_report("%s: number of virt queues mismatch", __func__);
> + return -EINVAL;
> + }
> +
> + for (i = 0; i < hdev->nvqs; i++) {
> + event_notifier_init_fd(&hdev->vqs[i].error_notifier,
> + qemu_file_get_fd(f));
> + event_notifier_init_fd(&hdev->vqs[i].masked_notifier,
> + qemu_file_get_fd(f));
> + }
> +
> + if (!check_memslots(hdev, &err)) {
> + error_report_err(err);
> + return -EINVAL;
> + }
> +
> + hdev->_features_wait_incoming = false;
> +
> + return 0;
> +}
> +
> void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
> uint64_t features)
> {
> @@ -2075,19 +2181,24 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> hdev->started = true;
> hdev->vdev = vdev;
>
> - r = vhost_dev_set_features(hdev, hdev->log_enabled);
> - if (r < 0) {
> - goto fail_features;
> + if (!hdev->migrating_backend) {
> + r = vhost_dev_set_features(hdev, hdev->log_enabled);
> + if (r < 0) {
> + warn_report("%s %d", __func__, __LINE__);
> + goto fail_features;
> + }
> }
>
> if (vhost_dev_has_iommu(hdev)) {
> memory_listener_register(&hdev->iommu_listener, vdev->dma_as);
> }
>
> - r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
> - if (r < 0) {
> - VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
> - goto fail_mem;
> + if (!hdev->migrating_backend) {
> + r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
> + if (r < 0) {
> + VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
> + goto fail_mem;
> + }
> }
> for (i = 0; i < hdev->nvqs; ++i) {
> r = vhost_virtqueue_start(hdev,
> @@ -2127,7 +2238,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> }
> vhost_dev_elect_mem_logger(hdev, true);
> }
> - if (vrings) {
> + if (vrings && !hdev->migrating_backend) {
> r = vhost_dev_set_vring_enable(hdev, true);
> if (r) {
> goto fail_log;
> @@ -2155,6 +2266,8 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> }
> vhost_start_config_intr(hdev);
>
> + hdev->migrating_backend = false;
> +
> trace_vhost_dev_start_finish(vdev->name);
> return 0;
> fail_iotlb:
> @@ -2204,14 +2317,29 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev,
> event_notifier_cleanup(
> &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
>
> + if (hdev->migrating_backend) {
Ditto - no raw assert()?
> + /* backend must support detached state */
> + assert(hdev->vhost_ops->vhost_save_backend);
> + assert(hdev->vhost_ops->vhost_load_backend);
> + }
> +
> trace_vhost_dev_stop(hdev, vdev->name, vrings);
>
> if (hdev->vhost_ops->vhost_dev_start) {
> hdev->vhost_ops->vhost_dev_start(hdev, false);
> }
> - if (vrings) {
> + if (vrings && !hdev->migrating_backend) {
> vhost_dev_set_vring_enable(hdev, false);
> }
> +
> + if (hdev->migrating_backend) {
> + for (i = 0; i < hdev->nvqs; ++i) {
> + struct vhost_virtqueue *vq = hdev->vqs + i;
> +
> + event_notifier_set_handler(&vq->error_notifier, NULL);
> + }
> + }
> +
> for (i = 0; i < hdev->nvqs; ++i) {
> rc |= do_vhost_virtqueue_stop(hdev,
> vdev,
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index 0785fc764d..66627c6a56 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -163,6 +163,9 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
> typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
> typedef void (*vhost_qmp_status_op)(struct vhost_dev *dev, VhostStatus *status);
>
> +typedef void (*vhost_detached_save_op)(struct vhost_dev *dev, QEMUFile *f);
> +typedef int (*vhost_detached_load_op)(struct vhost_dev *dev, QEMUFile *f);
> +
> typedef struct VhostOps {
> VhostBackendType backend_type;
> vhost_backend_init vhost_backend_init;
> @@ -219,6 +222,8 @@ typedef struct VhostOps {
> vhost_set_device_state_fd_op vhost_set_device_state_fd;
> vhost_check_device_state_op vhost_check_device_state;
> vhost_qmp_status_op vhost_qmp_status;
> + vhost_detached_save_op vhost_save_backend;
> + vhost_detached_load_op vhost_load_backend;
> } VhostOps;
>
> int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 8a4c8c3502..330374aca2 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -103,6 +103,10 @@ struct vhost_dev {
> * @acked_features: final negotiated features with front-end driver
> */
> uint64_t _features;
> + bool _features_wait_incoming;
> +
> + bool migrating_backend;
> +
> uint64_t acked_features;
>
> uint64_t max_queues;
> @@ -318,6 +322,8 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
> */
> uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> uint64_t features);
> +void vhost_save_backend(struct vhost_dev *hdev, QEMUFile *f);
> +int vhost_load_backend(struct vhost_dev *hdev, QEMUFile *f);
>
> /**
> * vhost_ack_features() - set vhost acked_features
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 28/33] vhost: introduce backend migration
2025-10-09 19:09 ` Raphael Norwitz
@ 2025-10-09 20:51 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:51 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:09, Raphael Norwitz wrote:
> A few suggestions here. Overall, it looks sane to me.
First my applogizes, I should have said it earlier:
I'm preparing a v2, and starting from this patch it's significantly
reworked (the previous big part of refactoring (01-23,25) is still
relevant)
I have a parallel series for similar migration of virtio-net/tap
(TAP device fds are migrated through UNIX socket), and there were
a lot of discussions, and the ideas applies to vhost-user-blk series
as well.
The main change of v2 is significantly simplified interface:
the whole feature is enabled/disable by one migration parameter,
no need for per-device options. But this requires additional
changes in code, as we have to postpone backend (chardev opening
and initial communication to vhost-server) until the point in time
when we know, are we going to get the fds from migration channel
or not.
Next, migration part was revorked into VMSD structures instead of
.save() / .load() handlers.
Now, my work is to look at the comments and understand, how
much they apply to upcoming v2.
>
> On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
[..]
>>
>> @@ -1624,21 +1652,30 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>> r = vhost_set_backend_type(hdev, backend_type);
>> assert(r >= 0);
>>
>> - r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
>> - if (r < 0) {
>> - goto fail;
>> + if (hdev->migrating_backend) {
>> + /* backend must support detached state */
>
> Probably better to error_report() or something other than a raw assert?
Assert is better, as this is not possible. Still, no such handlers in v2.
>
>> + assert(hdev->vhost_ops->vhost_save_backend);
>> + assert(hdev->vhost_ops->vhost_load_backend);
>> + hdev->_features_wait_incoming = true;
>> }
>>
>> - r = hdev->vhost_ops->vhost_set_owner(hdev);
>> + r = hdev->vhost_ops->vhost_backend_init(hdev, opaque, errp);
>> if (r < 0) {
>> - error_setg_errno(errp, -r, "vhost_set_owner failed");
[..]
>> @@ -1920,6 +1960,12 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>> uint64_t features)
>> {
>> const int *bit = feature_bits;
>> +
>
> Should this be
>
> if (hdev->_features_wait_incoming && hdev->migrating_backend) {
>
> to not impact existing flows?
This code is still in v2.
But _features_wait_incoming is a new field introduced withi this commit,
so there are no existing flows with it..
And in v2 _features_wait_incoming and migrating_backend are less
connected. Initialization code in v2 doesn't rely on .migrating_backend
(as we just don't know :). stop()/start() code will still rely on
.migrating_backend.
>
>> + if (hdev->_features_wait_incoming) {
>> + /* Excessive set is enough for early initialization. */
>> + return features;
>> + }
>> +
>> while (*bit != VHOST_INVALID_FEATURE_BIT) {
>> uint64_t bit_mask = (1ULL << *bit);
>> if (!vhost_dev_has_feature(hdev, *bit)) {
>> @@ -1930,6 +1976,66 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>> return features;
>> }
>>
[..]
>> @@ -2204,14 +2317,29 @@ static int do_vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev,
>> event_notifier_cleanup(
>> &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
>>
>> + if (hdev->migrating_backend) {
>
> Ditto - no raw assert()?
no handlers - no problmes (in v2 :). Still, I'm sure that assert was good here, as we never
set migrating_backend for devices which don't support it.
>
>
>> + /* backend must support detached state */
>> + assert(hdev->vhost_ops->vhost_save_backend);
>> + assert(hdev->vhost_ops->vhost_load_backend);
>> + }
>> +
>> trace_vhost_dev_stop(hdev, vdev->name, vrings);
>>
>> if (hdev->vhost_ops->vhost_dev_start) {
>> hdev->vhost_ops->vhost_dev_start(hdev, false);
>> }
>> - if (vrings) {
>> + if (vrings && !hdev->migrating_backend) {
>> vhost_dev_set_vring_enable(hdev, false);
>> }
>> +
>> + if (hdev->migrating_backend) {
>> + for (i = 0; i < hdev->nvqs; ++i) {
>> + struct vhost_virtqueue *vq = hdev->vqs + i;
>> +
>> + event_notifier_set_handler(&vq->error_notifier, NULL);
>> + }
>> + }
>> +
>> for (i = 0; i < hdev->nvqs; ++i) {
>> rc |= do_vhost_virtqueue_stop(hdev,
>> vdev,
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 29/33] vhost-user: support backend migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (27 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 28/33] vhost: introduce backend migration Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:09 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 30/33] virtio: support vhost " Vladimir Sementsov-Ogievskiy
` (4 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
In case of local backend migration, skip backend-related
initialization, but instead get the state from migration
channel (including secondary channel file descriptor).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/vhost-user.c | 62 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 3979582975..f220af270e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -28,6 +28,8 @@
#include "system/runstate.h"
#include "system/cryptodev.h"
#include "migration/postcopy-ram.h"
+#include "migration/qemu-file-types.h"
+#include "migration/qemu-file.h"
#include "trace.h"
#include "system/ramblock.h"
@@ -2273,6 +2275,10 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
u->dev = dev;
dev->opaque = u;
+ if (dev->migrating_backend) {
+ goto out;
+ }
+
err = vhost_user_get_features(dev, &features);
if (err < 0) {
error_setg_errno(errp, -err, "vhost_backend_init failed");
@@ -2387,6 +2393,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
}
}
+out:
u->postcopy_notifier.notify = vhost_user_postcopy_notifier;
postcopy_add_notifier(&u->postcopy_notifier);
@@ -2936,6 +2943,10 @@ void vhost_user_async_close(DeviceState *d,
static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
{
+ if (dev->migrating_backend) {
+ return 0;
+ }
+
if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
return 0;
}
@@ -3105,6 +3116,55 @@ static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
status->protocol_features = qmp_decode_protocols(u->protocol_features);
}
+static void vhost_user_save(struct vhost_dev *dev, QEMUFile *f)
+{
+ struct vhost_user *u = dev->opaque;
+ bool has_backend_channel = !!u->backend_sioc;
+ qemu_put_be64(f, u->protocol_features);
+ qemu_put_be32(f, u->user->memory_slots);
+
+ qemu_put_byte(f, has_backend_channel);
+ if (u->backend_sioc) {
+ qemu_file_put_fd(f, u->backend_sioc->fd);
+ }
+}
+
+static int vhost_user_load(struct vhost_dev *dev, QEMUFile *f)
+{
+ struct vhost_user *u = dev->opaque;
+ uint8_t has_backend_channel;
+ uint32_t memory_slots;
+
+ qemu_get_be64s(f, &u->protocol_features);
+ qemu_get_be32s(f, &memory_slots);
+ qemu_get_8s(f, &has_backend_channel);
+
+ u->user->memory_slots = memory_slots;
+
+ if (has_backend_channel) {
+ int fd = qemu_file_get_fd(f);
+ Error *local_err = NULL;
+
+ u->backend_sioc = qio_channel_socket_new_fd(fd, &local_err);
+ if (!u->backend_sioc) {
+ error_report_err(local_err);
+ return -ECONNREFUSED;
+ }
+ u->backend_src = qio_channel_add_watch_source(
+ QIO_CHANNEL(u->backend_sioc), G_IO_IN | G_IO_HUP,
+ backend_read, dev, NULL, NULL);
+ }
+
+ if (dev->migration_blocker == NULL &&
+ !vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
+ error_setg(&dev->migration_blocker,
+ "Migration disabled: vhost-user backend lacks "
+ "VHOST_USER_PROTOCOL_F_LOG_SHMFD feature.");
+ }
+
+ return 0;
+}
+
const VhostOps user_ops = {
.backend_type = VHOST_BACKEND_TYPE_USER,
.vhost_backend_init = vhost_user_backend_init,
@@ -3146,4 +3206,6 @@ const VhostOps user_ops = {
.vhost_set_device_state_fd = vhost_user_set_device_state_fd,
.vhost_check_device_state = vhost_user_check_device_state,
.vhost_qmp_status = vhost_user_qmp_status,
+ .vhost_save_backend = vhost_user_save,
+ .vhost_load_backend = vhost_user_load,
};
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 29/33] vhost-user: support backend migration
2025-08-13 16:48 ` [PATCH 29/33] vhost-user: support " Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:09 ` Raphael Norwitz
2025-10-09 20:54 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:09 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Just a naming nit.
On Wed, Aug 13, 2025 at 12:54 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> In case of local backend migration, skip backend-related
> initialization, but instead get the state from migration
> channel (including secondary channel file descriptor).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/vhost-user.c | 62 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 62 insertions(+)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 3979582975..f220af270e 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -28,6 +28,8 @@
> #include "system/runstate.h"
> #include "system/cryptodev.h"
> #include "migration/postcopy-ram.h"
> +#include "migration/qemu-file-types.h"
> +#include "migration/qemu-file.h"
> #include "trace.h"
> #include "system/ramblock.h"
>
> @@ -2273,6 +2275,10 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> u->dev = dev;
> dev->opaque = u;
>
> + if (dev->migrating_backend) {
> + goto out;
> + }
> +
> err = vhost_user_get_features(dev, &features);
> if (err < 0) {
> error_setg_errno(errp, -err, "vhost_backend_init failed");
> @@ -2387,6 +2393,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
> }
> }
>
Maybe call the goto target migrating_backend_out or something else to
indicate what it's for.
> +out:
> u->postcopy_notifier.notify = vhost_user_postcopy_notifier;
> postcopy_add_notifier(&u->postcopy_notifier);
>
> @@ -2936,6 +2943,10 @@ void vhost_user_async_close(DeviceState *d,
>
> static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
> {
> + if (dev->migrating_backend) {
> + return 0;
> + }
> +
> if (!vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_STATUS)) {
> return 0;
> }
> @@ -3105,6 +3116,55 @@ static void vhost_user_qmp_status(struct vhost_dev *dev, VhostStatus *status)
> status->protocol_features = qmp_decode_protocols(u->protocol_features);
> }
>
> +static void vhost_user_save(struct vhost_dev *dev, QEMUFile *f)
> +{
> + struct vhost_user *u = dev->opaque;
> + bool has_backend_channel = !!u->backend_sioc;
> + qemu_put_be64(f, u->protocol_features);
> + qemu_put_be32(f, u->user->memory_slots);
> +
> + qemu_put_byte(f, has_backend_channel);
> + if (u->backend_sioc) {
> + qemu_file_put_fd(f, u->backend_sioc->fd);
> + }
> +}
> +
> +static int vhost_user_load(struct vhost_dev *dev, QEMUFile *f)
> +{
> + struct vhost_user *u = dev->opaque;
> + uint8_t has_backend_channel;
> + uint32_t memory_slots;
> +
> + qemu_get_be64s(f, &u->protocol_features);
> + qemu_get_be32s(f, &memory_slots);
> + qemu_get_8s(f, &has_backend_channel);
> +
> + u->user->memory_slots = memory_slots;
> +
> + if (has_backend_channel) {
> + int fd = qemu_file_get_fd(f);
> + Error *local_err = NULL;
> +
> + u->backend_sioc = qio_channel_socket_new_fd(fd, &local_err);
> + if (!u->backend_sioc) {
> + error_report_err(local_err);
> + return -ECONNREFUSED;
> + }
> + u->backend_src = qio_channel_add_watch_source(
> + QIO_CHANNEL(u->backend_sioc), G_IO_IN | G_IO_HUP,
> + backend_read, dev, NULL, NULL);
> + }
> +
> + if (dev->migration_blocker == NULL &&
> + !vhost_user_has_prot(dev, VHOST_USER_PROTOCOL_F_LOG_SHMFD)) {
> + error_setg(&dev->migration_blocker,
> + "Migration disabled: vhost-user backend lacks "
> + "VHOST_USER_PROTOCOL_F_LOG_SHMFD feature.");
> + }
> +
> + return 0;
> +}
> +
> const VhostOps user_ops = {
> .backend_type = VHOST_BACKEND_TYPE_USER,
> .vhost_backend_init = vhost_user_backend_init,
> @@ -3146,4 +3206,6 @@ const VhostOps user_ops = {
> .vhost_set_device_state_fd = vhost_user_set_device_state_fd,
> .vhost_check_device_state = vhost_user_check_device_state,
> .vhost_qmp_status = vhost_user_qmp_status,
> + .vhost_save_backend = vhost_user_save,
> + .vhost_load_backend = vhost_user_load,
> };
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 29/33] vhost-user: support backend migration
2025-10-09 19:09 ` Raphael Norwitz
@ 2025-10-09 20:54 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:54 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:09, Raphael Norwitz wrote:
> Just a naming nit.
>
> On Wed, Aug 13, 2025 at 12:54 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> In case of local backend migration, skip backend-related
>> initialization, but instead get the state from migration
>> channel (including secondary channel file descriptor).
>>
[..]
>> @@ -2273,6 +2275,10 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
>> u->dev = dev;
>> dev->opaque = u;
>>
>> + if (dev->migrating_backend) {
>> + goto out;
>> + }
>> +
>> err = vhost_user_get_features(dev, &features);
>> if (err < 0) {
>> error_setg_errno(errp, -err, "vhost_backend_init failed");
>> @@ -2387,6 +2393,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
>> }
>> }
>>
>
> Maybe call the goto target migrating_backend_out or something else to
> indicate what it's for.
>
I've get rid of this goto in upcoming v2.
>
>> +out:
>> u->postcopy_notifier.notify = vhost_user_postcopy_notifier;
>> postcopy_add_notifier(&u->postcopy_notifier);
>>
>> @@ -2936,6 +2943,10 @@ void vhost_user_async_close(DeviceState *d,
>>
>> static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
>> {
[..]
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 30/33] virtio: support vhost backend migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (28 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 29/33] vhost-user: support " Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:09 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 31/33] vhost-user-blk: " Vladimir Sementsov-Ogievskiy
` (3 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Add logic to transfer virtio notifiers through migration channel
for vhost backend migration case.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/virtio/virtio-bus.c | 2 +-
hw/virtio/virtio.c | 74 ++++++++++++++++++++++++++++++++++++--
include/hw/virtio/virtio.h | 2 ++
3 files changed, 75 insertions(+), 3 deletions(-)
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index c7e3941b1e..c1848144a2 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -286,7 +286,7 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign)
return -ENOSYS;
}
- if (assign) {
+ if (assign && !virtio_is_vhost_migrating_backend(vdev)) {
r = event_notifier_init(notifier, 1);
if (r < 0) {
error_report("%s: unable to init event notifier: %s (%d)",
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 10891f0e0c..87c243edad 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -26,6 +26,7 @@
#include "hw/virtio/virtio.h"
#include "hw/virtio/vhost.h"
#include "migration/qemu-file-types.h"
+#include "migration/qemu-file.h"
#include "qemu/atomic.h"
#include "hw/virtio/virtio-bus.h"
#include "hw/qdev-properties.h"
@@ -2992,6 +2993,7 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
uint32_t guest_features_lo = (vdev->guest_features & 0xffffffff);
int i;
+ bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
if (k->save_config) {
k->save_config(qbus->parent, f);
@@ -3025,11 +3027,23 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
*/
qemu_put_be64(f, vdev->vq[i].vring.desc);
qemu_put_be16s(f, &vdev->vq[i].last_avail_idx);
+
+ if (migrating_backend) {
+ qemu_file_put_fd(f,
+ event_notifier_get_fd(&vdev->vq[i].host_notifier));
+ qemu_file_put_fd(
+ f, event_notifier_get_fd(&vdev->vq[i].guest_notifier));
+ }
+
if (k->save_queue) {
k->save_queue(qbus->parent, i, f);
}
}
+ if (migrating_backend) {
+ qemu_file_put_fd(f, event_notifier_get_fd(&vdev->config_notifier));
+ }
+
if (vdc->save != NULL) {
vdc->save(vdev, f);
}
@@ -3235,6 +3249,8 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+ Error *local_err = NULL;
+ bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
/*
* We poison the endianness to ensure it does not get used before
@@ -3304,6 +3320,13 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
vdev->vq[i].signalled_used_valid = false;
vdev->vq[i].notification = true;
+ if (migrating_backend) {
+ event_notifier_init_fd(&vdev->vq[i].host_notifier,
+ qemu_file_get_fd(f));
+ event_notifier_init_fd(&vdev->vq[i].guest_notifier,
+ qemu_file_get_fd(f));
+ }
+
if (!vdev->vq[i].vring.desc && vdev->vq[i].last_avail_idx) {
error_report("VQ %d address 0x0 "
"inconsistent with Host index 0x%x",
@@ -3317,6 +3340,10 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
}
}
+ if (migrating_backend) {
+ event_notifier_init_fd(&vdev->config_notifier , qemu_file_get_fd(f));
+ }
+
virtio_notify_vector(vdev, VIRTIO_NO_VECTOR);
if (vdc->load != NULL) {
@@ -3333,6 +3360,19 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
}
}
+ if (migrating_backend) {
+ /*
+ * On vhost backend migration, device do load host_features from
+ * migration stream. So update host_features.
+ */
+ vdev->host_features = vdc->get_features(vdev, vdev->host_features,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return -EINVAL;
+ }
+ }
+
/* Subsections */
ret = vmstate_load_state(f, &vmstate_virtio, vdev, 1);
if (ret) {
@@ -3394,6 +3434,18 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
continue;
}
+ if (migrating_backend) {
+ /*
+ * Indices are not synced prior backend migration (as we don't
+ * stop vrings by GET_VRING_BASE). No reason to sync them now,
+ * and do any checks.
+ */
+ vdev->vq[i].used_idx = 0;
+ vdev->vq[i].shadow_avail_idx = 0;
+ vdev->vq[i].inuse = 0;
+ continue;
+ }
+
nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
/* Check it isn't doing strange things with descriptor numbers. */
if (nheads > vdev->vq[i].vring.num) {
@@ -3762,8 +3814,9 @@ int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
EventNotifierHandler *read_fn = is_config ?
virtio_config_guest_notifier_read :
virtio_queue_guest_notifier_read;
+ bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
- if (assign) {
+ if (assign && !migrating_backend) {
int r = event_notifier_init(notifier, 0);
if (r < 0) {
return r;
@@ -3773,7 +3826,7 @@ int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
event_notifier_set_handler(notifier,
(assign && !with_irqfd) ? read_fn : NULL);
- if (!assign) {
+ if (!assign && !migrating_backend) {
/* Test and clear notifier before closing it,*/
/* in case poll callback didn't have time to run. */
read_fn(notifier);
@@ -4392,6 +4445,23 @@ done:
return element;
}
+bool virtio_is_vhost_migrating_backend(VirtIODevice *vdev)
+{
+ VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+ struct vhost_dev *hdev;
+
+ if (!vdc->get_vhost) {
+ return false;
+ }
+
+ hdev = vdc->get_vhost(vdev);
+ if (!hdev) {
+ return false;
+ }
+
+ return hdev->migrating_backend;
+}
+
static const TypeInfo virtio_device_info = {
.name = TYPE_VIRTIO_DEVICE,
.parent = TYPE_DEVICE,
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 9a4a0a94aa..f94a7e5895 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -238,6 +238,8 @@ struct VirtioDeviceClass {
bool (*skip_vhost_migration_log)(VirtIODevice *vdev);
};
+bool virtio_is_vhost_migrating_backend(VirtIODevice *vdev);
+
void virtio_instance_init_common(Object *proxy_obj, void *data,
size_t vdev_size, const char *vdev_name);
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 30/33] virtio: support vhost backend migration
2025-08-13 16:48 ` [PATCH 30/33] virtio: support vhost " Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:09 ` Raphael Norwitz
2025-10-09 20:59 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:09 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
A few nits. Looks like there has been some churn here so this patch
will need to be rebased.
On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Add logic to transfer virtio notifiers through migration channel
> for vhost backend migration case.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/virtio/virtio-bus.c | 2 +-
> hw/virtio/virtio.c | 74 ++++++++++++++++++++++++++++++++++++--
> include/hw/virtio/virtio.h | 2 ++
> 3 files changed, 75 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index c7e3941b1e..c1848144a2 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -286,7 +286,7 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign)
> return -ENOSYS;
> }
>
> - if (assign) {
> + if (assign && !virtio_is_vhost_migrating_backend(vdev)) {
> r = event_notifier_init(notifier, 1);
> if (r < 0) {
> error_report("%s: unable to init event notifier: %s (%d)",
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 10891f0e0c..87c243edad 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -26,6 +26,7 @@
> #include "hw/virtio/virtio.h"
> #include "hw/virtio/vhost.h"
> #include "migration/qemu-file-types.h"
> +#include "migration/qemu-file.h"
> #include "qemu/atomic.h"
> #include "hw/virtio/virtio-bus.h"
> #include "hw/qdev-properties.h"
> @@ -2992,6 +2993,7 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> uint32_t guest_features_lo = (vdev->guest_features & 0xffffffff);
> int i;
> + bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
>
> if (k->save_config) {
> k->save_config(qbus->parent, f);
> @@ -3025,11 +3027,23 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
> */
> qemu_put_be64(f, vdev->vq[i].vring.desc);
> qemu_put_be16s(f, &vdev->vq[i].last_avail_idx);
> +
> + if (migrating_backend) {
> + qemu_file_put_fd(f,
> + event_notifier_get_fd(&vdev->vq[i].host_notifier));
> + qemu_file_put_fd(
> + f, event_notifier_get_fd(&vdev->vq[i].guest_notifier));
> + }
> +
> if (k->save_queue) {
> k->save_queue(qbus->parent, i, f);
> }
> }
>
> + if (migrating_backend) {
> + qemu_file_put_fd(f, event_notifier_get_fd(&vdev->config_notifier));
> + }
> +
> if (vdc->save != NULL) {
> vdc->save(vdev, f);
> }
> @@ -3235,6 +3249,8 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> + Error *local_err = NULL;
> + bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
>
> /*
> * We poison the endianness to ensure it does not get used before
> @@ -3304,6 +3320,13 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> vdev->vq[i].signalled_used_valid = false;
> vdev->vq[i].notification = true;
>
> + if (migrating_backend) {
> + event_notifier_init_fd(&vdev->vq[i].host_notifier,
> + qemu_file_get_fd(f));
> + event_notifier_init_fd(&vdev->vq[i].guest_notifier,
> + qemu_file_get_fd(f));
> + }
> +
> if (!vdev->vq[i].vring.desc && vdev->vq[i].last_avail_idx) {
> error_report("VQ %d address 0x0 "
> "inconsistent with Host index 0x%x",
> @@ -3317,6 +3340,10 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> }
> }
>
> + if (migrating_backend) {
nit: spurious spaces after &vdev->config_notifier
> + event_notifier_init_fd(&vdev->config_notifier , qemu_file_get_fd(f));
> + }
> +
> virtio_notify_vector(vdev, VIRTIO_NO_VECTOR);
>
> if (vdc->load != NULL) {
> @@ -3333,6 +3360,19 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> }
> }
>
> + if (migrating_backend) {
> + /*
> + * On vhost backend migration, device do load host_features from
> + * migration stream. So update host_features.
> + */
> + vdev->host_features = vdc->get_features(vdev, vdev->host_features,
> + &local_err);
> + if (local_err) {
> + error_report_err(local_err);
> + return -EINVAL;
> + }
> + }
> +
> /* Subsections */
> ret = vmstate_load_state(f, &vmstate_virtio, vdev, 1);
> if (ret) {
> @@ -3394,6 +3434,18 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
> continue;
> }
>
> + if (migrating_backend) {
> + /*
"prior to backend migration"?
> + * Indices are not synced prior backend migration (as we don't
> + * stop vrings by GET_VRING_BASE). No reason to sync them now,
> + * and do any checks.
> + */
> + vdev->vq[i].used_idx = 0;
> + vdev->vq[i].shadow_avail_idx = 0;
> + vdev->vq[i].inuse = 0;
> + continue;
> + }
> +
> nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
> /* Check it isn't doing strange things with descriptor numbers. */
> if (nheads > vdev->vq[i].vring.num) {
> @@ -3762,8 +3814,9 @@ int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
> EventNotifierHandler *read_fn = is_config ?
> virtio_config_guest_notifier_read :
> virtio_queue_guest_notifier_read;
> + bool migrating_backend = virtio_is_vhost_migrating_backend(vdev);
>
> - if (assign) {
> + if (assign && !migrating_backend) {
> int r = event_notifier_init(notifier, 0);
> if (r < 0) {
> return r;
> @@ -3773,7 +3826,7 @@ int virtio_queue_set_guest_notifier(VirtIODevice *vdev, int n, bool assign,
> event_notifier_set_handler(notifier,
> (assign && !with_irqfd) ? read_fn : NULL);
>
> - if (!assign) {
> + if (!assign && !migrating_backend) {
> /* Test and clear notifier before closing it,*/
> /* in case poll callback didn't have time to run. */
> read_fn(notifier);
> @@ -4392,6 +4445,23 @@ done:
> return element;
> }
>
> +bool virtio_is_vhost_migrating_backend(VirtIODevice *vdev)
> +{
> + VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> + struct vhost_dev *hdev;
> +
> + if (!vdc->get_vhost) {
> + return false;
> + }
> +
> + hdev = vdc->get_vhost(vdev);
> + if (!hdev) {
> + return false;
> + }
> +
> + return hdev->migrating_backend;
> +}
> +
> static const TypeInfo virtio_device_info = {
> .name = TYPE_VIRTIO_DEVICE,
> .parent = TYPE_DEVICE,
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 9a4a0a94aa..f94a7e5895 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -238,6 +238,8 @@ struct VirtioDeviceClass {
> bool (*skip_vhost_migration_log)(VirtIODevice *vdev);
> };
>
> +bool virtio_is_vhost_migrating_backend(VirtIODevice *vdev);
> +
> void virtio_instance_init_common(Object *proxy_obj, void *data,
> size_t vdev_size, const char *vdev_name);
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 30/33] virtio: support vhost backend migration
2025-10-09 19:09 ` Raphael Norwitz
@ 2025-10-09 20:59 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 20:59 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:09, Raphael Norwitz wrote:
> A few nits. Looks like there has been some churn here so this patch
> will need to be rebased.
>
> On Wed, Aug 13, 2025 at 1:01 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> Add logic to transfer virtio notifiers through migration channel
>> for vhost backend migration case.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
[..]
>> @@ -3317,6 +3340,10 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
>> }
>> }
>>
>> + if (migrating_backend) {
>
> nit: spurious spaces after &vdev->config_notifier
Oh, kept in v2) Will fix.
>
>> + event_notifier_init_fd(&vdev->config_notifier , qemu_file_get_fd(f));
>> + }
>> +
>> virtio_notify_vector(vdev, VIRTIO_NO_VECTOR);
>>
>> if (vdc->load != NULL) {
[..]
>> @@ -3394,6 +3434,18 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
>> continue;
>> }
>>
>> + if (migrating_backend) {
>> + /*
>
> "prior to backend migration"?
>
Right, will fix.
>
>
>> + * Indices are not synced prior backend migration (as we don't
>> + * stop vrings by GET_VRING_BASE). No reason to sync them now,
>> + * and do any checks.
>> + */
>> + vdev->vq[i].used_idx = 0;
>> + vdev->vq[i].shadow_avail_idx = 0;
>> + vdev->vq[i].inuse = 0;
>> + continue;
>> + }
[..]
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (29 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 30/33] virtio: support vhost " Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:09 ` Raphael Norwitz
2025-08-13 16:48 ` [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
` (2 subsequent siblings)
33 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Opt-out backend initialization code, and instead get the state
from migration channel (including inflight region).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/block/vhost-user-blk.c | 185 +++++++++++++++++++++++------
include/hw/virtio/vhost-user-blk.h | 2 +
migration/options.c | 7 ++
migration/options.h | 1 +
qapi/migration.json | 15 ++-
5 files changed, 169 insertions(+), 41 deletions(-)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index c8bc2c78e6..2e6ef6477e 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -17,6 +17,7 @@
*/
#include "qemu/osdep.h"
+#include "qapi-types-run-state.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "qemu/cutils.h"
@@ -32,6 +33,11 @@
#include "system/system.h"
#include "system/runstate.h"
#include "trace.h"
+#include "migration/qemu-file.h"
+#include "migration/migration.h"
+#include "migration/options.h"
+#include "qemu/event_notifier.h"
+#include <sys/mman.h>
static const int user_feature_bits[] = {
VIRTIO_BLK_F_SIZE_MAX,
@@ -159,32 +165,35 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
s->dev.acked_features = vdev->guest_features;
- ret = vhost_dev_prepare_inflight(&s->dev, vdev);
- if (ret < 0) {
- error_setg_errno(errp, -ret, "Error setting inflight format");
- goto err_guest_notifiers;
- }
-
- if (!s->inflight->addr) {
- ret = vhost_dev_get_inflight(&s->dev, s->queue_size, s->inflight);
+ if (!s->dev.migrating_backend) {
+ ret = vhost_dev_prepare_inflight(&s->dev, vdev);
if (ret < 0) {
- error_setg_errno(errp, -ret, "Error getting inflight");
+ error_setg_errno(errp, -ret, "Error setting inflight format");
goto err_guest_notifiers;
}
- }
- ret = vhost_dev_set_inflight(&s->dev, s->inflight);
- if (ret < 0) {
- error_setg_errno(errp, -ret, "Error setting inflight");
- goto err_guest_notifiers;
- }
+ if (!s->inflight->addr) {
+ ret = vhost_dev_get_inflight(&s->dev, s->queue_size, s->inflight);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "Error getting inflight");
+ goto err_guest_notifiers;
+ }
+ }
- /* guest_notifier_mask/pending not used yet, so just unmask
- * everything here. virtio-pci will do the right thing by
- * enabling/disabling irqfd.
- */
- for (i = 0; i < s->dev.nvqs; i++) {
- vhost_virtqueue_mask(&s->dev, vdev, i, false);
+ ret = vhost_dev_set_inflight(&s->dev, s->inflight);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "Error setting inflight");
+ goto err_guest_notifiers;
+ }
+
+ /*
+ * guest_notifier_mask/pending not used yet, so just unmask
+ * everything here. virtio-pci will do the right thing by
+ * enabling/disabling irqfd.
+ */
+ for (i = 0; i < s->dev.nvqs; i++) {
+ vhost_virtqueue_mask(&s->dev, vdev, i, false);
+ }
}
s->dev.vq_index_end = s->dev.nvqs;
@@ -231,6 +240,10 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
force_stop = s->skip_get_vring_base_on_force_shutdown &&
qemu_force_shutdown_requested();
+ s->dev.migrating_backend = s->dev.migrating_backend ||
+ (runstate_check(RUN_STATE_FINISH_MIGRATE) &&
+ migrate_local_vhost_user_blk());
+
ret = force_stop ? vhost_dev_force_stop(&s->dev, vdev, true) :
vhost_dev_stop(&s->dev, vdev, true);
@@ -343,7 +356,9 @@ static void vhost_user_blk_reset(VirtIODevice *vdev)
vhost_dev_free_inflight(s->inflight);
}
-static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
+static int vhost_user_blk_connect(DeviceState *dev,
+ bool migrating_backend,
+ Error **errp)
{
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
VHostUserBlk *s = VHOST_USER_BLK(vdev);
@@ -359,6 +374,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
s->dev.nvqs = s->num_queues;
s->dev.vqs = s->vhost_vqs;
s->dev.vq_index = 0;
+ s->dev.migrating_backend = migrating_backend;
vhost_dev_set_config_notifier(&s->dev, &blk_ops);
@@ -409,7 +425,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
switch (event) {
case CHR_EVENT_OPENED:
- if (vhost_user_blk_connect(dev, &local_err) < 0) {
+ if (vhost_user_blk_connect(dev, false, &local_err) < 0) {
error_report_err(local_err);
qemu_chr_fe_disconnect(&s->chardev);
return;
@@ -428,31 +444,37 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
}
}
-static int vhost_user_blk_realize_connect(VHostUserBlk *s, Error **errp)
+static int vhost_user_blk_realize_connect(VHostUserBlk *s,
+ bool migrating_backend,
+ Error **errp)
{
DeviceState *dev = DEVICE(s);
int ret;
s->connected = false;
- ret = qemu_chr_fe_wait_connected(&s->chardev, errp);
- if (ret < 0) {
- return ret;
+ if (!migrating_backend) {
+ ret = qemu_chr_fe_wait_connected(&s->chardev, errp);
+ if (ret < 0) {
+ return ret;
+ }
}
- ret = vhost_user_blk_connect(dev, errp);
+ ret = vhost_user_blk_connect(dev, migrating_backend, errp);
if (ret < 0) {
qemu_chr_fe_disconnect(&s->chardev);
return ret;
}
assert(s->connected);
- ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
- VIRTIO_DEVICE(s)->config_len, errp);
- if (ret < 0) {
- qemu_chr_fe_disconnect(&s->chardev);
- vhost_dev_cleanup(&s->dev);
- return ret;
+ if (!migrating_backend) {
+ ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
+ VIRTIO_DEVICE(s)->config_len, errp);
+ if (ret < 0) {
+ qemu_chr_fe_disconnect(&s->chardev);
+ vhost_dev_cleanup(&s->dev);
+ return ret;
+ }
}
return 0;
@@ -469,6 +491,11 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
trace_vhost_user_blk_device_realize();
+ if (s->incoming_backend && !runstate_check(RUN_STATE_INMIGRATE)) {
+ error_setg(errp, "__yc_local-incoming can be used "
+ "only for incoming migration");
+ }
+
if (!s->chardev.chr) {
error_setg(errp, "chardev is mandatory");
return;
@@ -517,7 +544,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
error_report_err(*errp);
*errp = NULL;
}
- ret = vhost_user_blk_realize_connect(s, errp);
+ ret = vhost_user_blk_realize_connect(s, s->incoming_backend, errp);
} while (ret < 0 && retries--);
if (ret < 0) {
@@ -525,9 +552,12 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
}
/* we're fully initialized, now we can operate, so add the handler */
- qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
- vhost_user_blk_event, NULL, (void *)dev,
- NULL, true);
+ if (!s->incoming_backend) {
+ qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
+ vhost_user_blk_event, NULL, (void *)dev,
+ NULL, true);
+ }
+
trace_vhost_user_blk_device_realize_finish();
return;
@@ -592,6 +622,79 @@ static const VMStateDescription vmstate_vhost_user_blk = {
},
};
+static void vhost_user_blk_save(VirtIODevice *vdev, QEMUFile *f)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
+
+ if (!hdev->migrating_backend) {
+ return;
+ }
+
+ qemu_file_put_fd(f, s->inflight->fd);
+ qemu_put_be64(f, s->inflight->size);
+ qemu_put_be64(f, s->inflight->offset);
+ qemu_put_be16(f, s->inflight->queue_size);
+
+ vhost_save_backend(hdev, f);
+}
+
+static int vhost_user_blk_load(VirtIODevice *vdev, QEMUFile *f,
+ int version_id)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
+
+ if (!hdev->migrating_backend) {
+ return 0;
+ }
+
+ s->inflight->fd = qemu_file_get_fd(f);
+ qemu_get_be64s(f, &s->inflight->size);
+ qemu_get_be64s(f, &s->inflight->offset);
+ qemu_get_be16s(f, &s->inflight->queue_size);
+
+ s->inflight->addr = mmap(0, s->inflight->size, PROT_READ | PROT_WRITE,
+ MAP_SHARED, s->inflight->fd, s->inflight->offset);
+ if (s->inflight->addr == MAP_FAILED) {
+ return -EINVAL;
+ }
+
+ vhost_load_backend(hdev, f);
+
+ return 0;
+}
+
+static int vhost_user_blk_post_load(VirtIODevice *vdev)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
+ DeviceState *dev = &s->parent_obj.parent_obj;
+
+ if (!hdev->migrating_backend) {
+ return 0;
+ }
+
+ memcpy(&s->blkcfg, vdev->config, vdev->config_len);
+
+ /* we're fully initialized, now we can operate, so add the handler */
+ qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
+ vhost_user_blk_event, NULL, (void *)dev,
+ NULL, true);
+
+ return 0;
+}
+
+static bool vhost_user_blk_skip_migration_log(VirtIODevice *vdev)
+{
+ /*
+ * Note that hdev->migrating_backend is false at this moment,
+ * as logging is being setup during outging migration setup stage,
+ * which is far before vm stop.
+ */
+ return migrate_local_vhost_user_blk();
+}
+
static const Property vhost_user_blk_properties[] = {
DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues,
@@ -605,6 +708,8 @@ static const Property vhost_user_blk_properties[] = {
VIRTIO_BLK_F_WRITE_ZEROES, true),
DEFINE_PROP_BOOL("skip-get-vring-base-on-force-shutdown", VHostUserBlk,
skip_get_vring_base_on_force_shutdown, false),
+ DEFINE_PROP_BOOL("local-incoming", VHostUserBlk,
+ incoming_backend, false),
};
static void vhost_user_blk_class_init(ObjectClass *klass, const void *data)
@@ -624,6 +729,10 @@ static void vhost_user_blk_class_init(ObjectClass *klass, const void *data)
vdc->set_status = vhost_user_blk_set_status;
vdc->reset = vhost_user_blk_reset;
vdc->get_vhost = vhost_user_blk_get_vhost;
+ vdc->save = vhost_user_blk_save;
+ vdc->load = vhost_user_blk_load;
+ vdc->post_load = vhost_user_blk_post_load,
+ vdc->skip_vhost_migration_log = vhost_user_blk_skip_migration_log;
}
static const TypeInfo vhost_user_blk_info = {
diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h
index a10f785672..b06f55fd6f 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -52,6 +52,8 @@ struct VHostUserBlk {
bool started_vu;
bool skip_get_vring_base_on_force_shutdown;
+
+ bool incoming_backend;
};
#endif
diff --git a/migration/options.c b/migration/options.c
index dffb6910f4..11b719c81b 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
}
+bool migrate_local_vhost_user_blk(void)
+{
+ MigrationState *s = migrate_get_current();
+
+ return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
+}
+
bool migrate_ignore_shared(void)
{
MigrationState *s = migrate_get_current();
diff --git a/migration/options.h b/migration/options.h
index 40971f0aa0..5a40ac073d 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -31,6 +31,7 @@ bool migrate_dirty_bitmaps(void);
bool migrate_events(void);
bool migrate_mapped_ram(void);
bool migrate_local_char_socket(void);
+bool migrate_local_vhost_user_blk(void);
bool migrate_ignore_shared(void);
bool migrate_late_block_activate(void);
bool migrate_multifd(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 4f282d168e..ead7f4d17c 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -520,11 +520,19 @@
# @local-char-socket: Migrate socket chardevs open file descriptors.
# Only may be used when migration channel is unix socket. Only
# involves socket chardevs with "support-local-migration" option
-# enabled. (since 10.2)
+# enabled. For target device also @local-incoming option must
+# be specified (since 10.2)
+#
+# @local-vhost-user-blk: Migrate vhost-user-blk locally, keeping
+# backend alive. Open file descriptors and backend-related state are
+# migrated. Only may be used when migration channel is unix socket.
+# For target device also @local-incoming option must be specified
+# (since 10.2)
#
# Features:
#
-# @unstable: Members @x-colo and @x-ignore-shared are experimental.
+# @unstable: Members @x-colo, @x-ignore-shared, @local-char-socket,
+# @local-vhost-user-blk are experimental.
# @deprecated: Member @zero-blocks is deprecated as being part of
# block migration which was already removed.
#
@@ -542,7 +550,8 @@
'validate-uuid', 'background-snapshot',
'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
'dirty-limit', 'mapped-ram',
- { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
+ { 'name': 'local-char-socket', 'features': [ 'unstable' ] },
+ { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
##
# @MigrationCapabilityStatus:
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-08-13 16:48 ` [PATCH 31/33] vhost-user-blk: " Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:09 ` Raphael Norwitz
2025-10-09 21:14 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:09 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
A small question here but will review more thoroughly pending feedback
on my overall comments.
On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Opt-out backend initialization code, and instead get the state
> from migration channel (including inflight region).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> hw/block/vhost-user-blk.c | 185 +++++++++++++++++++++++------
> include/hw/virtio/vhost-user-blk.h | 2 +
> migration/options.c | 7 ++
> migration/options.h | 1 +
> qapi/migration.json | 15 ++-
> 5 files changed, 169 insertions(+), 41 deletions(-)
>
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index c8bc2c78e6..2e6ef6477e 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -17,6 +17,7 @@
> */
>
> #include "qemu/osdep.h"
> +#include "qapi-types-run-state.h"
> #include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "qemu/cutils.h"
> @@ -32,6 +33,11 @@
> #include "system/system.h"
> #include "system/runstate.h"
> #include "trace.h"
> +#include "migration/qemu-file.h"
> +#include "migration/migration.h"
> +#include "migration/options.h"
> +#include "qemu/event_notifier.h"
> +#include <sys/mman.h>
>
> static const int user_feature_bits[] = {
> VIRTIO_BLK_F_SIZE_MAX,
> @@ -159,32 +165,35 @@ static int vhost_user_blk_start(VirtIODevice *vdev, Error **errp)
>
> s->dev.acked_features = vdev->guest_features;
>
> - ret = vhost_dev_prepare_inflight(&s->dev, vdev);
> - if (ret < 0) {
> - error_setg_errno(errp, -ret, "Error setting inflight format");
> - goto err_guest_notifiers;
> - }
> -
> - if (!s->inflight->addr) {
> - ret = vhost_dev_get_inflight(&s->dev, s->queue_size, s->inflight);
> + if (!s->dev.migrating_backend) {
> + ret = vhost_dev_prepare_inflight(&s->dev, vdev);
> if (ret < 0) {
> - error_setg_errno(errp, -ret, "Error getting inflight");
> + error_setg_errno(errp, -ret, "Error setting inflight format");
> goto err_guest_notifiers;
> }
> - }
>
> - ret = vhost_dev_set_inflight(&s->dev, s->inflight);
> - if (ret < 0) {
> - error_setg_errno(errp, -ret, "Error setting inflight");
> - goto err_guest_notifiers;
> - }
> + if (!s->inflight->addr) {
> + ret = vhost_dev_get_inflight(&s->dev, s->queue_size, s->inflight);
> + if (ret < 0) {
> + error_setg_errno(errp, -ret, "Error getting inflight");
> + goto err_guest_notifiers;
> + }
> + }
>
> - /* guest_notifier_mask/pending not used yet, so just unmask
> - * everything here. virtio-pci will do the right thing by
> - * enabling/disabling irqfd.
> - */
> - for (i = 0; i < s->dev.nvqs; i++) {
> - vhost_virtqueue_mask(&s->dev, vdev, i, false);
> + ret = vhost_dev_set_inflight(&s->dev, s->inflight);
> + if (ret < 0) {
> + error_setg_errno(errp, -ret, "Error setting inflight");
> + goto err_guest_notifiers;
> + }
> +
> + /*
> + * guest_notifier_mask/pending not used yet, so just unmask
> + * everything here. virtio-pci will do the right thing by
> + * enabling/disabling irqfd.
> + */
> + for (i = 0; i < s->dev.nvqs; i++) {
> + vhost_virtqueue_mask(&s->dev, vdev, i, false);
> + }
> }
>
> s->dev.vq_index_end = s->dev.nvqs;
> @@ -231,6 +240,10 @@ static int vhost_user_blk_stop(VirtIODevice *vdev)
> force_stop = s->skip_get_vring_base_on_force_shutdown &&
> qemu_force_shutdown_requested();
>
> + s->dev.migrating_backend = s->dev.migrating_backend ||
> + (runstate_check(RUN_STATE_FINISH_MIGRATE) &&
> + migrate_local_vhost_user_blk());
> +
> ret = force_stop ? vhost_dev_force_stop(&s->dev, vdev, true) :
> vhost_dev_stop(&s->dev, vdev, true);
>
> @@ -343,7 +356,9 @@ static void vhost_user_blk_reset(VirtIODevice *vdev)
> vhost_dev_free_inflight(s->inflight);
> }
>
> -static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
> +static int vhost_user_blk_connect(DeviceState *dev,
> + bool migrating_backend,
> + Error **errp)
> {
> VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> VHostUserBlk *s = VHOST_USER_BLK(vdev);
> @@ -359,6 +374,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
> s->dev.nvqs = s->num_queues;
> s->dev.vqs = s->vhost_vqs;
> s->dev.vq_index = 0;
> + s->dev.migrating_backend = migrating_backend;
>
> vhost_dev_set_config_notifier(&s->dev, &blk_ops);
>
> @@ -409,7 +425,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>
> switch (event) {
> case CHR_EVENT_OPENED:
> - if (vhost_user_blk_connect(dev, &local_err) < 0) {
> + if (vhost_user_blk_connect(dev, false, &local_err) < 0) {
> error_report_err(local_err);
> qemu_chr_fe_disconnect(&s->chardev);
> return;
> @@ -428,31 +444,37 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
> }
> }
>
> -static int vhost_user_blk_realize_connect(VHostUserBlk *s, Error **errp)
> +static int vhost_user_blk_realize_connect(VHostUserBlk *s,
> + bool migrating_backend,
> + Error **errp)
> {
> DeviceState *dev = DEVICE(s);
> int ret;
>
> s->connected = false;
>
> - ret = qemu_chr_fe_wait_connected(&s->chardev, errp);
> - if (ret < 0) {
> - return ret;
> + if (!migrating_backend) {
> + ret = qemu_chr_fe_wait_connected(&s->chardev, errp);
> + if (ret < 0) {
> + return ret;
> + }
> }
>
> - ret = vhost_user_blk_connect(dev, errp);
> + ret = vhost_user_blk_connect(dev, migrating_backend, errp);
> if (ret < 0) {
> qemu_chr_fe_disconnect(&s->chardev);
> return ret;
> }
> assert(s->connected);
>
> - ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
> - VIRTIO_DEVICE(s)->config_len, errp);
> - if (ret < 0) {
> - qemu_chr_fe_disconnect(&s->chardev);
> - vhost_dev_cleanup(&s->dev);
> - return ret;
> + if (!migrating_backend) {
> + ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
> + VIRTIO_DEVICE(s)->config_len, errp);
> + if (ret < 0) {
> + qemu_chr_fe_disconnect(&s->chardev);
> + vhost_dev_cleanup(&s->dev);
> + return ret;
> + }
> }
>
> return 0;
> @@ -469,6 +491,11 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
>
> trace_vhost_user_blk_device_realize();
>
> + if (s->incoming_backend && !runstate_check(RUN_STATE_INMIGRATE)) {
> + error_setg(errp, "__yc_local-incoming can be used "
> + "only for incoming migration");
> + }
> +
> if (!s->chardev.chr) {
> error_setg(errp, "chardev is mandatory");
> return;
> @@ -517,7 +544,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> error_report_err(*errp);
> *errp = NULL;
> }
> - ret = vhost_user_blk_realize_connect(s, errp);
> + ret = vhost_user_blk_realize_connect(s, s->incoming_backend, errp);
> } while (ret < 0 && retries--);
>
> if (ret < 0) {
> @@ -525,9 +552,12 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> }
>
> /* we're fully initialized, now we can operate, so add the handler */
> - qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
> - vhost_user_blk_event, NULL, (void *)dev,
> - NULL, true);
> + if (!s->incoming_backend) {
> + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
> + vhost_user_blk_event, NULL, (void *)dev,
> + NULL, true);
> + }
> +
> trace_vhost_user_blk_device_realize_finish();
> return;
>
> @@ -592,6 +622,79 @@ static const VMStateDescription vmstate_vhost_user_blk = {
> },
> };
>
> +static void vhost_user_blk_save(VirtIODevice *vdev, QEMUFile *f)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
> +
> + if (!hdev->migrating_backend) {
> + return;
> + }
> +
> + qemu_file_put_fd(f, s->inflight->fd);
> + qemu_put_be64(f, s->inflight->size);
> + qemu_put_be64(f, s->inflight->offset);
> + qemu_put_be16(f, s->inflight->queue_size);
> +
> + vhost_save_backend(hdev, f);
> +}
> +
> +static int vhost_user_blk_load(VirtIODevice *vdev, QEMUFile *f,
> + int version_id)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
> +
> + if (!hdev->migrating_backend) {
> + return 0;
> + }
> +
> + s->inflight->fd = qemu_file_get_fd(f);
> + qemu_get_be64s(f, &s->inflight->size);
> + qemu_get_be64s(f, &s->inflight->offset);
> + qemu_get_be16s(f, &s->inflight->queue_size);
> +
> + s->inflight->addr = mmap(0, s->inflight->size, PROT_READ | PROT_WRITE,
> + MAP_SHARED, s->inflight->fd, s->inflight->offset);
> + if (s->inflight->addr == MAP_FAILED) {
> + return -EINVAL;
> + }
> +
> + vhost_load_backend(hdev, f);
> +
> + return 0;
> +}
> +
> +static int vhost_user_blk_post_load(VirtIODevice *vdev)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + struct vhost_dev *hdev = vhost_user_blk_get_vhost(vdev);
> + DeviceState *dev = &s->parent_obj.parent_obj;
> +
> + if (!hdev->migrating_backend) {
> + return 0;
> + }
> +
> + memcpy(&s->blkcfg, vdev->config, vdev->config_len);
> +
> + /* we're fully initialized, now we can operate, so add the handler */
> + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
> + vhost_user_blk_event, NULL, (void *)dev,
> + NULL, true);
> +
> + return 0;
> +}
> +
> +static bool vhost_user_blk_skip_migration_log(VirtIODevice *vdev)
> +{
> + /*
> + * Note that hdev->migrating_backend is false at this moment,
> + * as logging is being setup during outging migration setup stage,
> + * which is far before vm stop.
> + */
> + return migrate_local_vhost_user_blk();
> +}
> +
> static const Property vhost_user_blk_properties[] = {
> DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues,
> @@ -605,6 +708,8 @@ static const Property vhost_user_blk_properties[] = {
> VIRTIO_BLK_F_WRITE_ZEROES, true),
> DEFINE_PROP_BOOL("skip-get-vring-base-on-force-shutdown", VHostUserBlk,
> skip_get_vring_base_on_force_shutdown, false),
> + DEFINE_PROP_BOOL("local-incoming", VHostUserBlk,
> + incoming_backend, false),
> };
>
> static void vhost_user_blk_class_init(ObjectClass *klass, const void *data)
> @@ -624,6 +729,10 @@ static void vhost_user_blk_class_init(ObjectClass *klass, const void *data)
> vdc->set_status = vhost_user_blk_set_status;
> vdc->reset = vhost_user_blk_reset;
> vdc->get_vhost = vhost_user_blk_get_vhost;
> + vdc->save = vhost_user_blk_save;
> + vdc->load = vhost_user_blk_load;
> + vdc->post_load = vhost_user_blk_post_load,
> + vdc->skip_vhost_migration_log = vhost_user_blk_skip_migration_log;
> }
>
> static const TypeInfo vhost_user_blk_info = {
> diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h
> index a10f785672..b06f55fd6f 100644
> --- a/include/hw/virtio/vhost-user-blk.h
> +++ b/include/hw/virtio/vhost-user-blk.h
> @@ -52,6 +52,8 @@ struct VHostUserBlk {
> bool started_vu;
>
> bool skip_get_vring_base_on_force_shutdown;
> +
> + bool incoming_backend;
> };
>
> #endif
> diff --git a/migration/options.c b/migration/options.c
> index dffb6910f4..11b719c81b 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
> }
>
> +bool migrate_local_vhost_user_blk(void)
> +{
> + MigrationState *s = migrate_get_current();
> +
Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
> +}
> +
> bool migrate_ignore_shared(void)
> {
> MigrationState *s = migrate_get_current();
> diff --git a/migration/options.h b/migration/options.h
> index 40971f0aa0..5a40ac073d 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -31,6 +31,7 @@ bool migrate_dirty_bitmaps(void);
> bool migrate_events(void);
> bool migrate_mapped_ram(void);
> bool migrate_local_char_socket(void);
> +bool migrate_local_vhost_user_blk(void);
> bool migrate_ignore_shared(void);
> bool migrate_late_block_activate(void);
> bool migrate_multifd(void);
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 4f282d168e..ead7f4d17c 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -520,11 +520,19 @@
> # @local-char-socket: Migrate socket chardevs open file descriptors.
> # Only may be used when migration channel is unix socket. Only
> # involves socket chardevs with "support-local-migration" option
> -# enabled. (since 10.2)
> +# enabled. For target device also @local-incoming option must
> +# be specified (since 10.2)
> +#
> +# @local-vhost-user-blk: Migrate vhost-user-blk locally, keeping
> +# backend alive. Open file descriptors and backend-related state are
> +# migrated. Only may be used when migration channel is unix socket.
> +# For target device also @local-incoming option must be specified
> +# (since 10.2)
> #
> # Features:
> #
> -# @unstable: Members @x-colo and @x-ignore-shared are experimental.
> +# @unstable: Members @x-colo, @x-ignore-shared, @local-char-socket,
> +# @local-vhost-user-blk are experimental.
> # @deprecated: Member @zero-blocks is deprecated as being part of
> # block migration which was already removed.
> #
> @@ -542,7 +550,8 @@
> 'validate-uuid', 'background-snapshot',
> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
> 'dirty-limit', 'mapped-ram',
> - { 'name': 'local-char-socket', 'features': [ 'unstable' ] } ] }
> + { 'name': 'local-char-socket', 'features': [ 'unstable' ] },
> + { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
>
> ##
> # @MigrationCapabilityStatus:
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-10-09 19:09 ` Raphael Norwitz
@ 2025-10-09 21:14 ` Vladimir Sementsov-Ogievskiy
2025-10-09 23:43 ` Raphael Norwitz
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 21:14 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:09, Raphael Norwitz wrote:
> A small question here but will review more thoroughly pending feedback
> on my overall comments.
>
I really hope you didn't spent much time on these 28-31 patches :/
> On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
[..]
>> --- a/migration/options.c
>> +++ b/migration/options.c
>> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
>> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
>> }
>>
>> +bool migrate_local_vhost_user_blk(void)
>> +{
>> + MigrationState *s = migrate_get_current();
>> +
>
> Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
It is generated by QAPI code generator.
Exactly, it's defined by 'local-vhost-user-blk' member inside 'MigrationCapability':
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge',
...
{ 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
and after build, the generated code is in build/qapi/qapi-types-migration.h, as a enum:
typedef enum MigrationCapability {
MIGRATION_CAPABILITY_XBZRLE,
,,,
MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK,
MIGRATION_CAPABILITY__MAX,
} MigrationCapability;
In v2, I'll follow the interface of virtio-net series, look at
https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/
so, it would be migration parameter instead of capability, like
QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk"] }
and to enable both vhost-user-blk and virtio-net-tap together:
QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk", "virtio-net-tap"] }
>
>
>> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
>> +}
>> +
[..]
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-10-09 21:14 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-09 23:43 ` Raphael Norwitz
2025-10-10 6:27 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 23:43 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Thu, Oct 9, 2025 at 5:14 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 09.10.25 22:09, Raphael Norwitz wrote:
> > A small question here but will review more thoroughly pending feedback
> > on my overall comments.
> >
>
> I really hope you didn't spent much time on these 28-31 patches :/
>
I spent much more time on the cleanups :)
> > On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
> > <vsementsov@yandex-team.ru> wrote:
> >>
>
> [..]
>
> >> --- a/migration/options.c
> >> +++ b/migration/options.c
> >> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
> >> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
> >> }
> >>
> >> +bool migrate_local_vhost_user_blk(void)
> >> +{
> >> + MigrationState *s = migrate_get_current();
> >> +
> >
> > Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
>
> It is generated by QAPI code generator.
>
> Exactly, it's defined by 'local-vhost-user-blk' member inside 'MigrationCapability':
>
> { 'enum': 'MigrationCapability',
> 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge',
>
> ...
>
> { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
>
>
> and after build, the generated code is in build/qapi/qapi-types-migration.h, as a enum:
>
> typedef enum MigrationCapability {
> MIGRATION_CAPABILITY_XBZRLE,
>
> ,,,
>
> MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK,
> MIGRATION_CAPABILITY__MAX,
> } MigrationCapability;
>
>
> In v2, I'll follow the interface of virtio-net series, look at
>
> https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/
>
> so, it would be migration parameter instead of capability, like
>
> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk"] }
>
> and to enable both vhost-user-blk and virtio-net-tap together:
>
> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk", "virtio-net-tap"] }
>
Why do we need two separate migration parameters for vhost-user-blk
and virtio-net-tap? Why not have a single parameter for virtio local
migrations and, if it is set, all backends types which support local
migration can advertise and take advantage of it?
> >
> >
> >> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
> >> +}
> >> +
>
> [..]
>
>
> --
> Best regards,
> Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-10-09 23:43 ` Raphael Norwitz
@ 2025-10-10 6:27 ` Vladimir Sementsov-Ogievskiy
2025-10-13 21:50 ` Raphael Norwitz
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-10 6:27 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 10.10.25 02:43, Raphael Norwitz wrote:
> On Thu, Oct 9, 2025 at 5:14 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> On 09.10.25 22:09, Raphael Norwitz wrote:
>>> A small question here but will review more thoroughly pending feedback
>>> on my overall comments.
>>>
>>
>> I really hope you didn't spent much time on these 28-31 patches :/
>>
>
> I spent much more time on the cleanups :)
>
>>> On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
>>> <vsementsov@yandex-team.ru> wrote:
>>>>
>>
>> [..]
>>
>>>> --- a/migration/options.c
>>>> +++ b/migration/options.c
>>>> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
>>>> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
>>>> }
>>>>
>>>> +bool migrate_local_vhost_user_blk(void)
>>>> +{
>>>> + MigrationState *s = migrate_get_current();
>>>> +
>>>
>>> Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
>>
>> It is generated by QAPI code generator.
>>
>> Exactly, it's defined by 'local-vhost-user-blk' member inside 'MigrationCapability':
>>
>> { 'enum': 'MigrationCapability',
>> 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge',
>>
>> ...
>>
>> { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
>>
>>
>> and after build, the generated code is in build/qapi/qapi-types-migration.h, as a enum:
>>
>> typedef enum MigrationCapability {
>> MIGRATION_CAPABILITY_XBZRLE,
>>
>> ,,,
>>
>> MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK,
>> MIGRATION_CAPABILITY__MAX,
>> } MigrationCapability;
>>
>>
>> In v2, I'll follow the interface of virtio-net series, look at
>>
>> https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/
>>
>> so, it would be migration parameter instead of capability, like
>>
>> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk"] }
>>
>> and to enable both vhost-user-blk and virtio-net-tap together:
>>
>> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk", "virtio-net-tap"] }
>>
>
> Why do we need two separate migration parameters for vhost-user-blk
> and virtio-net-tap? Why not have a single parameter for virtio local
> migrations and, if it is set, all backends types which support local
> migration can advertise and take advantage of it?
As I describe in the commit message https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/ :
Why not simple boolean? To simplify migration to further versions,
when more devices will support backend-transfer migration.
Alternatively, we may add per-device option to disable backend-transfer
migration, but still:
1. It's more comfortable to set same capabilities/parameters on both
source and target QEMU, than care about each device.
2. To not break the design, that machine-type + device options +
migration capabilities and parameters are fully define the resulting
migration stream. We'll break this if add in future more
backend-transfer support in devices under same backend-transfer=true
parameter.
>
>>>
>>>
>>>> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
>>>> +}
>>>> +
>>
>> [..]
>>
>>
>> --
>> Best regards,
>> Vladimir
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-10-10 6:27 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-13 21:50 ` Raphael Norwitz
2025-10-14 11:59 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-13 21:50 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On Fri, Oct 10, 2025 at 2:27 AM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 10.10.25 02:43, Raphael Norwitz wrote:
> > On Thu, Oct 9, 2025 at 5:14 PM Vladimir Sementsov-Ogievskiy
> > <vsementsov@yandex-team.ru> wrote:
> >>
> >> On 09.10.25 22:09, Raphael Norwitz wrote:
> >>> A small question here but will review more thoroughly pending feedback
> >>> on my overall comments.
> >>>
> >>
> >> I really hope you didn't spent much time on these 28-31 patches :/
> >>
> >
> > I spent much more time on the cleanups :)
> >
> >>> On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
> >>> <vsementsov@yandex-team.ru> wrote:
> >>>>
> >>
> >> [..]
> >>
> >>>> --- a/migration/options.c
> >>>> +++ b/migration/options.c
> >>>> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
> >>>> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
> >>>> }
> >>>>
> >>>> +bool migrate_local_vhost_user_blk(void)
> >>>> +{
> >>>> + MigrationState *s = migrate_get_current();
> >>>> +
> >>>
> >>> Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
> >>
> >> It is generated by QAPI code generator.
> >>
> >> Exactly, it's defined by 'local-vhost-user-blk' member inside 'MigrationCapability':
> >>
> >> { 'enum': 'MigrationCapability',
> >> 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge',
> >>
> >> ...
> >>
> >> { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
> >>
> >>
> >> and after build, the generated code is in build/qapi/qapi-types-migration.h, as a enum:
> >>
> >> typedef enum MigrationCapability {
> >> MIGRATION_CAPABILITY_XBZRLE,
> >>
> >> ,,,
> >>
> >> MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK,
> >> MIGRATION_CAPABILITY__MAX,
> >> } MigrationCapability;
> >>
> >>
> >> In v2, I'll follow the interface of virtio-net series, look at
> >>
> >> https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/
> >>
> >> so, it would be migration parameter instead of capability, like
> >>
> >> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk"] }
> >>
> >> and to enable both vhost-user-blk and virtio-net-tap together:
> >>
> >> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk", "virtio-net-tap"] }
> >>
> >
> > Why do we need two separate migration parameters for vhost-user-blk
> > and virtio-net-tap? Why not have a single parameter for virtio local
> > migrations and, if it is set, all backends types which support local
> > migration can advertise and take advantage of it?
>
> As I describe in the commit message https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/ :
>
>
> Why not simple boolean? To simplify migration to further versions,
> when more devices will support backend-transfer migration.
>
> Alternatively, we may add per-device option to disable backend-transfer
> migration, but still:
>
> 1. It's more comfortable to set same capabilities/parameters on both
> source and target QEMU, than care about each device.
>
> 2. To not break the design, that machine-type + device options +
> migration capabilities and parameters are fully define the resulting
> migration stream. We'll break this if add in future more
> backend-transfer support in devices under same backend-transfer=true
> parameter.
ACK on needing a separate migration parameter. Thanks for the references.
I would suggest having the incoming_backend field in the struct
vhost_user (or maybe even in struct vhost_dev if the tap device
migration is similar enough) rather than in struct VHostUserBlk, so
that device-specific code can be kept as similar as possible.
>
>
> >
> >>>
> >>>
> >>>> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
> >>>> +}
> >>>> +
> >>
> >> [..]
> >>
> >>
> >> --
> >> Best regards,
> >> Vladimir
>
>
> --
> Best regards,
> Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 31/33] vhost-user-blk: support vhost backend migration
2025-10-13 21:50 ` Raphael Norwitz
@ 2025-10-14 11:59 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-14 11:59 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 14.10.25 00:50, Raphael Norwitz wrote:
> On Fri, Oct 10, 2025 at 2:27 AM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> On 10.10.25 02:43, Raphael Norwitz wrote:
>>> On Thu, Oct 9, 2025 at 5:14 PM Vladimir Sementsov-Ogievskiy
>>> <vsementsov@yandex-team.ru> wrote:
>>>>
>>>> On 09.10.25 22:09, Raphael Norwitz wrote:
>>>>> A small question here but will review more thoroughly pending feedback
>>>>> on my overall comments.
>>>>>
>>>>
>>>> I really hope you didn't spent much time on these 28-31 patches :/
>>>>
>>>
>>> I spent much more time on the cleanups :)
>>>
>>>>> On Wed, Aug 13, 2025 at 12:53 PM Vladimir Sementsov-Ogievskiy
>>>>> <vsementsov@yandex-team.ru> wrote:
>>>>>>
>>>>
>>>> [..]
>>>>
>>>>>> --- a/migration/options.c
>>>>>> +++ b/migration/options.c
>>>>>> @@ -269,6 +269,13 @@ bool migrate_local_char_socket(void)
>>>>>> return s->capabilities[MIGRATION_CAPABILITY_LOCAL_CHAR_SOCKET];
>>>>>> }
>>>>>>
>>>>>> +bool migrate_local_vhost_user_blk(void)
>>>>>> +{
>>>>>> + MigrationState *s = migrate_get_current();
>>>>>> +
>>>>>
>>>>> Where was MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK added/defined?
>>>>
>>>> It is generated by QAPI code generator.
>>>>
>>>> Exactly, it's defined by 'local-vhost-user-blk' member inside 'MigrationCapability':
>>>>
>>>> { 'enum': 'MigrationCapability',
>>>> 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge',
>>>>
>>>> ...
>>>>
>>>> { 'name': 'local-vhost-user-blk', 'features': [ 'unstable' ] } ] }
>>>>
>>>>
>>>> and after build, the generated code is in build/qapi/qapi-types-migration.h, as a enum:
>>>>
>>>> typedef enum MigrationCapability {
>>>> MIGRATION_CAPABILITY_XBZRLE,
>>>>
>>>> ,,,
>>>>
>>>> MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK,
>>>> MIGRATION_CAPABILITY__MAX,
>>>> } MigrationCapability;
>>>>
>>>>
>>>> In v2, I'll follow the interface of virtio-net series, look at
>>>>
>>>> https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/
>>>>
>>>> so, it would be migration parameter instead of capability, like
>>>>
>>>> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk"] }
>>>>
>>>> and to enable both vhost-user-blk and virtio-net-tap together:
>>>>
>>>> QMP migrate-set-parameters {... backend-transfer = ["vhost-user-blk", "virtio-net-tap"] }
>>>>
>>>
>>> Why do we need two separate migration parameters for vhost-user-blk
>>> and virtio-net-tap? Why not have a single parameter for virtio local
>>> migrations and, if it is set, all backends types which support local
>>> migration can advertise and take advantage of it?
>>
>> As I describe in the commit message https://patchew.org/QEMU/20250923100110.70862-1-vsementsov@yandex-team.ru/20250923100110.70862-17-vsementsov@yandex-team.ru/ :
>>
>>
>> Why not simple boolean? To simplify migration to further versions,
>> when more devices will support backend-transfer migration.
>>
>> Alternatively, we may add per-device option to disable backend-transfer
>> migration, but still:
>>
>> 1. It's more comfortable to set same capabilities/parameters on both
>> source and target QEMU, than care about each device.
>>
>> 2. To not break the design, that machine-type + device options +
>> migration capabilities and parameters are fully define the resulting
>> migration stream. We'll break this if add in future more
>> backend-transfer support in devices under same backend-transfer=true
>> parameter.
>
> ACK on needing a separate migration parameter. Thanks for the references.
>
> I would suggest having the incoming_backend field in the struct
> vhost_user (or maybe even in struct vhost_dev if the tap device
> migration is similar enough) rather than in struct VHostUserBlk, so
> that device-specific code can be kept as similar as possible.
In v2 it will be "backend_transfer" field in struct vhost_dev.
>
>>
>>
>>>
>>>>>
>>>>>
>>>>>> + return s->capabilities[MIGRATION_CAPABILITY_LOCAL_VHOST_USER_BLK];
>>>>>> +}
>>>>>> +
>>>>
>>>> [..]
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Vladimir
>>
>>
>> --
>> Best regards,
>> Vladimir
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (30 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 31/33] vhost-user-blk: " Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-08-14 5:01 ` Philippe Mathieu-Daudé
2025-08-18 6:55 ` Thomas Huth
2025-08-13 16:48 ` [PATCH 33/33] tests/functional: add test_x86_64_vhost_user_blk_fd_migration.py Vladimir Sementsov-Ogievskiy
2025-10-09 19:16 ` [PATCH 00/33] vhost-user-blk: live-backend local migration Raphael Norwitz
33 siblings, 2 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov, Thomas Huth,
Philippe Mathieu-Daudé
Allow to specify non default vm for the command.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
tests/functional/qemu_test/cmd.py | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tests/functional/qemu_test/cmd.py b/tests/functional/qemu_test/cmd.py
index dc5f422b77..28b36a3a54 100644
--- a/tests/functional/qemu_test/cmd.py
+++ b/tests/functional/qemu_test/cmd.py
@@ -172,7 +172,8 @@ def exec_command(test, command):
_console_interaction(test, None, None, command + '\r')
def exec_command_and_wait_for_pattern(test, command,
- success_message, failure_message=None):
+ success_message, failure_message=None,
+ vm=None):
"""
Send a command to a console (appending CRLF characters), then wait
for success_message to appear on the console, while logging the.
@@ -184,9 +185,11 @@ def exec_command_and_wait_for_pattern(test, command,
:param command: the command to send
:param success_message: if this message appears, test succeeds
:param failure_message: if this message appears, test fails
+ :param vm: the VM to use (defaults to test.vm if None)
"""
assert success_message
- _console_interaction(test, success_message, failure_message, command + '\r')
+ _console_interaction(test, success_message, failure_message, command + '\r',
+ vm=vm)
def get_qemu_img(test):
test.log.debug('Looking for and selecting a qemu-img binary')
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg
2025-08-13 16:48 ` [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
@ 2025-08-14 5:01 ` Philippe Mathieu-Daudé
2025-08-18 6:55 ` Thomas Huth
1 sibling, 0 replies; 108+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-14 5:01 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, Thomas Huth
On 13/8/25 18:48, Vladimir Sementsov-Ogievskiy wrote:
> Allow to specify non default vm for the command.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> tests/functional/qemu_test/cmd.py | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg
2025-08-13 16:48 ` [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
2025-08-14 5:01 ` Philippe Mathieu-Daudé
@ 2025-08-18 6:55 ` Thomas Huth
1 sibling, 0 replies; 108+ messages in thread
From: Thomas Huth @ 2025-08-18 6:55 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, Philippe Mathieu-Daudé
On 13/08/2025 18.48, Vladimir Sementsov-Ogievskiy wrote:
> Allow to specify non default vm for the command.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> tests/functional/qemu_test/cmd.py | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tests/functional/qemu_test/cmd.py b/tests/functional/qemu_test/cmd.py
> index dc5f422b77..28b36a3a54 100644
> --- a/tests/functional/qemu_test/cmd.py
> +++ b/tests/functional/qemu_test/cmd.py
> @@ -172,7 +172,8 @@ def exec_command(test, command):
> _console_interaction(test, None, None, command + '\r')
>
> def exec_command_and_wait_for_pattern(test, command,
> - success_message, failure_message=None):
> + success_message, failure_message=None,
> + vm=None):
> """
> Send a command to a console (appending CRLF characters), then wait
> for success_message to appear on the console, while logging the.
> @@ -184,9 +185,11 @@ def exec_command_and_wait_for_pattern(test, command,
> :param command: the command to send
> :param success_message: if this message appears, test succeeds
> :param failure_message: if this message appears, test fails
> + :param vm: the VM to use (defaults to test.vm if None)
> """
> assert success_message
> - _console_interaction(test, success_message, failure_message, command + '\r')
> + _console_interaction(test, success_message, failure_message, command + '\r',
> + vm=vm)
>
> def get_qemu_img(test):
> test.log.debug('Looking for and selecting a qemu-img binary')
Reviewed-by: Thomas Huth <thuth@redhat.com>
^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 33/33] tests/functional: add test_x86_64_vhost_user_blk_fd_migration.py
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (31 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 32/33] test/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
@ 2025-08-13 16:48 ` Vladimir Sementsov-Ogievskiy
2025-10-09 19:16 ` [PATCH 00/33] vhost-user-blk: live-backend local migration Raphael Norwitz
33 siblings, 0 replies; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-08-13 16:48 UTC (permalink / raw)
To: mst, peterx, farosas, raphael
Cc: sgarzare, marcandre.lureau, pbonzini, kwolf, hreitz, berrange,
eblake, armbru, qemu-devel, qemu-block, steven.sistare,
den-plotnikov, vsementsov
Introduce a simple test to check that local migration of vhost-user-blk
device with passing open fds through unix socket works, and the disk
is still working on target.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
...test_x86_64_vhost_user_blk_fd_migration.py | 279 ++++++++++++++++++
1 file changed, 279 insertions(+)
create mode 100644 tests/functional/test_x86_64_vhost_user_blk_fd_migration.py
diff --git a/tests/functional/test_x86_64_vhost_user_blk_fd_migration.py b/tests/functional/test_x86_64_vhost_user_blk_fd_migration.py
new file mode 100644
index 0000000000..7ab3f61a5b
--- /dev/null
+++ b/tests/functional/test_x86_64_vhost_user_blk_fd_migration.py
@@ -0,0 +1,279 @@
+#!/usr/bin/env python3
+#
+# Functional test that tests vhost-user-blk local migration
+# with fd passing
+#
+# Copyright (c) Yandex
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import os
+import time
+import subprocess
+
+from qemu_test import (
+ LinuxKernelTest,
+ Asset,
+ exec_command_and_wait_for_pattern,
+)
+
+
+def wait_migration_finish(source_vm, target_vm):
+ migr_events = (
+ ("MIGRATION", {"data": {"status": "completed"}}),
+ ("MIGRATION", {"data": {"status": "failed"}}),
+ )
+
+ source_e = source_vm.events_wait(migr_events)["data"]
+ target_e = target_vm.events_wait(migr_events)["data"]
+
+ source_s = source_vm.cmd("query-status")["status"]
+ target_s = target_vm.cmd("query-status")["status"]
+
+ assert (
+ source_e["status"] == "completed"
+ and target_e["status"] == "completed"
+ and source_s == "postmigrate"
+ and target_s == "paused"
+ ), f"""Migration failed:
+ SRC status: {source_s}
+ SRC event: {source_e}
+ TGT status: {target_s}
+ TGT event:{target_e}"""
+
+
+class VhostUserBlkFdMigration(LinuxKernelTest):
+
+ ASSET_KERNEL = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/vmlinuz"
+ ),
+ "d4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129",
+ )
+
+ ASSET_INITRD = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/initrd.img"
+ ),
+ "277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b",
+ )
+
+ DATA1 = "TEST_DATA_BEFORE_MIGRATION_12345"
+ DATA2 = "TEST_DATA_AFTER_MIGRATION_54321"
+
+ def write_data(self, data, vm) -> None:
+ exec_command_and_wait_for_pattern(
+ self,
+ f'echo "{data}" | ' "dd of=/dev/vda bs=512 count=1 oflag=direct",
+ "# ",
+ vm=vm,
+ )
+
+ def read_data(self, data, vm) -> None:
+ exec_command_and_wait_for_pattern(
+ self,
+ "dd if=/dev/vda bs=512 count=1 iflag=direct 2>/dev/null",
+ data,
+ vm=vm,
+ )
+
+ def setUp(self):
+ super().setUp()
+ self.vhost_proc = None
+
+ def tearDown(self):
+ # Cleanup vhost-user server process
+ if self.vhost_proc:
+ try:
+ self.vhost_proc.terminate()
+ self.vhost_proc.wait(timeout=5)
+ except subprocess.TimeoutExpired:
+ self.vhost_proc.kill()
+ self.vhost_proc.wait()
+ except:
+ pass
+
+ super().tearDown()
+
+ def create_test_image(self):
+ """Create a temporary test image for vhost-user-blk"""
+ img_path = self.scratch_file("disk.img")
+
+ # Create 64MB image
+ with open(img_path, "wb") as f:
+ f.write(b"\0" * (64 * 1024 * 1024))
+
+ return img_path
+
+ def start_vhost_user_server(self, socket_path, img_path):
+ """Start vhost-user-blk server using contrib/vhost-user-blk"""
+ # Find vhost-user-blk binary
+ vub_binary = self.build_file(
+ "contrib", "vhost-user-blk", "vhost-user-blk"
+ )
+
+ if not os.path.isfile(vub_binary) or not os.access(vub_binary, os.X_OK):
+ self.skipTest("vhost-user-blk binary not found")
+
+ # assert that our further waiting would be correct
+ self.assertFalse(os.path.exists(socket_path))
+
+ cmd = [vub_binary, "-s", socket_path, "-b", img_path]
+ self.log.info(f'Starting vhost-user server: {" ".join(cmd)}')
+ self.vhost_proc = subprocess.Popen(
+ cmd, stderr=subprocess.PIPE, text=True, preexec_fn=os.setsid
+ )
+
+ # Wait for socket to be created
+ for _ in range(100): # 10 seconds timeout
+ time.sleep(0.1)
+
+ # Check if process is still running
+ if self.vhost_proc.poll() is not None:
+ self.fail(f"vhost-user server failed: {self.vhost_proc.stderr}")
+
+ if os.path.exists(socket_path):
+ return
+
+ self.fail(f"vhost-user socket {socket_path} was not created")
+
+ def setup_shared_memory(self):
+ shm_path = f"/dev/shm/qemu_test_{os.getpid()}"
+
+ try:
+ with open(shm_path, "wb") as f:
+ f.write(b"\0" * (1024 * 1024 * 1024)) # 1GB
+ except Exception as e:
+ self.fail(f"Failed to create shared memory file: {e}")
+
+ return shm_path
+
+ def prepare_and_launch_vm(
+ self, shm_path, vhost_socket, incoming=False, vm=None
+ ):
+ if not vm:
+ vm = self.vm
+
+ vm.add_args("-accel", "kvm")
+ vm.add_args("-device", "pcie-pci-bridge,id=pci.1,bus=pcie.0")
+ vm.add_args("-m", "1G")
+ vm.add_args("-append", "console=ttyS0 rd.rescue")
+
+ vm.add_args(
+ "-object",
+ f"memory-backend-file,id=ram0,size=1G,mem-path={shm_path},share=on",
+ )
+ vm.add_args("-machine", "memory-backend=ram0")
+
+ vm.add_args("-kernel", self.ASSET_KERNEL.fetch())
+ vm.add_args("-initrd", self.ASSET_INITRD.fetch())
+
+ vm.add_args("-S")
+
+ if incoming:
+ vm.add_args("-incoming", "defer")
+
+ vm.set_console()
+
+ vm_s = "target" if incoming else "source"
+ self.log.info(f"Launching {vm_s} VM")
+ vm.launch()
+
+ self.set_migration_capabilities(vm)
+ self.add_vhost_user_blk_device(vm, vhost_socket, incoming)
+
+ def add_vhost_user_blk_device(self, vm, socket_path, incoming=False):
+ # Add chardev
+ chardev_params = {
+ "id": "chardev-virtio-disk0",
+ "backend": {
+ "type": "socket",
+ "data": {
+ "addr": {"type": "unix", "data": {"path": socket_path}},
+ "server": False,
+ "reconnect-ms": 20,
+ "support-local-migration": True,
+ },
+ },
+ }
+
+ if incoming:
+ chardev_params["backend"]["data"]["local-incoming"] = True
+
+ vm.cmd("chardev-add", chardev_params)
+
+ # Add device
+ device_params = {
+ "id": "virtio-disk0",
+ "driver": "vhost-user-blk-pci",
+ "chardev": "chardev-virtio-disk0",
+ "num-queues": 1,
+ "bus": "pci.1",
+ "config-wce": False,
+ "bootindex": 1,
+ "disable-legacy": "off",
+ }
+
+ if incoming:
+ device_params["local-incoming"] = True
+
+ vm.cmd("device_add", device_params)
+
+ def set_migration_capabilities(self, vm):
+ capabilities = [
+ {"capability": "events", "state": True},
+ {"capability": "x-ignore-shared", "state": True},
+ {"capability": "local-vhost-user-blk", "state": True},
+ {"capability": "local-char-socket", "state": True},
+ ]
+ vm.cmd("migrate-set-capabilities", {"capabilities": capabilities})
+
+ def test_vhost_user_blk_fd_migration(self):
+ self.require_accelerator("kvm")
+ self.set_machine("q35")
+
+ socket_dir = self.socket_dir()
+ vhost_socket = os.path.join(socket_dir.name, "vhost-user-blk.sock")
+ migration_socket = os.path.join(socket_dir.name, "migration.sock")
+
+ img_path = self.create_test_image()
+ shm_path = self.setup_shared_memory()
+
+ self.start_vhost_user_server(vhost_socket, img_path)
+
+ self.prepare_and_launch_vm(shm_path, vhost_socket)
+ self.vm.cmd("cont")
+ self.wait_for_console_pattern("Entering emergency mode.")
+ self.wait_for_console_pattern("# ")
+
+ self.write_data(self.DATA1, self.vm)
+ self.read_data(self.DATA1, self.vm)
+
+ target_vm = self.get_vm(name="target")
+ self.prepare_and_launch_vm(
+ shm_path, vhost_socket, incoming=True, vm=target_vm
+ )
+
+ target_vm.cmd("migrate-incoming", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Starting migration")
+ self.vm.cmd("migrate", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Waiting for migration completion")
+ wait_migration_finish(self.vm, target_vm)
+
+ target_vm.cmd("cont")
+ self.vm.shutdown()
+
+ self.log.info("Verifying disk on target VM after migration")
+ self.read_data(self.DATA1, target_vm)
+ self.write_data(self.DATA2, target_vm)
+ self.read_data(self.DATA2, target_vm)
+
+ target_vm.shutdown()
+
+
+if __name__ == "__main__":
+ LinuxKernelTest.main()
--
2.48.1
^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 00/33] vhost-user-blk: live-backend local migration
2025-08-13 16:48 [PATCH 00/33] vhost-user-blk: live-backend local migration Vladimir Sementsov-Ogievskiy
` (32 preceding siblings ...)
2025-08-13 16:48 ` [PATCH 33/33] tests/functional: add test_x86_64_vhost_user_blk_fd_migration.py Vladimir Sementsov-Ogievskiy
@ 2025-10-09 19:16 ` Raphael Norwitz
2025-10-09 22:43 ` Vladimir Sementsov-Ogievskiy
33 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 19:16 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
My apologies for the late review here. I appreciate the need to work
around these issues but I do feel the approach complicates Qemu
significantly and it may be possible to achieve similar results
managing state inside the backend. More comments inline.
I like a lot of the cleanups here - maybe consider breaking out a
series with some of the cleanups?
On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> Hi all!
>
> Local migration of vhost-user-blk requires non-trivial actions
> from management layer, it should provide a new connection for new
> QEMU process and handle disk operation movement from one connection
> to another.
>
> Such switching, including reinitialization of vhost-user connection,
> draining disk requests, etc, adds significant value to local migration
> downtime.
I see how draining IO requests adds downtime and is impactful. That
said, we need to start-stop the device anyways so I'm not convinced
that setting up mappings and sending messages back and forth are
impactful enough to warrant adding a whole new migration mode. Am I
missing anything here?
>
> This all leads to an idea: why not to just pass all we need from
> old QEMU process to the new one (including open file descriptors),
> and don't touch the backend at all? This way, the vhost user backend
> server will not even know, that QEMU process is changed, as live
> vhost-user connection is migrated.
Alternatively, if it really is about avoiding IO draining, what if
Qemu advertised a new vhost-user protocol feature which would query
whether the backend already has state for the device? Then, if the
backend indicates that it does, Qemu and the backend can take a
different path in vhost-user, exchanging relevant information,
including the descriptor indexes for the VQs such that draining can be
avoided. I expect that could be implemented to cut down a lot of the
other vhost-user overhead anyways (i.e. you could skip setting the
memory table). If nothing else it would probably help other device
types take advantage of this without adding more options to Qemu.
Thoughts?
>
> So this series realize the idea. No requests are done to backend
> during migration, instead all backend-related state and all related
> file descriptors (vhost-user connection, guest/host notifiers,
> inflight region) are passed to new process. Of course, migration
> should go through unix socket.
>
> The most of the series are refactoring patches. The core feature is
> spread between 24, 28-31 patches.
>
> Why not CPR-transfer?
>
> 1. In the new mode of local migration we need to pass not only
> file descriptors, but additional parts of backend-related state,
> which we don't want (or even can't) reinitialize in target process.
> And it's a lot simpler to add new fields to common migration stream.
> And why not to pass fds in the same stream?
>
> 2. No benefit of vhost-user connection fd passed to target in early
> stage before device creation: we can't use it together with source
> QEMU process anyway. So, we need a moment, when source qemu stops using
> the fd, and target start doing it. And native place for this moment is
> usual save/load of the device in migration process. And yes, we have to
> deeply update initialization/starting of the device to not reinitialize
> the backend, but just continue to work with it in a new QEMU process.
>
> 3. So, if we can't actually use fd, passed early before device creation,
> no reason to care about:
> - non-working QMP connection on target until "migrate" command on source
> - additional migration channel
> - implementing code to pass additional non-fd fields together with fds in CPR
>
> However, the series doesn't conflict with CPR-transfer, as it's actually
> a usual migration with some additional capabilities. The only
> requirement is that main migration channel should be a unix socket.
>
> Vladimir Sementsov-Ogievskiy (33):
> vhost: introduce vhost_ops->vhost_set_vring_enable_supported method
> vhost: drop backend_features field
> vhost-user: introduce vhost_user_has_prot() helper
> vhost: move protocol_features to vhost_user
> vhost-user-gpu: drop code duplication
> vhost: make vhost_dev.features private
> virtio: move common part of _set_guest_notifier to generic code
> virtio: drop *_set_guest_notifier_fd_handler() helpers
> vhost-user: keep QIOChannelSocket for backend channel
> vhost: vhost_virtqueue_start(): fix failure path
> vhost: make vhost_memory_unmap() null-safe
> vhost: simplify calls to vhost_memory_unmap()
> vhost: move vrings mapping to the top of vhost_virtqueue_start()
> vhost: vhost_virtqueue_start(): drop extra local variables
> vhost: final refactoring of vhost vrings map/unmap
> vhost: simplify vhost_dev_init() error-path
> vhost: move busyloop timeout initialization to vhost_virtqueue_init()
> vhost: introduce check_memslots() helper
> vhost: vhost_dev_init(): drop extra features variable
> hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier()
> vhost-user: make trace events more readable
> vhost-user-blk: add some useful trace-points
> vhost: add some useful trace-points
> chardev-add: support local migration
> virtio: introduce .skip_vhost_migration_log() handler
> io/channel-socket: introduce qio_channel_socket_keep_nonblock()
> migration/socket: keep fds non-block
> vhost: introduce backend migration
> vhost-user: support backend migration
> virtio: support vhost backend migration
> vhost-user-blk: support vhost backend migration
> test/functional: exec_command_and_wait_for_pattern: add vm arg
> tests/functional: add test_x86_64_vhost_user_blk_fd_migration.py
>
> backends/cryptodev-vhost.c | 1 -
> chardev/char-socket.c | 101 +++-
> hw/block/trace-events | 10 +
> hw/block/vhost-user-blk.c | 201 ++++++--
> hw/display/vhost-user-gpu.c | 11 +-
> hw/net/vhost_net.c | 27 +-
> hw/scsi/vhost-scsi.c | 1 -
> hw/scsi/vhost-user-scsi.c | 1 -
> hw/virtio/trace-events | 12 +-
> hw/virtio/vdpa-dev.c | 3 +-
> hw/virtio/vhost-user-base.c | 8 +-
> hw/virtio/vhost-user.c | 326 +++++++++---
> hw/virtio/vhost.c | 474 ++++++++++++------
> hw/virtio/virtio-bus.c | 20 +-
> hw/virtio/virtio-hmp-cmds.c | 2 -
> hw/virtio/virtio-mmio.c | 41 +-
> hw/virtio/virtio-pci.c | 34 +-
> hw/virtio/virtio-qmp.c | 10 +-
> hw/virtio/virtio.c | 120 ++++-
> include/chardev/char-socket.h | 3 +
> include/hw/virtio/vhost-backend.h | 10 +
> include/hw/virtio/vhost-user-blk.h | 2 +
> include/hw/virtio/vhost.h | 42 +-
> include/hw/virtio/virtio-pci.h | 3 -
> include/hw/virtio/virtio.h | 11 +-
> include/io/channel-socket.h | 3 +
> io/channel-socket.c | 16 +-
> migration/options.c | 14 +
> migration/options.h | 2 +
> migration/socket.c | 1 +
> net/vhost-vdpa.c | 7 +-
> qapi/char.json | 16 +-
> qapi/migration.json | 19 +-
> qapi/virtio.json | 3 -
> stubs/meson.build | 1 +
> stubs/qemu_file.c | 15 +
> stubs/vmstate.c | 6 +
> tests/functional/qemu_test/cmd.py | 7 +-
> ...test_x86_64_vhost_user_blk_fd_migration.py | 279 +++++++++++
> tests/qtest/meson.build | 2 +-
> tests/unit/meson.build | 4 +-
> 41 files changed, 1420 insertions(+), 449 deletions(-)
> create mode 100644 stubs/qemu_file.c
> create mode 100644 tests/functional/test_x86_64_vhost_user_blk_fd_migration.py
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 00/33] vhost-user-blk: live-backend local migration
2025-10-09 19:16 ` [PATCH 00/33] vhost-user-blk: live-backend local migration Raphael Norwitz
@ 2025-10-09 22:43 ` Vladimir Sementsov-Ogievskiy
2025-10-09 23:28 ` Raphael Norwitz
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-09 22:43 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 09.10.25 22:16, Raphael Norwitz wrote:
> My apologies for the late review here. I appreciate the need to work
> around these issues but I do feel the approach complicates Qemu
> significantly and it may be possible to achieve similar results
> managing state inside the backend. More comments inline.
>
> I like a lot of the cleanups here - maybe consider breaking out a
> series with some of the cleanups?
Of course, I thought about that too.
>
> On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
>>
>> Hi all!
>>
>> Local migration of vhost-user-blk requires non-trivial actions
>> from management layer, it should provide a new connection for new
>> QEMU process and handle disk operation movement from one connection
>> to another.
>>
>> Such switching, including reinitialization of vhost-user connection,
>> draining disk requests, etc, adds significant value to local migration
>> downtime.
>
> I see how draining IO requests adds downtime and is impactful. That
> said, we need to start-stop the device anyways
No, with this series and new feature enabled we don't have this drain,
see
if (dev->backend_transfer) {
return 0;
}
at start of do_vhost_virtqueue_stop().
> so I'm not convinced
> that setting up mappings and sending messages back and forth are
> impactful enough to warrant adding a whole new migration mode. Am I
> missing anything here?
In management layer we have to manage two end-points for remote
disk, and accompany a safe switch from one to another. That's
complicated and often long procedure, which contributes an
average delay of 0.6 seconds, and (which is worse) ~2.4 seconds
in p99.
Of course, you may say "just rewrite your management layer to
work better":) But that's not simple, and we came to idea, that
we can do the whole local migration at QEMU side, not touching
backend at all.
The main benefit: fewer participants. We don't rely on management layer
and vhost-user server to do proper things for migration. Backend even
don't know, that QEMU is updated. This makes the whole process
simpler and therefore safer.
The disk service may also be temporarily down at some time, which of course has
a bad effect on live migration and its freeze-time. We avoid this
issue with my series (as we don't communicate to the backend in
any way during migration, and disk service should not manage any
endpoints switching)
Note also, that my series is not a precedent in QEMU, and not a totally new
mode.
Steve Sistare works on the idea to pass backends through UNIX socket, and it
is now merged as cpr-transfer and cpr-exec migration modes, and supports
VFIO devices.
So, my work shares this existing concept on vhost-user-blk and virtio-net,
and may be used as part of cpr-transfer / cpr-exec, or in separate.
>
>>
>> This all leads to an idea: why not to just pass all we need from
>> old QEMU process to the new one (including open file descriptors),
>> and don't touch the backend at all? This way, the vhost user backend
>> server will not even know, that QEMU process is changed, as live
>> vhost-user connection is migrated.
>
> Alternatively, if it really is about avoiding IO draining, what if
> Qemu advertised a new vhost-user protocol feature which would query
> whether the backend already has state for the device? Then, if the
> backend indicates that it does, Qemu and the backend can take a
> different path in vhost-user, exchanging relevant information,
> including the descriptor indexes for the VQs such that draining can be
> avoided. I expect that could be implemented to cut down a lot of the
> other vhost-user overhead anyways (i.e. you could skip setting the
> memory table). If nothing else it would probably help other device
> types take advantage of this without adding more options to Qemu.
>
Hmm, if say only about draining, as I understand, the only thing we need
is support migrating of "inflight region". This done in the series,
and we are also preparing a separate feature to support migrating
inflight region for remote migration.
But, for local migration we want more: remove disk service from
the process at all, to have a guaranteed small downtime for live updates.
independent of any problems which may occur on disk service side.
Why freeze-time is more sensitive for live-updates than for remote
migration? Because we have to run a lot of live-update operations:
simply update all the vms in the cloud to a new version. Remote
migration happens much less frequently: when we need to move all
vms from physical server to reboot it (or repair it, serve it, etc).
So, I still believe, that migrating backend states through QEMU migration
stream makes sense in general, and for vhost-user-blk it works well too.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 00/33] vhost-user-blk: live-backend local migration
2025-10-09 22:43 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-09 23:28 ` Raphael Norwitz
2025-10-10 8:47 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-09 23:28 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Thanks for the detailed response here, it does clear up the intent.
I agree it's much better to keep the management layer from having to
make API calls back and forth to the backend so that the migration
looks like a reconnect from the backend's perspective. I'm not totally
clear on the fundamental reason why the management layer would have to
call out to the backend, as opposed to having the vhost-user code in
the backend figure out that it's a local migration when the new
destination QEMU tries to connect and respond accordingly.
That said, I haven't followed the work here all that closely. If MST
or other maintainers have blessed this as the right way I'm ok with
it.
On Thu, Oct 9, 2025 at 6:43 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 09.10.25 22:16, Raphael Norwitz wrote:
> > My apologies for the late review here. I appreciate the need to work
> > around these issues but I do feel the approach complicates Qemu
> > significantly and it may be possible to achieve similar results
> > managing state inside the backend. More comments inline.
> >
> > I like a lot of the cleanups here - maybe consider breaking out a
> > series with some of the cleanups?
>
> Of course, I thought about that too.
>
> >
> > On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
> > <vsementsov@yandex-team.ru> wrote:
> >>
> >> Hi all!
> >>
> >> Local migration of vhost-user-blk requires non-trivial actions
> >> from management layer, it should provide a new connection for new
> >> QEMU process and handle disk operation movement from one connection
> >> to another.
> >>
> >> Such switching, including reinitialization of vhost-user connection,
> >> draining disk requests, etc, adds significant value to local migration
> >> downtime.
> >
> > I see how draining IO requests adds downtime and is impactful. That
> > said, we need to start-stop the device anyways
>
> No, with this series and new feature enabled we don't have this drain,
> see
>
> if (dev->backend_transfer) {
> return 0;
> }
>
> at start of do_vhost_virtqueue_stop().
>
> > so I'm not convinced
> > that setting up mappings and sending messages back and forth are
> > impactful enough to warrant adding a whole new migration mode. Am I
> > missing anything here?
>
> In management layer we have to manage two end-points for remote
> disk, and accompany a safe switch from one to another. That's
> complicated and often long procedure, which contributes an
> average delay of 0.6 seconds, and (which is worse) ~2.4 seconds
> in p99.
>
> Of course, you may say "just rewrite your management layer to
> work better":) But that's not simple, and we came to idea, that
> we can do the whole local migration at QEMU side, not touching
> backend at all.
>
> The main benefit: fewer participants. We don't rely on management layer
> and vhost-user server to do proper things for migration. Backend even
> don't know, that QEMU is updated. This makes the whole process
> simpler and therefore safer.
>
> The disk service may also be temporarily down at some time, which of course has
> a bad effect on live migration and its freeze-time. We avoid this
> issue with my series (as we don't communicate to the backend in
> any way during migration, and disk service should not manage any
> endpoints switching)
>
> Note also, that my series is not a precedent in QEMU, and not a totally new
> mode.
>
> Steve Sistare works on the idea to pass backends through UNIX socket, and it
> is now merged as cpr-transfer and cpr-exec migration modes, and supports
> VFIO devices.
>
> So, my work shares this existing concept on vhost-user-blk and virtio-net,
> and may be used as part of cpr-transfer / cpr-exec, or in separate.
>
> >
> >>
> >> This all leads to an idea: why not to just pass all we need from
> >> old QEMU process to the new one (including open file descriptors),
> >> and don't touch the backend at all? This way, the vhost user backend
> >> server will not even know, that QEMU process is changed, as live
> >> vhost-user connection is migrated.
> >
> > Alternatively, if it really is about avoiding IO draining, what if
> > Qemu advertised a new vhost-user protocol feature which would query
> > whether the backend already has state for the device? Then, if the
> > backend indicates that it does, Qemu and the backend can take a
> > different path in vhost-user, exchanging relevant information,
> > including the descriptor indexes for the VQs such that draining can be
> > avoided. I expect that could be implemented to cut down a lot of the
> > other vhost-user overhead anyways (i.e. you could skip setting the
> > memory table). If nothing else it would probably help other device
> > types take advantage of this without adding more options to Qemu.
> >
>
> Hmm, if say only about draining, as I understand, the only thing we need
> is support migrating of "inflight region". This done in the series,
> and we are also preparing a separate feature to support migrating
> inflight region for remote migration.
>
> But, for local migration we want more: remove disk service from
> the process at all, to have a guaranteed small downtime for live updates.
> independent of any problems which may occur on disk service side.
>
> Why freeze-time is more sensitive for live-updates than for remote
> migration? Because we have to run a lot of live-update operations:
> simply update all the vms in the cloud to a new version. Remote
> migration happens much less frequently: when we need to move all
> vms from physical server to reboot it (or repair it, serve it, etc).
>
> So, I still believe, that migrating backend states through QEMU migration
> stream makes sense in general, and for vhost-user-blk it works well too.
>
>
> --
> Best regards,
> Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 00/33] vhost-user-blk: live-backend local migration
2025-10-09 23:28 ` Raphael Norwitz
@ 2025-10-10 8:47 ` Vladimir Sementsov-Ogievskiy
2025-10-13 21:41 ` Raphael Norwitz
0 siblings, 1 reply; 108+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-10 8:47 UTC (permalink / raw)
To: Raphael Norwitz
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
On 10.10.25 02:28, Raphael Norwitz wrote:
> Thanks for the detailed response here, it does clear up the intent.
>
> I agree it's much better to keep the management layer from having to
> make API calls back and forth to the backend so that the migration
> looks like a reconnect from the backend's perspective. I'm not totally
> clear on the fundamental reason why the management layer would have to
> call out to the backend, as opposed to having the vhost-user code in
> the backend figure out that it's a local migration when the new
> destination QEMU tries to connect and respond accordingly.
>
Handling this in vhost-user-server without the management layer would
actually mean handling two connections in parallel. This doesn't seem
to fit well into the vhost-user protocol.
However, we already have this support (as we have live update for VMs
with vhost-user-blk) in the disk service by accepting a new connection
on an additional Unix socket servicing the same disk but in readonly
mode until the initial connection terminates. The problem isn't with
the separate socket itself, but with safely switching the disk backend
from one connection to another. We would have to perform this switch
regardless, even if we managed both connections within the context of a
single server or a single Unix socket. The only difference is that this
way, we might avoid communication from the management layer to the disk
service. Instead of saying, "Hey, disk service, we're going to migrate
this QEMU - prepare for an endpoint switch," we'd just proceed with the
migration, and the disk service would detect it when it sees a second
connection to the Unix socket.
But this extra communication isn't the real issue. The real challenge
is that we still have to switch between connections on the backend
side. And we have to account for the possible temporary unavailability
of the disk service (the migration freeze time would just include this
period of unavailability).
With this series, we're saying: "Hold on. We already have everything
working and set up—the backend is ready, the dataplane is out of QEMU,
and the control plane isn't doing anything. And we're migrating to the
same host. Why not just keep everything as is? Just pass the file
descriptors to the new QEMU process and continue execution."
This way, we make the QEMU live-update operation independent of the
disk service's lifecycle, which improves reliability. And we maintain
only one connection instead of two, making the model simpler.
This doesn't even account for the extra time spent reconfiguring the
connection. Setting up mappings isn't free and becomes more costly for
large VMs (with significant RAM), when using hugetlbfs, or when the
system is under memory pressure.
> That said, I haven't followed the work here all that closely. If MST
> or other maintainers have blessed this as the right way I'm ok with
> it.
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 00/33] vhost-user-blk: live-backend local migration
2025-10-10 8:47 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-13 21:41 ` Raphael Norwitz
0 siblings, 0 replies; 108+ messages in thread
From: Raphael Norwitz @ 2025-10-13 21:41 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: mst, peterx, farosas, raphael, sgarzare, marcandre.lureau,
pbonzini, kwolf, hreitz, berrange, eblake, armbru, qemu-devel,
qemu-block, steven.sistare, den-plotnikov
Thanks for the extensive follow up here. I was hoping there would be
some way to move more of the logic into all vhost-user generic code
both to help other backends support local migration more easily and
have fewer "if backend is doing a local migration" checks in
vhost-user-blk code. As a straw man design, I would think there could
be some way of having the backend coordinate a handoff by signaling
the source Qemu and then the source Qemu could stop the device and ACK
with a message before the destination Qemu is allowed to start the
device.
Anyways, it seems like other maintainers have blessed this approach so
I'll leave it at that.
On Fri, Oct 10, 2025 at 4:47 AM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 10.10.25 02:28, Raphael Norwitz wrote:
> > Thanks for the detailed response here, it does clear up the intent.
> >
> > I agree it's much better to keep the management layer from having to
> > make API calls back and forth to the backend so that the migration
> > looks like a reconnect from the backend's perspective. I'm not totally
> > clear on the fundamental reason why the management layer would have to
> > call out to the backend, as opposed to having the vhost-user code in
> > the backend figure out that it's a local migration when the new
> > destination QEMU tries to connect and respond accordingly.
> >
>
> Handling this in vhost-user-server without the management layer would
> actually mean handling two connections in parallel. This doesn't seem
> to fit well into the vhost-user protocol.
>
> However, we already have this support (as we have live update for VMs
> with vhost-user-blk) in the disk service by accepting a new connection
> on an additional Unix socket servicing the same disk but in readonly
> mode until the initial connection terminates. The problem isn't with
> the separate socket itself, but with safely switching the disk backend
> from one connection to another. We would have to perform this switch
> regardless, even if we managed both connections within the context of a
> single server or a single Unix socket. The only difference is that this
> way, we might avoid communication from the management layer to the disk
> service. Instead of saying, "Hey, disk service, we're going to migrate
> this QEMU - prepare for an endpoint switch," we'd just proceed with the
> migration, and the disk service would detect it when it sees a second
> connection to the Unix socket.
>
> But this extra communication isn't the real issue. The real challenge
> is that we still have to switch between connections on the backend
> side. And we have to account for the possible temporary unavailability
> of the disk service (the migration freeze time would just include this
> period of unavailability).
>
> With this series, we're saying: "Hold on. We already have everything
> working and set up—the backend is ready, the dataplane is out of QEMU,
> and the control plane isn't doing anything. And we're migrating to the
> same host. Why not just keep everything as is? Just pass the file
> descriptors to the new QEMU process and continue execution."
>
> This way, we make the QEMU live-update operation independent of the
> disk service's lifecycle, which improves reliability. And we maintain
> only one connection instead of two, making the model simpler.
>
> This doesn't even account for the extra time spent reconfiguring the
> connection. Setting up mappings isn't free and becomes more costly for
> large VMs (with significant RAM), when using hugetlbfs, or when the
> system is under memory pressure.
>
>
> > That said, I haven't followed the work here all that closely. If MST
> > or other maintainers have blessed this as the right way I'm ok with
> > it.
> >
>
>
>
> --
> Best regards,
> Vladimir
^ permalink raw reply [flat|nested] 108+ messages in thread