* [PATCH 0/6] Add queue ready message to VDUSE
@ 2026-01-28 12:45 Eugenio Pérez
2026-01-28 12:45 ` [PATCH 1/6] vduse: ensure vq->ready access is smp safe Eugenio Pérez
` (5 more replies)
0 siblings, 6 replies; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
This series introduces a new VDUSE message for VDUSE userland instance to
detect when a VirtQueue (VQ) is enabled, replacing the polling.
VirtIO net devices' dataplane is started after the control virtqueue so QEMU
can apply the configuration in the destination of a Live Migration. Without
this feature, the VDUSE instance must poll the VQs to check when (and if) a VQ
has been enabled.
This series also implements VDUSE feature flags allowing the VDUSE devices to
opt-in to the VQ ready message. Devices that opt-in to this feature will
receive explicit notifications when a VQ is ready. Devices that do not set this
flag remain unaffected, ensuring backward compatibility without indefinitely
incrementing API versions.
The VDUSE features is a 64 bit bitmap for simplicity, the same way as vhost and
vhost-net started. It can be extended as a flexible array of bits when we
reach so many features, but it seems unlikely at this point.
This series depends on
https://lore.kernel.org/lkml/20260119143306.1818855-1-eperezma@redhat.com/
Eugenio Pérez (6):
vduse: ensure vq->ready access is smp safe
vduse: store control device pointer
vduse: Add API v2 definition
vduse: add VDUSE_GET_FEATURES ioctl
vduse: add F_QUEUE_READY feature
vduse: advertise API V2 support
drivers/vdpa/vdpa_user/vduse_dev.c | 100 +++++++++++++++++++++++++----
include/uapi/linux/vduse.h | 30 ++++++++-
2 files changed, 115 insertions(+), 15 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
2026-01-29 1:16 ` Jason Wang
2026-01-28 12:45 ` [PATCH 2/6] vduse: store control device pointer Eugenio Pérez
` (4 subsequent siblings)
5 siblings, 1 reply; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
well after initial setup, and the device can read it afterwards.
Ensure that reads and writes to vq->ready are SMP safe so that the
caller can trust that virtqueue kicks and calls behave as expected
immediately after the operation returns.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 8 deletions(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index 73d1d517dc6c..a4963aaf9332 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
return mask;
}
+static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
+{
+ /*
+ * Paired with vduse_vq_set_ready smp_store, as the driver may modify
+ * it while the VDUSE instance is reading it.
+ */
+ return smp_load_acquire(&vq->ready);
+}
+
+static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
+{
+ /*
+ * Paired with vduse_vq_get_ready smp_load, as the driver may modify
+ * it while the VDUSE instance is reading it.
+ */
+ smp_store_release(&vq->ready, ready);
+}
+
static void vduse_dev_reset(struct vduse_dev *dev)
{
int i;
@@ -486,7 +504,7 @@ static void vduse_dev_reset(struct vduse_dev *dev)
for (i = 0; i < dev->vq_num; i++) {
struct vduse_virtqueue *vq = dev->vqs[i];
- vq->ready = false;
+ vduse_vq_set_ready(vq, false);
vq->desc_addr = 0;
vq->driver_addr = 0;
vq->device_addr = 0;
@@ -529,7 +547,7 @@ static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx,
static void vduse_vq_kick(struct vduse_virtqueue *vq)
{
spin_lock(&vq->kick_lock);
- if (!vq->ready)
+ if (!vduse_vq_get_ready(vq))
goto unlock;
if (vq->kickfd)
@@ -598,7 +616,7 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
struct vduse_virtqueue *vq = dev->vqs[idx];
- vq->ready = ready;
+ vduse_vq_set_ready(vq, ready);
}
static bool vduse_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 idx)
@@ -606,7 +624,7 @@ static bool vduse_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 idx)
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
struct vduse_virtqueue *vq = dev->vqs[idx];
- return vq->ready;
+ return vduse_vq_get_ready(vq);
}
static int vduse_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 idx,
@@ -1097,7 +1115,7 @@ static int vduse_kickfd_setup(struct vduse_dev *dev,
if (vq->kickfd)
eventfd_ctx_put(vq->kickfd);
vq->kickfd = ctx;
- if (vq->ready && vq->kicked && vq->kickfd) {
+ if (vduse_vq_get_ready(vq) && vq->kicked && vq->kickfd) {
eventfd_signal(vq->kickfd);
vq->kicked = false;
}
@@ -1133,7 +1151,7 @@ static void vduse_vq_irq_inject(struct work_struct *work)
struct vduse_virtqueue, inject);
spin_lock_bh(&vq->irq_lock);
- if (vq->ready && vq->cb.callback)
+ if (vduse_vq_get_ready(vq) && vq->cb.callback)
vq->cb.callback(vq->cb.private);
spin_unlock_bh(&vq->irq_lock);
}
@@ -1146,7 +1164,7 @@ static bool vduse_vq_signal_irqfd(struct vduse_virtqueue *vq)
return false;
spin_lock_irq(&vq->irq_lock);
- if (vq->ready && vq->cb.trigger) {
+ if (vduse_vq_get_ready(vq) && vq->cb.trigger) {
eventfd_signal(vq->cb.trigger);
signal = true;
}
@@ -1500,7 +1518,7 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
vq_info.split.avail_index =
vq->state.split.avail_index;
- vq_info.ready = vq->ready;
+ vq_info.ready = vduse_vq_get_ready(vq);
ret = -EFAULT;
if (copy_to_user(argp, &vq_info, sizeof(vq_info)))
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 2/6] vduse: store control device pointer
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
2026-01-28 12:45 ` [PATCH 1/6] vduse: ensure vq->ready access is smp safe Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
2026-01-28 12:45 ` [PATCH 3/6] vduse: Add API v2 definition Eugenio Pérez
` (3 subsequent siblings)
5 siblings, 0 replies; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
This helps log the errors in next patches. The alternative is to
perform a linear search for it with class_find_device_by_devt(class, devt),
as device_destroy do for cleaning.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
drivers/vdpa/vdpa_user/vduse_dev.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index a4963aaf9332..551ccde0b856 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -164,6 +164,7 @@ static DEFINE_IDR(vduse_idr);
static dev_t vduse_major;
static struct cdev vduse_ctrl_cdev;
+static const struct device *vduse_ctrl_dev;
static struct cdev vduse_cdev;
static struct workqueue_struct *vduse_irq_wq;
static struct workqueue_struct *vduse_irq_bound_wq;
@@ -2426,7 +2427,6 @@ static void vduse_mgmtdev_exit(void)
static int vduse_init(void)
{
int ret;
- struct device *dev;
ret = class_register(&vduse_class);
if (ret)
@@ -2443,9 +2443,9 @@ static int vduse_init(void)
if (ret)
goto err_ctrl_cdev;
- dev = device_create(&vduse_class, NULL, vduse_major, NULL, "control");
- if (IS_ERR(dev)) {
- ret = PTR_ERR(dev);
+ vduse_ctrl_dev = device_create(&vduse_class, NULL, vduse_major, NULL, "control");
+ if (IS_ERR(vduse_ctrl_dev)) {
+ ret = PTR_ERR(vduse_ctrl_dev);
goto err_device;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 3/6] vduse: Add API v2 definition
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
2026-01-28 12:45 ` [PATCH 1/6] vduse: ensure vq->ready access is smp safe Eugenio Pérez
2026-01-28 12:45 ` [PATCH 2/6] vduse: store control device pointer Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
2026-01-29 2:00 ` Jason Wang
2026-01-28 12:45 ` [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl Eugenio Pérez
` (2 subsequent siblings)
5 siblings, 1 reply; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
Introduce the definition for VDUSE API V2. This version serves as a
gateway for feature negotiation.
The kernel uses this version to determine if the userspace device
supports feature flags. Devices that do not explicitly negotiate API V2
will be blocked from querying available VDUSE features, ensuring
backward compatibility.
The next patches implement the new feature incrementally, only enabling
the VDUSE device to set the V2 API version by the end of the series.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
include/uapi/linux/vduse.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index 68b4287f9fac..dea89ed281a7 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -14,6 +14,10 @@
#define VDUSE_API_VERSION_1 1
+/* Features support */
+
+#define VDUSE_API_VERSION_2 2
+
/*
* Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION).
* This is used for future extension.
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
` (2 preceding siblings ...)
2026-01-28 12:45 ` [PATCH 3/6] vduse: Add API v2 definition Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
2026-01-29 2:10 ` Jason Wang
2026-01-28 12:45 ` [PATCH 5/6] vduse: add F_QUEUE_READY feature Eugenio Pérez
2026-01-28 12:45 ` [PATCH 6/6] vduse: advertise API V2 support Eugenio Pérez
5 siblings, 1 reply; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
Add an ioctl to allow VDUSE instances to query the available features
supported by the kernel module.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
A simple u64 bitmap is used for feature flags. While a flexible array
could support indefinite expansion, 64 bits is sufficient for the
foreseeable future and simplifies the implementation.
Also, dev_dbg is used for logging rather than dev_err as these can be
triggered from userspace.
---
drivers/vdpa/vdpa_user/vduse_dev.c | 28 ++++++++++++++++++++++++++++
include/uapi/linux/vduse.h | 7 ++++++-
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index 551ccde0b856..e7da69c2ad71 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -52,6 +52,9 @@
#define IRQ_UNBOUND -1
+/* Supported VDUSE features */
+static const uint64_t vduse_features;
+
/*
* VDUSE instance have not asked the vduse API version, so assume 0.
*
@@ -1977,6 +1980,19 @@ static bool vduse_validate_config(struct vduse_dev_config *config,
sizeof(config->reserved)))
return false;
+ if (api_version < VDUSE_API_VERSION_2) {
+ if (config->vduse_features) {
+ dev_dbg(vduse_ctrl_dev,
+ "config->vduse_features with version %llu",
+ api_version);
+ return false;
+ }
+ } else {
+ if (config->vduse_features & ~vduse_features)
+ return false;
+ }
+
+
if (api_version < VDUSE_API_VERSION_1 &&
(config->ngroups || config->nas))
return false;
@@ -2237,6 +2253,18 @@ static long vduse_ioctl(struct file *file, unsigned int cmd,
ret = vduse_destroy_dev(name);
break;
}
+ case VDUSE_GET_FEATURES:
+ if (control->api_version < VDUSE_API_VERSION_2) {
+ dev_dbg(vduse_ctrl_dev,
+ "VDUSE_GET_FEATURES ioctl with version %llu",
+ control->api_version);
+ ret = -ENOIOCTLCMD;
+ break;
+ }
+
+ ret = put_user(vduse_features, (u64 __user *)argp);
+ break;
+
default:
ret = -EINVAL;
break;
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index dea89ed281a7..1f68e556cbf2 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -37,6 +37,7 @@
* @vq_align: the allocation alignment of virtqueue's metadata
* @ngroups: number of vq groups that VDUSE device declares
* @nas: number of address spaces that VDUSE device declares
+ * @vduse_features: VDUSE features
* @reserved: for future use, needs to be initialized to zero
* @config_size: the size of the configuration space
* @config: the buffer of the configuration space
@@ -53,7 +54,8 @@ struct vduse_dev_config {
__u32 vq_align;
__u32 ngroups; /* if VDUSE_API_VERSION >= 1 */
__u32 nas; /* if VDUSE_API_VERSION >= 1 */
- __u32 reserved[11];
+ __u64 vduse_features;
+ __u32 reserved[9];
__u32 config_size;
__u8 config[];
};
@@ -67,6 +69,9 @@ struct vduse_dev_config {
*/
#define VDUSE_DESTROY_DEV _IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX])
+/* Get the VDUSE supported features */
+#define VDUSE_GET_FEATURES _IOR(VDUSE_BASE, 0x04, __u64)
+
/* The ioctls for VDUSE device (/dev/vduse/$NAME) */
/**
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
` (3 preceding siblings ...)
2026-01-28 12:45 ` [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
2026-01-29 2:12 ` Jason Wang
2026-01-28 12:45 ` [PATCH 6/6] vduse: advertise API V2 support Eugenio Pérez
5 siblings, 1 reply; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
to explicitly signal userspace when a specific virtqueue has been
enabled.
In scenarios like Live Migration of VirtIO net devices, the dataplane
starts after the control virtqueue allowing QEMU to apply configuration
in the destination device.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
include/uapi/linux/vduse.h | 19 +++++++++++++++++++
2 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index e7da69c2ad71..1d93b540db4d 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -9,6 +9,7 @@
*/
#include "linux/virtio_net.h"
+#include <linux/bits.h>
#include <linux/cleanup.h>
#include <linux/init.h>
#include <linux/module.h>
@@ -53,7 +54,7 @@
#define IRQ_UNBOUND -1
/* Supported VDUSE features */
-static const uint64_t vduse_features;
+static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
/*
* VDUSE instance have not asked the vduse API version, so assume 0.
@@ -120,6 +121,7 @@ struct vduse_dev {
char *name;
struct mutex lock;
spinlock_t msg_lock;
+ u64 vduse_features;
u64 msg_unique;
u32 msg_timeout;
wait_queue_head_t waitq;
@@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
{
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
struct vduse_virtqueue *vq = dev->vqs[idx];
+ struct vduse_dev_msg msg = { 0 };
+ int r;
+
+ if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
+ goto out;
+
+ msg.req.type = VDUSE_SET_VQ_READY;
+ msg.req.vq_ready.num = idx;
+ msg.req.vq_ready.ready = !!ready;
+
+ r = vduse_dev_msg_sync(dev, &msg);
+ if (r < 0) {
+ dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
+ idx, ready);
+
+ /* We can't do better than break the device in this case */
+ spin_lock(&dev->msg_lock);
+ vduse_dev_broken(dev);
+ spin_unlock(&dev->msg_lock);
+ return;
+ }
+
+out:
vduse_vq_set_ready(vq, ready);
}
@@ -2121,6 +2146,7 @@ static int vduse_create_dev(struct vduse_dev_config *config,
dev->device_features = config->features;
dev->device_id = config->device_id;
dev->vendor_id = config->vendor_id;
+ dev->vduse_features = config->vduse_features;
dev->nas = (dev->api_version < VDUSE_API_VERSION_1) ? 1 : config->nas;
dev->as = kcalloc(dev->nas, sizeof(dev->as[0]), GFP_KERNEL);
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index 1f68e556cbf2..1e27395692ea 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -18,6 +18,9 @@
#define VDUSE_API_VERSION_2 2
+/* The VDUSE instance expects a request for vq ready */
+#define VDUSE_F_QUEUE_READY 0
+
/*
* Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION).
* This is used for future extension.
@@ -330,6 +333,7 @@ enum vduse_req_type {
VDUSE_SET_STATUS,
VDUSE_UPDATE_IOTLB,
VDUSE_SET_VQ_GROUP_ASID,
+ VDUSE_SET_VQ_READY,
};
/**
@@ -376,6 +380,15 @@ struct vduse_iova_range_v2 {
__u32 asid;
};
+/**
+ * struct vduse_vq_ready - Virtqueue ready request message
+ * @num: Virtqueue number
+ */
+struct vduse_vq_ready {
+ __u32 num;
+ __u32 ready;
+};
+
/**
* struct vduse_dev_request - control request
* @type: request type
@@ -386,6 +399,7 @@ struct vduse_iova_range_v2 {
* @iova: IOVA range for updating
* @iova_v2: IOVA range for updating if API_VERSION >= 1
* @vq_group_asid: ASID of a virtqueue group
+ * @vq_ready: Virtqueue ready request
* @padding: padding
*
* Structure used by read(2) on /dev/vduse/$NAME.
@@ -403,6 +417,11 @@ struct vduse_dev_request {
*/
struct vduse_iova_range_v2 iova_v2;
struct vduse_vq_group_asid vq_group_asid;
+ /*
+ * Following members but padding exist only if vduse api
+ * version >= 2
+ */
+ struct vduse_vq_ready vq_ready;
__u32 padding[32];
};
};
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 6/6] vduse: advertise API V2 support
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
` (4 preceding siblings ...)
2026-01-28 12:45 ` [PATCH 5/6] vduse: add F_QUEUE_READY feature Eugenio Pérez
@ 2026-01-28 12:45 ` Eugenio Pérez
5 siblings, 0 replies; 33+ messages in thread
From: Eugenio Pérez @ 2026-01-28 12:45 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Jason Wang, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
Eugenio Pérez, virtualization
Advertise VDUSE API V2 support to userspace.
With the necessary infrastructure and feature flags in place, VDUSE
devices can now opt-in to the new API and utilize the
VDUSE_F_QUEUE_READY feature.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
drivers/vdpa/vdpa_user/vduse_dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index 1d93b540db4d..7b5cae065f8b 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -2224,7 +2224,7 @@ static long vduse_ioctl(struct file *file, unsigned int cmd,
switch (cmd) {
case VDUSE_GET_API_VERSION:
if (control->api_version == VDUSE_API_VERSION_NOT_ASKED)
- control->api_version = VDUSE_API_VERSION_1;
+ control->api_version = VDUSE_API_VERSION_2;
ret = put_user(control->api_version, (u64 __user *)argp);
break;
case VDUSE_SET_API_VERSION: {
@@ -2235,7 +2235,7 @@ static long vduse_ioctl(struct file *file, unsigned int cmd,
break;
ret = -EINVAL;
- if (api_version > VDUSE_API_VERSION_1)
+ if (api_version > VDUSE_API_VERSION_2)
break;
ret = 0;
--
2.52.0
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-28 12:45 ` [PATCH 1/6] vduse: ensure vq->ready access is smp safe Eugenio Pérez
@ 2026-01-29 1:16 ` Jason Wang
2026-01-29 6:20 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-29 1:16 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> well after initial setup, and the device can read it afterwards.
>
> Ensure that reads and writes to vq->ready are SMP safe so that the
> caller can trust that virtqueue kicks and calls behave as expected
> immediately after the operation returns.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> 1 file changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> index 73d1d517dc6c..a4963aaf9332 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> return mask;
> }
>
> +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> +{
> + /*
> + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> + * it while the VDUSE instance is reading it.
> + */
> + return smp_load_acquire(&vq->ready);
> +}
> +
> +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> +{
> + /*
> + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> + * it while the VDUSE instance is reading it.
> + */
> + smp_store_release(&vq->ready, ready);
Assuming this is not used in the datapath, I wonder if we can simply
use vq_lock mutex.
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/6] vduse: Add API v2 definition
2026-01-28 12:45 ` [PATCH 3/6] vduse: Add API v2 definition Eugenio Pérez
@ 2026-01-29 2:00 ` Jason Wang
2026-01-29 8:07 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-29 2:00 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Introduce the definition for VDUSE API V2. This version serves as a
> gateway for feature negotiation.
>
> The kernel uses this version to determine if the userspace device
> supports feature flags. Devices that do not explicitly negotiate API V2
> will be blocked from querying available VDUSE features, ensuring
> backward compatibility.
>
> The next patches implement the new feature incrementally, only enabling
> the VDUSE device to set the V2 API version by the end of the series.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> include/uapi/linux/vduse.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> index 68b4287f9fac..dea89ed281a7 100644
> --- a/include/uapi/linux/vduse.h
> +++ b/include/uapi/linux/vduse.h
> @@ -14,6 +14,10 @@
>
> #define VDUSE_API_VERSION_1 1
>
> +/* Features support */
> +
> +#define VDUSE_API_VERSION_2 2
If we can catch the next release cycle, I would perfer not bumping
VDUSE version twice in the same release.
> +
> /*
> * Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION).
> * This is used for future extension.
> --
> 2.52.0
>
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl
2026-01-28 12:45 ` [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl Eugenio Pérez
@ 2026-01-29 2:10 ` Jason Wang
2026-01-29 8:03 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-29 2:10 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Add an ioctl to allow VDUSE instances to query the available features
> supported by the kernel module.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> A simple u64 bitmap is used for feature flags. While a flexible array
> could support indefinite expansion, 64 bits is sufficient for the
> foreseeable future and simplifies the implementation.
>
> Also, dev_dbg is used for logging rather than dev_err as these can be
> triggered from userspace.
> ---
> drivers/vdpa/vdpa_user/vduse_dev.c | 28 ++++++++++++++++++++++++++++
> include/uapi/linux/vduse.h | 7 ++++++-
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> index 551ccde0b856..e7da69c2ad71 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -52,6 +52,9 @@
>
> #define IRQ_UNBOUND -1
>
> +/* Supported VDUSE features */
> +static const uint64_t vduse_features;
> +
> /*
> * VDUSE instance have not asked the vduse API version, so assume 0.
> *
> @@ -1977,6 +1980,19 @@ static bool vduse_validate_config(struct vduse_dev_config *config,
> sizeof(config->reserved)))
> return false;
>
> + if (api_version < VDUSE_API_VERSION_2) {
> + if (config->vduse_features) {
> + dev_dbg(vduse_ctrl_dev,
> + "config->vduse_features with version %llu",
> + api_version);
> + return false;
> + }
> + } else {
> + if (config->vduse_features & ~vduse_features)
> + return false;
> + }
> +
> +
> if (api_version < VDUSE_API_VERSION_1 &&
> (config->ngroups || config->nas))
> return false;
> @@ -2237,6 +2253,18 @@ static long vduse_ioctl(struct file *file, unsigned int cmd,
> ret = vduse_destroy_dev(name);
> break;
> }
> + case VDUSE_GET_FEATURES:
> + if (control->api_version < VDUSE_API_VERSION_2) {
> + dev_dbg(vduse_ctrl_dev,
> + "VDUSE_GET_FEATURES ioctl with version %llu",
> + control->api_version);
> + ret = -ENOIOCTLCMD;
Should we stick to -EINVAL (see below)?
> + break;
> + }
> +
> + ret = put_user(vduse_features, (u64 __user *)argp);
> + break;
> +
> default:
> ret = -EINVAL;
> break;
> diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> index dea89ed281a7..1f68e556cbf2 100644
> --- a/include/uapi/linux/vduse.h
> +++ b/include/uapi/linux/vduse.h
> @@ -37,6 +37,7 @@
> * @vq_align: the allocation alignment of virtqueue's metadata
> * @ngroups: number of vq groups that VDUSE device declares
> * @nas: number of address spaces that VDUSE device declares
> + * @vduse_features: VDUSE features
> * @reserved: for future use, needs to be initialized to zero
> * @config_size: the size of the configuration space
> * @config: the buffer of the configuration space
> @@ -53,7 +54,8 @@ struct vduse_dev_config {
> __u32 vq_align;
> __u32 ngroups; /* if VDUSE_API_VERSION >= 1 */
> __u32 nas; /* if VDUSE_API_VERSION >= 1 */
> - __u32 reserved[11];
> + __u64 vduse_features;
> + __u32 reserved[9];
> __u32 config_size;
> __u8 config[];
> };
> @@ -67,6 +69,9 @@ struct vduse_dev_config {
> */
> #define VDUSE_DESTROY_DEV _IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX])
>
> +/* Get the VDUSE supported features */
> +#define VDUSE_GET_FEATURES _IOR(VDUSE_BASE, 0x04, __u64)
> +
> /* The ioctls for VDUSE device (/dev/vduse/$NAME) */
>
> /**
> --
> 2.52.0
>
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-28 12:45 ` [PATCH 5/6] vduse: add F_QUEUE_READY feature Eugenio Pérez
@ 2026-01-29 2:12 ` Jason Wang
2026-01-29 6:26 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-29 2:12 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> to explicitly signal userspace when a specific virtqueue has been
> enabled.
>
> In scenarios like Live Migration of VirtIO net devices, the dataplane
> starts after the control virtqueue allowing QEMU to apply configuration
> in the destination device.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> 2 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> index e7da69c2ad71..1d93b540db4d 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -9,6 +9,7 @@
> */
>
> #include "linux/virtio_net.h"
> +#include <linux/bits.h>
> #include <linux/cleanup.h>
> #include <linux/init.h>
> #include <linux/module.h>
> @@ -53,7 +54,7 @@
> #define IRQ_UNBOUND -1
>
> /* Supported VDUSE features */
> -static const uint64_t vduse_features;
> +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
>
> /*
> * VDUSE instance have not asked the vduse API version, so assume 0.
> @@ -120,6 +121,7 @@ struct vduse_dev {
> char *name;
> struct mutex lock;
> spinlock_t msg_lock;
> + u64 vduse_features;
> u64 msg_unique;
> u32 msg_timeout;
> wait_queue_head_t waitq;
> @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> {
> struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> struct vduse_virtqueue *vq = dev->vqs[idx];
> + struct vduse_dev_msg msg = { 0 };
> + int r;
> +
> + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> + goto out;
> +
> + msg.req.type = VDUSE_SET_VQ_READY;
> + msg.req.vq_ready.num = idx;
> + msg.req.vq_ready.ready = !!ready;
> +
> + r = vduse_dev_msg_sync(dev, &msg);
>
> + if (r < 0) {
> + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> + idx, ready);
> +
> + /* We can't do better than break the device in this case */
> + spin_lock(&dev->msg_lock);
> + vduse_dev_broken(dev);
This has been done by vduse_dev_msg_sync().
Thanks
> + spin_unlock(&dev->msg_lock);
> + return;
> + }
> +
> +out:
> vduse_vq_set_ready(vq, ready);
> }
>
> @@ -2121,6 +2146,7 @@ static int vduse_create_dev(struct vduse_dev_config *config,
> dev->device_features = config->features;
> dev->device_id = config->device_id;
> dev->vendor_id = config->vendor_id;
> + dev->vduse_features = config->vduse_features;
>
> dev->nas = (dev->api_version < VDUSE_API_VERSION_1) ? 1 : config->nas;
> dev->as = kcalloc(dev->nas, sizeof(dev->as[0]), GFP_KERNEL);
> diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> index 1f68e556cbf2..1e27395692ea 100644
> --- a/include/uapi/linux/vduse.h
> +++ b/include/uapi/linux/vduse.h
> @@ -18,6 +18,9 @@
>
> #define VDUSE_API_VERSION_2 2
>
> +/* The VDUSE instance expects a request for vq ready */
> +#define VDUSE_F_QUEUE_READY 0
> +
> /*
> * Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION).
> * This is used for future extension.
> @@ -330,6 +333,7 @@ enum vduse_req_type {
> VDUSE_SET_STATUS,
> VDUSE_UPDATE_IOTLB,
> VDUSE_SET_VQ_GROUP_ASID,
> + VDUSE_SET_VQ_READY,
> };
>
> /**
> @@ -376,6 +380,15 @@ struct vduse_iova_range_v2 {
> __u32 asid;
> };
>
> +/**
> + * struct vduse_vq_ready - Virtqueue ready request message
> + * @num: Virtqueue number
> + */
> +struct vduse_vq_ready {
> + __u32 num;
> + __u32 ready;
> +};
> +
> /**
> * struct vduse_dev_request - control request
> * @type: request type
> @@ -386,6 +399,7 @@ struct vduse_iova_range_v2 {
> * @iova: IOVA range for updating
> * @iova_v2: IOVA range for updating if API_VERSION >= 1
> * @vq_group_asid: ASID of a virtqueue group
> + * @vq_ready: Virtqueue ready request
> * @padding: padding
> *
> * Structure used by read(2) on /dev/vduse/$NAME.
> @@ -403,6 +417,11 @@ struct vduse_dev_request {
> */
> struct vduse_iova_range_v2 iova_v2;
> struct vduse_vq_group_asid vq_group_asid;
> + /*
> + * Following members but padding exist only if vduse api
> + * version >= 2
> + */
> + struct vduse_vq_ready vq_ready;
> __u32 padding[32];
> };
> };
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-29 1:16 ` Jason Wang
@ 2026-01-29 6:20 ` Eugenio Perez Martin
2026-01-30 2:18 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-29 6:20 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > well after initial setup, and the device can read it afterwards.
> >
> > Ensure that reads and writes to vq->ready are SMP safe so that the
> > caller can trust that virtqueue kicks and calls behave as expected
> > immediately after the operation returns.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > 1 file changed, 26 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index 73d1d517dc6c..a4963aaf9332 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > return mask;
> > }
> >
> > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > +{
> > + /*
> > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > + * it while the VDUSE instance is reading it.
> > + */
> > + return smp_load_acquire(&vq->ready);
> > +}
> > +
> > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > +{
> > + /*
> > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > + * it while the VDUSE instance is reading it.
> > + */
> > + smp_store_release(&vq->ready, ready);
>
> Assuming this is not used in the datapath, I wonder if we can simply
> use vq_lock mutex.
>
The function vduse_vq_set/get_ready are not in the datapath, but
vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
switch to vq_mutex if you want though, maybe it's even comparable with
the cost of the ioctls or eventfd signaling.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-29 2:12 ` Jason Wang
@ 2026-01-29 6:26 ` Eugenio Perez Martin
2026-01-30 2:17 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-29 6:26 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > to explicitly signal userspace when a specific virtqueue has been
> > enabled.
> >
> > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > starts after the control virtqueue allowing QEMU to apply configuration
> > in the destination device.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > 2 files changed, 46 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index e7da69c2ad71..1d93b540db4d 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -9,6 +9,7 @@
> > */
> >
> > #include "linux/virtio_net.h"
> > +#include <linux/bits.h>
> > #include <linux/cleanup.h>
> > #include <linux/init.h>
> > #include <linux/module.h>
> > @@ -53,7 +54,7 @@
> > #define IRQ_UNBOUND -1
> >
> > /* Supported VDUSE features */
> > -static const uint64_t vduse_features;
> > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> >
> > /*
> > * VDUSE instance have not asked the vduse API version, so assume 0.
> > @@ -120,6 +121,7 @@ struct vduse_dev {
> > char *name;
> > struct mutex lock;
> > spinlock_t msg_lock;
> > + u64 vduse_features;
> > u64 msg_unique;
> > u32 msg_timeout;
> > wait_queue_head_t waitq;
> > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > {
> > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > struct vduse_virtqueue *vq = dev->vqs[idx];
> > + struct vduse_dev_msg msg = { 0 };
> > + int r;
> > +
> > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > + goto out;
> > +
> > + msg.req.type = VDUSE_SET_VQ_READY;
> > + msg.req.vq_ready.num = idx;
> > + msg.req.vq_ready.ready = !!ready;
> > +
> > + r = vduse_dev_msg_sync(dev, &msg);
> >
> > + if (r < 0) {
> > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > + idx, ready);
> > +
> > + /* We can't do better than break the device in this case */
> > + spin_lock(&dev->msg_lock);
> > + vduse_dev_broken(dev);
>
> This has been done by vduse_dev_msg_sync().
>
This is done by msg_sync() when userland does not reply in a
timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
Should I add a comment?
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl
2026-01-29 2:10 ` Jason Wang
@ 2026-01-29 8:03 ` Eugenio Perez Martin
0 siblings, 0 replies; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-29 8:03 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 3:10 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Add an ioctl to allow VDUSE instances to query the available features
> > supported by the kernel module.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > A simple u64 bitmap is used for feature flags. While a flexible array
> > could support indefinite expansion, 64 bits is sufficient for the
> > foreseeable future and simplifies the implementation.
> >
> > Also, dev_dbg is used for logging rather than dev_err as these can be
> > triggered from userspace.
> > ---
> > drivers/vdpa/vdpa_user/vduse_dev.c | 28 ++++++++++++++++++++++++++++
> > include/uapi/linux/vduse.h | 7 ++++++-
> > 2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index 551ccde0b856..e7da69c2ad71 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -52,6 +52,9 @@
> >
> > #define IRQ_UNBOUND -1
> >
> > +/* Supported VDUSE features */
> > +static const uint64_t vduse_features;
> > +
> > /*
> > * VDUSE instance have not asked the vduse API version, so assume 0.
> > *
> > @@ -1977,6 +1980,19 @@ static bool vduse_validate_config(struct vduse_dev_config *config,
> > sizeof(config->reserved)))
> > return false;
> >
> > + if (api_version < VDUSE_API_VERSION_2) {
> > + if (config->vduse_features) {
> > + dev_dbg(vduse_ctrl_dev,
> > + "config->vduse_features with version %llu",
> > + api_version);
> > + return false;
> > + }
> > + } else {
> > + if (config->vduse_features & ~vduse_features)
> > + return false;
> > + }
> > +
> > +
> > if (api_version < VDUSE_API_VERSION_1 &&
> > (config->ngroups || config->nas))
> > return false;
> > @@ -2237,6 +2253,18 @@ static long vduse_ioctl(struct file *file, unsigned int cmd,
> > ret = vduse_destroy_dev(name);
> > break;
> > }
> > + case VDUSE_GET_FEATURES:
> > + if (control->api_version < VDUSE_API_VERSION_2) {
> > + dev_dbg(vduse_ctrl_dev,
> > + "VDUSE_GET_FEATURES ioctl with version %llu",
> > + control->api_version);
> > + ret = -ENOIOCTLCMD;
>
> Should we stick to -EINVAL (see below)?
>
Good point! I got fooled by vduse_dev_ioctl that uses ENOIOCTLCMD as
default. I guess it's too late to change the mismatch. Changing for
V2.
Thanks!
>
> > + break;
> > + }
> > +
> > + ret = put_user(vduse_features, (u64 __user *)argp);
> > + break;
> > +
> > default:
> > ret = -EINVAL;
> > break;
> > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> > index dea89ed281a7..1f68e556cbf2 100644
> > --- a/include/uapi/linux/vduse.h
> > +++ b/include/uapi/linux/vduse.h
> > @@ -37,6 +37,7 @@
> > * @vq_align: the allocation alignment of virtqueue's metadata
> > * @ngroups: number of vq groups that VDUSE device declares
> > * @nas: number of address spaces that VDUSE device declares
> > + * @vduse_features: VDUSE features
> > * @reserved: for future use, needs to be initialized to zero
> > * @config_size: the size of the configuration space
> > * @config: the buffer of the configuration space
> > @@ -53,7 +54,8 @@ struct vduse_dev_config {
> > __u32 vq_align;
> > __u32 ngroups; /* if VDUSE_API_VERSION >= 1 */
> > __u32 nas; /* if VDUSE_API_VERSION >= 1 */
> > - __u32 reserved[11];
> > + __u64 vduse_features;
> > + __u32 reserved[9];
> > __u32 config_size;
> > __u8 config[];
> > };
> > @@ -67,6 +69,9 @@ struct vduse_dev_config {
> > */
> > #define VDUSE_DESTROY_DEV _IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX])
> >
> > +/* Get the VDUSE supported features */
> > +#define VDUSE_GET_FEATURES _IOR(VDUSE_BASE, 0x04, __u64)
> > +
> > /* The ioctls for VDUSE device (/dev/vduse/$NAME) */
> >
> > /**
> > --
> > 2.52.0
> >
>
> Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/6] vduse: Add API v2 definition
2026-01-29 2:00 ` Jason Wang
@ 2026-01-29 8:07 ` Eugenio Perez Martin
2026-01-30 2:17 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-29 8:07 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 3:00 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Introduce the definition for VDUSE API V2. This version serves as a
> > gateway for feature negotiation.
> >
> > The kernel uses this version to determine if the userspace device
> > supports feature flags. Devices that do not explicitly negotiate API V2
> > will be blocked from querying available VDUSE features, ensuring
> > backward compatibility.
> >
> > The next patches implement the new feature incrementally, only enabling
> > the VDUSE device to set the V2 API version by the end of the series.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > include/uapi/linux/vduse.h | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> > index 68b4287f9fac..dea89ed281a7 100644
> > --- a/include/uapi/linux/vduse.h
> > +++ b/include/uapi/linux/vduse.h
> > @@ -14,6 +14,10 @@
> >
> > #define VDUSE_API_VERSION_1 1
> >
> > +/* Features support */
> > +
> > +#define VDUSE_API_VERSION_2 2
>
> If we can catch the next release cycle, I would perfer not bumping
> VDUSE version twice in the same release.
>
Following the number of versions that ASID required I don't expect
this one to be ready for the next release cycle :).
But if you ack the patches related with the vduse features handling,
MST is ok with that, and we're still in time to reach linux-next and
the next Linux, I can prepare a series where I add the features bits
to V1 API. Then, all ASID and VQ_ENABLE will be handled by feature
bits. Would that work for both of you?
If we're out of time, I'd prefer ASID to reach the next release cycle
as the series is already big.
Thanks!
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/6] vduse: Add API v2 definition
2026-01-29 8:07 ` Eugenio Perez Martin
@ 2026-01-30 2:17 ` Jason Wang
2026-01-30 8:12 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-30 2:17 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 4:08 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 3:00 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > Introduce the definition for VDUSE API V2. This version serves as a
> > > gateway for feature negotiation.
> > >
> > > The kernel uses this version to determine if the userspace device
> > > supports feature flags. Devices that do not explicitly negotiate API V2
> > > will be blocked from querying available VDUSE features, ensuring
> > > backward compatibility.
> > >
> > > The next patches implement the new feature incrementally, only enabling
> > > the VDUSE device to set the V2 API version by the end of the series.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > > include/uapi/linux/vduse.h | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> > > index 68b4287f9fac..dea89ed281a7 100644
> > > --- a/include/uapi/linux/vduse.h
> > > +++ b/include/uapi/linux/vduse.h
> > > @@ -14,6 +14,10 @@
> > >
> > > #define VDUSE_API_VERSION_1 1
> > >
> > > +/* Features support */
> > > +
> > > +#define VDUSE_API_VERSION_2 2
> >
> > If we can catch the next release cycle, I would perfer not bumping
> > VDUSE version twice in the same release.
> >
>
> Following the number of versions that ASID required I don't expect
> this one to be ready for the next release cycle :).
Then it's probably fine.
>
> But if you ack the patches related with the vduse features handling,
> MST is ok with that, and we're still in time to reach linux-next and
> the next Linux, I can prepare a series where I add the features bits
> to V1 API. Then, all ASID and VQ_ENABLE will be handled by feature
> bits. Would that work for both of you?
Maybe you can start with V1 and if we can't catch this release, a
respin will be needed anyhow.
>
> If we're out of time, I'd prefer ASID to reach the next release cycle
> as the series is already big.
>
> Thanks!
Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-29 6:26 ` Eugenio Perez Martin
@ 2026-01-30 2:17 ` Jason Wang
2026-01-30 8:14 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-30 2:17 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > to explicitly signal userspace when a specific virtqueue has been
> > > enabled.
> > >
> > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > starts after the control virtqueue allowing QEMU to apply configuration
> > > in the destination device.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > index e7da69c2ad71..1d93b540db4d 100644
> > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > @@ -9,6 +9,7 @@
> > > */
> > >
> > > #include "linux/virtio_net.h"
> > > +#include <linux/bits.h>
> > > #include <linux/cleanup.h>
> > > #include <linux/init.h>
> > > #include <linux/module.h>
> > > @@ -53,7 +54,7 @@
> > > #define IRQ_UNBOUND -1
> > >
> > > /* Supported VDUSE features */
> > > -static const uint64_t vduse_features;
> > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > >
> > > /*
> > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > char *name;
> > > struct mutex lock;
> > > spinlock_t msg_lock;
> > > + u64 vduse_features;
> > > u64 msg_unique;
> > > u32 msg_timeout;
> > > wait_queue_head_t waitq;
> > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > {
> > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > + struct vduse_dev_msg msg = { 0 };
> > > + int r;
> > > +
> > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > + goto out;
> > > +
> > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > + msg.req.vq_ready.num = idx;
> > > + msg.req.vq_ready.ready = !!ready;
> > > +
> > > + r = vduse_dev_msg_sync(dev, &msg);
> > >
> > > + if (r < 0) {
> > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > + idx, ready);
> > > +
> > > + /* We can't do better than break the device in this case */
> > > + spin_lock(&dev->msg_lock);
> > > + vduse_dev_broken(dev);
> >
> > This has been done by vduse_dev_msg_sync().
> >
>
> This is done by msg_sync() when userland does not reply in a
> timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> Should I add a comment?
If this is not specific to Q_READY, I think we need to move it to
msg_sync() as well.
Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-29 6:20 ` Eugenio Perez Martin
@ 2026-01-30 2:18 ` Jason Wang
2026-01-30 7:56 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-01-30 2:18 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > well after initial setup, and the device can read it afterwards.
> > >
> > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > caller can trust that virtqueue kicks and calls behave as expected
> > > immediately after the operation returns.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > index 73d1d517dc6c..a4963aaf9332 100644
> > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > return mask;
> > > }
> > >
> > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > +{
> > > + /*
> > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > + * it while the VDUSE instance is reading it.
> > > + */
> > > + return smp_load_acquire(&vq->ready);
> > > +}
> > > +
> > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > +{
> > > + /*
> > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > + * it while the VDUSE instance is reading it.
> > > + */
> > > + smp_store_release(&vq->ready, ready);
> >
> > Assuming this is not used in the datapath, I wonder if we can simply
> > use vq_lock mutex.
> >
>
> The function vduse_vq_set/get_ready are not in the datapath, but
> vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> switch to vq_mutex if you want though, maybe it's even comparable with
> the cost of the ioctls or eventfd signaling.
I'd like to use mutex for simplicity.
Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-30 2:18 ` Jason Wang
@ 2026-01-30 7:56 ` Eugenio Perez Martin
2026-02-03 4:05 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-30 7:56 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > well after initial setup, and the device can read it afterwards.
> > > >
> > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > immediately after the operation returns.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > return mask;
> > > > }
> > > >
> > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > +{
> > > > + /*
> > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > + * it while the VDUSE instance is reading it.
> > > > + */
> > > > + return smp_load_acquire(&vq->ready);
> > > > +}
> > > > +
> > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > +{
> > > > + /*
> > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > + * it while the VDUSE instance is reading it.
> > > > + */
> > > > + smp_store_release(&vq->ready, ready);
> > >
> > > Assuming this is not used in the datapath, I wonder if we can simply
> > > use vq_lock mutex.
> > >
> >
> > The function vduse_vq_set/get_ready are not in the datapath, but
> > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > switch to vq_mutex if you want though, maybe it's even comparable with
> > the cost of the ioctls or eventfd signaling.
>
> I'd like to use mutex for simplicity.
>
I cannot move it to a mutex, as we need to take it in the critical
sections of the kick_lock and irq_lock spinlocks.
I can move it to a spinlock, but it seems more complicated to me. We
need to make sure we always take these kick_lock and irq_lock in the
same order as the ready_lock, to not create deadlocks, and they always
protect just the ready boolean. But sure, I'll send the spinlock
version for V2.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/6] vduse: Add API v2 definition
2026-01-30 2:17 ` Jason Wang
@ 2026-01-30 8:12 ` Eugenio Perez Martin
0 siblings, 0 replies; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-30 8:12 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 4:08 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 3:00 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Introduce the definition for VDUSE API V2. This version serves as a
> > > > gateway for feature negotiation.
> > > >
> > > > The kernel uses this version to determine if the userspace device
> > > > supports feature flags. Devices that do not explicitly negotiate API V2
> > > > will be blocked from querying available VDUSE features, ensuring
> > > > backward compatibility.
> > > >
> > > > The next patches implement the new feature incrementally, only enabling
> > > > the VDUSE device to set the V2 API version by the end of the series.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > > include/uapi/linux/vduse.h | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
> > > > index 68b4287f9fac..dea89ed281a7 100644
> > > > --- a/include/uapi/linux/vduse.h
> > > > +++ b/include/uapi/linux/vduse.h
> > > > @@ -14,6 +14,10 @@
> > > >
> > > > #define VDUSE_API_VERSION_1 1
> > > >
> > > > +/* Features support */
> > > > +
> > > > +#define VDUSE_API_VERSION_2 2
> > >
> > > If we can catch the next release cycle, I would perfer not bumping
> > > VDUSE version twice in the same release.
> > >
> >
> > Following the number of versions that ASID required I don't expect
> > this one to be ready for the next release cycle :).
>
> Then it's probably fine.
>
> >
> > But if you ack the patches related with the vduse features handling,
> > MST is ok with that, and we're still in time to reach linux-next and
> > the next Linux, I can prepare a series where I add the features bits
> > to V1 API. Then, all ASID and VQ_ENABLE will be handled by feature
> > bits. Would that work for both of you?
>
> Maybe you can start with V1 and if we can't catch this release, a
> respin will be needed anyhow.
>
I'm fine with that if you ask it, but if it gets merged (unlikely)
we'll have a few commits with V1 that supports the VDUSE_GET_FEATURES
and others that do not.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-30 2:17 ` Jason Wang
@ 2026-01-30 8:14 ` Eugenio Perez Martin
2026-02-03 4:00 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-01-30 8:14 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > to explicitly signal userspace when a specific virtqueue has been
> > > > enabled.
> > > >
> > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > in the destination device.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > @@ -9,6 +9,7 @@
> > > > */
> > > >
> > > > #include "linux/virtio_net.h"
> > > > +#include <linux/bits.h>
> > > > #include <linux/cleanup.h>
> > > > #include <linux/init.h>
> > > > #include <linux/module.h>
> > > > @@ -53,7 +54,7 @@
> > > > #define IRQ_UNBOUND -1
> > > >
> > > > /* Supported VDUSE features */
> > > > -static const uint64_t vduse_features;
> > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > >
> > > > /*
> > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > char *name;
> > > > struct mutex lock;
> > > > spinlock_t msg_lock;
> > > > + u64 vduse_features;
> > > > u64 msg_unique;
> > > > u32 msg_timeout;
> > > > wait_queue_head_t waitq;
> > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > {
> > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > + struct vduse_dev_msg msg = { 0 };
> > > > + int r;
> > > > +
> > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > + goto out;
> > > > +
> > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > + msg.req.vq_ready.num = idx;
> > > > + msg.req.vq_ready.ready = !!ready;
> > > > +
> > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > >
> > > > + if (r < 0) {
> > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > + idx, ready);
> > > > +
> > > > + /* We can't do better than break the device in this case */
> > > > + spin_lock(&dev->msg_lock);
> > > > + vduse_dev_broken(dev);
> > >
> > > This has been done by vduse_dev_msg_sync().
> > >
> >
> > This is done by msg_sync() when userland does not reply in a
> > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > Should I add a comment?
>
> If this is not specific to Q_READY, I think we need to move it to
> msg_sync() as well.
>
It's specific to Q_READY for me, as it's the request that returns void
and has no possibility to inform of an error.
The VDUSE userland instance could reply to other requests with
REQ_RESULT_FAILED and the driver still has capacity to recover from
the failure. If we always break the device on every request fail, we
deny that possibility.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-01-30 8:14 ` Eugenio Perez Martin
@ 2026-02-03 4:00 ` Jason Wang
2026-02-03 7:27 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-03 4:00 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > enabled.
> > > > >
> > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > in the destination device.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > ---
> > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > @@ -9,6 +9,7 @@
> > > > > */
> > > > >
> > > > > #include "linux/virtio_net.h"
> > > > > +#include <linux/bits.h>
> > > > > #include <linux/cleanup.h>
> > > > > #include <linux/init.h>
> > > > > #include <linux/module.h>
> > > > > @@ -53,7 +54,7 @@
> > > > > #define IRQ_UNBOUND -1
> > > > >
> > > > > /* Supported VDUSE features */
> > > > > -static const uint64_t vduse_features;
> > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > >
> > > > > /*
> > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > char *name;
> > > > > struct mutex lock;
> > > > > spinlock_t msg_lock;
> > > > > + u64 vduse_features;
> > > > > u64 msg_unique;
> > > > > u32 msg_timeout;
> > > > > wait_queue_head_t waitq;
> > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > {
> > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > + int r;
> > > > > +
> > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > + goto out;
> > > > > +
> > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > + msg.req.vq_ready.num = idx;
> > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > +
> > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > >
> > > > > + if (r < 0) {
> > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > + idx, ready);
> > > > > +
> > > > > + /* We can't do better than break the device in this case */
> > > > > + spin_lock(&dev->msg_lock);
> > > > > + vduse_dev_broken(dev);
> > > >
> > > > This has been done by vduse_dev_msg_sync().
> > > >
> > >
> > > This is done by msg_sync() when userland does not reply in a
> > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > Should I add a comment?
> >
> > If this is not specific to Q_READY, I think we need to move it to
> > msg_sync() as well.
> >
>
> It's specific to Q_READY for me, as it's the request that returns void
> and has no possibility to inform of an error.
I may miss something, I mean why consider the failure of Q_READY to be
more serious than the failure of other commands (e.g set_status()).
>
> The VDUSE userland instance could reply to other requests with
> REQ_RESULT_FAILED and the driver still has capacity to recover from
> the failure.
> If we always break the device on every request fail, we
> deny that possibility.
>
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-01-30 7:56 ` Eugenio Perez Martin
@ 2026-02-03 4:05 ` Jason Wang
2026-02-03 10:35 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-03 4:05 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > well after initial setup, and the device can read it afterwards.
> > > > >
> > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > immediately after the operation returns.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > ---
> > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > return mask;
> > > > > }
> > > > >
> > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > +{
> > > > > + /*
> > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > + * it while the VDUSE instance is reading it.
> > > > > + */
> > > > > + return smp_load_acquire(&vq->ready);
> > > > > +}
> > > > > +
> > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > +{
> > > > > + /*
> > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > + * it while the VDUSE instance is reading it.
> > > > > + */
> > > > > + smp_store_release(&vq->ready, ready);
> > > >
> > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > use vq_lock mutex.
> > > >
> > >
> > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > the cost of the ioctls or eventfd signaling.
> >
> > I'd like to use mutex for simplicity.
> >
>
> I cannot move it to a mutex, as we need to take it in the critical
> sections of the kick_lock and irq_lock spinlocks.
>
> I can move it to a spinlock, but it seems more complicated to me. We
> need to make sure we always take these kick_lock and irq_lock in the
> same order as the ready_lock, to not create deadlocks, and they always
> protect just the ready boolean. But sure, I'll send the spinlock
> version for V2.
Thinking about this, I'm not sure I understand the issue.
Maybe you can give me an example of the race.
THanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-02-03 4:00 ` Jason Wang
@ 2026-02-03 7:27 ` Eugenio Perez Martin
2026-02-04 2:44 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-03 7:27 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Tue, Feb 3, 2026 at 5:00 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > > enabled.
> > > > > >
> > > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > > in the destination device.
> > > > > >
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > ---
> > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > @@ -9,6 +9,7 @@
> > > > > > */
> > > > > >
> > > > > > #include "linux/virtio_net.h"
> > > > > > +#include <linux/bits.h>
> > > > > > #include <linux/cleanup.h>
> > > > > > #include <linux/init.h>
> > > > > > #include <linux/module.h>
> > > > > > @@ -53,7 +54,7 @@
> > > > > > #define IRQ_UNBOUND -1
> > > > > >
> > > > > > /* Supported VDUSE features */
> > > > > > -static const uint64_t vduse_features;
> > > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > > >
> > > > > > /*
> > > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > > char *name;
> > > > > > struct mutex lock;
> > > > > > spinlock_t msg_lock;
> > > > > > + u64 vduse_features;
> > > > > > u64 msg_unique;
> > > > > > u32 msg_timeout;
> > > > > > wait_queue_head_t waitq;
> > > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > > {
> > > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > > + int r;
> > > > > > +
> > > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > > + goto out;
> > > > > > +
> > > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > > + msg.req.vq_ready.num = idx;
> > > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > > +
> > > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > > >
> > > > > > + if (r < 0) {
> > > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > > + idx, ready);
> > > > > > +
> > > > > > + /* We can't do better than break the device in this case */
> > > > > > + spin_lock(&dev->msg_lock);
> > > > > > + vduse_dev_broken(dev);
> > > > >
> > > > > This has been done by vduse_dev_msg_sync().
> > > > >
> > > >
> > > > This is done by msg_sync() when userland does not reply in a
> > > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > > Should I add a comment?
> > >
> > > If this is not specific to Q_READY, I think we need to move it to
> > > msg_sync() as well.
> > >
> >
> > It's specific to Q_READY for me, as it's the request that returns void
> > and has no possibility to inform of an error.
>
> I may miss something, I mean why consider the failure of Q_READY to be
> more serious than the failure of other commands (e.g set_status()).
>
I'm not considering the failure of Q_READY more serious than any other
failure. I'm breaking the device here as I cannot return the error to
the vDPA driver: This function returns void.
We can make the function return a bool or int, and then make
vhost_vdpa and virtio_vdpa react to that error. QEMU is already
prepared for VHOST_VDPA_SET_VRING_ENABLE to return an error, as it is
an ioctl, and hopefully the rest of the users of
VHOST_VDPA_SET_VRING_ENABLE are also prepared.
Should I change vdpa_config_ops->set_vq_ready so it can return an
error, as a prerequisite of this series?
> >
> > The VDUSE userland instance could reply to other requests with
> > REQ_RESULT_FAILED and the driver still has capacity to recover from
> > the failure.
> > If we always break the device on every request fail, we
> > deny that possibility.
> >
>
> Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-02-03 4:05 ` Jason Wang
@ 2026-02-03 10:35 ` Eugenio Perez Martin
2026-02-04 2:48 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-03 10:35 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Tue, Feb 3, 2026 at 5:05 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > > well after initial setup, and the device can read it afterwards.
> > > > > >
> > > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > > immediately after the operation returns.
> > > > > >
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > ---
> > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > > return mask;
> > > > > > }
> > > > > >
> > > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > > +{
> > > > > > + /*
> > > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > > + * it while the VDUSE instance is reading it.
> > > > > > + */
> > > > > > + return smp_load_acquire(&vq->ready);
> > > > > > +}
> > > > > > +
> > > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > > +{
> > > > > > + /*
> > > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > > + * it while the VDUSE instance is reading it.
> > > > > > + */
> > > > > > + smp_store_release(&vq->ready, ready);
> > > > >
> > > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > > use vq_lock mutex.
> > > > >
> > > >
> > > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > > the cost of the ioctls or eventfd signaling.
> > >
> > > I'd like to use mutex for simplicity.
> > >
> >
> > I cannot move it to a mutex, as we need to take it in the critical
> > sections of the kick_lock and irq_lock spinlocks.
> >
> > I can move it to a spinlock, but it seems more complicated to me. We
> > need to make sure we always take these kick_lock and irq_lock in the
> > same order as the ready_lock, to not create deadlocks, and they always
> > protect just the ready boolean. But sure, I'll send the spinlock
> > version for V2.
>
> Thinking about this, I'm not sure I understand the issue.
>
> Maybe you can give me an example of the race.
>
It's the same issue as you describe with set_group_asid on cpu1 and
dma_map and dma_unmap in cpu1 in this mail [0], but now the functions
are set_vq_ready in the vdpa driver side and VDUSE_VQ_GET_INFO in the
VDUSE instance.
To me, the barriers should be in the driver and the VDUSE instance,
and the VDUSE instance should use the VDUSE_SET_VQ_READY message to
know when a VQ is ready. But I don't expect all the VDUSE instances to
opt-in for that, so the flow without smp barriers is:
[cpu0] .set_vq_ready(vq, true)
[cpu1] VDUSE_VQ_GET_INFO(vq)
[cpu1] check vq->ready, and it races with the set_vq_ready from cpu0.
[0] https://patchew.org/linux/20251113115558.1277981-1-eperezma@redhat.com/20251113115558.1277981-6-eperezma@redhat.com/#CACGkMEtMyte91bbdb0hfUo+YgSpZdjoWe0meKwbDc18CfWdATA@mail.gmail.com
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-02-03 7:27 ` Eugenio Perez Martin
@ 2026-02-04 2:44 ` Jason Wang
2026-02-04 7:34 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-04 2:44 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Tue, Feb 3, 2026 at 3:28 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Tue, Feb 3, 2026 at 5:00 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > > > <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > > > enabled.
> > > > > > >
> > > > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > > > in the destination device.
> > > > > > >
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > ---
> > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > @@ -9,6 +9,7 @@
> > > > > > > */
> > > > > > >
> > > > > > > #include "linux/virtio_net.h"
> > > > > > > +#include <linux/bits.h>
> > > > > > > #include <linux/cleanup.h>
> > > > > > > #include <linux/init.h>
> > > > > > > #include <linux/module.h>
> > > > > > > @@ -53,7 +54,7 @@
> > > > > > > #define IRQ_UNBOUND -1
> > > > > > >
> > > > > > > /* Supported VDUSE features */
> > > > > > > -static const uint64_t vduse_features;
> > > > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > > > >
> > > > > > > /*
> > > > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > > > char *name;
> > > > > > > struct mutex lock;
> > > > > > > spinlock_t msg_lock;
> > > > > > > + u64 vduse_features;
> > > > > > > u64 msg_unique;
> > > > > > > u32 msg_timeout;
> > > > > > > wait_queue_head_t waitq;
> > > > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > > > {
> > > > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > > > + int r;
> > > > > > > +
> > > > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > > > + goto out;
> > > > > > > +
> > > > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > > > + msg.req.vq_ready.num = idx;
> > > > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > > > +
> > > > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > > > >
> > > > > > > + if (r < 0) {
> > > > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > > > + idx, ready);
> > > > > > > +
> > > > > > > + /* We can't do better than break the device in this case */
> > > > > > > + spin_lock(&dev->msg_lock);
> > > > > > > + vduse_dev_broken(dev);
> > > > > >
> > > > > > This has been done by vduse_dev_msg_sync().
> > > > > >
> > > > >
> > > > > This is done by msg_sync() when userland does not reply in a
> > > > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > > > Should I add a comment?
> > > >
> > > > If this is not specific to Q_READY, I think we need to move it to
> > > > msg_sync() as well.
> > > >
> > >
> > > It's specific to Q_READY for me, as it's the request that returns void
> > > and has no possibility to inform of an error.
> >
> > I may miss something, I mean why consider the failure of Q_READY to be
> > more serious than the failure of other commands (e.g set_status()).
> >
>
> I'm not considering the failure of Q_READY more serious than any other
> failure. I'm breaking the device here as I cannot return the error to
> the vDPA driver: This function returns void.
Yes, and set_status() return void as well.
void virtio_add_status(struct virtio_device *dev, unsigned int status)
{
might_sleep();
=> dev->config->set_status(dev, dev->config->get_status(dev) | status);
}
>
> We can make the function return a bool or int, and then make
> vhost_vdpa and virtio_vdpa react to that error. QEMU is already
> prepared for VHOST_VDPA_SET_VRING_ENABLE to return an error, as it is
> an ioctl,
But we did:
case VHOST_VDPA_SET_VRING_ENABLE:
if (copy_from_user(&s, argp, sizeof(s)))
return -EFAULT;
ops->set_vq_ready(vdpa, idx, s.num);
return 0;
So the failure come from copy_from_user()
> and hopefully the rest of the users of
> VHOST_VDPA_SET_VRING_ENABLE are also prepared.
>
> Should I change vdpa_config_ops->set_vq_ready so it can return an
> error, as a prerequisite of this series?
Or it would be better to leave the breaking of device on
REQ_RESULT_FAILED for future investigation (not blocking this series).
>
> > >
> > > The VDUSE userland instance could reply to other requests with
> > > REQ_RESULT_FAILED and the driver still has capacity to recover from
> > > the failure.
> > > If we always break the device on every request fail, we
> > > deny that possibility.
> > >
> >
> > Thanks
> >
>
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-02-03 10:35 ` Eugenio Perez Martin
@ 2026-02-04 2:48 ` Jason Wang
2026-02-04 8:53 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-04 2:48 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Tue, Feb 3, 2026 at 6:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Tue, Feb 3, 2026 at 5:05 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > > > <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > > > well after initial setup, and the device can read it afterwards.
> > > > > > >
> > > > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > > > immediately after the operation returns.
> > > > > > >
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > ---
> > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > > > return mask;
> > > > > > > }
> > > > > > >
> > > > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > > > +{
> > > > > > > + /*
> > > > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > + */
> > > > > > > + return smp_load_acquire(&vq->ready);
> > > > > > > +}
> > > > > > > +
> > > > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > > > +{
> > > > > > > + /*
> > > > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > + */
> > > > > > > + smp_store_release(&vq->ready, ready);
> > > > > >
> > > > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > > > use vq_lock mutex.
> > > > > >
> > > > >
> > > > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > > > the cost of the ioctls or eventfd signaling.
> > > >
> > > > I'd like to use mutex for simplicity.
> > > >
> > >
> > > I cannot move it to a mutex, as we need to take it in the critical
> > > sections of the kick_lock and irq_lock spinlocks.
> > >
> > > I can move it to a spinlock, but it seems more complicated to me. We
> > > need to make sure we always take these kick_lock and irq_lock in the
> > > same order as the ready_lock, to not create deadlocks, and they always
> > > protect just the ready boolean. But sure, I'll send the spinlock
> > > version for V2.
> >
> > Thinking about this, I'm not sure I understand the issue.
> >
> > Maybe you can give me an example of the race.
> >
>
> It's the same issue as you describe with set_group_asid on cpu1 and
> dma_map and dma_unmap in cpu1 in this mail [0], but now the functions
> are set_vq_ready in the vdpa driver side and VDUSE_VQ_GET_INFO in the
> VDUSE instance.
>
> To me, the barriers should be in the driver and the VDUSE instance,
> and the VDUSE instance should use the VDUSE_SET_VQ_READY message to
> know when a VQ is ready. But I don't expect all the VDUSE instances to
> opt-in for that, so the flow without smp barriers is:
>
> [cpu0] .set_vq_ready(vq, true)
> [cpu1] VDUSE_VQ_GET_INFO(vq)
> [cpu1] check vq->ready, and it races with the set_vq_ready from cpu0.
>
Are there any side effects of this race? I meant the driver can do
set_vq_ready() at any time. And there would be message for notifying
the usersapce in this series.
> [0] https://patchew.org/linux/20251113115558.1277981-1-eperezma@redhat.com/20251113115558.1277981-6-eperezma@redhat.com/#CACGkMEtMyte91bbdb0hfUo+YgSpZdjoWe0meKwbDc18CfWdATA@mail.gmail.com
>
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-02-04 2:44 ` Jason Wang
@ 2026-02-04 7:34 ` Eugenio Perez Martin
2026-02-05 4:08 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-04 7:34 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Feb 4, 2026 at 3:44 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Feb 3, 2026 at 3:28 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Tue, Feb 3, 2026 at 5:00 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > > > > <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > > > > enabled.
> > > > > > > >
> > > > > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > > > > in the destination device.
> > > > > > > >
> > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > ---
> > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > @@ -9,6 +9,7 @@
> > > > > > > > */
> > > > > > > >
> > > > > > > > #include "linux/virtio_net.h"
> > > > > > > > +#include <linux/bits.h>
> > > > > > > > #include <linux/cleanup.h>
> > > > > > > > #include <linux/init.h>
> > > > > > > > #include <linux/module.h>
> > > > > > > > @@ -53,7 +54,7 @@
> > > > > > > > #define IRQ_UNBOUND -1
> > > > > > > >
> > > > > > > > /* Supported VDUSE features */
> > > > > > > > -static const uint64_t vduse_features;
> > > > > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > > > > >
> > > > > > > > /*
> > > > > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > > > > char *name;
> > > > > > > > struct mutex lock;
> > > > > > > > spinlock_t msg_lock;
> > > > > > > > + u64 vduse_features;
> > > > > > > > u64 msg_unique;
> > > > > > > > u32 msg_timeout;
> > > > > > > > wait_queue_head_t waitq;
> > > > > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > > > > {
> > > > > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > > > > + int r;
> > > > > > > > +
> > > > > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > > > > + goto out;
> > > > > > > > +
> > > > > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > > > > + msg.req.vq_ready.num = idx;
> > > > > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > > > > +
> > > > > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > > > > >
> > > > > > > > + if (r < 0) {
> > > > > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > > > > + idx, ready);
> > > > > > > > +
> > > > > > > > + /* We can't do better than break the device in this case */
> > > > > > > > + spin_lock(&dev->msg_lock);
> > > > > > > > + vduse_dev_broken(dev);
> > > > > > >
> > > > > > > This has been done by vduse_dev_msg_sync().
> > > > > > >
> > > > > >
> > > > > > This is done by msg_sync() when userland does not reply in a
> > > > > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > > > > Should I add a comment?
> > > > >
> > > > > If this is not specific to Q_READY, I think we need to move it to
> > > > > msg_sync() as well.
> > > > >
> > > >
> > > > It's specific to Q_READY for me, as it's the request that returns void
> > > > and has no possibility to inform of an error.
> > >
> > > I may miss something, I mean why consider the failure of Q_READY to be
> > > more serious than the failure of other commands (e.g set_status()).
> > >
> >
> > I'm not considering the failure of Q_READY more serious than any other
> > failure. I'm breaking the device here as I cannot return the error to
> > the vDPA driver: This function returns void.
>
> Yes, and set_status() return void as well.
>
> void virtio_add_status(struct virtio_device *dev, unsigned int status)
> {
> might_sleep();
> => dev->config->set_status(dev, dev->config->get_status(dev) | status);
> }
>
Yes, I'm not saying all of the other users of vduse_dev_msg_sync don't
ignore the return code of the userland VDUSE instance. I'm saying that
we have cases where it is not ignored and the driver can react from
the error. After a fast look for them:
1) Case vhost_vdpa -> VHOST_GET_VRING_BASE -> ops->get_vq_state.
2) Case vhost_vdpa -> ops->vduse_vdpa_set_map -> vduse_dev_update_iotlb
If the userland VDUSE instance returns an error, -EIO is propagated to
vhost_vdpa user, and both can react and continue operating normally.
If we break the device, the same two userlands apps see a totally
different behavior: The device is totally unusable from that moment.
Do we really want to break the device from the moment that VDUSE
instance returns an error in these conditions, and do it in an user
visible change?
> >
> > We can make the function return a bool or int, and then make
> > vhost_vdpa and virtio_vdpa react to that error. QEMU is already
> > prepared for VHOST_VDPA_SET_VRING_ENABLE to return an error, as it is
> > an ioctl,
>
> But we did:
>
> case VHOST_VDPA_SET_VRING_ENABLE:
> if (copy_from_user(&s, argp, sizeof(s)))
> return -EFAULT;
> ops->set_vq_ready(vdpa, idx, s.num);
> return 0;
>
> So the failure come from copy_from_user()
>
Yes. Let me rewrite it as:
We can make ops->set_vq_ready return a bool or int, and then make
vhost_vdpa react to that error. The driver virtio_vdpa already checks
the same by calling get_vq_ready, but there is no equivalent in
vhost_vdpa. I can set a comment explaining the two methods for
checking the error of the call. QEMU is already prepared for handling
the return of an error from VHOST_VDPA_SET_VRING_ENABLE, as the ioctl
already returns errors like -EFAULT, and hopefully the rest of the
users of VHOST_VDPA_SET_VRING_ENABLE are also prepared.
> >
> > Should I change vdpa_config_ops->set_vq_ready so it can return an
> > error, as a prerequisite of this series?
>
> Or it would be better to leave the breaking of device on
> REQ_RESULT_FAILED for future investigation (not blocking this series).
>
I'd say it's the best option, yes. But my vote is to make
VHOST_VDPA_SET_VRING_ENABLE more robust actually :).
> >
> > > >
> > > > The VDUSE userland instance could reply to other requests with
> > > > REQ_RESULT_FAILED and the driver still has capacity to recover from
> > > > the failure.
> > > > If we always break the device on every request fail, we
> > > > deny that possibility.
> > > >
> > >
> > > Thanks
> > >
> >
>
> Thanks
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-02-04 2:48 ` Jason Wang
@ 2026-02-04 8:53 ` Eugenio Perez Martin
2026-02-05 4:04 ` Jason Wang
0 siblings, 1 reply; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-04 8:53 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Feb 4, 2026 at 3:48 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Feb 3, 2026 at 6:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Tue, Feb 3, 2026 at 5:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > > > > <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > >
> > > > > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > > > > well after initial setup, and the device can read it afterwards.
> > > > > > > >
> > > > > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > > > > immediately after the operation returns.
> > > > > > > >
> > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > ---
> > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > > > > return mask;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > > > > +{
> > > > > > > > + /*
> > > > > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > + */
> > > > > > > > + return smp_load_acquire(&vq->ready);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > > > > +{
> > > > > > > > + /*
> > > > > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > + */
> > > > > > > > + smp_store_release(&vq->ready, ready);
> > > > > > >
> > > > > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > > > > use vq_lock mutex.
> > > > > > >
> > > > > >
> > > > > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > > > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > > > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > > > > the cost of the ioctls or eventfd signaling.
> > > > >
> > > > > I'd like to use mutex for simplicity.
> > > > >
> > > >
> > > > I cannot move it to a mutex, as we need to take it in the critical
> > > > sections of the kick_lock and irq_lock spinlocks.
> > > >
> > > > I can move it to a spinlock, but it seems more complicated to me. We
> > > > need to make sure we always take these kick_lock and irq_lock in the
> > > > same order as the ready_lock, to not create deadlocks, and they always
> > > > protect just the ready boolean. But sure, I'll send the spinlock
> > > > version for V2.
> > >
> > > Thinking about this, I'm not sure I understand the issue.
> > >
> > > Maybe you can give me an example of the race.
> > >
> >
> > It's the same issue as you describe with set_group_asid on cpu1 and
> > dma_map and dma_unmap in cpu1 in this mail [0], but now the functions
> > are set_vq_ready in the vdpa driver side and VDUSE_VQ_GET_INFO in the
> > VDUSE instance.
> >
> > To me, the barriers should be in the driver and the VDUSE instance,
> > and the VDUSE instance should use the VDUSE_SET_VQ_READY message to
> > know when a VQ is ready. But I don't expect all the VDUSE instances to
> > opt-in for that, so the flow without smp barriers is:
> >
> > [cpu0] .set_vq_ready(vq, true)
> > [cpu1] VDUSE_VQ_GET_INFO(vq)
> > [cpu1] check vq->ready, and it races with the set_vq_ready from cpu0.
> >
>
> Are there any side effects of this race? I meant the driver can do
> set_vq_ready() at any time. And there would be message for notifying
> the usersapce in this series.
>
Actually I'm moving to a semaphore for the moment. SMP barrier is not enough.
The message works for the driver to know that the device vq is ready
and it can send kicks. But there is a race for the device where it has
replied to the message but vq->ready is still not true:
static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
u16 idx, bool ready)
{
[...]
r = vduse_dev_msg_sync(dev, &msg);
// The device could try to call() from here...
[...]
out:
// to here
vduse_vq_set_ready(vq, ready);
}
To be honest, I'd prefer to just set ready = true at
vduse_dev_write_iter before returning success to the VDUSE userland.
This way the synchronization struct is omitted entirely. But let me
know if you prefer otherwise.
Thanks!
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-02-04 8:53 ` Eugenio Perez Martin
@ 2026-02-05 4:04 ` Jason Wang
2026-02-05 6:30 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-05 4:04 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Feb 4, 2026 at 4:54 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Wed, Feb 4, 2026 at 3:48 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, Feb 3, 2026 at 6:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > >
> > > On Tue, Feb 3, 2026 at 5:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
> > > > <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > > > > > <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > > > > > well after initial setup, and the device can read it afterwards.
> > > > > > > > >
> > > > > > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > > > > > immediately after the operation returns.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > ---
> > > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > > > > > return mask;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > > > > > +{
> > > > > > > > > + /*
> > > > > > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > > + */
> > > > > > > > > + return smp_load_acquire(&vq->ready);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > > > > > +{
> > > > > > > > > + /*
> > > > > > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > > + */
> > > > > > > > > + smp_store_release(&vq->ready, ready);
> > > > > > > >
> > > > > > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > > > > > use vq_lock mutex.
> > > > > > > >
> > > > > > >
> > > > > > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > > > > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > > > > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > > > > > the cost of the ioctls or eventfd signaling.
> > > > > >
> > > > > > I'd like to use mutex for simplicity.
> > > > > >
> > > > >
> > > > > I cannot move it to a mutex, as we need to take it in the critical
> > > > > sections of the kick_lock and irq_lock spinlocks.
> > > > >
> > > > > I can move it to a spinlock, but it seems more complicated to me. We
> > > > > need to make sure we always take these kick_lock and irq_lock in the
> > > > > same order as the ready_lock, to not create deadlocks, and they always
> > > > > protect just the ready boolean. But sure, I'll send the spinlock
> > > > > version for V2.
> > > >
> > > > Thinking about this, I'm not sure I understand the issue.
> > > >
> > > > Maybe you can give me an example of the race.
> > > >
> > >
> > > It's the same issue as you describe with set_group_asid on cpu1 and
> > > dma_map and dma_unmap in cpu1 in this mail [0], but now the functions
> > > are set_vq_ready in the vdpa driver side and VDUSE_VQ_GET_INFO in the
> > > VDUSE instance.
> > >
> > > To me, the barriers should be in the driver and the VDUSE instance,
> > > and the VDUSE instance should use the VDUSE_SET_VQ_READY message to
> > > know when a VQ is ready. But I don't expect all the VDUSE instances to
> > > opt-in for that, so the flow without smp barriers is:
> > >
> > > [cpu0] .set_vq_ready(vq, true)
> > > [cpu1] VDUSE_VQ_GET_INFO(vq)
> > > [cpu1] check vq->ready, and it races with the set_vq_ready from cpu0.
> > >
> >
> > Are there any side effects of this race? I meant the driver can do
> > set_vq_ready() at any time. And there would be message for notifying
> > the usersapce in this series.
> >
>
> Actually I'm moving to a semaphore for the moment. SMP barrier is not enough.
>
> The message works for the driver to know that the device vq is ready
> and it can send kicks. But there is a race for the device where it has
> replied to the message but vq->ready is still not true:
>
> static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> u16 idx, bool ready)
> {
> [...]
> r = vduse_dev_msg_sync(dev, &msg);
> // The device could try to call() from here...
>
> [...]
> out:
> // to here
> vduse_vq_set_ready(vq, ready);
> }
>
> To be honest, I'd prefer to just set ready = true at
> vduse_dev_write_iter before returning success to the VDUSE userland.
I think you meant:
vduse_vq_set_ready(vq, ready);
r = vduse_dev_msg_sync(dev, &msg);
> This way the synchronization struct is omitted entirely. But let me
> know if you prefer otherwise.
Let's go this way.
Thanks
>
> Thanks!
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-02-04 7:34 ` Eugenio Perez Martin
@ 2026-02-05 4:08 ` Jason Wang
2026-02-05 6:38 ` Eugenio Perez Martin
0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2026-02-05 4:08 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Wed, Feb 4, 2026 at 3:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Wed, Feb 4, 2026 at 3:44 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, Feb 3, 2026 at 3:28 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > >
> > > On Tue, Feb 3, 2026 at 5:00 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
> > > > <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > > > > > <eperezma@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > > > > > enabled.
> > > > > > > > >
> > > > > > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > > > > > in the destination device.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > ---
> > > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > @@ -9,6 +9,7 @@
> > > > > > > > > */
> > > > > > > > >
> > > > > > > > > #include "linux/virtio_net.h"
> > > > > > > > > +#include <linux/bits.h>
> > > > > > > > > #include <linux/cleanup.h>
> > > > > > > > > #include <linux/init.h>
> > > > > > > > > #include <linux/module.h>
> > > > > > > > > @@ -53,7 +54,7 @@
> > > > > > > > > #define IRQ_UNBOUND -1
> > > > > > > > >
> > > > > > > > > /* Supported VDUSE features */
> > > > > > > > > -static const uint64_t vduse_features;
> > > > > > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > > > > > >
> > > > > > > > > /*
> > > > > > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > > > > > char *name;
> > > > > > > > > struct mutex lock;
> > > > > > > > > spinlock_t msg_lock;
> > > > > > > > > + u64 vduse_features;
> > > > > > > > > u64 msg_unique;
> > > > > > > > > u32 msg_timeout;
> > > > > > > > > wait_queue_head_t waitq;
> > > > > > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > > > > > {
> > > > > > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > > > > > + int r;
> > > > > > > > > +
> > > > > > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > > > > > + goto out;
> > > > > > > > > +
> > > > > > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > > > > > + msg.req.vq_ready.num = idx;
> > > > > > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > > > > > +
> > > > > > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > > > > > >
> > > > > > > > > + if (r < 0) {
> > > > > > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > > > > > + idx, ready);
> > > > > > > > > +
> > > > > > > > > + /* We can't do better than break the device in this case */
> > > > > > > > > + spin_lock(&dev->msg_lock);
> > > > > > > > > + vduse_dev_broken(dev);
> > > > > > > >
> > > > > > > > This has been done by vduse_dev_msg_sync().
> > > > > > > >
> > > > > > >
> > > > > > > This is done by msg_sync() when userland does not reply in a
> > > > > > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > > > > > Should I add a comment?
> > > > > >
> > > > > > If this is not specific to Q_READY, I think we need to move it to
> > > > > > msg_sync() as well.
> > > > > >
> > > > >
> > > > > It's specific to Q_READY for me, as it's the request that returns void
> > > > > and has no possibility to inform of an error.
> > > >
> > > > I may miss something, I mean why consider the failure of Q_READY to be
> > > > more serious than the failure of other commands (e.g set_status()).
> > > >
> > >
> > > I'm not considering the failure of Q_READY more serious than any other
> > > failure. I'm breaking the device here as I cannot return the error to
> > > the vDPA driver: This function returns void.
> >
> > Yes, and set_status() return void as well.
> >
> > void virtio_add_status(struct virtio_device *dev, unsigned int status)
> > {
> > might_sleep();
> > => dev->config->set_status(dev, dev->config->get_status(dev) | status);
> > }
> >
>
> Yes, I'm not saying all of the other users of vduse_dev_msg_sync don't
> ignore the return code of the userland VDUSE instance. I'm saying that
> we have cases where it is not ignored and the driver can react from
> the error. After a fast look for them:
>
> 1) Case vhost_vdpa -> VHOST_GET_VRING_BASE -> ops->get_vq_state.
> 2) Case vhost_vdpa -> ops->vduse_vdpa_set_map -> vduse_dev_update_iotlb
>
> If the userland VDUSE instance returns an error, -EIO is propagated to
> vhost_vdpa user, and both can react and continue operating normally.
> If we break the device, the same two userlands apps see a totally
> different behavior: The device is totally unusable from that moment.
>
> Do we really want to break the device from the moment that VDUSE
> instance returns an error in these conditions, and do it in an user
> visible change?
Ok, I think you worries about that if we do it for set_status() it
might break userspace. That makes sense.
>
> > >
> > > We can make the function return a bool or int, and then make
> > > vhost_vdpa and virtio_vdpa react to that error. QEMU is already
> > > prepared for VHOST_VDPA_SET_VRING_ENABLE to return an error, as it is
> > > an ioctl,
> >
> > But we did:
> >
> > case VHOST_VDPA_SET_VRING_ENABLE:
> > if (copy_from_user(&s, argp, sizeof(s)))
> > return -EFAULT;
> > ops->set_vq_ready(vdpa, idx, s.num);
> > return 0;
> >
> > So the failure come from copy_from_user()
> >
>
> Yes. Let me rewrite it as:
>
> We can make ops->set_vq_ready return a bool or int, and then make
> vhost_vdpa react to that error. The driver virtio_vdpa already checks
> the same by calling get_vq_ready, but there is no equivalent in
> vhost_vdpa. I can set a comment explaining the two methods for
> checking the error of the call. QEMU is already prepared for handling
> the return of an error from VHOST_VDPA_SET_VRING_ENABLE, as the ioctl
> already returns errors like -EFAULT, and hopefully the rest of the
> users of VHOST_VDPA_SET_VRING_ENABLE are also prepared.
>
> > >
> > > Should I change vdpa_config_ops->set_vq_ready so it can return an
> > > error, as a prerequisite of this series?
> >
> > Or it would be better to leave the breaking of device on
> > REQ_RESULT_FAILED for future investigation (not blocking this series).
> >
>
> I'd say it's the best option, yes. But my vote is to make
> VHOST_VDPA_SET_VRING_ENABLE more robust actually :).
Ok, I think then it would be better to use a separate patch in this series?
Thanks
>
> > >
> > > > >
> > > > > The VDUSE userland instance could reply to other requests with
> > > > > REQ_RESULT_FAILED and the driver still has capacity to recover from
> > > > > the failure.
> > > > > If we always break the device on every request fail, we
> > > > > deny that possibility.
> > > > >
> > > >
> > > > Thanks
> > > >
> > >
> >
> > Thanks
> >
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/6] vduse: ensure vq->ready access is smp safe
2026-02-05 4:04 ` Jason Wang
@ 2026-02-05 6:30 ` Eugenio Perez Martin
0 siblings, 0 replies; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-05 6:30 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Feb 5, 2026 at 5:04 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Feb 4, 2026 at 4:54 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Wed, Feb 4, 2026 at 3:48 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Tue, Feb 3, 2026 at 6:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > >
> > > > On Tue, Feb 3, 2026 at 5:05 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Fri, Jan 30, 2026 at 3:56 PM Eugenio Perez Martin
> > > > > <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Jan 30, 2026 at 3:18 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Jan 29, 2026 at 2:21 PM Eugenio Perez Martin
> > > > > > > <eperezma@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Jan 29, 2026 at 2:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > The vduse_vdpa_set_vq_ready can be called in the lifetime of the device
> > > > > > > > > > well after initial setup, and the device can read it afterwards.
> > > > > > > > > >
> > > > > > > > > > Ensure that reads and writes to vq->ready are SMP safe so that the
> > > > > > > > > > caller can trust that virtqueue kicks and calls behave as expected
> > > > > > > > > > immediately after the operation returns.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > ---
> > > > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 34 +++++++++++++++++++++++-------
> > > > > > > > > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > index 73d1d517dc6c..a4963aaf9332 100644
> > > > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > @@ -460,6 +460,24 @@ static __poll_t vduse_dev_poll(struct file *file, poll_table *wait)
> > > > > > > > > > return mask;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > +static bool vduse_vq_get_ready(const struct vduse_virtqueue *vq)
> > > > > > > > > > +{
> > > > > > > > > > + /*
> > > > > > > > > > + * Paired with vduse_vq_set_ready smp_store, as the driver may modify
> > > > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > > > + */
> > > > > > > > > > + return smp_load_acquire(&vq->ready);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > +static void vduse_vq_set_ready(struct vduse_virtqueue *vq, bool ready)
> > > > > > > > > > +{
> > > > > > > > > > + /*
> > > > > > > > > > + * Paired with vduse_vq_get_ready smp_load, as the driver may modify
> > > > > > > > > > + * it while the VDUSE instance is reading it.
> > > > > > > > > > + */
> > > > > > > > > > + smp_store_release(&vq->ready, ready);
> > > > > > > > >
> > > > > > > > > Assuming this is not used in the datapath, I wonder if we can simply
> > > > > > > > > use vq_lock mutex.
> > > > > > > > >
> > > > > > > >
> > > > > > > > The function vduse_vq_set/get_ready are not in the datapath, but
> > > > > > > > vduse_vq_kick and vduse_vq_signal_irqfd are. I'm ok if you want to
> > > > > > > > switch to vq_mutex if you want though, maybe it's even comparable with
> > > > > > > > the cost of the ioctls or eventfd signaling.
> > > > > > >
> > > > > > > I'd like to use mutex for simplicity.
> > > > > > >
> > > > > >
> > > > > > I cannot move it to a mutex, as we need to take it in the critical
> > > > > > sections of the kick_lock and irq_lock spinlocks.
> > > > > >
> > > > > > I can move it to a spinlock, but it seems more complicated to me. We
> > > > > > need to make sure we always take these kick_lock and irq_lock in the
> > > > > > same order as the ready_lock, to not create deadlocks, and they always
> > > > > > protect just the ready boolean. But sure, I'll send the spinlock
> > > > > > version for V2.
> > > > >
> > > > > Thinking about this, I'm not sure I understand the issue.
> > > > >
> > > > > Maybe you can give me an example of the race.
> > > > >
> > > >
> > > > It's the same issue as you describe with set_group_asid on cpu1 and
> > > > dma_map and dma_unmap in cpu1 in this mail [0], but now the functions
> > > > are set_vq_ready in the vdpa driver side and VDUSE_VQ_GET_INFO in the
> > > > VDUSE instance.
> > > >
> > > > To me, the barriers should be in the driver and the VDUSE instance,
> > > > and the VDUSE instance should use the VDUSE_SET_VQ_READY message to
> > > > know when a VQ is ready. But I don't expect all the VDUSE instances to
> > > > opt-in for that, so the flow without smp barriers is:
> > > >
> > > > [cpu0] .set_vq_ready(vq, true)
> > > > [cpu1] VDUSE_VQ_GET_INFO(vq)
> > > > [cpu1] check vq->ready, and it races with the set_vq_ready from cpu0.
> > > >
> > >
> > > Are there any side effects of this race? I meant the driver can do
> > > set_vq_ready() at any time. And there would be message for notifying
> > > the usersapce in this series.
> > >
> >
> > Actually I'm moving to a semaphore for the moment. SMP barrier is not enough.
> >
> > The message works for the driver to know that the device vq is ready
> > and it can send kicks. But there is a race for the device where it has
> > replied to the message but vq->ready is still not true:
> >
> > static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > u16 idx, bool ready)
> > {
> > [...]
> > r = vduse_dev_msg_sync(dev, &msg);
> > // The device could try to call() from here...
> >
> > [...]
> > out:
> > // to here
> > vduse_vq_set_ready(vq, ready);
> > }
> >
> > To be honest, I'd prefer to just set ready = true at
> > vduse_dev_write_iter before returning success to the VDUSE userland.
>
> I think you meant:
>
> vduse_vq_set_ready(vq, ready);
Since this function's error path is to break the device, I think it
can work, yes. I'm sending the V2 with this.
Thanks!
> r = vduse_dev_msg_sync(dev, &msg);
>
> > This way the synchronization struct is omitted entirely. But let me
> > know if you prefer otherwise.
>
> Let's go this way.
>
> Thanks
>
> >
> > Thanks!
> >
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 5/6] vduse: add F_QUEUE_READY feature
2026-02-05 4:08 ` Jason Wang
@ 2026-02-05 6:38 ` Eugenio Perez Martin
0 siblings, 0 replies; 33+ messages in thread
From: Eugenio Perez Martin @ 2026-02-05 6:38 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S . Tsirkin, Xuan Zhuo, Cindy Lu, Laurent Vivier,
Stefano Garzarella, linux-kernel, Maxime Coquelin, Yongji Xie,
virtualization
On Thu, Feb 5, 2026 at 5:09 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Feb 4, 2026 at 3:35 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Wed, Feb 4, 2026 at 3:44 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Tue, Feb 3, 2026 at 3:28 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > >
> > > > On Tue, Feb 3, 2026 at 5:00 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Fri, Jan 30, 2026 at 4:15 PM Eugenio Perez Martin
> > > > > <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Jan 30, 2026 at 3:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Jan 29, 2026 at 2:26 PM Eugenio Perez Martin
> > > > > > > <eperezma@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Jan 29, 2026 at 3:12 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Jan 28, 2026 at 8:45 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Add the VDUSE_F_QUEUE_READY feature flag. This allows the kernel module
> > > > > > > > > > to explicitly signal userspace when a specific virtqueue has been
> > > > > > > > > > enabled.
> > > > > > > > > >
> > > > > > > > > > In scenarios like Live Migration of VirtIO net devices, the dataplane
> > > > > > > > > > starts after the control virtqueue allowing QEMU to apply configuration
> > > > > > > > > > in the destination device.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > ---
> > > > > > > > > > drivers/vdpa/vdpa_user/vduse_dev.c | 28 +++++++++++++++++++++++++++-
> > > > > > > > > > include/uapi/linux/vduse.h | 19 +++++++++++++++++++
> > > > > > > > > > 2 files changed, 46 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > index e7da69c2ad71..1d93b540db4d 100644
> > > > > > > > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > > > > > > > > @@ -9,6 +9,7 @@
> > > > > > > > > > */
> > > > > > > > > >
> > > > > > > > > > #include "linux/virtio_net.h"
> > > > > > > > > > +#include <linux/bits.h>
> > > > > > > > > > #include <linux/cleanup.h>
> > > > > > > > > > #include <linux/init.h>
> > > > > > > > > > #include <linux/module.h>
> > > > > > > > > > @@ -53,7 +54,7 @@
> > > > > > > > > > #define IRQ_UNBOUND -1
> > > > > > > > > >
> > > > > > > > > > /* Supported VDUSE features */
> > > > > > > > > > -static const uint64_t vduse_features;
> > > > > > > > > > +static const uint64_t vduse_features = BIT_U64(VDUSE_F_QUEUE_READY);
> > > > > > > > > >
> > > > > > > > > > /*
> > > > > > > > > > * VDUSE instance have not asked the vduse API version, so assume 0.
> > > > > > > > > > @@ -120,6 +121,7 @@ struct vduse_dev {
> > > > > > > > > > char *name;
> > > > > > > > > > struct mutex lock;
> > > > > > > > > > spinlock_t msg_lock;
> > > > > > > > > > + u64 vduse_features;
> > > > > > > > > > u64 msg_unique;
> > > > > > > > > > u32 msg_timeout;
> > > > > > > > > > wait_queue_head_t waitq;
> > > > > > > > > > @@ -619,7 +621,30 @@ static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa,
> > > > > > > > > > {
> > > > > > > > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > > > > > > > > struct vduse_virtqueue *vq = dev->vqs[idx];
> > > > > > > > > > + struct vduse_dev_msg msg = { 0 };
> > > > > > > > > > + int r;
> > > > > > > > > > +
> > > > > > > > > > + if (!(dev->vduse_features & BIT_U64(VDUSE_F_QUEUE_READY)))
> > > > > > > > > > + goto out;
> > > > > > > > > > +
> > > > > > > > > > + msg.req.type = VDUSE_SET_VQ_READY;
> > > > > > > > > > + msg.req.vq_ready.num = idx;
> > > > > > > > > > + msg.req.vq_ready.ready = !!ready;
> > > > > > > > > > +
> > > > > > > > > > + r = vduse_dev_msg_sync(dev, &msg);
> > > > > > > > > >
> > > > > > > > > > + if (r < 0) {
> > > > > > > > > > + dev_dbg(&vdpa->dev, "device refuses to set vq %u ready %u",
> > > > > > > > > > + idx, ready);
> > > > > > > > > > +
> > > > > > > > > > + /* We can't do better than break the device in this case */
> > > > > > > > > > + spin_lock(&dev->msg_lock);
> > > > > > > > > > + vduse_dev_broken(dev);
> > > > > > > > >
> > > > > > > > > This has been done by vduse_dev_msg_sync().
> > > > > > > > >
> > > > > > > >
> > > > > > > > This is done by msg_sync() when userland does not reply in a
> > > > > > > > timeframe, but not when userland replies with VDUSE_REQ_RESULT_FAILED.
> > > > > > > > Should I add a comment?
> > > > > > >
> > > > > > > If this is not specific to Q_READY, I think we need to move it to
> > > > > > > msg_sync() as well.
> > > > > > >
> > > > > >
> > > > > > It's specific to Q_READY for me, as it's the request that returns void
> > > > > > and has no possibility to inform of an error.
> > > > >
> > > > > I may miss something, I mean why consider the failure of Q_READY to be
> > > > > more serious than the failure of other commands (e.g set_status()).
> > > > >
> > > >
> > > > I'm not considering the failure of Q_READY more serious than any other
> > > > failure. I'm breaking the device here as I cannot return the error to
> > > > the vDPA driver: This function returns void.
> > >
> > > Yes, and set_status() return void as well.
> > >
> > > void virtio_add_status(struct virtio_device *dev, unsigned int status)
> > > {
> > > might_sleep();
> > > => dev->config->set_status(dev, dev->config->get_status(dev) | status);
> > > }
> > >
> >
> > Yes, I'm not saying all of the other users of vduse_dev_msg_sync don't
> > ignore the return code of the userland VDUSE instance. I'm saying that
> > we have cases where it is not ignored and the driver can react from
> > the error. After a fast look for them:
> >
> > 1) Case vhost_vdpa -> VHOST_GET_VRING_BASE -> ops->get_vq_state.
> > 2) Case vhost_vdpa -> ops->vduse_vdpa_set_map -> vduse_dev_update_iotlb
> >
> > If the userland VDUSE instance returns an error, -EIO is propagated to
> > vhost_vdpa user, and both can react and continue operating normally.
> > If we break the device, the same two userlands apps see a totally
> > different behavior: The device is totally unusable from that moment.
> >
> > Do we really want to break the device from the moment that VDUSE
> > instance returns an error in these conditions, and do it in an user
> > visible change?
>
> Ok, I think you worries about that if we do it for set_status() it
> might break userspace. That makes sense.
>
> >
> > > >
> > > > We can make the function return a bool or int, and then make
> > > > vhost_vdpa and virtio_vdpa react to that error. QEMU is already
> > > > prepared for VHOST_VDPA_SET_VRING_ENABLE to return an error, as it is
> > > > an ioctl,
> > >
> > > But we did:
> > >
> > > case VHOST_VDPA_SET_VRING_ENABLE:
> > > if (copy_from_user(&s, argp, sizeof(s)))
> > > return -EFAULT;
> > > ops->set_vq_ready(vdpa, idx, s.num);
> > > return 0;
> > >
> > > So the failure come from copy_from_user()
> > >
> >
> > Yes. Let me rewrite it as:
> >
> > We can make ops->set_vq_ready return a bool or int, and then make
> > vhost_vdpa react to that error. The driver virtio_vdpa already checks
> > the same by calling get_vq_ready, but there is no equivalent in
> > vhost_vdpa. I can set a comment explaining the two methods for
> > checking the error of the call. QEMU is already prepared for handling
> > the return of an error from VHOST_VDPA_SET_VRING_ENABLE, as the ioctl
> > already returns errors like -EFAULT, and hopefully the rest of the
> > users of VHOST_VDPA_SET_VRING_ENABLE are also prepared.
> >
> > > >
> > > > Should I change vdpa_config_ops->set_vq_ready so it can return an
> > > > error, as a prerequisite of this series?
> > >
> > > Or it would be better to leave the breaking of device on
> > > REQ_RESULT_FAILED for future investigation (not blocking this series).
> > >
> >
> > I'd say it's the best option, yes. But my vote is to make
> > VHOST_VDPA_SET_VRING_ENABLE more robust actually :).
>
> Ok, I think then it would be better to use a separate patch in this series?
>
Yes, I'm ok with leaving this as a change for future series.
Thanks!
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2026-02-05 6:38 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-28 12:45 [PATCH 0/6] Add queue ready message to VDUSE Eugenio Pérez
2026-01-28 12:45 ` [PATCH 1/6] vduse: ensure vq->ready access is smp safe Eugenio Pérez
2026-01-29 1:16 ` Jason Wang
2026-01-29 6:20 ` Eugenio Perez Martin
2026-01-30 2:18 ` Jason Wang
2026-01-30 7:56 ` Eugenio Perez Martin
2026-02-03 4:05 ` Jason Wang
2026-02-03 10:35 ` Eugenio Perez Martin
2026-02-04 2:48 ` Jason Wang
2026-02-04 8:53 ` Eugenio Perez Martin
2026-02-05 4:04 ` Jason Wang
2026-02-05 6:30 ` Eugenio Perez Martin
2026-01-28 12:45 ` [PATCH 2/6] vduse: store control device pointer Eugenio Pérez
2026-01-28 12:45 ` [PATCH 3/6] vduse: Add API v2 definition Eugenio Pérez
2026-01-29 2:00 ` Jason Wang
2026-01-29 8:07 ` Eugenio Perez Martin
2026-01-30 2:17 ` Jason Wang
2026-01-30 8:12 ` Eugenio Perez Martin
2026-01-28 12:45 ` [PATCH 4/6] vduse: add VDUSE_GET_FEATURES ioctl Eugenio Pérez
2026-01-29 2:10 ` Jason Wang
2026-01-29 8:03 ` Eugenio Perez Martin
2026-01-28 12:45 ` [PATCH 5/6] vduse: add F_QUEUE_READY feature Eugenio Pérez
2026-01-29 2:12 ` Jason Wang
2026-01-29 6:26 ` Eugenio Perez Martin
2026-01-30 2:17 ` Jason Wang
2026-01-30 8:14 ` Eugenio Perez Martin
2026-02-03 4:00 ` Jason Wang
2026-02-03 7:27 ` Eugenio Perez Martin
2026-02-04 2:44 ` Jason Wang
2026-02-04 7:34 ` Eugenio Perez Martin
2026-02-05 4:08 ` Jason Wang
2026-02-05 6:38 ` Eugenio Perez Martin
2026-01-28 12:45 ` [PATCH 6/6] vduse: advertise API V2 support Eugenio Pérez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox