* [PATCH v3 0/6] Add multiple address spaces support to VDUSE
@ 2025-09-19 9:32 Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 1/6] vduse: make domain_lock an rwlock Eugenio Pérez
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Eugenio Pérez @ 2025-09-19 9:32 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Maxime Coquelin, jasowang, Xuan Zhuo, linux-kernel,
Laurent Vivier, Stefano Garzarella, Yongji Xie,
Eugenio Pérez, Cindy Lu, virtualization
When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (QEMU) instead of being assigned
directly to Guest. This is because QEMU needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).
This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accessing it directly.
This series add support to multiple address spaces in VDUSE device
allowing selective virtqueue isolation through address space IDs (ASID).
The VDUSE device needs to report:
* Number of virtqueue groups
* Association of each vq group with each virtqueue
* Number of address spaces supported.
Then, the vDPA driver can modify the ASID assigned to each VQ group to
isolate the memory AS. This aligns VDUSE with vdpa_sim and nvidia mlx5
devices which already support ASID.
This helps to isolate the environments for the virtqueues that will not
be assigned directly. E.g in the case of virtio-net, the control
virtqueue will not be assigned directly to guest.
Also, to be able to test this patch, the user needs to manually revert
56e71885b034 ("vduse: Temporarily fail if control queue feature requested").
PATCH v3:
* Make the default group an invalid group as long as VDUSE device does
not set it to some valid u32 value. Modify the vdpa core to take that
into account (Jason). Adapt all the virtio_map_ops callbacks to it.
* Make setting status DRIVER_OK fail if vq group is not valid.
* Create the VDUSE_DEV_MAX_GROUPS and VDUSE_DEV_MAX_AS instead of using a magic
number
* Remove the _int name suffix from struct vduse_vq_group.
* Get the vduse domain through the vduse_as in the map functions (Jason).
* Squash the patch implementing the AS logic with the patch creating the
vduse_as struct (Jason).
PATCH v2:
* Now the vq group is in vduse_vq_config struct instead of issuing one
VDUSE message per vq.
* Convert the use of mutex to rwlock (Xie Yongji).
PATCH v1:
* Fix: Remove BIT_ULL(VIRTIO_S_*), as _S_ is already the bit (Maxime)
* Using vduse_vq_group_int directly instead of an empty struct in union
virtio_map.
RFC v3:
* Increase VDUSE_MAX_VQ_GROUPS to 0xffff (Jason). It was set to a lower
value to reduce memory consumption, but vqs are already limited to
that value and userspace VDUSE is able to allocate that many vqs. Also, it's
a dynamic array now. Same with ASID.
* Move the valid vq groups range check to vduse_validate_config.
* Embed vduse_iotlb_entry into vduse_iotlb_entry_v2.
* Use of array_index_nospec in VDUSE device ioctls.
* Move the umem mutex to asid struct so there is no contention between
ASIDs.
* Remove the descs vq group capability as it will not be used and we can
add it on top.
* Do not ask for vq groups in number of vq groups < 2.
* Remove TODO about merging VDUSE_IOTLB_GET_FD ioctl with
VDUSE_IOTLB_GET_INFO.
RFC v2:
* Cache group information in kernel, as we need to provide the vq map
tokens properly.
* Add descs vq group to optimize SVQ forwarding and support indirect
descriptors out of the box.
* Make iotlb entry the last one of vduse_iotlb_entry_v2 so the first
part of the struct is the same.
* Fixes detected testing with OVS+VDUSE.
Eugenio Pérez (6):
vduse: make domain_lock an rwlock
vduse: add v1 API definition
vduse: add vq group support
vduse: return internal vq group struct as map token
vduse: add vq group asid support
vduse: bump version number
drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
drivers/vdpa/vdpa_user/vduse_dev.c | 506 ++++++++++++++++++++++-------
drivers/vhost/vdpa.c | 11 +-
include/linux/vdpa.h | 5 +-
include/linux/virtio.h | 6 +-
include/uapi/linux/vduse.h | 65 +++-
8 files changed, 467 insertions(+), 132 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 1/6] vduse: make domain_lock an rwlock
2025-09-19 9:32 [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Pérez
@ 2025-09-19 9:32 ` Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 2/6] vduse: add v1 API definition Eugenio Pérez
2025-09-19 9:36 ` [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Perez Martin
2 siblings, 0 replies; 5+ messages in thread
From: Eugenio Pérez @ 2025-09-19 9:32 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Maxime Coquelin, jasowang, Xuan Zhuo, linux-kernel,
Laurent Vivier, Stefano Garzarella, Yongji Xie,
Eugenio Pérez, Cindy Lu, virtualization
It will be used in a few more scenarios read-only so make it more
scalable.
Suggested-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v2: New in v2
---
drivers/vdpa/vdpa_user/vduse_dev.c | 41 +++++++++++++++---------------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index e7bced0b5542..2b6a8958ffe0 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -14,6 +14,7 @@
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/eventfd.h>
+#include <linux/rwlock.h>
#include <linux/slab.h>
#include <linux/wait.h>
#include <linux/dma-map-ops.h>
@@ -117,7 +118,7 @@ struct vduse_dev {
struct vduse_umem *umem;
struct mutex mem_lock;
unsigned int bounce_size;
- struct mutex domain_lock;
+ rwlock_t domain_lock;
};
struct vduse_dev_msg {
@@ -1176,9 +1177,9 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
if (entry.start > entry.last)
break;
- mutex_lock(&dev->domain_lock);
+ read_lock(&dev->domain_lock);
if (!dev->domain) {
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
break;
}
spin_lock(&dev->domain->iotlb_lock);
@@ -1193,7 +1194,7 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
entry.perm = map->perm;
}
spin_unlock(&dev->domain->iotlb_lock);
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
ret = -EINVAL;
if (!f)
break;
@@ -1346,10 +1347,10 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
sizeof(umem.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
ret = vduse_dev_reg_umem(dev, umem.iova,
umem.uaddr, umem.size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
break;
}
case VDUSE_IOTLB_DEREG_UMEM: {
@@ -1363,10 +1364,10 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
if (!is_mem_zero((const char *)umem.reserved,
sizeof(umem.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
ret = vduse_dev_dereg_umem(dev, umem.iova,
umem.size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
break;
}
case VDUSE_IOTLB_GET_INFO: {
@@ -1385,9 +1386,9 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
sizeof(info.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ read_lock(&dev->domain_lock);
if (!dev->domain) {
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
break;
}
spin_lock(&dev->domain->iotlb_lock);
@@ -1402,7 +1403,7 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
info.capability |= VDUSE_IOVA_CAP_UMEM;
}
spin_unlock(&dev->domain->iotlb_lock);
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
if (!map)
break;
@@ -1425,10 +1426,10 @@ static int vduse_dev_release(struct inode *inode, struct file *file)
{
struct vduse_dev *dev = file->private_data;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (dev->domain)
vduse_dev_dereg_umem(dev, 0, dev->domain->bounce_size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
spin_lock(&dev->msg_lock);
/* Make sure the inflight messages can processed after reconncection */
list_splice_init(&dev->recv_list, &dev->send_list);
@@ -1647,7 +1648,7 @@ static struct vduse_dev *vduse_dev_create(void)
mutex_init(&dev->lock);
mutex_init(&dev->mem_lock);
- mutex_init(&dev->domain_lock);
+ rwlock_init(&dev->domain_lock);
spin_lock_init(&dev->msg_lock);
INIT_LIST_HEAD(&dev->send_list);
INIT_LIST_HEAD(&dev->recv_list);
@@ -1805,7 +1806,7 @@ static ssize_t bounce_size_store(struct device *device,
int ret;
ret = -EPERM;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (dev->domain)
goto unlock;
@@ -1821,7 +1822,7 @@ static ssize_t bounce_size_store(struct device *device,
dev->bounce_size = bounce_size & PAGE_MASK;
ret = count;
unlock:
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
return ret;
}
@@ -2045,11 +2046,11 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
if (ret)
return ret;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (!dev->domain)
dev->domain = vduse_domain_create(VDUSE_IOVA_SIZE - 1,
dev->bounce_size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
if (!dev->domain) {
put_device(&dev->vdev->vdpa.dev);
return -ENOMEM;
@@ -2059,10 +2060,10 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
ret = _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num);
if (ret) {
put_device(&dev->vdev->vdpa.dev);
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
vduse_domain_destroy(dev->domain);
dev->domain = NULL;
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
return ret;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 2/6] vduse: add v1 API definition
2025-09-19 9:32 [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 1/6] vduse: make domain_lock an rwlock Eugenio Pérez
@ 2025-09-19 9:32 ` Eugenio Pérez
2025-09-19 9:36 ` [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Perez Martin
2 siblings, 0 replies; 5+ messages in thread
From: Eugenio Pérez @ 2025-09-19 9:32 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Maxime Coquelin, jasowang, Xuan Zhuo, linux-kernel,
Laurent Vivier, Stefano Garzarella, Yongji Xie,
Eugenio Pérez, Cindy Lu, virtualization
This allows the kernel to detect whether the userspace VDUSE device
supports the VQ group and ASID features. VDUSE devices that don't set
the V1 API will not receive the new messages, and vdpa device will be
created with only one vq group and asid.
The next patches implement the new feature incrementally, only enabling
the VDUSE device to set the V1 API version by the end of the series.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
include/uapi/linux/vduse.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index 10ad71aa00d6..ccb92a1efce0 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -10,6 +10,10 @@
#define VDUSE_API_VERSION 0
+/* VQ groups and ASID support */
+
+#define VDUSE_API_VERSION_1 1
+
/*
* Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION).
* This is used for future extension.
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 1/6] vduse: make domain_lock an rwlock
2025-09-19 9:33 Eugenio Pérez
@ 2025-09-19 9:33 ` Eugenio Pérez
0 siblings, 0 replies; 5+ messages in thread
From: Eugenio Pérez @ 2025-09-19 9:33 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Eugenio Pérez, Yongji Xie, Maxime Coquelin, linux-kernel,
Xuan Zhuo, virtualization, Cindy Lu, jasowang, Laurent Vivier,
Stefano Garzarella
It will be used in a few more scenarios read-only so make it more
scalable.
Suggested-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v2: New in v2
---
drivers/vdpa/vdpa_user/vduse_dev.c | 41 +++++++++++++++---------------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index e7bced0b5542..2b6a8958ffe0 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -14,6 +14,7 @@
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/eventfd.h>
+#include <linux/rwlock.h>
#include <linux/slab.h>
#include <linux/wait.h>
#include <linux/dma-map-ops.h>
@@ -117,7 +118,7 @@ struct vduse_dev {
struct vduse_umem *umem;
struct mutex mem_lock;
unsigned int bounce_size;
- struct mutex domain_lock;
+ rwlock_t domain_lock;
};
struct vduse_dev_msg {
@@ -1176,9 +1177,9 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
if (entry.start > entry.last)
break;
- mutex_lock(&dev->domain_lock);
+ read_lock(&dev->domain_lock);
if (!dev->domain) {
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
break;
}
spin_lock(&dev->domain->iotlb_lock);
@@ -1193,7 +1194,7 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
entry.perm = map->perm;
}
spin_unlock(&dev->domain->iotlb_lock);
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
ret = -EINVAL;
if (!f)
break;
@@ -1346,10 +1347,10 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
sizeof(umem.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
ret = vduse_dev_reg_umem(dev, umem.iova,
umem.uaddr, umem.size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
break;
}
case VDUSE_IOTLB_DEREG_UMEM: {
@@ -1363,10 +1364,10 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
if (!is_mem_zero((const char *)umem.reserved,
sizeof(umem.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
ret = vduse_dev_dereg_umem(dev, umem.iova,
umem.size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
break;
}
case VDUSE_IOTLB_GET_INFO: {
@@ -1385,9 +1386,9 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
sizeof(info.reserved)))
break;
- mutex_lock(&dev->domain_lock);
+ read_lock(&dev->domain_lock);
if (!dev->domain) {
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
break;
}
spin_lock(&dev->domain->iotlb_lock);
@@ -1402,7 +1403,7 @@ static long vduse_dev_ioctl(struct file *file, unsigned int cmd,
info.capability |= VDUSE_IOVA_CAP_UMEM;
}
spin_unlock(&dev->domain->iotlb_lock);
- mutex_unlock(&dev->domain_lock);
+ read_unlock(&dev->domain_lock);
if (!map)
break;
@@ -1425,10 +1426,10 @@ static int vduse_dev_release(struct inode *inode, struct file *file)
{
struct vduse_dev *dev = file->private_data;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (dev->domain)
vduse_dev_dereg_umem(dev, 0, dev->domain->bounce_size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
spin_lock(&dev->msg_lock);
/* Make sure the inflight messages can processed after reconncection */
list_splice_init(&dev->recv_list, &dev->send_list);
@@ -1647,7 +1648,7 @@ static struct vduse_dev *vduse_dev_create(void)
mutex_init(&dev->lock);
mutex_init(&dev->mem_lock);
- mutex_init(&dev->domain_lock);
+ rwlock_init(&dev->domain_lock);
spin_lock_init(&dev->msg_lock);
INIT_LIST_HEAD(&dev->send_list);
INIT_LIST_HEAD(&dev->recv_list);
@@ -1805,7 +1806,7 @@ static ssize_t bounce_size_store(struct device *device,
int ret;
ret = -EPERM;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (dev->domain)
goto unlock;
@@ -1821,7 +1822,7 @@ static ssize_t bounce_size_store(struct device *device,
dev->bounce_size = bounce_size & PAGE_MASK;
ret = count;
unlock:
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
return ret;
}
@@ -2045,11 +2046,11 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
if (ret)
return ret;
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
if (!dev->domain)
dev->domain = vduse_domain_create(VDUSE_IOVA_SIZE - 1,
dev->bounce_size);
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
if (!dev->domain) {
put_device(&dev->vdev->vdpa.dev);
return -ENOMEM;
@@ -2059,10 +2060,10 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
ret = _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num);
if (ret) {
put_device(&dev->vdev->vdpa.dev);
- mutex_lock(&dev->domain_lock);
+ write_lock(&dev->domain_lock);
vduse_domain_destroy(dev->domain);
dev->domain = NULL;
- mutex_unlock(&dev->domain_lock);
+ write_unlock(&dev->domain_lock);
return ret;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 0/6] Add multiple address spaces support to VDUSE
2025-09-19 9:32 [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 1/6] vduse: make domain_lock an rwlock Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 2/6] vduse: add v1 API definition Eugenio Pérez
@ 2025-09-19 9:36 ` Eugenio Perez Martin
2 siblings, 0 replies; 5+ messages in thread
From: Eugenio Perez Martin @ 2025-09-19 9:36 UTC (permalink / raw)
To: Michael S . Tsirkin
Cc: Maxime Coquelin, jasowang, Xuan Zhuo, linux-kernel,
Laurent Vivier, Stefano Garzarella, Yongji Xie, Cindy Lu,
virtualization
On Fri, Sep 19, 2025 at 11:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
[...]
Sorry I hit Ctrl-C in the wrong terminal, so this series lacks the
final patches.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-19 9:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 9:32 [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 1/6] vduse: make domain_lock an rwlock Eugenio Pérez
2025-09-19 9:32 ` [PATCH v3 2/6] vduse: add v1 API definition Eugenio Pérez
2025-09-19 9:36 ` [PATCH v3 0/6] Add multiple address spaces support to VDUSE Eugenio Perez Martin
-- strict thread matches above, loose matches on Subject: below --
2025-09-19 9:33 Eugenio Pérez
2025-09-19 9:33 ` [PATCH v3 1/6] vduse: make domain_lock an rwlock Eugenio Pérez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).