qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu
@ 2025-11-09  5:33 Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
                   ` (10 more replies)
  0 siblings, 11 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater, Leon Romanovsky, Leon Romanovsky,
	Jason Gunthorpe, Dongwon Kim

The virtio-gpu driver running in the Guest VM can create Guest blob
resources (by importing dmabufs) that are backed by System RAM. This
is made possible by making use of memfd memory backend and udmabuf
driver on the Host side. However, in order to create Guest blobs
that are backed by vfio-pci device regions (which happens when
virtio-gpu imports dmabufs from devices that have local memory such
as dGPU VFs), we have to implement VFIO_DEVICE_FEATURE_DMA_BUF and
leverage it in virtio-gpu.

So, while creating the blobs we use memory_region_is_ram_device() to
figure out if the blob is backed by memfd or a vfio-pci device. If
it is determined that the blob is backed by vfio-pci device region,
instead of calling into udmabuf driver to create a dmabuf fd we would
now call into vfio-pci driver to have a dmabuf fd created on the Host.

Changelog:
v1 -> v2:
- Drop the patch that uses res->blob_size instead of res->blob to
  identify blob resources (Akihiko)
- Remove the res->dmabuf_fd < 0 check while attaching backing to a
  resource (Akihiko)
- Remove cmd->cmd_hdr.type != VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
  check while attaching backing (Akihiko)
- Improve vfio_get_region_index_from_mr() and add documentation (Cedric)
- Remove rcu_read_lock/unlock around qemu_ram_block_from_host()
  (Akihiko, Cedric)
- Improve, document and rename vfio_device_create_dmabuf() to
  vfio_device_create_dmabuf_fd() (Cedric)
- Add a new helper to lookup VFIO device from memory region (Cedric)
- Document vfio_device_get_region_info() (Cedric)
- Ensure that info variable (in vfio_dmabuf_mmap()) is initialized
  before use (Cedric)
- Rename udmabuf files and helpers to dmabuf (Akihiko)
- Remove the redundant check for virtio_gpu_have_udmabuf() in
  virtio_gpu_init_dmabuf() (Akihiko)
- Add a helper to check whether all the entries of a dmabuf belong
  to a single memory region or not (Akihiko)

RFC -> v1:
- Create the CPU mapping using vfio device fd if the dmabuf exporter
  (vfio-pci) does not provide mmap() support (Akihiko)
- Log a warning with LOG_GUEST_ERROR instead of warn_report() when
  dmabuf cannot be created using Guest provided addresses (Akihiko)
- Use address_space_translate() instead of gpa2hva() to obtain the
  Host addresses (Akihiko)
- Rearrange the patches and improve the commit messages (Akihiko)
- Fix compilation error when VFIO is not enabled (Alex)
- Add a new helper to obtain VFIO region index from memory region
- Move vfio_device_create_dmabuf() to hw/vfio/device.c

Tested with an SRIOV enabled Intel dGPU (B60) by running Gnome Wayland
(in the VM) and Qemu with the following (relevant) parameters:
-device vfio-pci,host=0000:03:00.1
-device virtio-vga,max_outputs=1,xres=1920,yres=1080,blob=true
-display gtk,gl=on

Associated vfio-pci kernel driver series:
https://lore.kernel.org/dri-devel/cover.1754311439.git.leon@kernel.org/
Associated virtio-gpu kernel driver series (merged):
https://lore.kernel.org/dri-devel/20241126031643.3490496-1-vivek.kasireddy@intel.com/

---
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>

Vivek Kasireddy (10):
  virtio-gpu: Recreate the resource's dmabuf if new backing is attached
  virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  vfio: Document vfio_device_get_region_info()
  vfio/region: Add a helper to get region index from memory region
  vfio/device: Add a helper to lookup VFIODevice from memory region
  linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF
  vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF
  virtio-gpu: Rename udmabuf files and helpers to dmabuf
  virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO
    devices

 hw/display/Kconfig                            |   5 +
 hw/display/meson.build                        |   4 +-
 ...abuf-stubs.c => virtio-gpu-dmabuf-stubs.c} |   4 +-
 ...rtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c} | 149 ++++++++++++++++--
 hw/display/virtio-gpu.c                       |  34 +++-
 hw/vfio/device.c                              |  52 ++++++
 hw/vfio/region.c                              |  14 ++
 include/hw/vfio/vfio-device.h                 |  46 ++++++
 include/hw/virtio/virtio-gpu.h                |   6 +-
 linux-headers/linux/vfio.h                    |  25 +++
 10 files changed, 316 insertions(+), 23 deletions(-)
 rename hw/display/{virtio-gpu-udmabuf-stubs.c => virtio-gpu-dmabuf-stubs.c} (79%)
 rename hw/display/{virtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c} (56%)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-10  4:59   ` Akihiko Odaki
  2025-11-09  5:33 ` [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater

There are cases when a blob resource's backing might get detached
and re-attached again such as when the underlying object is getting
migrated in the Guest. In these situations, we need to obtain a new
dmabuf fd, which can be done by calling virtio_gpu_init_udmabuf().

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/display/virtio-gpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 43e88a4daf..199b18c746 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -937,6 +937,10 @@ virtio_gpu_resource_attach_backing(VirtIOGPU *g,
         cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
         return;
     }
+
+    if (res->blob_size) {
+        virtio_gpu_init_udmabuf(res);
+    }
 }
 
 static void
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-10  5:00   ` Akihiko Odaki
  2025-11-11  4:45   ` Akihiko Odaki
  2025-11-09  5:33 ` [PATCH v2 03/10] vfio: Document vfio_device_get_region_info() Vivek Kasireddy
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater

If the Guest provides a DMA address that is associated with a ram
device (such as a PCI device region and not its system memory),
then we can obtain the hva (host virtual address) by invoking
address_space_translate() followed by memory_region_get_ram_ptr().

This is because the ram device's address space is not accessible
to virtio-gpu directly and hence dma_memory_map() cannot be used.
Therefore, we first need to identify the memory region associated
with the DMA address and figure out if it belongs to a ram device
or not and decide how to obtain the host address accordingly.

Note that we take a reference on the memory region if it belongs
to a ram device but we would still call dma_memory_unmap() later
(to unref mr) regardless of how we obtained the hva.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 199b18c746..d352b5afd6 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -798,6 +798,26 @@ static void virtio_gpu_set_scanout_blob(VirtIOGPU *g,
                               &fb, res, &ss.r, &cmd->error);
 }
 
+static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
+                                       struct virtio_gpu_ctrl_command *cmd,
+                                       uint64_t a, hwaddr *len)
+{
+    MemoryRegion *mr = NULL;
+    hwaddr xlat;
+
+    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat, len,
+                                 DMA_DIRECTION_TO_DEVICE,
+                                 MEMTXATTRS_UNSPECIFIED);
+    if (memory_region_is_ram_device(mr)) {
+        memory_region_ref(mr);
+        return memory_region_get_ram_ptr(mr) + xlat;
+    }
+
+    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
+                          DMA_DIRECTION_TO_DEVICE,
+                          MEMTXATTRS_UNSPECIFIED);
+}
+
 int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
                                   uint32_t nr_entries, uint32_t offset,
                                   struct virtio_gpu_ctrl_command *cmd,
@@ -839,9 +859,7 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
 
         do {
             len = l;
-            map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, &len,
-                                 DMA_DIRECTION_TO_DEVICE,
-                                 MEMTXATTRS_UNSPECIFIED);
+            map = virtio_gpu_dma_memory_map(g, cmd, a, &len);
             if (!map) {
                 qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to map MMIO memory for"
                               " element %d\n", __func__, e);
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/10] vfio: Document vfio_device_get_region_info()
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 04/10] vfio/region: Add a helper to get region index from memory region Vivek Kasireddy
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vivek Kasireddy, Alex Williamson, Cédric Le Goater

Add documentation for vfio_device_get_region_info() and clarify the
expectations around its usage.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 include/hw/vfio/vfio-device.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index 0fe6c60ba2..bb28123faf 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -257,6 +257,19 @@ void vfio_device_prepare(VFIODevice *vbasedev, VFIOContainer *bcontainer,
 
 void vfio_device_unprepare(VFIODevice *vbasedev);
 
+/**
+ * Return the region info for a given region index. The region info includes
+ * details such as size, offset, and capabilities. Note that the returned
+ * info pointer is either a cached copy or newly allocated by
+ * vfio_device_get_region_info(), so the caller is not expected to allocate
+ * or free it.
+ *
+ * @vbasedev: #VFIODevice to use
+ * @index: region index
+ * @info: pointer to store the region info
+ *
+ * Returns 0 on success or a negative value on error.
+ */
 int vfio_device_get_region_info(VFIODevice *vbasedev, int index,
                                 struct vfio_region_info **info);
 int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type,
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 04/10] vfio/region: Add a helper to get region index from memory region
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (2 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 03/10] vfio: Document vfio_device_get_region_info() Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 05/10] vfio/device: Add a helper to lookup VFIODevice " Vivek Kasireddy
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vivek Kasireddy, Alex Williamson, Cédric Le Goater

Having a way to figure out the region index (or bar) associated
with a memory region is helpful in various scenarios. For example,
this capability can be useful in retrieving the region info needed
for mapping a part of a VFIO region or creating a dmabuf.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/vfio/region.c              | 14 ++++++++++++++
 include/hw/vfio/vfio-device.h | 10 ++++++++++
 2 files changed, 24 insertions(+)

diff --git a/hw/vfio/region.c b/hw/vfio/region.c
index b165ab0b93..046adfaa2c 100644
--- a/hw/vfio/region.c
+++ b/hw/vfio/region.c
@@ -398,3 +398,17 @@ void vfio_region_mmaps_set_enabled(VFIORegion *region, bool enabled)
     trace_vfio_region_mmaps_set_enabled(memory_region_name(region->mem),
                                         enabled);
 }
+
+int vfio_get_region_index_from_mr(MemoryRegion *mr)
+{
+    VFIORegion *region;
+
+    while (mr->container) {
+        if (mr->ops == &vfio_region_ops) {
+            region = mr->opaque;
+	    return region->nr;
+        }
+        mr = mr->container;
+    }
+    return -1;
+}
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index bb28123faf..44cacd3728 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -290,6 +290,16 @@ bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_t
 
 int vfio_device_get_irq_info(VFIODevice *vbasedev, int index,
                                 struct vfio_irq_info *info);
+
+/**
+ * Return the region index associated with a given MemoryRegion. The index
+ * can be useful in retrieving region info.
+ *
+ * @mr: MemoryRegion to use
+ *
+ * Returns the region index or -1 on error.
+ */
+int vfio_get_region_index_from_mr(MemoryRegion *mr);
 #endif
 
 /* Returns 0 on success, or a negative errno. */
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/10] vfio/device: Add a helper to lookup VFIODevice from memory region
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (3 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 04/10] vfio/region: Add a helper to get region index from memory region Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vivek Kasireddy, Alex Williamson, Cédric Le Goater

Instead of iterating over all QOM devices to find the VFIODevice
associated with a memory region, it is faster to just use the
vfio_device_list to lookup the VFIODevice.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/vfio/device.c              | 12 ++++++++++++
 include/hw/vfio/vfio-device.h |  9 +++++++++
 2 files changed, 21 insertions(+)

diff --git a/hw/vfio/device.c b/hw/vfio/device.c
index 76869828fc..9ff73f9941 100644
--- a/hw/vfio/device.c
+++ b/hw/vfio/device.c
@@ -603,3 +603,15 @@ static VFIODeviceIOOps vfio_device_io_ops_ioctl = {
     .region_read = vfio_device_io_region_read,
     .region_write = vfio_device_io_region_write,
 };
+
+VFIODevice *vfio_device_lookup(MemoryRegion *mr)
+{
+    VFIODevice *vbasedev;
+
+    QLIST_FOREACH(vbasedev, &vfio_device_list, next) {
+        if (vbasedev->dev == mr->dev) {
+            return vbasedev;
+        }
+    }
+    return NULL;
+}
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index 44cacd3728..2f8087f133 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -300,6 +300,15 @@ int vfio_device_get_irq_info(VFIODevice *vbasedev, int index,
  * Returns the region index or -1 on error.
  */
 int vfio_get_region_index_from_mr(MemoryRegion *mr);
+
+/**
+ * Return the VFIO device associated with a given MemoryRegion.
+ *
+ * @mr: MemoryRegion to use
+ *
+ * Returns the VFIO device if found or NULL.
+ */
+VFIODevice *vfio_device_lookup(MemoryRegion *mr);
 #endif
 
 /* Returns 0 on success, or a negative errno. */
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (4 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 05/10] vfio/device: Add a helper to lookup VFIODevice " Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-09  6:49   ` Leon Romanovsky
  2025-11-09  5:33 ` [PATCH v2 07/10] vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vivek Kasireddy, Alex Williamson, Cédric Le Goater

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 linux-headers/linux/vfio.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4d96d1fc12..bc11ca3663 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -1478,6 +1478,31 @@ struct vfio_device_feature_bus_master {
 };
 #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
 
+/**
+ * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
+ * regions selected.
+ *
+ * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC,
+ * etc. offset/length specify a slice of the region to create the dmabuf from.
+ * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf.
+ *
+ * Return: The fd number on success, -1 and errno is set on failure.
+ */
+#define VFIO_DEVICE_FEATURE_DMA_BUF 11
+
+struct vfio_region_dma_range {
+	__u64 offset;
+	__u64 length;
+};
+
+struct vfio_device_feature_dma_buf {
+	__u32	region_index;
+	__u32	open_flags;
+	__u32   flags;
+	__u32   nr_ranges;
+	struct vfio_region_dma_range dma_ranges[];
+};
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 07/10] vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (5 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-09  5:33 ` [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf Vivek Kasireddy
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vivek Kasireddy, Alex Williamson, Cédric Le Goater

In order to implement VFIO_DEVICE_FEATURE_DMA_BUF, we first need
to identify the VFIO region and index the buffer (represented by
iovec) belongs to and then translate its addresses to offsets
within that region.

The qemu_ram_block_from_host() API gives us both the region and the
offset info we need to populate the dma ranges in order to invoke
this feature.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/vfio/device.c              | 40 +++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-device.h | 14 ++++++++++++
 2 files changed, 54 insertions(+)

diff --git a/hw/vfio/device.c b/hw/vfio/device.c
index 9ff73f9941..5417142482 100644
--- a/hw/vfio/device.c
+++ b/hw/vfio/device.c
@@ -21,6 +21,7 @@
 #include "qemu/osdep.h"
 #include <sys/ioctl.h>
 
+#include "system/ramblock.h"
 #include "hw/vfio/vfio-device.h"
 #include "hw/vfio/pci.h"
 #include "hw/hw.h"
@@ -615,3 +616,42 @@ VFIODevice *vfio_device_lookup(MemoryRegion *mr)
     }
     return NULL;
 }
+
+int vfio_device_create_dmabuf_fd(VFIODevice *vbasedev,
+                                 struct iovec *iov, unsigned int iov_cnt)
+{
+    g_autofree struct vfio_device_feature *feature = NULL;
+    struct vfio_device_feature_dma_buf *dma_buf;
+    ram_addr_t offset;
+    RAMBlock *rb;
+    size_t argsz;
+    int i, index;
+
+    argsz = sizeof(*feature) + sizeof (*dma_buf) +
+            sizeof(struct vfio_region_dma_range) * iov_cnt;
+    feature = g_malloc0(argsz);
+    dma_buf = (struct vfio_device_feature_dma_buf *)feature->data;
+
+    for (i = 0; i < iov_cnt; i++) {
+        rb = qemu_ram_block_from_host(iov[i].iov_base, false, &offset);
+        if (!rb) {
+            return -1;
+        }
+
+        index = vfio_get_region_index_from_mr(rb->mr);
+        if (index < 0) {
+            return -1;
+        }
+
+        dma_buf->region_index = index;
+        dma_buf->dma_ranges[i].offset = offset;
+        dma_buf->dma_ranges[i].length = iov[i].iov_len;
+    }
+
+    dma_buf->nr_ranges = iov_cnt;
+    dma_buf->open_flags = O_RDONLY | O_CLOEXEC;
+    feature->argsz = argsz;
+    feature->flags = VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_DMA_BUF;
+
+    return vbasedev->io_ops->device_feature(vbasedev, feature);
+}
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index 2f8087f133..7fc2912f15 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -309,6 +309,20 @@ int vfio_get_region_index_from_mr(MemoryRegion *mr);
  * Returns the VFIO device if found or NULL.
  */
 VFIODevice *vfio_device_lookup(MemoryRegion *mr);
+
+/**
+ * Create and return a dmabuf fd by first translating the addresses in the
+ * iovec array into VFIO region offsets and then invoking the
+ * VFIO_DEVICE_FEATURE_DMA_BUF feature.
+ *
+ * @vbasedev: #VFIODevice to use
+ * @iov: array of iovec entries associated with a buffer
+ * @iov_cnt: number of entries in the iovec array
+ *
+ * Returns the newly created dmabuf fd or -1 on error.
+ */
+int vfio_device_create_dmabuf_fd(VFIODevice *vbasedev,
+                                 struct iovec *iov, unsigned int iov_cnt);
 #endif
 
 /* Returns 0 on success, or a negative errno. */
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (6 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 07/10] vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-10  5:00   ` Akihiko Odaki
  2025-11-09  5:33 ` [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions() Vivek Kasireddy
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater

This is prep-work for adding the ability to create dmabuf fds from
VFIO devices in addition to udmabuf.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/display/meson.build                               |  4 ++--
 ...gpu-udmabuf-stubs.c => virtio-gpu-dmabuf-stubs.c} |  4 ++--
 .../{virtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c}    | 12 ++++++------
 hw/display/virtio-gpu.c                              |  8 ++++----
 include/hw/virtio/virtio-gpu.h                       |  6 +++---
 5 files changed, 17 insertions(+), 17 deletions(-)
 rename hw/display/{virtio-gpu-udmabuf-stubs.c => virtio-gpu-dmabuf-stubs.c} (79%)
 rename hw/display/{virtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c} (94%)

diff --git a/hw/display/meson.build b/hw/display/meson.build
index 90e6c041bd..f5f92b1068 100644
--- a/hw/display/meson.build
+++ b/hw/display/meson.build
@@ -69,9 +69,9 @@ if config_all_devices.has_key('CONFIG_VIRTIO_GPU')
   virtio_gpu_ss.add(when: 'CONFIG_VIRTIO_GPU',
                     if_true: [files('virtio-gpu-base.c', 'virtio-gpu.c'), pixman])
   if host_os == 'linux'
-    virtio_gpu_ss.add(files('virtio-gpu-udmabuf.c'))
+    virtio_gpu_ss.add(files('virtio-gpu-dmabuf.c'))
   else
-    virtio_gpu_ss.add(files('virtio-gpu-udmabuf-stubs.c'))
+    virtio_gpu_ss.add(files('virtio-gpu-dmabuf-stubs.c'))
   endif
   virtio_gpu_ss.add(when: 'CONFIG_VHOST_USER_GPU', if_true: files('vhost-user-gpu.c'))
   hw_display_modules += {'virtio-gpu': virtio_gpu_ss}
diff --git a/hw/display/virtio-gpu-udmabuf-stubs.c b/hw/display/virtio-gpu-dmabuf-stubs.c
similarity index 79%
rename from hw/display/virtio-gpu-udmabuf-stubs.c
rename to hw/display/virtio-gpu-dmabuf-stubs.c
index f692e13510..01067e246d 100644
--- a/hw/display/virtio-gpu-udmabuf-stubs.c
+++ b/hw/display/virtio-gpu-dmabuf-stubs.c
@@ -7,12 +7,12 @@ bool virtio_gpu_have_udmabuf(void)
     return false;
 }
 
-void virtio_gpu_init_udmabuf(struct virtio_gpu_simple_resource *res)
+void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     /* nothing (stub) */
 }
 
-void virtio_gpu_fini_udmabuf(struct virtio_gpu_simple_resource *res)
+void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     /* nothing (stub) */
 }
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-dmabuf.c
similarity index 94%
rename from hw/display/virtio-gpu-udmabuf.c
rename to hw/display/virtio-gpu-dmabuf.c
index d804f321aa..c34d4c85bc 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gpu-dmabuf.c
@@ -68,7 +68,7 @@ static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
     g_free(list);
 }
 
-static void virtio_gpu_remap_udmabuf(struct virtio_gpu_simple_resource *res)
+static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     res->remapped = mmap(NULL, res->blob_size, PROT_READ,
                          MAP_SHARED, res->dmabuf_fd, 0);
@@ -79,7 +79,7 @@ static void virtio_gpu_remap_udmabuf(struct virtio_gpu_simple_resource *res)
     }
 }
 
-static void virtio_gpu_destroy_udmabuf(struct virtio_gpu_simple_resource *res)
+static void virtio_gpu_destroy_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     if (res->remapped) {
         munmap(res->remapped, res->blob_size);
@@ -128,7 +128,7 @@ bool virtio_gpu_have_udmabuf(void)
     return memfd_backend;
 }
 
-void virtio_gpu_init_udmabuf(struct virtio_gpu_simple_resource *res)
+void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     void *pdata = NULL;
 
@@ -141,7 +141,7 @@ void virtio_gpu_init_udmabuf(struct virtio_gpu_simple_resource *res)
         if (res->dmabuf_fd < 0) {
             return;
         }
-        virtio_gpu_remap_udmabuf(res);
+        virtio_gpu_remap_dmabuf(res);
         if (!res->remapped) {
             return;
         }
@@ -151,10 +151,10 @@ void virtio_gpu_init_udmabuf(struct virtio_gpu_simple_resource *res)
     res->blob = pdata;
 }
 
-void virtio_gpu_fini_udmabuf(struct virtio_gpu_simple_resource *res)
+void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res)
 {
     if (res->remapped) {
-        virtio_gpu_destroy_udmabuf(res);
+        virtio_gpu_destroy_dmabuf(res);
     }
 }
 
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index d352b5afd6..da005f2342 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -360,7 +360,7 @@ static void virtio_gpu_resource_create_blob(VirtIOGPU *g,
         return;
     }
 
-    virtio_gpu_init_udmabuf(res);
+    virtio_gpu_init_dmabuf(res);
     QTAILQ_INSERT_HEAD(&g->reslist, res, next);
 }
 
@@ -920,7 +920,7 @@ void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
     res->addrs = NULL;
 
     if (res->blob) {
-        virtio_gpu_fini_udmabuf(res);
+        virtio_gpu_fini_dmabuf(res);
     }
 }
 
@@ -957,7 +957,7 @@ virtio_gpu_resource_attach_backing(VirtIOGPU *g,
     }
 
     if (res->blob_size) {
-        virtio_gpu_init_udmabuf(res);
+        virtio_gpu_init_dmabuf(res);
     }
 }
 
@@ -1443,7 +1443,7 @@ static int virtio_gpu_blob_load(QEMUFile *f, void *opaque, size_t size,
             return -EINVAL;
         }
 
-        virtio_gpu_init_udmabuf(res);
+        virtio_gpu_init_dmabuf(res);
 
         resource_id = qemu_get_be32(f);
     }
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 58e0f91fda..e3d351a2d4 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -354,10 +354,10 @@ bool virtio_gpu_scanout_blob_to_fb(struct virtio_gpu_framebuffer *fb,
                                    struct virtio_gpu_set_scanout_blob *ss,
                                    uint64_t blob_size);
 
-/* virtio-gpu-udmabuf.c */
+/* virtio-gpu-dmabuf.c */
 bool virtio_gpu_have_udmabuf(void);
-void virtio_gpu_init_udmabuf(struct virtio_gpu_simple_resource *res);
-void virtio_gpu_fini_udmabuf(struct virtio_gpu_simple_resource *res);
+void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res);
+void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res);
 int virtio_gpu_update_dmabuf(VirtIOGPU *g,
                              uint32_t scanout_id,
                              struct virtio_gpu_simple_resource *res,
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (7 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-10  5:19   ` Akihiko Odaki
  2025-11-09  5:33 ` [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices Vivek Kasireddy
  2025-11-17  9:04 ` [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Cédric Le Goater
  10 siblings, 1 reply; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater

Add a helper to check whether the addresses in an iovec array
belong to the same memory region or not. This is useful to verify
before trying to create a dmabuf from an iovec array.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/display/virtio-gpu-dmabuf.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-dmabuf.c
index c34d4c85bc..80143034d4 100644
--- a/hw/display/virtio-gpu-dmabuf.c
+++ b/hw/display/virtio-gpu-dmabuf.c
@@ -27,6 +27,31 @@
 #include "standard-headers/linux/udmabuf.h"
 #include "standard-headers/drm/drm_fourcc.h"
 
+static bool qemu_iovec_same_memory_regions(const struct iovec *iov, int iov_cnt)
+{
+    RAMBlock *rb, *curr_rb;
+    ram_addr_t offset;
+    int i;
+
+    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
+    if (!rb) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: Could not find ramblock/memory region\n", __func__);
+        return false;
+    }
+
+    for (i = 1; i < iov_cnt; i++) {
+	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false, &offset);
+	if (curr_rb != rb) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "%s: memory regions not same for iov entries\n",
+                          __func__);
+            return false;
+	}
+    }
+    return true;
+}
+
 static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
 {
     struct udmabuf_create_list *list;
@@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
         res->iov[0].iov_len < 4096) {
         pdata = res->iov[0].iov_base;
     } else {
+        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
+            return;
+        }
+
         virtio_gpu_create_udmabuf(res);
         if (res->dmabuf_fd < 0) {
             return;
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (8 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions() Vivek Kasireddy
@ 2025-11-09  5:33 ` Vivek Kasireddy
  2025-11-10  4:55   ` Akihiko Odaki
  2025-11-17  9:04 ` [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Cédric Le Goater
  10 siblings, 1 reply; 45+ messages in thread
From: Vivek Kasireddy @ 2025-11-09  5:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vivek Kasireddy, Marc-André Lureau, Alex Bennée,
	Akihiko Odaki, Dmitry Osipenko, Alex Williamson,
	Cédric Le Goater

In addition to memfd, a blob resource can also have its backing
storage in a VFIO device region. Therefore, we first need to figure
out if the blob is backed by a VFIO device region or a memfd before
we can call the right API to get a dmabuf fd created.

So, once we have the ramblock and the associated mr, we rely on
memory_region_is_ram_device() to tell us where the backing storage
is located. If the blob resource is VFIO backed, we try to find the
right VFIO device that contains the blob and then invoke the API
vfio_device_create_dmabuf().

Note that in virtio_gpu_remap_udmabuf(), we first try to test if
the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
use the VFIO device fd directly to create the CPU mapping.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 hw/display/Kconfig             |   5 ++
 hw/display/virtio-gpu-dmabuf.c | 114 +++++++++++++++++++++++++++++++--
 2 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/hw/display/Kconfig b/hw/display/Kconfig
index 1e95ab28ef..0d090f25f5 100644
--- a/hw/display/Kconfig
+++ b/hw/display/Kconfig
@@ -106,6 +106,11 @@ config VIRTIO_VGA
     depends on VIRTIO_PCI
     select VGA
 
+config VIRTIO_GPU_VFIO_BLOB
+    bool
+    default y
+    depends on VFIO
+
 config VHOST_USER_GPU
     bool
     default y
diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-dmabuf.c
index 80143034d4..940b0e0411 100644
--- a/hw/display/virtio-gpu-dmabuf.c
+++ b/hw/display/virtio-gpu-dmabuf.c
@@ -18,6 +18,7 @@
 #include "ui/console.h"
 #include "hw/virtio/virtio-gpu.h"
 #include "hw/virtio/virtio-gpu-pixman.h"
+#include "hw/vfio/vfio-device.h"
 #include "trace.h"
 #include "system/ramblock.h"
 #include "system/hostmem.h"
@@ -52,6 +53,19 @@ static bool qemu_iovec_same_memory_regions(const struct iovec *iov, int iov_cnt)
     return true;
 }
 
+static void vfio_create_dmabuf(VFIODevice *vdev,
+                               struct virtio_gpu_simple_resource *res)
+{
+#if defined(VIRTIO_GPU_VFIO_BLOB)
+    res->dmabuf_fd = vfio_device_create_dmabuf_fd(vdev, res->iov, res->iov_cnt);
+    if (res->dmabuf_fd < 0) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: VFIO_DEVICE_FEATURE_DMA_BUF: %s\n",
+                      __func__, strerror(errno));
+    }
+#endif
+}
+
 static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
 {
     struct udmabuf_create_list *list;
@@ -93,11 +107,69 @@ static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
     g_free(list);
 }
 
-static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource *res)
+static void *vfio_dmabuf_mmap(struct virtio_gpu_simple_resource *res,
+                              VFIODevice *vdev)
+{
+    struct vfio_region_info *info = NULL;
+    ram_addr_t offset, len = 0;
+    void *map, *submap;
+    int i, ret = -1;
+    RAMBlock *rb;
+
+    /*
+     * We first reserve a contiguous chunk of address space for the entire
+     * dmabuf, then replace it with smaller mappings that correspond to the
+     * individual segments of the dmabuf.
+     */
+    map = mmap(NULL, res->blob_size, PROT_READ, MAP_SHARED, vdev->fd, 0);
+    if (map == MAP_FAILED) {
+        return map;
+    }
+
+    for (i = 0; i < res->iov_cnt; i++) {
+        rb = qemu_ram_block_from_host(res->iov[i].iov_base, false, &offset);
+	if (!rb) {
+            goto err;
+        }
+#if defined(VIRTIO_GPU_VFIO_BLOB)
+        ret = vfio_get_region_index_from_mr(rb->mr);
+        if (ret < 0) {
+            goto err;
+        }
+
+        ret = vfio_device_get_region_info(vdev, ret, &info);
+#endif
+        if (ret < 0 || !info) {
+            goto err;
+        }
+
+        submap = mmap(map + len, res->iov[i].iov_len, PROT_READ,
+                      MAP_SHARED | MAP_FIXED, vdev->fd,
+                      info->offset + offset);
+        if (submap == MAP_FAILED) {
+            goto err;
+        }
+
+        len += res->iov[i].iov_len;
+    }
+    return map;
+err:
+    munmap(map, res->blob_size);
+    return MAP_FAILED;
+}
+
+static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource *res,
+                                    VFIODevice *vdev)
 {
     res->remapped = mmap(NULL, res->blob_size, PROT_READ,
                          MAP_SHARED, res->dmabuf_fd, 0);
     if (res->remapped == MAP_FAILED) {
+        if (vdev) {
+            res->remapped = vfio_dmabuf_mmap(res, vdev);
+            if (res->remapped != MAP_FAILED) {
+                return;
+            }
+        }
         warn_report("%s: dmabuf mmap failed: %s", __func__,
                     strerror(errno));
         res->remapped = NULL;
@@ -155,7 +227,10 @@ bool virtio_gpu_have_udmabuf(void)
 
 void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
 {
+    VFIODevice *vdev = NULL;
     void *pdata = NULL;
+    ram_addr_t offset;
+    RAMBlock *rb;
 
     res->dmabuf_fd = -1;
     if (res->iov_cnt == 1 &&
@@ -166,11 +241,38 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
             return;
         }
 
-        virtio_gpu_create_udmabuf(res);
-        if (res->dmabuf_fd < 0) {
+        rb = qemu_ram_block_from_host(res->iov[0].iov_base, false, &offset);
+        if (memory_region_is_ram_device(rb->mr)) {
+            vdev = vfio_device_lookup(rb->mr);
+            if (!vdev) {
+                qemu_log_mask(LOG_GUEST_ERROR,
+                              "%s: Could not find device to create dmabuf\n",
+                              __func__);
+                return;
+            }
+
+            vfio_create_dmabuf(vdev, res);
+            if (res->dmabuf_fd < 0) {
+                qemu_log_mask(LOG_GUEST_ERROR,
+                              "%s: Could not create dmabuf from vfio device\n",
+                              __func__);
+                return;
+            }
+        } else if (memory_region_is_ram(rb->mr)) {
+            virtio_gpu_create_udmabuf(res);
+            if (res->dmabuf_fd < 0) {
+                qemu_log_mask(LOG_GUEST_ERROR,
+                              "%s: Could not create dmabuf from memfd\n",
+                              __func__);
+                return;
+            }
+        } else {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "%s: memory region cannot be used to create dmabuf\n",
+                          __func__);
             return;
         }
-        virtio_gpu_remap_dmabuf(res);
+        virtio_gpu_remap_dmabuf(res, vdev);
         if (!res->remapped) {
             return;
         }
@@ -182,9 +284,7 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
 
 void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res)
 {
-    if (res->remapped) {
-        virtio_gpu_destroy_dmabuf(res);
-    }
+    virtio_gpu_destroy_dmabuf(res);
 }
 
 static void virtio_gpu_free_dmabuf(VirtIOGPU *g, VGPUDMABuf *dmabuf)
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF
  2025-11-09  5:33 ` [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
@ 2025-11-09  6:49   ` Leon Romanovsky
  2025-11-17  9:02     ` Cédric Le Goater
  0 siblings, 1 reply; 45+ messages in thread
From: Leon Romanovsky @ 2025-11-09  6:49 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel; +Cc: Alex Williamson, Cédric Le Goater



On Sun, Nov 9, 2025, at 07:33, Vivek Kasireddy wrote:
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>  linux-headers/linux/vfio.h | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4d96d1fc12..bc11ca3663 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -1478,6 +1478,31 @@ struct vfio_device_feature_bus_master {
>  };
>  #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
> 
> +/**
> + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> + * regions selected.
> + *
> + * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC,
> + * etc. offset/length specify a slice of the region to create the dmabuf from.
> + * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf.
> + *
> + * Return: The fd number on success, -1 and errno is set on failure.
> + */
> +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> +
> +struct vfio_region_dma_range {
> +	__u64 offset;
> +	__u64 length;
> +};
> +
> +struct vfio_device_feature_dma_buf {
> +	__u32	region_index;
> +	__u32	open_flags;
> +	__u32   flags;
> +	__u32   nr_ranges;
> +	struct vfio_region_dma_range dma_ranges[];

Not important comment at all, but in last versions of UAPI, this line is
struct vfio_region_dma_range dma_ranges[] __counted_by(nr_ranges);
https://lore.kernel.org/kvm/20251106-dmabuf-vfio-v7-10-2503bf390699@nvidia.com/T/#Z2e.:..:20251106-dmabuf-vfio-v7-10-2503bf390699::40nvidia.com:1include:uapi:linux:vfio.h

> +};
> +
>  /* -------- API for Type1 VFIO IOMMU -------- */
> 
>  /**
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-09  5:33 ` [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices Vivek Kasireddy
@ 2025-11-10  4:55   ` Akihiko Odaki
  2025-11-12  4:26     ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-10  4:55 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> In addition to memfd, a blob resource can also have its backing
> storage in a VFIO device region. Therefore, we first need to figure
> out if the blob is backed by a VFIO device region or a memfd before
> we can call the right API to get a dmabuf fd created.
> 
> So, once we have the ramblock and the associated mr, we rely on
> memory_region_is_ram_device() to tell us where the backing storage
> is located. If the blob resource is VFIO backed, we try to find the
> right VFIO device that contains the blob and then invoke the API
> vfio_device_create_dmabuf().
> 
> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
> use the VFIO device fd directly to create the CPU mapping.

I have just remembered that mremap() will work for either of udmabuf and 
VFIO. That will avoid having two different methods and make 
vfio_get_region_index_from_mr() and vfio_device_get_region_info() 
unnecessary.

> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>   hw/display/Kconfig             |   5 ++
>   hw/display/virtio-gpu-dmabuf.c | 114 +++++++++++++++++++++++++++++++--
>   2 files changed, 112 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/display/Kconfig b/hw/display/Kconfig
> index 1e95ab28ef..0d090f25f5 100644
> --- a/hw/display/Kconfig
> +++ b/hw/display/Kconfig
> @@ -106,6 +106,11 @@ config VIRTIO_VGA
>       depends on VIRTIO_PCI
>       select VGA
>   
> +config VIRTIO_GPU_VFIO_BLOB
> +    bool
> +    default y
> +    depends on VFIO
> +
>   config VHOST_USER_GPU
>       bool
>       default y
> diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-dmabuf.c
> index 80143034d4..940b0e0411 100644
> --- a/hw/display/virtio-gpu-dmabuf.c
> +++ b/hw/display/virtio-gpu-dmabuf.c
> @@ -18,6 +18,7 @@
>   #include "ui/console.h"
>   #include "hw/virtio/virtio-gpu.h"
>   #include "hw/virtio/virtio-gpu-pixman.h"
> +#include "hw/vfio/vfio-device.h"
>   #include "trace.h"
>   #include "system/ramblock.h"
>   #include "system/hostmem.h"
> @@ -52,6 +53,19 @@ static bool qemu_iovec_same_memory_regions(const struct iovec *iov, int iov_cnt)
>       return true;
>   }
>   
> +static void vfio_create_dmabuf(VFIODevice *vdev,
> +                               struct virtio_gpu_simple_resource *res)
> +{
> +#if defined(VIRTIO_GPU_VFIO_BLOB)
> +    res->dmabuf_fd = vfio_device_create_dmabuf_fd(vdev, res->iov, res->iov_cnt);
> +    if (res->dmabuf_fd < 0) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "%s: VFIO_DEVICE_FEATURE_DMA_BUF: %s\n",
> +                      __func__, strerror(errno));
> +    }
> +#endif
> +}
> +
>   static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
>   {
>       struct udmabuf_create_list *list;
> @@ -93,11 +107,69 @@ static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
>       g_free(list);
>   }
>   
> -static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource *res)
> +static void *vfio_dmabuf_mmap(struct virtio_gpu_simple_resource *res,
> +                              VFIODevice *vdev)
> +{
> +    struct vfio_region_info *info = NULL;
> +    ram_addr_t offset, len = 0;
> +    void *map, *submap;
> +    int i, ret = -1;
> +    RAMBlock *rb;
> +
> +    /*
> +     * We first reserve a contiguous chunk of address space for the entire
> +     * dmabuf, then replace it with smaller mappings that correspond to the
> +     * individual segments of the dmabuf.
> +     */
> +    map = mmap(NULL, res->blob_size, PROT_READ, MAP_SHARED, vdev->fd, 0);
> +    if (map == MAP_FAILED) {
> +        return map;
> +    }
> +
> +    for (i = 0; i < res->iov_cnt; i++) {
> +        rb = qemu_ram_block_from_host(res->iov[i].iov_base, false, &offset);
> +	if (!rb) {
> +            goto err;
> +        }
> +#if defined(VIRTIO_GPU_VFIO_BLOB)
> +        ret = vfio_get_region_index_from_mr(rb->mr);
> +        if (ret < 0) {
> +            goto err;
> +        }
> +
> +        ret = vfio_device_get_region_info(vdev, ret, &info);
> +#endif
> +        if (ret < 0 || !info) {
> +            goto err;
> +        }
> +
> +        submap = mmap(map + len, res->iov[i].iov_len, PROT_READ,
> +                      MAP_SHARED | MAP_FIXED, vdev->fd,
> +                      info->offset + offset);
> +        if (submap == MAP_FAILED) {
> +            goto err;
> +        }
> +
> +        len += res->iov[i].iov_len;
> +    }
> +    return map;
> +err:
> +    munmap(map, res->blob_size);
> +    return MAP_FAILED;
> +}
> +
> +static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource *res,
> +                                    VFIODevice *vdev)
>   {
>       res->remapped = mmap(NULL, res->blob_size, PROT_READ,
>                            MAP_SHARED, res->dmabuf_fd, 0);
>       if (res->remapped == MAP_FAILED) {
> +        if (vdev) {
> +            res->remapped = vfio_dmabuf_mmap(res, vdev);
> +            if (res->remapped != MAP_FAILED) {
> +                return;
> +            }
> +        }
>           warn_report("%s: dmabuf mmap failed: %s", __func__,
>                       strerror(errno));
>           res->remapped = NULL;
> @@ -155,7 +227,10 @@ bool virtio_gpu_have_udmabuf(void)
>   
>   void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
>   {
> +    VFIODevice *vdev = NULL;
>       void *pdata = NULL;
> +    ram_addr_t offset;
> +    RAMBlock *rb;
>   
>       res->dmabuf_fd = -1;
>       if (res->iov_cnt == 1 &&
> @@ -166,11 +241,38 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
>               return;
>           }
>   
> -        virtio_gpu_create_udmabuf(res);
> -        if (res->dmabuf_fd < 0) {
> +        rb = qemu_ram_block_from_host(res->iov[0].iov_base, false, &offset);
> +        if (memory_region_is_ram_device(rb->mr)) {
> +            vdev = vfio_device_lookup(rb->mr);
> +            if (!vdev) {
> +                qemu_log_mask(LOG_GUEST_ERROR,
> +                              "%s: Could not find device to create dmabuf\n",
> +                              __func__);

This should say "VFIO device" since no other RAM device is supported.

> +                return;
> +            }
> +
> +            vfio_create_dmabuf(vdev, res);
> +            if (res->dmabuf_fd < 0) {
> +                qemu_log_mask(LOG_GUEST_ERROR,
> +                              "%s: Could not create dmabuf from vfio device\n",

Nitpick: let's make VFIO uppercase as other user-visible messages do.

> +                              __func__);
> +                return;
> +            }
> +        } else if (memory_region_is_ram(rb->mr)) {
 > +            virtio_gpu_create_udmabuf(res);> +            if 
(res->dmabuf_fd < 0) {
> +                qemu_log_mask(LOG_GUEST_ERROR,
> +                              "%s: Could not create dmabuf from memfd\n",
> +                              __func__);
> +                return;
> +            }
> +        } else {
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          "%s: memory region cannot be used to create dmabuf\n",
> +                          __func__);
>               return;
>           }
> -        virtio_gpu_remap_dmabuf(res);
> +        virtio_gpu_remap_dmabuf(res, vdev);
>           if (!res->remapped) {
>               return;
>           }
> @@ -182,9 +284,7 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
>   
>   void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res)
>   {
> -    if (res->remapped) {
> -        virtio_gpu_destroy_dmabuf(res);
> -    }
> +    virtio_gpu_destroy_dmabuf(res);
>   }
>   
>   static void virtio_gpu_free_dmabuf(VirtIOGPU *g, VGPUDMABuf *dmabuf)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached
  2025-11-09  5:33 ` [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
@ 2025-11-10  4:59   ` Akihiko Odaki
  0 siblings, 0 replies; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-10  4:59 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> There are cases when a blob resource's backing might get detached
> and re-attached again such as when the underlying object is getting
> migrated in the Guest. In these situations, we need to obtain a new
> dmabuf fd, which can be done by calling virtio_gpu_init_udmabuf().
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>

Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-09  5:33 ` [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
@ 2025-11-10  5:00   ` Akihiko Odaki
  2025-11-11  4:45   ` Akihiko Odaki
  1 sibling, 0 replies; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-10  5:00 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> If the Guest provides a DMA address that is associated with a ram
> device (such as a PCI device region and not its system memory),
> then we can obtain the hva (host virtual address) by invoking
> address_space_translate() followed by memory_region_get_ram_ptr().
> 
> This is because the ram device's address space is not accessible
> to virtio-gpu directly and hence dma_memory_map() cannot be used.
> Therefore, we first need to identify the memory region associated
> with the DMA address and figure out if it belongs to a ram device
> or not and decide how to obtain the host address accordingly.
> 
> Note that we take a reference on the memory region if it belongs
> to a ram device but we would still call dma_memory_unmap() later
> (to unref mr) regardless of how we obtained the hva.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>

Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

> ---
>   hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 199b18c746..d352b5afd6 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -798,6 +798,26 @@ static void virtio_gpu_set_scanout_blob(VirtIOGPU *g,
>                                 &fb, res, &ss.r, &cmd->error);
>   }
>   
> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> +                                       struct virtio_gpu_ctrl_command *cmd,
> +                                       uint64_t a, hwaddr *len)
> +{
> +    MemoryRegion *mr = NULL;
> +    hwaddr xlat;
> +
> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat, len,
> +                                 DMA_DIRECTION_TO_DEVICE,
> +                                 MEMTXATTRS_UNSPECIFIED);
> +    if (memory_region_is_ram_device(mr)) {
> +        memory_region_ref(mr);
> +        return memory_region_get_ram_ptr(mr) + xlat;
> +    }
> +
> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> +                          DMA_DIRECTION_TO_DEVICE,
> +                          MEMTXATTRS_UNSPECIFIED);
> +}
> +
>   int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
>                                     uint32_t nr_entries, uint32_t offset,
>                                     struct virtio_gpu_ctrl_command *cmd,
> @@ -839,9 +859,7 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
>   
>           do {
>               len = l;
> -            map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, &len,
> -                                 DMA_DIRECTION_TO_DEVICE,
> -                                 MEMTXATTRS_UNSPECIFIED);
> +            map = virtio_gpu_dma_memory_map(g, cmd, a, &len);
>               if (!map) {
>                   qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to map MMIO memory for"
>                                 " element %d\n", __func__, e);



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf
  2025-11-09  5:33 ` [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf Vivek Kasireddy
@ 2025-11-10  5:00   ` Akihiko Odaki
  0 siblings, 0 replies; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-10  5:00 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> This is prep-work for adding the ability to create dmabuf fds from
> VFIO devices in addition to udmabuf.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>

Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-09  5:33 ` [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions() Vivek Kasireddy
@ 2025-11-10  5:19   ` Akihiko Odaki
  2025-11-12  4:24     ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-10  5:19 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> Add a helper to check whether the addresses in an iovec array
> belong to the same memory region or not. This is useful to verify
> before trying to create a dmabuf from an iovec array.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>   hw/display/virtio-gpu-dmabuf.c | 29 +++++++++++++++++++++++++++++
>   1 file changed, 29 insertions(+)
> 
> diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-dmabuf.c
> index c34d4c85bc..80143034d4 100644
> --- a/hw/display/virtio-gpu-dmabuf.c
> +++ b/hw/display/virtio-gpu-dmabuf.c
> @@ -27,6 +27,31 @@
>   #include "standard-headers/linux/udmabuf.h"
>   #include "standard-headers/drm/drm_fourcc.h"
>   
> +static bool qemu_iovec_same_memory_regions(const struct iovec *iov, int iov_cnt)
> +{
> +    RAMBlock *rb, *curr_rb;
> +    ram_addr_t offset;
> +    int i;
> +
> +    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
> +    if (!rb) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "%s: Could not find ramblock/memory region\n", __func__);
> +        return false;
> +    }
> +
> +    for (i = 1; i < iov_cnt; i++) {
> +	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false, &offset);
> +	if (curr_rb != rb) {
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          "%s: memory regions not same for iov entries\n",
> +                          __func__);
> +            return false;
> +	}
> +    }
> +    return true;
> +}
> +
>   static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource *res)
>   {
>       struct udmabuf_create_list *list;
> @@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
>           res->iov[0].iov_len < 4096) {
>           pdata = res->iov[0].iov_base;
>       } else {
> +        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
> +            return;
> +        }
> +

This check is unnecessary. Perhaps rejecting iov with different memory 
regions may be fine if that simplifies the code, but this actually adds 
some code.

>           virtio_gpu_create_udmabuf(res);
>           if (res->dmabuf_fd < 0) {
>               return;

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-09  5:33 ` [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
  2025-11-10  5:00   ` Akihiko Odaki
@ 2025-11-11  4:45   ` Akihiko Odaki
  2025-11-12  4:30     ` Kasireddy, Vivek
  1 sibling, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-11  4:45 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/09 14:33, Vivek Kasireddy wrote:
> If the Guest provides a DMA address that is associated with a ram
> device (such as a PCI device region and not its system memory),
> then we can obtain the hva (host virtual address) by invoking
> address_space_translate() followed by memory_region_get_ram_ptr().
> 
> This is because the ram device's address space is not accessible
> to virtio-gpu directly and hence dma_memory_map() cannot be used.
> Therefore, we first need to identify the memory region associated
> with the DMA address and figure out if it belongs to a ram device
> or not and decide how to obtain the host address accordingly.
> 
> Note that we take a reference on the memory region if it belongs
> to a ram device but we would still call dma_memory_unmap() later
> (to unref mr) regardless of how we obtained the hva.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>   hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 199b18c746..d352b5afd6 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -798,6 +798,26 @@ static void virtio_gpu_set_scanout_blob(VirtIOGPU *g,
>                                 &fb, res, &ss.r, &cmd->error);
>   }
>   
> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> +                                       struct virtio_gpu_ctrl_command *cmd,
> +                                       uint64_t a, hwaddr *len)
> +{
> +    MemoryRegion *mr = NULL;
> +    hwaddr xlat;
> +
> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat, len,
> +                                 DMA_DIRECTION_TO_DEVICE,
> +                                 MEMTXATTRS_UNSPECIFIED);
> +    if (memory_region_is_ram_device(mr)) {
> +        memory_region_ref(mr);
> +        return memory_region_get_ram_ptr(mr) + xlat;
> +    }
> +
> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> +                          DMA_DIRECTION_TO_DEVICE,
> +                          MEMTXATTRS_UNSPECIFIED);

This function should:
- call memory_region_get_ram_ptr(mr)
   if memory_region_is_ram(mr)
- return NULL otherwise

There are a few reasons. First, the documentation of dma_memory_map() 
tells to use it "only for reads OR writes - not for read-modify-write 
operations." It can be used for read-modify-write operations so 
dma_memory_map() should be avoided.

Second, it ensures that the mapped pointer is writable.
"[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated 
with VFIO devices" adds checks for memory_region_is_ram() and 
memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the other 
callers also use the function to map writable pointers.

It also makes the check of memory_region_is_ram_device() and 
memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(), 
reducing the overall complexity.

Regards,
Akihiko Odaki

> +}
> +
>   int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
>                                     uint32_t nr_entries, uint32_t offset,
>                                     struct virtio_gpu_ctrl_command *cmd,
> @@ -839,9 +859,7 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
>   
>           do {
>               len = l;
> -            map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, &len,
> -                                 DMA_DIRECTION_TO_DEVICE,
> -                                 MEMTXATTRS_UNSPECIFIED);
> +            map = virtio_gpu_dma_memory_map(g, cmd, a, &len);
>               if (!map) {
>                   qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to map MMIO memory for"
>                                 " element %d\n", __func__, e);



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-10  5:19   ` Akihiko Odaki
@ 2025-11-12  4:24     ` Kasireddy, Vivek
  2025-11-12  8:21       ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-12  4:24 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
> qemu_iovec_same_memory_regions()
> 
> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> > Add a helper to check whether the addresses in an iovec array
> > belong to the same memory region or not. This is useful to verify
> > before trying to create a dmabuf from an iovec array.
> >
> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> > Cc: Alex Bennée <alex.bennee@linaro.org>
> > Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> > Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Cédric Le Goater <clg@redhat.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >   hw/display/virtio-gpu-dmabuf.c | 29 +++++++++++++++++++++++++++++
> >   1 file changed, 29 insertions(+)
> >
> > diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
> dmabuf.c
> > index c34d4c85bc..80143034d4 100644
> > --- a/hw/display/virtio-gpu-dmabuf.c
> > +++ b/hw/display/virtio-gpu-dmabuf.c
> > @@ -27,6 +27,31 @@
> >   #include "standard-headers/linux/udmabuf.h"
> >   #include "standard-headers/drm/drm_fourcc.h"
> >
> > +static bool qemu_iovec_same_memory_regions(const struct iovec *iov,
> int iov_cnt)
> > +{
> > +    RAMBlock *rb, *curr_rb;
> > +    ram_addr_t offset;
> > +    int i;
> > +
> > +    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
> > +    if (!rb) {
> > +        qemu_log_mask(LOG_GUEST_ERROR,
> > +                      "%s: Could not find ramblock/memory region\n", __func__);
> > +        return false;
> > +    }
> > +
> > +    for (i = 1; i < iov_cnt; i++) {
> > +	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false,
> &offset);
> > +	if (curr_rb != rb) {
> > +            qemu_log_mask(LOG_GUEST_ERROR,
> > +                          "%s: memory regions not same for iov entries\n",
> > +                          __func__);
> > +            return false;
> > +	}
> > +    }
> > +    return true;
> > +}
> > +
> >   static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource
> *res)
> >   {
> >       struct udmabuf_create_list *list;
> > @@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct
> virtio_gpu_simple_resource *res)
> >           res->iov[0].iov_len < 4096) {
> >           pdata = res->iov[0].iov_base;
> >       } else {
> > +        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
> > +            return;
> > +        }
> > +
> 
> This check is unnecessary. Perhaps rejecting iov with different memory
> regions may be fine if that simplifies the code, but this actually adds
> some code.
I think we can keep this sanity check but I don't really mind dropping this
patch given that buffers with mixed memory regions are not encountered
in practical situations. Or, I guess I could move the if (curr_rb != rb) check
to virtio_gpu_create_udmabuf() and vfio_device_create_dmabuf_fd()
like you suggested previously.

Thanks,
Vivek

> 
> >           virtio_gpu_create_udmabuf(res);
> >           if (res->dmabuf_fd < 0) {
> >               return;
> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-10  4:55   ` Akihiko Odaki
@ 2025-11-12  4:26     ` Kasireddy, Vivek
  2025-11-12  8:17       ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-12  4:26 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated with VFIO devices
> 
> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> > In addition to memfd, a blob resource can also have its backing
> > storage in a VFIO device region. Therefore, we first need to figure
> > out if the blob is backed by a VFIO device region or a memfd before
> > we can call the right API to get a dmabuf fd created.
> >
> > So, once we have the ramblock and the associated mr, we rely on
> > memory_region_is_ram_device() to tell us where the backing storage
> > is located. If the blob resource is VFIO backed, we try to find the
> > right VFIO device that contains the blob and then invoke the API
> > vfio_device_create_dmabuf().
> >
> > Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> > the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
> > use the VFIO device fd directly to create the CPU mapping.
> 
> I have just remembered that mremap() will work for either of udmabuf and
> VFIO. That will avoid having two different methods and make
> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
> unnecessary.
IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we are not
actually doing remap but are simply calling mmap(). In other words, we are not
expanding or shrinking existing mapping but are creating a new mapping.
And, for dmabufs associated with VFIO devices, without having to call
vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I don't see
any other way to determine the region offset.

So, I guess I'll create a new patch to do s/remapped/map.
> 
> >
> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> > Cc: Alex Bennée <alex.bennee@linaro.org>
> > Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> > Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Cédric Le Goater <clg@redhat.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >   hw/display/Kconfig             |   5 ++
> >   hw/display/virtio-gpu-dmabuf.c | 114
> +++++++++++++++++++++++++++++++--
> >   2 files changed, 112 insertions(+), 7 deletions(-)
> >
> > diff --git a/hw/display/Kconfig b/hw/display/Kconfig
> > index 1e95ab28ef..0d090f25f5 100644
> > --- a/hw/display/Kconfig
> > +++ b/hw/display/Kconfig
> > @@ -106,6 +106,11 @@ config VIRTIO_VGA
> >       depends on VIRTIO_PCI
> >       select VGA
> >
> > +config VIRTIO_GPU_VFIO_BLOB
> > +    bool
> > +    default y
> > +    depends on VFIO
> > +
> >   config VHOST_USER_GPU
> >       bool
> >       default y
> > diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
> dmabuf.c
> > index 80143034d4..940b0e0411 100644
> > --- a/hw/display/virtio-gpu-dmabuf.c
> > +++ b/hw/display/virtio-gpu-dmabuf.c
> > @@ -18,6 +18,7 @@
> >   #include "ui/console.h"
> >   #include "hw/virtio/virtio-gpu.h"
> >   #include "hw/virtio/virtio-gpu-pixman.h"
> > +#include "hw/vfio/vfio-device.h"
> >   #include "trace.h"
> >   #include "system/ramblock.h"
> >   #include "system/hostmem.h"
> > @@ -52,6 +53,19 @@ static bool
> qemu_iovec_same_memory_regions(const struct iovec *iov, int iov_cnt)
> >       return true;
> >   }
> >
> > +static void vfio_create_dmabuf(VFIODevice *vdev,
> > +                               struct virtio_gpu_simple_resource *res)
> > +{
> > +#if defined(VIRTIO_GPU_VFIO_BLOB)
> > +    res->dmabuf_fd = vfio_device_create_dmabuf_fd(vdev, res->iov, res-
> >iov_cnt);
> > +    if (res->dmabuf_fd < 0) {
> > +        qemu_log_mask(LOG_GUEST_ERROR,
> > +                      "%s: VFIO_DEVICE_FEATURE_DMA_BUF: %s\n",
> > +                      __func__, strerror(errno));
> > +    }
> > +#endif
> > +}
> > +
> >   static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource
> *res)
> >   {
> >       struct udmabuf_create_list *list;
> > @@ -93,11 +107,69 @@ static void virtio_gpu_create_udmabuf(struct
> virtio_gpu_simple_resource *res)
> >       g_free(list);
> >   }
> >
> > -static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource
> *res)
> > +static void *vfio_dmabuf_mmap(struct virtio_gpu_simple_resource *res,
> > +                              VFIODevice *vdev)
> > +{
> > +    struct vfio_region_info *info = NULL;
> > +    ram_addr_t offset, len = 0;
> > +    void *map, *submap;
> > +    int i, ret = -1;
> > +    RAMBlock *rb;
> > +
> > +    /*
> > +     * We first reserve a contiguous chunk of address space for the entire
> > +     * dmabuf, then replace it with smaller mappings that correspond to
> the
> > +     * individual segments of the dmabuf.
> > +     */
> > +    map = mmap(NULL, res->blob_size, PROT_READ, MAP_SHARED, vdev-
> >fd, 0);
> > +    if (map == MAP_FAILED) {
> > +        return map;
> > +    }
> > +
> > +    for (i = 0; i < res->iov_cnt; i++) {
> > +        rb = qemu_ram_block_from_host(res->iov[i].iov_base, false,
> &offset);
> > +	if (!rb) {
> > +            goto err;
> > +        }
> > +#if defined(VIRTIO_GPU_VFIO_BLOB)
> > +        ret = vfio_get_region_index_from_mr(rb->mr);
> > +        if (ret < 0) {
> > +            goto err;
> > +        }
> > +
> > +        ret = vfio_device_get_region_info(vdev, ret, &info);
> > +#endif
> > +        if (ret < 0 || !info) {
> > +            goto err;
> > +        }
> > +
> > +        submap = mmap(map + len, res->iov[i].iov_len, PROT_READ,
> > +                      MAP_SHARED | MAP_FIXED, vdev->fd,
> > +                      info->offset + offset);
> > +        if (submap == MAP_FAILED) {
> > +            goto err;
> > +        }
> > +
> > +        len += res->iov[i].iov_len;
> > +    }
> > +    return map;
> > +err:
> > +    munmap(map, res->blob_size);
> > +    return MAP_FAILED;
> > +}
> > +
> > +static void virtio_gpu_remap_dmabuf(struct virtio_gpu_simple_resource
> *res,
> > +                                    VFIODevice *vdev)
> >   {
> >       res->remapped = mmap(NULL, res->blob_size, PROT_READ,
> >                            MAP_SHARED, res->dmabuf_fd, 0);
> >       if (res->remapped == MAP_FAILED) {
> > +        if (vdev) {
> > +            res->remapped = vfio_dmabuf_mmap(res, vdev);
> > +            if (res->remapped != MAP_FAILED) {
> > +                return;
> > +            }
> > +        }
> >           warn_report("%s: dmabuf mmap failed: %s", __func__,
> >                       strerror(errno));
> >           res->remapped = NULL;
> > @@ -155,7 +227,10 @@ bool virtio_gpu_have_udmabuf(void)
> >
> >   void virtio_gpu_init_dmabuf(struct virtio_gpu_simple_resource *res)
> >   {
> > +    VFIODevice *vdev = NULL;
> >       void *pdata = NULL;
> > +    ram_addr_t offset;
> > +    RAMBlock *rb;
> >
> >       res->dmabuf_fd = -1;
> >       if (res->iov_cnt == 1 &&
> > @@ -166,11 +241,38 @@ void virtio_gpu_init_dmabuf(struct
> virtio_gpu_simple_resource *res)
> >               return;
> >           }
> >
> > -        virtio_gpu_create_udmabuf(res);
> > -        if (res->dmabuf_fd < 0) {
> > +        rb = qemu_ram_block_from_host(res->iov[0].iov_base, false,
> &offset);
> > +        if (memory_region_is_ram_device(rb->mr)) {
> > +            vdev = vfio_device_lookup(rb->mr);
> > +            if (!vdev) {
> > +                qemu_log_mask(LOG_GUEST_ERROR,
> > +                              "%s: Could not find device to create dmabuf\n",
> > +                              __func__);
> 
> This should say "VFIO device" since no other RAM device is supported.
Ok, I'll make the change.

> 
> > +                return;
> > +            }
> > +
> > +            vfio_create_dmabuf(vdev, res);
> > +            if (res->dmabuf_fd < 0) {
> > +                qemu_log_mask(LOG_GUEST_ERROR,
> > +                              "%s: Could not create dmabuf from vfio device\n",
> 
> Nitpick: let's make VFIO uppercase as other user-visible messages do.
Sure, will do.

Thanks,
Vivek

> 
> > +                              __func__);
> > +                return;
> > +            }
> > +        } else if (memory_region_is_ram(rb->mr)) {
>  > +            virtio_gpu_create_udmabuf(res);> +            if
> (res->dmabuf_fd < 0) {
> > +                qemu_log_mask(LOG_GUEST_ERROR,
> > +                              "%s: Could not create dmabuf from memfd\n",
> > +                              __func__);
> > +                return;
> > +            }
> > +        } else {
> > +            qemu_log_mask(LOG_GUEST_ERROR,
> > +                          "%s: memory region cannot be used to create dmabuf\n",
> > +                          __func__);
> >               return;
> >           }
> > -        virtio_gpu_remap_dmabuf(res);
> > +        virtio_gpu_remap_dmabuf(res, vdev);
> >           if (!res->remapped) {
> >               return;
> >           }
> > @@ -182,9 +284,7 @@ void virtio_gpu_init_dmabuf(struct
> virtio_gpu_simple_resource *res)
> >
> >   void virtio_gpu_fini_dmabuf(struct virtio_gpu_simple_resource *res)
> >   {
> > -    if (res->remapped) {
> > -        virtio_gpu_destroy_dmabuf(res);
> > -    }
> > +    virtio_gpu_destroy_dmabuf(res);
> >   }
> >
> >   static void virtio_gpu_free_dmabuf(VirtIOGPU *g, VGPUDMABuf
> *dmabuf)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-11  4:45   ` Akihiko Odaki
@ 2025-11-12  4:30     ` Kasireddy, Vivek
  2025-11-12  8:14       ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-12  4:30 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> 
> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> > If the Guest provides a DMA address that is associated with a ram
> > device (such as a PCI device region and not its system memory), then
> > we can obtain the hva (host virtual address) by invoking
> > address_space_translate() followed by memory_region_get_ram_ptr().
> >
> > This is because the ram device's address space is not accessible to
> > virtio-gpu directly and hence dma_memory_map() cannot be used.
> > Therefore, we first need to identify the memory region associated with
> > the DMA address and figure out if it belongs to a ram device or not
> > and decide how to obtain the host address accordingly.
> >
> > Note that we take a reference on the memory region if it belongs to a
> > ram device but we would still call dma_memory_unmap() later (to unref
> > mr) regardless of how we obtained the hva.
> >
> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> > Cc: Alex Bennée <alex.bennee@linaro.org>
> > Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> > Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Cédric Le Goater <clg@redhat.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >   hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >   1 file changed, 21 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
> > 199b18c746..d352b5afd6 100644
> > --- a/hw/display/virtio-gpu.c
> > +++ b/hw/display/virtio-gpu.c
> > @@ -798,6 +798,26 @@ static void virtio_gpu_set_scanout_blob(VirtIOGPU
> *g,
> >                                 &fb, res, &ss.r, &cmd->error);
> >   }
> >
> > +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> > +                                       struct virtio_gpu_ctrl_command *cmd,
> > +                                       uint64_t a, hwaddr *len) {
> > +    MemoryRegion *mr = NULL;
> > +    hwaddr xlat;
> > +
> > +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat, len,
> > +                                 DMA_DIRECTION_TO_DEVICE,
> > +                                 MEMTXATTRS_UNSPECIFIED);
> > +    if (memory_region_is_ram_device(mr)) {
> > +        memory_region_ref(mr);
> > +        return memory_region_get_ram_ptr(mr) + xlat;
> > +    }
> > +
> > +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> > +                          DMA_DIRECTION_TO_DEVICE,
> > +                          MEMTXATTRS_UNSPECIFIED);
> 
> This function should:
> - call memory_region_get_ram_ptr(mr)
>    if memory_region_is_ram(mr)
> - return NULL otherwise
> 
> There are a few reasons. First, the documentation of dma_memory_map()
> tells to use it "only for reads OR writes - not for read-modify-write
> operations." It can be used for read-modify-write operations so
> dma_memory_map() should be avoided.
This patch series only deals with non-virgl use-cases where AFAICS resources
are not written to on the Host.

> 
> Second, it ensures that the mapped pointer is writable.
> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated
> with VFIO devices" adds checks for memory_region_is_ram() and
> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the other
> callers also use the function to map writable pointers.
Unless I am missing something, I don't see where writable pointers are used
in non-virgl use-cases?

> 
> It also makes the check of memory_region_is_ram_device() and
> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(), reducing
> the overall complexity.
Since buffers reside completely in either ram or ram_device regions, using both
memory_region_is_ram_device() and memory_region_is_ram() to check where
they are located seems necessary and unavoidable.

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki
> 
> > +}
> > +
> >   int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
> >                                     uint32_t nr_entries, uint32_t offset,
> >                                     struct virtio_gpu_ctrl_command
> > *cmd, @@ -839,9 +859,7 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU
> > *g,
> >
> >           do {
> >               len = l;
> > -            map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, &len,
> > -                                 DMA_DIRECTION_TO_DEVICE,
> > -                                 MEMTXATTRS_UNSPECIFIED);
> > +            map = virtio_gpu_dma_memory_map(g, cmd, a, &len);
> >               if (!map) {
> >                   qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to map MMIO
> memory for"
> >                                 " element %d\n", __func__, e);



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-12  4:30     ` Kasireddy, Vivek
@ 2025-11-12  8:14       ` Akihiko Odaki
  2025-11-13  3:39         ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-12  8:14 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/12 13:30, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>
>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>> If the Guest provides a DMA address that is associated with a ram
>>> device (such as a PCI device region and not its system memory), then
>>> we can obtain the hva (host virtual address) by invoking
>>> address_space_translate() followed by memory_region_get_ram_ptr().
>>>
>>> This is because the ram device's address space is not accessible to
>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
>>> Therefore, we first need to identify the memory region associated with
>>> the DMA address and figure out if it belongs to a ram device or not
>>> and decide how to obtain the host address accordingly.
>>>
>>> Note that we take a reference on the memory region if it belongs to a
>>> ram device but we would still call dma_memory_unmap() later (to unref
>>> mr) regardless of how we obtained the hva.
>>>
>>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
>>> Cc: Alex Bennée <alex.bennee@linaro.org>
>>> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
>>> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>> Cc: Alex Williamson <alex.williamson@redhat.com>
>>> Cc: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
>>> ---
>>>    hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>    1 file changed, 21 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
>>> 199b18c746..d352b5afd6 100644
>>> --- a/hw/display/virtio-gpu.c
>>> +++ b/hw/display/virtio-gpu.c
>>> @@ -798,6 +798,26 @@ static void virtio_gpu_set_scanout_blob(VirtIOGPU
>> *g,
>>>                                  &fb, res, &ss.r, &cmd->error);
>>>    }
>>>
>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>> +                                       uint64_t a, hwaddr *len) {
>>> +    MemoryRegion *mr = NULL;
>>> +    hwaddr xlat;
>>> +
>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat, len,
>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>> +    if (memory_region_is_ram_device(mr)) {
>>> +        memory_region_ref(mr);
>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>> +    }
>>> +
>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
>>> +                          DMA_DIRECTION_TO_DEVICE,
>>> +                          MEMTXATTRS_UNSPECIFIED);
>>
>> This function should:
>> - call memory_region_get_ram_ptr(mr)
>>     if memory_region_is_ram(mr)
>> - return NULL otherwise
>>
>> There are a few reasons. First, the documentation of dma_memory_map()
>> tells to use it "only for reads OR writes - not for read-modify-write
>> operations." It can be used for read-modify-write operations so
>> dma_memory_map() should be avoided.
> This patch series only deals with non-virgl use-cases where AFAICS resources
> are not written to on the Host.
> 
>>
>> Second, it ensures that the mapped pointer is writable.
>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated
>> with VFIO devices" adds checks for memory_region_is_ram() and
>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the other
>> callers also use the function to map writable pointers.
> Unless I am missing something, I don't see where writable pointers are used
> in non-virgl use-cases?

Rutabaga uses too, but you are right about that 2D operations won't use it.

That said, exposing non-writable memory to Virgl and Rutabaga lets the 
guest corrupt memory so should be avoided. On the other hand, it is 
unlikely that rejecting non-writable memory will cause any problem. You 
can also add another code path to use 
memory_region_supports_direct_access() instead of memory_region_is_ram() 
for virtio-gpu for 2D and avoid calling memory_region_is_ram() in 
virtio_gpu_init_dmabuf() if you want to keep non-writable memory working.

> 
>>
>> It also makes the check of memory_region_is_ram_device() and
>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(), reducing
>> the overall complexity.
> Since buffers reside completely in either ram or ram_device regions, using both
> memory_region_is_ram_device() and memory_region_is_ram() to check where
> they are located seems necessary and unavoidable.

It can unconditionally call virtio_gpu_create_udmabuf(), and if the 
function finds the memory is incompatible with udmabuf, it can call 
vfio_device_lookup() to tell if the memory belongs to VFIO or not.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-12  4:26     ` Kasireddy, Vivek
@ 2025-11-12  8:17       ` Akihiko Odaki
  2025-11-13  3:17         ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-12  8:17 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/12 13:26, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>> associated with VFIO devices
>>
>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>> In addition to memfd, a blob resource can also have its backing
>>> storage in a VFIO device region. Therefore, we first need to figure
>>> out if the blob is backed by a VFIO device region or a memfd before
>>> we can call the right API to get a dmabuf fd created.
>>>
>>> So, once we have the ramblock and the associated mr, we rely on
>>> memory_region_is_ram_device() to tell us where the backing storage
>>> is located. If the blob resource is VFIO backed, we try to find the
>>> right VFIO device that contains the blob and then invoke the API
>>> vfio_device_create_dmabuf().
>>>
>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
>>> use the VFIO device fd directly to create the CPU mapping.
>>
>> I have just remembered that mremap() will work for either of udmabuf and
>> VFIO. That will avoid having two different methods and make
>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
>> unnecessary.
> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we are not
> actually doing remap but are simply calling mmap(). In other words, we are not
> expanding or shrinking existing mapping but are creating a new mapping.
> And, for dmabufs associated with VFIO devices, without having to call
> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I don't see
> any other way to determine the region offset.
> 
> So, I guess I'll create a new patch to do s/remapped/map.

I mean calling mremap() with 0 as the old_size parameter. The man page says:
 > If the value of old_size is zero, and old_address refers to a
 > shareable mapping (see the description of MAP_SHARED in mmap(2)), then
 > mremap() will create a new mapping of the same pages.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-12  4:24     ` Kasireddy, Vivek
@ 2025-11-12  8:21       ` Akihiko Odaki
  2025-11-13  3:14         ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-12  8:21 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/12 13:24, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
>> qemu_iovec_same_memory_regions()
>>
>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>> Add a helper to check whether the addresses in an iovec array
>>> belong to the same memory region or not. This is useful to verify
>>> before trying to create a dmabuf from an iovec array.
>>>
>>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
>>> Cc: Alex Bennée <alex.bennee@linaro.org>
>>> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
>>> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>> Cc: Alex Williamson <alex.williamson@redhat.com>
>>> Cc: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
>>> ---
>>>    hw/display/virtio-gpu-dmabuf.c | 29 +++++++++++++++++++++++++++++
>>>    1 file changed, 29 insertions(+)
>>>
>>> diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
>> dmabuf.c
>>> index c34d4c85bc..80143034d4 100644
>>> --- a/hw/display/virtio-gpu-dmabuf.c
>>> +++ b/hw/display/virtio-gpu-dmabuf.c
>>> @@ -27,6 +27,31 @@
>>>    #include "standard-headers/linux/udmabuf.h"
>>>    #include "standard-headers/drm/drm_fourcc.h"
>>>
>>> +static bool qemu_iovec_same_memory_regions(const struct iovec *iov,
>> int iov_cnt)
>>> +{
>>> +    RAMBlock *rb, *curr_rb;
>>> +    ram_addr_t offset;
>>> +    int i;
>>> +
>>> +    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
>>> +    if (!rb) {
>>> +        qemu_log_mask(LOG_GUEST_ERROR,
>>> +                      "%s: Could not find ramblock/memory region\n", __func__);
>>> +        return false;
>>> +    }
>>> +
>>> +    for (i = 1; i < iov_cnt; i++) {
>>> +	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false,
>> &offset);
>>> +	if (curr_rb != rb) {
>>> +            qemu_log_mask(LOG_GUEST_ERROR,
>>> +                          "%s: memory regions not same for iov entries\n",
>>> +                          __func__);
>>> +            return false;
>>> +	}
>>> +    }
>>> +    return true;
>>> +}
>>> +
>>>    static void virtio_gpu_create_udmabuf(struct virtio_gpu_simple_resource
>> *res)
>>>    {
>>>        struct udmabuf_create_list *list;
>>> @@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct
>> virtio_gpu_simple_resource *res)
>>>            res->iov[0].iov_len < 4096) {
>>>            pdata = res->iov[0].iov_base;
>>>        } else {
>>> +        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
>>> +            return;
>>> +        }
>>> +
>>
>> This check is unnecessary. Perhaps rejecting iov with different memory
>> regions may be fine if that simplifies the code, but this actually adds
>> some code.
> I think we can keep this sanity check but I don't really mind dropping this
> patch given that buffers with mixed memory regions are not encountered
> in practical situations. Or, I guess I could move the if (curr_rb != rb) check
> to virtio_gpu_create_udmabuf() and vfio_device_create_dmabuf_fd()
> like you suggested previously.

I won't call it a "sanity check"; it is "unlikely" to have different 
memory regions, but it is still not "wrong" and is sane.

The VFIO code path still needs to check if the memory regions belong to 
one VFIO device. Trying to create one DMA-BUF using multiple VFIO 
devices is wrong.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-12  8:21       ` Akihiko Odaki
@ 2025-11-13  3:14         ` Kasireddy, Vivek
  2025-11-13  3:28           ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-13  3:14 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
> qemu_iovec_same_memory_regions()
> 
> On 2025/11/12 13:24, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
> >> qemu_iovec_same_memory_regions()
> >>
> >> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>> Add a helper to check whether the addresses in an iovec array
> >>> belong to the same memory region or not. This is useful to verify
> >>> before trying to create a dmabuf from an iovec array.
> >>>
> >>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> >>> Cc: Alex Bennée <alex.bennee@linaro.org>
> >>> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> >>> Cc: Alex Williamson <alex.williamson@redhat.com>
> >>> Cc: Cédric Le Goater <clg@redhat.com>
> >>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> >>> ---
> >>>    hw/display/virtio-gpu-dmabuf.c | 29
> +++++++++++++++++++++++++++++
> >>>    1 file changed, 29 insertions(+)
> >>>
> >>> diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
> >> dmabuf.c
> >>> index c34d4c85bc..80143034d4 100644
> >>> --- a/hw/display/virtio-gpu-dmabuf.c
> >>> +++ b/hw/display/virtio-gpu-dmabuf.c
> >>> @@ -27,6 +27,31 @@
> >>>    #include "standard-headers/linux/udmabuf.h"
> >>>    #include "standard-headers/drm/drm_fourcc.h"
> >>>
> >>> +static bool qemu_iovec_same_memory_regions(const struct iovec *iov,
> >> int iov_cnt)
> >>> +{
> >>> +    RAMBlock *rb, *curr_rb;
> >>> +    ram_addr_t offset;
> >>> +    int i;
> >>> +
> >>> +    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
> >>> +    if (!rb) {
> >>> +        qemu_log_mask(LOG_GUEST_ERROR,
> >>> +                      "%s: Could not find ramblock/memory region\n",
> __func__);
> >>> +        return false;
> >>> +    }
> >>> +
> >>> +    for (i = 1; i < iov_cnt; i++) {
> >>> +	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false,
> >> &offset);
> >>> +	if (curr_rb != rb) {
> >>> +            qemu_log_mask(LOG_GUEST_ERROR,
> >>> +                          "%s: memory regions not same for iov entries\n",
> >>> +                          __func__);
> >>> +            return false;
> >>> +	}
> >>> +    }
> >>> +    return true;
> >>> +}
> >>> +
> >>>    static void virtio_gpu_create_udmabuf(struct
> virtio_gpu_simple_resource
> >> *res)
> >>>    {
> >>>        struct udmabuf_create_list *list;
> >>> @@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct
> >> virtio_gpu_simple_resource *res)
> >>>            res->iov[0].iov_len < 4096) {
> >>>            pdata = res->iov[0].iov_base;
> >>>        } else {
> >>> +        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
> >>> +            return;
> >>> +        }
> >>> +
> >>
> >> This check is unnecessary. Perhaps rejecting iov with different memory
> >> regions may be fine if that simplifies the code, but this actually adds
> >> some code.
> > I think we can keep this sanity check but I don't really mind dropping this
> > patch given that buffers with mixed memory regions are not encountered
> > in practical situations. Or, I guess I could move the if (curr_rb != rb) check
> > to virtio_gpu_create_udmabuf() and vfio_device_create_dmabuf_fd()
> > like you suggested previously.
> 
> I won't call it a "sanity check"; it is "unlikely" to have different
> memory regions, but it is still not "wrong" and is sane.
I'd say "very unlikely". The only scenario I can imagine where this might
happen is if a buffer is partially migrated (from one memory region to another).
And, if we do come across such a buffer, it has to be rejected because
there is no way we can create a dmabuf (via vfio or udmabuf) for such a
buffer/resource.

> 
> The VFIO code path still needs to check if the memory regions belong to
> one VFIO device. Trying to create one DMA-BUF using multiple VFIO
> devices is wrong.
Ok, I'll add this check in vfio_device_create_dmabuf_fd() to avoid iterating
over all the entries separately.

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-12  8:17       ` Akihiko Odaki
@ 2025-11-13  3:17         ` Kasireddy, Vivek
  2025-11-13  3:31           ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-13  3:17 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated with VFIO devices
> 
> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
> blobs
> >> associated with VFIO devices
> >>
> >> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>> In addition to memfd, a blob resource can also have its backing
> >>> storage in a VFIO device region. Therefore, we first need to figure
> >>> out if the blob is backed by a VFIO device region or a memfd before
> >>> we can call the right API to get a dmabuf fd created.
> >>>
> >>> So, once we have the ramblock and the associated mr, we rely on
> >>> memory_region_is_ram_device() to tell us where the backing storage
> >>> is located. If the blob resource is VFIO backed, we try to find the
> >>> right VFIO device that contains the blob and then invoke the API
> >>> vfio_device_create_dmabuf().
> >>>
> >>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> >>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
> >>> use the VFIO device fd directly to create the CPU mapping.
> >>
> >> I have just remembered that mremap() will work for either of udmabuf
> and
> >> VFIO. That will avoid having two different methods and make
> >> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
> >> unnecessary.
> > IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we are
> not
> > actually doing remap but are simply calling mmap(). In other words, we
> are not
> > expanding or shrinking existing mapping but are creating a new mapping.
> > And, for dmabufs associated with VFIO devices, without having to call
> > vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
> don't see
> > any other way to determine the region offset.
> >
> > So, I guess I'll create a new patch to do s/remapped/map.
> 
> I mean calling mremap() with 0 as the old_size parameter. The man page
> says:
>  > If the value of old_size is zero, and old_address refers to a
>  > shareable mapping (see the description of MAP_SHARED in mmap(2)),
> then
>  > mremap() will create a new mapping of the same pages.
It might be possible to use mremap() here but using mmap() seems very
straightforward given that we are actually not shrinking or expanding
an existing mapping but are instead creating a new mapping. Also, I am
wondering what benefit would mremap() bring as opposed to just using
mmap()?

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
  2025-11-13  3:14         ` Kasireddy, Vivek
@ 2025-11-13  3:28           ` Akihiko Odaki
  0 siblings, 0 replies; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-13  3:28 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/13 12:14, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
>> qemu_iovec_same_memory_regions()
>>
>> On 2025/11/12 13:24, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce
>>>> qemu_iovec_same_memory_regions()
>>>>
>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>> Add a helper to check whether the addresses in an iovec array
>>>>> belong to the same memory region or not. This is useful to verify
>>>>> before trying to create a dmabuf from an iovec array.
>>>>>
>>>>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>>> Cc: Alex Bennée <alex.bennee@linaro.org>
>>>>> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>>>> Cc: Alex Williamson <alex.williamson@redhat.com>
>>>>> Cc: Cédric Le Goater <clg@redhat.com>
>>>>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
>>>>> ---
>>>>>     hw/display/virtio-gpu-dmabuf.c | 29
>> +++++++++++++++++++++++++++++
>>>>>     1 file changed, 29 insertions(+)
>>>>>
>>>>> diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
>>>> dmabuf.c
>>>>> index c34d4c85bc..80143034d4 100644
>>>>> --- a/hw/display/virtio-gpu-dmabuf.c
>>>>> +++ b/hw/display/virtio-gpu-dmabuf.c
>>>>> @@ -27,6 +27,31 @@
>>>>>     #include "standard-headers/linux/udmabuf.h"
>>>>>     #include "standard-headers/drm/drm_fourcc.h"
>>>>>
>>>>> +static bool qemu_iovec_same_memory_regions(const struct iovec *iov,
>>>> int iov_cnt)
>>>>> +{
>>>>> +    RAMBlock *rb, *curr_rb;
>>>>> +    ram_addr_t offset;
>>>>> +    int i;
>>>>> +
>>>>> +    rb = qemu_ram_block_from_host(iov[0].iov_base, false, &offset);
>>>>> +    if (!rb) {
>>>>> +        qemu_log_mask(LOG_GUEST_ERROR,
>>>>> +                      "%s: Could not find ramblock/memory region\n",
>> __func__);
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    for (i = 1; i < iov_cnt; i++) {
>>>>> +	curr_rb = qemu_ram_block_from_host(iov[i].iov_base, false,
>>>> &offset);
>>>>> +	if (curr_rb != rb) {
>>>>> +            qemu_log_mask(LOG_GUEST_ERROR,
>>>>> +                          "%s: memory regions not same for iov entries\n",
>>>>> +                          __func__);
>>>>> +            return false;
>>>>> +	}
>>>>> +    }
>>>>> +    return true;
>>>>> +}
>>>>> +
>>>>>     static void virtio_gpu_create_udmabuf(struct
>> virtio_gpu_simple_resource
>>>> *res)
>>>>>     {
>>>>>         struct udmabuf_create_list *list;
>>>>> @@ -137,6 +162,10 @@ void virtio_gpu_init_dmabuf(struct
>>>> virtio_gpu_simple_resource *res)
>>>>>             res->iov[0].iov_len < 4096) {
>>>>>             pdata = res->iov[0].iov_base;
>>>>>         } else {
>>>>> +        if (!qemu_iovec_same_memory_regions(res->iov, res->iov_cnt)) {
>>>>> +            return;
>>>>> +        }
>>>>> +
>>>>
>>>> This check is unnecessary. Perhaps rejecting iov with different memory
>>>> regions may be fine if that simplifies the code, but this actually adds
>>>> some code.
>>> I think we can keep this sanity check but I don't really mind dropping this
>>> patch given that buffers with mixed memory regions are not encountered
>>> in practical situations. Or, I guess I could move the if (curr_rb != rb) check
>>> to virtio_gpu_create_udmabuf() and vfio_device_create_dmabuf_fd()
>>> like you suggested previously.
>>
>> I won't call it a "sanity check"; it is "unlikely" to have different
>> memory regions, but it is still not "wrong" and is sane.
> I'd say "very unlikely". The only scenario I can imagine where this might
> happen is if a buffer is partially migrated (from one memory region to another).
> And, if we do come across such a buffer, it has to be rejected because
> there is no way we can create a dmabuf (via vfio or udmabuf) for such a
> buffer/resource.

The check virtio_gpu_create_udmabuf() already has and the check I 
suggested for vfio_device_create_dmabuf_fd() (checking if all memory 
regions belong to one VFIO device) should be sufficient to reject all 
cases that DMA-BUF cannot be created.

Regards,
Akihiko Odaki

> 
>>
>> The VFIO code path still needs to check if the memory regions belong to
>> one VFIO device. Trying to create one DMA-BUF using multiple VFIO
>> devices is wrong.
> Ok, I'll add this check in vfio_device_create_dmabuf_fd() to avoid iterating
> over all the entries separately.
> 
> Thanks,
> Vivek
> 
>>
>> Regards,
>> Akihiko Odaki



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-13  3:17         ` Kasireddy, Vivek
@ 2025-11-13  3:31           ` Akihiko Odaki
  2025-11-17  4:19             ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-13  3:31 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/13 12:17, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>> associated with VFIO devices
>>
>> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
>> blobs
>>>> associated with VFIO devices
>>>>
>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>> In addition to memfd, a blob resource can also have its backing
>>>>> storage in a VFIO device region. Therefore, we first need to figure
>>>>> out if the blob is backed by a VFIO device region or a memfd before
>>>>> we can call the right API to get a dmabuf fd created.
>>>>>
>>>>> So, once we have the ramblock and the associated mr, we rely on
>>>>> memory_region_is_ram_device() to tell us where the backing storage
>>>>> is located. If the blob resource is VFIO backed, we try to find the
>>>>> right VFIO device that contains the blob and then invoke the API
>>>>> vfio_device_create_dmabuf().
>>>>>
>>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
>>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
>>>>> use the VFIO device fd directly to create the CPU mapping.
>>>>
>>>> I have just remembered that mremap() will work for either of udmabuf
>> and
>>>> VFIO. That will avoid having two different methods and make
>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
>>>> unnecessary.
>>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we are
>> not
>>> actually doing remap but are simply calling mmap(). In other words, we
>> are not
>>> expanding or shrinking existing mapping but are creating a new mapping.
>>> And, for dmabufs associated with VFIO devices, without having to call
>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
>> don't see
>>> any other way to determine the region offset.
>>>
>>> So, I guess I'll create a new patch to do s/remapped/map.
>>
>> I mean calling mremap() with 0 as the old_size parameter. The man page
>> says:
>>   > If the value of old_size is zero, and old_address refers to a
>>   > shareable mapping (see the description of MAP_SHARED in mmap(2)),
>> then
>>   > mremap() will create a new mapping of the same pages.
> It might be possible to use mremap() here but using mmap() seems very
> straightforward given that we are actually not shrinking or expanding
> an existing mapping but are instead creating a new mapping. Also, I am
> wondering what benefit would mremap() bring as opposed to just using
> mmap()?

As I noted earlier, mremap() removes the need of having two different 
paths for udmabuf and VFIO, and make vfio_get_region_index_from_mr() and 
vfio_device_get_region_info() unnecessary, reducing code complexity.

mremap() is also sufficiently straightforward. The man page explicitly 
states it is designed to create a new mapping. Using it for the purpose 
(not shrinking or expanding an existing mapping) is not an abuse of the API.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-12  8:14       ` Akihiko Odaki
@ 2025-11-13  3:39         ` Kasireddy, Vivek
  2025-11-13  4:08           ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-13  3:39 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> 
> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> >> associated with a ram device
> >>
> >> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>> If the Guest provides a DMA address that is associated with a ram
> >>> device (such as a PCI device region and not its system memory), then
> >>> we can obtain the hva (host virtual address) by invoking
> >>> address_space_translate() followed by memory_region_get_ram_ptr().
> >>>
> >>> This is because the ram device's address space is not accessible to
> >>> virtio-gpu directly and hence dma_memory_map() cannot be used.
> >>> Therefore, we first need to identify the memory region associated with
> >>> the DMA address and figure out if it belongs to a ram device or not
> >>> and decide how to obtain the host address accordingly.
> >>>
> >>> Note that we take a reference on the memory region if it belongs to a
> >>> ram device but we would still call dma_memory_unmap() later (to unref
> >>> mr) regardless of how we obtained the hva.
> >>>
> >>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> >>> Cc: Alex Bennée <alex.bennee@linaro.org>
> >>> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> >>> Cc: Alex Williamson <alex.williamson@redhat.com>
> >>> Cc: Cédric Le Goater <clg@redhat.com>
> >>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> >>> ---
> >>>    hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >>>    1 file changed, 21 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
> >>> 199b18c746..d352b5afd6 100644
> >>> --- a/hw/display/virtio-gpu.c
> >>> +++ b/hw/display/virtio-gpu.c
> >>> @@ -798,6 +798,26 @@ static void
> virtio_gpu_set_scanout_blob(VirtIOGPU
> >> *g,
> >>>                                  &fb, res, &ss.r, &cmd->error);
> >>>    }
> >>>
> >>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> >>> +                                       struct virtio_gpu_ctrl_command *cmd,
> >>> +                                       uint64_t a, hwaddr *len) {
> >>> +    MemoryRegion *mr = NULL;
> >>> +    hwaddr xlat;
> >>> +
> >>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat,
> len,
> >>> +                                 DMA_DIRECTION_TO_DEVICE,
> >>> +                                 MEMTXATTRS_UNSPECIFIED);
> >>> +    if (memory_region_is_ram_device(mr)) {
> >>> +        memory_region_ref(mr);
> >>> +        return memory_region_get_ram_ptr(mr) + xlat;
> >>> +    }
> >>> +
> >>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> >>> +                          DMA_DIRECTION_TO_DEVICE,
> >>> +                          MEMTXATTRS_UNSPECIFIED);
> >>
> >> This function should:
> >> - call memory_region_get_ram_ptr(mr)
> >>     if memory_region_is_ram(mr)
> >> - return NULL otherwise
> >>
> >> There are a few reasons. First, the documentation of
> dma_memory_map()
> >> tells to use it "only for reads OR writes - not for read-modify-write
> >> operations." It can be used for read-modify-write operations so
> >> dma_memory_map() should be avoided.
> > This patch series only deals with non-virgl use-cases where AFAICS
> resources
> > are not written to on the Host.
> >
> >>
> >> Second, it ensures that the mapped pointer is writable.
> >> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated
> >> with VFIO devices" adds checks for memory_region_is_ram() and
> >> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
> other
> >> callers also use the function to map writable pointers.
> > Unless I am missing something, I don't see where writable pointers are
> used
> > in non-virgl use-cases?
> 
> Rutabaga uses too, but you are right about that 2D operations won't use it.
> 
> That said, exposing non-writable memory to Virgl and Rutabaga lets the
> guest corrupt memory so should be avoided. On the other hand, it is
> unlikely that rejecting non-writable memory will cause any problem. You
> can also add another code path to use
> memory_region_supports_direct_access() instead of
> memory_region_is_ram()
> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
> virtio_gpu_init_dmabuf() if you want to keep non-writable memory working.
AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga code.
And, this patch series and my use-case (GPU SRIOV) only needs to deal with
non-writeable memory because the rendering is already done by the Guest and
the Host only needs to display the Guest's FB.

However, I see that virtio_gpu_create_mapping_iov() is used by virgl/rutabaga
code as well, so I am wondering how do things work right now given that
virtio_gpu_create_mapping_iov() always calls dma_memory_map()? 
In other words, is there no problem currently with non-writeable memory
in virgl/rutabaga use-cases?

> 
> >
> >>
> >> It also makes the check of memory_region_is_ram_device() and
> >> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
> reducing
> >> the overall complexity.
> > Since buffers reside completely in either ram or ram_device regions, using
> both
> > memory_region_is_ram_device() and memory_region_is_ram() to check
> where
> > they are located seems necessary and unavoidable.
> 
> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
> function finds the memory is incompatible with udmabuf, it can call
> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
Yeah, what you suggest is doable but seems a bit convoluted to have to
first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
functions. 

I think using memory_region_is_ram_device() and memory_region_is_ram()
to identify the right memory region and calling either virtio_gpu_create_udmabuf()
or vfio_create_dmabuf() is much more intuitive and readable.

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-13  3:39         ` Kasireddy, Vivek
@ 2025-11-13  4:08           ` Akihiko Odaki
  2025-11-17  4:14             ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-13  4:08 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater



On 2025/11/13 12:39, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>
>> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>>>> associated with a ram device
>>>>
>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>> If the Guest provides a DMA address that is associated with a ram
>>>>> device (such as a PCI device region and not its system memory), then
>>>>> we can obtain the hva (host virtual address) by invoking
>>>>> address_space_translate() followed by memory_region_get_ram_ptr().
>>>>>
>>>>> This is because the ram device's address space is not accessible to
>>>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
>>>>> Therefore, we first need to identify the memory region associated with
>>>>> the DMA address and figure out if it belongs to a ram device or not
>>>>> and decide how to obtain the host address accordingly.
>>>>>
>>>>> Note that we take a reference on the memory region if it belongs to a
>>>>> ram device but we would still call dma_memory_unmap() later (to unref
>>>>> mr) regardless of how we obtained the hva.
>>>>>
>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>>>> Cc: Cédric Le Goater<clg@redhat.com>
>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
>>>>> ---
>>>>>     hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>>>     1 file changed, 21 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
>>>>> 199b18c746..d352b5afd6 100644
>>>>> --- a/hw/display/virtio-gpu.c
>>>>> +++ b/hw/display/virtio-gpu.c
>>>>> @@ -798,6 +798,26 @@ static void
>> virtio_gpu_set_scanout_blob(VirtIOGPU
>>>> *g,
>>>>>                                   &fb, res, &ss.r, &cmd->error);
>>>>>     }
>>>>>
>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>>>> +                                       uint64_t a, hwaddr *len) {
>>>>> +    MemoryRegion *mr = NULL;
>>>>> +    hwaddr xlat;
>>>>> +
>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a, &xlat,
>> len,
>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>>>> +    if (memory_region_is_ram_device(mr)) {
>>>>> +        memory_region_ref(mr);
>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>>>> +    }
>>>>> +
>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
>>>>> +                          DMA_DIRECTION_TO_DEVICE,
>>>>> +                          MEMTXATTRS_UNSPECIFIED);
>>>> This function should:
>>>> - call memory_region_get_ram_ptr(mr)
>>>>      if memory_region_is_ram(mr)
>>>> - return NULL otherwise
>>>>
>>>> There are a few reasons. First, the documentation of
>> dma_memory_map()
>>>> tells to use it "only for reads OR writes - not for read-modify-write
>>>> operations." It can be used for read-modify-write operations so
>>>> dma_memory_map() should be avoided.
>>> This patch series only deals with non-virgl use-cases where AFAICS
>> resources
>>> are not written to on the Host.
>>>
>>>> Second, it ensures that the mapped pointer is writable.
>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>> associated
>>>> with VFIO devices" adds checks for memory_region_is_ram() and
>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
>> other
>>>> callers also use the function to map writable pointers.
>>> Unless I am missing something, I don't see where writable pointers are
>> used
>>> in non-virgl use-cases?
>> Rutabaga uses too, but you are right about that 2D operations won't use it.
>>
>> That said, exposing non-writable memory to Virgl and Rutabaga lets the
>> guest corrupt memory so should be avoided. On the other hand, it is
>> unlikely that rejecting non-writable memory will cause any problem. You
>> can also add another code path to use
>> memory_region_supports_direct_access() instead of
>> memory_region_is_ram()
>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory working.
> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga code.
> And, this patch series and my use-case (GPU SRIOV) only needs to deal with
> non-writeable memory because the rendering is already done by the Guest and
> the Host only needs to display the Guest's FB.
> 
> However, I see that virtio_gpu_create_mapping_iov() is used by virgl/rutabaga
> code as well, so I am wondering how do things work right now given that
> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
> In other words, is there no problem currently with non-writeable memory
> in virgl/rutabaga use-cases?

The current code is problematic, and using memory_region_is_ram() will 
fix it.

> 
>>>> It also makes the check of memory_region_is_ram_device() and
>>>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
>> reducing
>>>> the overall complexity.
>>> Since buffers reside completely in either ram or ram_device regions, using
>> both
>>> memory_region_is_ram_device() and memory_region_is_ram() to check
>> where
>>> they are located seems necessary and unavoidable.
>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
>> function finds the memory is incompatible with udmabuf, it can call
>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
> Yeah, what you suggest is doable but seems a bit convoluted to have to
> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
> functions.
> 
> I think using memory_region_is_ram_device() and memory_region_is_ram()
> to identify the right memory region and calling either virtio_gpu_create_udmabuf()
> or vfio_create_dmabuf() is much more intuitive and readable.

memory_region_is_ram_device() and memory_region_is_ram() are not 
sufficient to identify the right memory region. 
memory_region_is_ram_device() returns true for RAM device created by 
non-VFIO devices, and memory_region_is_ram() returns true for memory 
regions created with memory_region_init_ram_ptr(), which is not backed 
with memfd.

In general, checking prerequisite conditions before performing 
operations implemented elsewhere is dangerous because the checks and the 
operation implementation can be diverged, and these facts demonstrate that.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-13  4:08           ` Akihiko Odaki
@ 2025-11-17  4:14             ` Kasireddy, Vivek
  2025-11-17  6:56               ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-17  4:14 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> 
> 
> 
> On 2025/11/13 12:39, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> >> associated with a ram device
> >>
> >> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
> >>> Hi Akihiko,
> >>>
> >>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA
> addr
> >>>> associated with a ram device
> >>>>
> >>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>> If the Guest provides a DMA address that is associated with a ram
> >>>>> device (such as a PCI device region and not its system memory), then
> >>>>> we can obtain the hva (host virtual address) by invoking
> >>>>> address_space_translate() followed by
> memory_region_get_ram_ptr().
> >>>>>
> >>>>> This is because the ram device's address space is not accessible to
> >>>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
> >>>>> Therefore, we first need to identify the memory region associated
> with
> >>>>> the DMA address and figure out if it belongs to a ram device or not
> >>>>> and decide how to obtain the host address accordingly.
> >>>>>
> >>>>> Note that we take a reference on the memory region if it belongs to a
> >>>>> ram device but we would still call dma_memory_unmap() later (to
> unref
> >>>>> mr) regardless of how we obtained the hva.
> >>>>>
> >>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
> >>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
> >>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
> >>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
> >>>>> Cc: Cédric Le Goater<clg@redhat.com>
> >>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
> >>>>> ---
> >>>>>     hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >>>>>     1 file changed, 21 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
> >>>>> 199b18c746..d352b5afd6 100644
> >>>>> --- a/hw/display/virtio-gpu.c
> >>>>> +++ b/hw/display/virtio-gpu.c
> >>>>> @@ -798,6 +798,26 @@ static void
> >> virtio_gpu_set_scanout_blob(VirtIOGPU
> >>>> *g,
> >>>>>                                   &fb, res, &ss.r, &cmd->error);
> >>>>>     }
> >>>>>
> >>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> >>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
> >>>>> +                                       uint64_t a, hwaddr *len) {
> >>>>> +    MemoryRegion *mr = NULL;
> >>>>> +    hwaddr xlat;
> >>>>> +
> >>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
> &xlat,
> >> len,
> >>>>> +                                 DMA_DIRECTION_TO_DEVICE,
> >>>>> +                                 MEMTXATTRS_UNSPECIFIED);
> >>>>> +    if (memory_region_is_ram_device(mr)) {
> >>>>> +        memory_region_ref(mr);
> >>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
> >>>>> +    }
> >>>>> +
> >>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> >>>>> +                          DMA_DIRECTION_TO_DEVICE,
> >>>>> +                          MEMTXATTRS_UNSPECIFIED);
> >>>> This function should:
> >>>> - call memory_region_get_ram_ptr(mr)
> >>>>      if memory_region_is_ram(mr)
> >>>> - return NULL otherwise
> >>>>
> >>>> There are a few reasons. First, the documentation of
> >> dma_memory_map()
> >>>> tells to use it "only for reads OR writes - not for read-modify-write
> >>>> operations." It can be used for read-modify-write operations so
> >>>> dma_memory_map() should be avoided.
> >>> This patch series only deals with non-virgl use-cases where AFAICS
> >> resources
> >>> are not written to on the Host.
> >>>
> >>>> Second, it ensures that the mapped pointer is writable.
> >>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> >> associated
> >>>> with VFIO devices" adds checks for memory_region_is_ram() and
> >>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
> >> other
> >>>> callers also use the function to map writable pointers.
> >>> Unless I am missing something, I don't see where writable pointers are
> >> used
> >>> in non-virgl use-cases?
> >> Rutabaga uses too, but you are right about that 2D operations won't use
> it.
> >>
> >> That said, exposing non-writable memory to Virgl and Rutabaga lets the
> >> guest corrupt memory so should be avoided. On the other hand, it is
> >> unlikely that rejecting non-writable memory will cause any problem. You
> >> can also add another code path to use
> >> memory_region_supports_direct_access() instead of
> >> memory_region_is_ram()
> >> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
> >> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
> working.
> > AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga
> code.
> > And, this patch series and my use-case (GPU SRIOV) only needs to deal
> with
> > non-writeable memory because the rendering is already done by the
> Guest and
> > the Host only needs to display the Guest's FB.
> >
> > However, I see that virtio_gpu_create_mapping_iov() is used by
> virgl/rutabaga
> > code as well, so I am wondering how do things work right now given that
> > virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
> > In other words, is there no problem currently with non-writeable memory
> > in virgl/rutabaga use-cases?
> 
> The current code is problematic, and using memory_region_is_ram() will
> fix it.
Ok, I'll make the change.

> 
> >
> >>>> It also makes the check of memory_region_is_ram_device() and
> >>>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
> >> reducing
> >>>> the overall complexity.
> >>> Since buffers reside completely in either ram or ram_device regions,
> using
> >> both
> >>> memory_region_is_ram_device() and memory_region_is_ram() to check
> >> where
> >>> they are located seems necessary and unavoidable.
> >> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
> >> function finds the memory is incompatible with udmabuf, it can call
> >> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
> > Yeah, what you suggest is doable but seems a bit convoluted to have to
> > first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
> > functions.
> >
> > I think using memory_region_is_ram_device() and
> memory_region_is_ram()
> > to identify the right memory region and calling either
> virtio_gpu_create_udmabuf()
> > or vfio_create_dmabuf() is much more intuitive and readable.
> 
> memory_region_is_ram_device() and memory_region_is_ram() are not
> sufficient to identify the right memory region.
> memory_region_is_ram_device() returns true for RAM device created by
> non-VFIO devices, and memory_region_is_ram() returns true for memory
> regions created with memory_region_init_ram_ptr(), which is not backed
> with memfd.
Right, but structuring the code in the following way would address your concerns
and make it more robust:
        if (memory_region_is_ram_device(rb->mr) && (vdev = vfio_device_lookup(rb->mr))) {
	vfio_create_dmabuf(vdev, res);
        } else if (memory_region_is_ram(rb->mr) && virtio_gpu_have_udmabuf()) {
	virtio_gpu_create_udmabuf(res);
        } else {
	...
        }

Thanks,
Vivek

> 
> In general, checking prerequisite conditions before performing
> operations implemented elsewhere is dangerous because the checks and
> the
> operation implementation can be diverged, and these facts demonstrate
> that.
> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-13  3:31           ` Akihiko Odaki
@ 2025-11-17  4:19             ` Kasireddy, Vivek
  2025-11-17  7:13               ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-17  4:19 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated with VFIO devices
> 
> On 2025/11/13 12:17, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
> blobs
> >> associated with VFIO devices
> >>
> >> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
> >>> Hi Akihiko,
> >>>
> >>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
> >> blobs
> >>>> associated with VFIO devices
> >>>>
> >>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>> In addition to memfd, a blob resource can also have its backing
> >>>>> storage in a VFIO device region. Therefore, we first need to figure
> >>>>> out if the blob is backed by a VFIO device region or a memfd before
> >>>>> we can call the right API to get a dmabuf fd created.
> >>>>>
> >>>>> So, once we have the ramblock and the associated mr, we rely on
> >>>>> memory_region_is_ram_device() to tell us where the backing storage
> >>>>> is located. If the blob resource is VFIO backed, we try to find the
> >>>>> right VFIO device that contains the blob and then invoke the API
> >>>>> vfio_device_create_dmabuf().
> >>>>>
> >>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> >>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
> >>>>> use the VFIO device fd directly to create the CPU mapping.
> >>>>
> >>>> I have just remembered that mremap() will work for either of udmabuf
> >> and
> >>>> VFIO. That will avoid having two different methods and make
> >>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
> >>>> unnecessary.
> >>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we
> are
> >> not
> >>> actually doing remap but are simply calling mmap(). In other words, we
> >> are not
> >>> expanding or shrinking existing mapping but are creating a new
> mapping.
> >>> And, for dmabufs associated with VFIO devices, without having to call
> >>> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
> >> don't see
> >>> any other way to determine the region offset.
> >>>
> >>> So, I guess I'll create a new patch to do s/remapped/map.
> >>
> >> I mean calling mremap() with 0 as the old_size parameter. The man page
> >> says:
> >>   > If the value of old_size is zero, and old_address refers to a
> >>   > shareable mapping (see the description of MAP_SHARED in mmap(2)),
> >> then
> >>   > mremap() will create a new mapping of the same pages.
> > It might be possible to use mremap() here but using mmap() seems very
> > straightforward given that we are actually not shrinking or expanding
> > an existing mapping but are instead creating a new mapping. Also, I am
> > wondering what benefit would mremap() bring as opposed to just using
> > mmap()?
> 
> As I noted earlier, mremap() removes the need of having two different
> paths for udmabuf and VFIO, and make vfio_get_region_index_from_mr()
> and
> vfio_device_get_region_info() unnecessary, reducing code complexity.
Sorry, I should have researched thoroughly before but after looking at the code
again, I don't see how mremap() removes the need for having two different
paths for udmabuf and VFIO and make vfio_get_region_index_from_mr()
and vfio_device_get_region_info() unnecessary. Could you please elaborate
how it can be done?

Thanks,
Vivek

> 
> mremap() is also sufficiently straightforward. The man page explicitly
> states it is designed to create a new mapping. Using it for the purpose
> (not shrinking or expanding an existing mapping) is not an abuse of the API.
> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-17  4:14             ` Kasireddy, Vivek
@ 2025-11-17  6:56               ` Akihiko Odaki
  2025-11-18  5:26                 ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-17  6:56 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/17 13:14, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>
>>
>>
>> On 2025/11/13 12:39, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>>>> associated with a ram device
>>>>
>>>> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
>>>>> Hi Akihiko,
>>>>>
>>>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA
>> addr
>>>>>> associated with a ram device
>>>>>>
>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>> If the Guest provides a DMA address that is associated with a ram
>>>>>>> device (such as a PCI device region and not its system memory), then
>>>>>>> we can obtain the hva (host virtual address) by invoking
>>>>>>> address_space_translate() followed by
>> memory_region_get_ram_ptr().
>>>>>>>
>>>>>>> This is because the ram device's address space is not accessible to
>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
>>>>>>> Therefore, we first need to identify the memory region associated
>> with
>>>>>>> the DMA address and figure out if it belongs to a ram device or not
>>>>>>> and decide how to obtain the host address accordingly.
>>>>>>>
>>>>>>> Note that we take a reference on the memory region if it belongs to a
>>>>>>> ram device but we would still call dma_memory_unmap() later (to
>> unref
>>>>>>> mr) regardless of how we obtained the hva.
>>>>>>>
>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
>>>>>>> ---
>>>>>>>      hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>>>>>      1 file changed, 21 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
>>>>>>> 199b18c746..d352b5afd6 100644
>>>>>>> --- a/hw/display/virtio-gpu.c
>>>>>>> +++ b/hw/display/virtio-gpu.c
>>>>>>> @@ -798,6 +798,26 @@ static void
>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
>>>>>> *g,
>>>>>>>                                    &fb, res, &ss.r, &cmd->error);
>>>>>>>      }
>>>>>>>
>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>>>>>> +                                       uint64_t a, hwaddr *len) {
>>>>>>> +    MemoryRegion *mr = NULL;
>>>>>>> +    hwaddr xlat;
>>>>>>> +
>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
>> &xlat,
>>>> len,
>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>>>>>> +    if (memory_region_is_ram_device(mr)) {
>>>>>>> +        memory_region_ref(mr);
>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
>>>>>> This function should:
>>>>>> - call memory_region_get_ram_ptr(mr)
>>>>>>       if memory_region_is_ram(mr)
>>>>>> - return NULL otherwise
>>>>>>
>>>>>> There are a few reasons. First, the documentation of
>>>> dma_memory_map()
>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
>>>>>> operations." It can be used for read-modify-write operations so
>>>>>> dma_memory_map() should be avoided.
>>>>> This patch series only deals with non-virgl use-cases where AFAICS
>>>> resources
>>>>> are not written to on the Host.
>>>>>
>>>>>> Second, it ensures that the mapped pointer is writable.
>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>>>> associated
>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
>>>> other
>>>>>> callers also use the function to map writable pointers.
>>>>> Unless I am missing something, I don't see where writable pointers are
>>>> used
>>>>> in non-virgl use-cases?
>>>> Rutabaga uses too, but you are right about that 2D operations won't use
>> it.
>>>>
>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets the
>>>> guest corrupt memory so should be avoided. On the other hand, it is
>>>> unlikely that rejecting non-writable memory will cause any problem. You
>>>> can also add another code path to use
>>>> memory_region_supports_direct_access() instead of
>>>> memory_region_is_ram()
>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
>> working.
>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga
>> code.
>>> And, this patch series and my use-case (GPU SRIOV) only needs to deal
>> with
>>> non-writeable memory because the rendering is already done by the
>> Guest and
>>> the Host only needs to display the Guest's FB.
>>>
>>> However, I see that virtio_gpu_create_mapping_iov() is used by
>> virgl/rutabaga
>>> code as well, so I am wondering how do things work right now given that
>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
>>> In other words, is there no problem currently with non-writeable memory
>>> in virgl/rutabaga use-cases?
>>
>> The current code is problematic, and using memory_region_is_ram() will
>> fix it.
> Ok, I'll make the change.
> 
>>
>>>
>>>>>> It also makes the check of memory_region_is_ram_device() and
>>>>>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
>>>> reducing
>>>>>> the overall complexity.
>>>>> Since buffers reside completely in either ram or ram_device regions,
>> using
>>>> both
>>>>> memory_region_is_ram_device() and memory_region_is_ram() to check
>>>> where
>>>>> they are located seems necessary and unavoidable.
>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
>>>> function finds the memory is incompatible with udmabuf, it can call
>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
>>> Yeah, what you suggest is doable but seems a bit convoluted to have to
>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
>>> functions.
>>>
>>> I think using memory_region_is_ram_device() and
>> memory_region_is_ram()
>>> to identify the right memory region and calling either
>> virtio_gpu_create_udmabuf()
>>> or vfio_create_dmabuf() is much more intuitive and readable.
>>
>> memory_region_is_ram_device() and memory_region_is_ram() are not
>> sufficient to identify the right memory region.
>> memory_region_is_ram_device() returns true for RAM device created by
>> non-VFIO devices, and memory_region_is_ram() returns true for memory
>> regions created with memory_region_init_ram_ptr(), which is not backed
>> with memfd.
> Right, but structuring the code in the following way would address your concerns
> and make it more robust:
>          if (memory_region_is_ram_device(rb->mr) && (vdev = vfio_device_lookup(rb->mr))) {
> 	vfio_create_dmabuf(vdev, res);
>          } else if (memory_region_is_ram(rb->mr) && virtio_gpu_have_udmabuf()) {
> 	virtio_gpu_create_udmabuf(res);
>          } else {
> 	...
>          }

One of the concerns I raised is that having such checks has an inherent 
hazard that they can be inconsistent with the actual implementations.

The original checks had such inconsistency, and the updated one still 
have too. memory_region_is_ram(rb->mr) && virtio_gpu_have_udmabuf() can 
be still true even for memory regions that do not have memfd; please 
refer to the example of memory_region_init_ram_ptr() I pointed out in 
the last email.

Even if you somehow managed to write checks that match with the 
implementations, it is still possible that a future change can break it. 
Letting the implementations check their prerequisite conditions 
completely prevents such an error by construction and makes the code 
more robust.

Another concern is complexity. Calling another function 
(virtio_gpu_have_udmabuf()) do not reduce complexity.

I do not understand why having such extra function calls and 
conditionals will make the code more robust.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-17  4:19             ` Kasireddy, Vivek
@ 2025-11-17  7:13               ` Akihiko Odaki
  2025-11-18  5:22                 ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-17  7:13 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/17 13:19, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>> associated with VFIO devices
>>
>> On 2025/11/13 12:17, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
>> blobs
>>>> associated with VFIO devices
>>>>
>>>> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
>>>>> Hi Akihiko,
>>>>>
>>>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
>>>> blobs
>>>>>> associated with VFIO devices
>>>>>>
>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>> In addition to memfd, a blob resource can also have its backing
>>>>>>> storage in a VFIO device region. Therefore, we first need to figure
>>>>>>> out if the blob is backed by a VFIO device region or a memfd before
>>>>>>> we can call the right API to get a dmabuf fd created.
>>>>>>>
>>>>>>> So, once we have the ramblock and the associated mr, we rely on
>>>>>>> memory_region_is_ram_device() to tell us where the backing storage
>>>>>>> is located. If the blob resource is VFIO backed, we try to find the
>>>>>>> right VFIO device that contains the blob and then invoke the API
>>>>>>> vfio_device_create_dmabuf().
>>>>>>>
>>>>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
>>>>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
>>>>>>> use the VFIO device fd directly to create the CPU mapping.
>>>>>>
>>>>>> I have just remembered that mremap() will work for either of udmabuf
>>>> and
>>>>>> VFIO. That will avoid having two different methods and make
>>>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
>>>>>> unnecessary.
>>>>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we
>> are
>>>> not
>>>>> actually doing remap but are simply calling mmap(). In other words, we
>>>> are not
>>>>> expanding or shrinking existing mapping but are creating a new
>> mapping.
>>>>> And, for dmabufs associated with VFIO devices, without having to call
>>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
>>>> don't see
>>>>> any other way to determine the region offset.
>>>>>
>>>>> So, I guess I'll create a new patch to do s/remapped/map.
>>>>
>>>> I mean calling mremap() with 0 as the old_size parameter. The man page
>>>> says:
>>>>    > If the value of old_size is zero, and old_address refers to a
>>>>    > shareable mapping (see the description of MAP_SHARED in mmap(2)),
>>>> then
>>>>    > mremap() will create a new mapping of the same pages.
>>> It might be possible to use mremap() here but using mmap() seems very
>>> straightforward given that we are actually not shrinking or expanding
>>> an existing mapping but are instead creating a new mapping. Also, I am
>>> wondering what benefit would mremap() bring as opposed to just using
>>> mmap()?
>>
>> As I noted earlier, mremap() removes the need of having two different
>> paths for udmabuf and VFIO, and make vfio_get_region_index_from_mr()
>> and
>> vfio_device_get_region_info() unnecessary, reducing code complexity.
> Sorry, I should have researched thoroughly before but after looking at the code
> again, I don't see how mremap() removes the need for having two different
> paths for udmabuf and VFIO and make vfio_get_region_index_from_mr()
> and vfio_device_get_region_info() unnecessary. Could you please elaborate
> how it can be done?

Not tested, but something like the following:

head = qemu_ram_mmap(-1, res->blob_size, qemu_real_host_page_size(),
                      QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);
if (head == MAP_FAILED) {
     return NULL;
}

cursor = head;

for (i = 0; i < res->iov_cnt; i++) {
     if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
                MREMAP_FIXED, cursor) == MAP_FAILED) {
         qemu_ram_munmap(-1, head, res->blob_size);
         return NULL;
     }

     cursor += res->iov[i].iov_len;
}

return head;

Regards,
Akihiko Odaki

> 
> Thanks,
> Vivek
> 
>>
>> mremap() is also sufficiently straightforward. The man page explicitly
>> states it is designed to create a new mapping. Using it for the purpose
>> (not shrinking or expanding an existing mapping) is not an abuse of the API.
>>
>> Regards,
>> Akihiko Odaki



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF
  2025-11-09  6:49   ` Leon Romanovsky
@ 2025-11-17  9:02     ` Cédric Le Goater
  0 siblings, 0 replies; 45+ messages in thread
From: Cédric Le Goater @ 2025-11-17  9:02 UTC (permalink / raw)
  To: Leon Romanovsky, Vivek Kasireddy, qemu-devel; +Cc: Alex Williamson

Hi Leon,

On 11/9/25 07:49, Leon Romanovsky wrote:
> 
> 
> On Sun, Nov 9, 2025, at 07:33, Vivek Kasireddy wrote:
>> Cc: Alex Williamson <alex.williamson@redhat.com>
>> Cc: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
>> ---
>>   linux-headers/linux/vfio.h | 25 +++++++++++++++++++++++++
>>   1 file changed, 25 insertions(+)
>>
>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>> index 4d96d1fc12..bc11ca3663 100644
>> --- a/linux-headers/linux/vfio.h
>> +++ b/linux-headers/linux/vfio.h
>> @@ -1478,6 +1478,31 @@ struct vfio_device_feature_bus_master {
>>   };
>>   #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
>>
>> +/**
>> + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
>> + * regions selected.
>> + *
>> + * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC,
>> + * etc. offset/length specify a slice of the region to create the dmabuf from.
>> + * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf.
>> + *
>> + * Return: The fd number on success, -1 and errno is set on failure.
>> + */
>> +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
>> +
>> +struct vfio_region_dma_range {
>> +	__u64 offset;
>> +	__u64 length;
>> +};
>> +
>> +struct vfio_device_feature_dma_buf {
>> +	__u32	region_index;
>> +	__u32	open_flags;
>> +	__u32   flags;
>> +	__u32   nr_ranges;
>> +	struct vfio_region_dma_range dma_ranges[];
> 
> Not important comment at all, but in last versions of UAPI, this line is
> struct vfio_region_dma_range dma_ranges[] __counted_by(nr_ranges);
> https://lore.kernel.org/kvm/20251106-dmabuf-vfio-v7-10-2503bf390699@nvidia.com/T/#Z2e.:..:20251106-dmabuf-vfio-v7-10-2503bf390699::40nvidia.com:1include:uapi:linux:vfio.h

The kernel dma-buf definitions will be included in the QEMU linux-headers/
directory with the update-linux-headers.sh script.

First, v8 of "vfio/pci: Allow MMIO regions to be exported" series [1] needs
be merged and I expect this will happen in the Linux v6.19 cycle. The QEMU
changes are then targeting QEMU 11.0 whose cycle will start next year now.

We still have some time to polish.

Thanks,

C.

[1] https://lore.kernel.org//qemu-devel/20251111-dmabuf-vfio-v8-0-fd9aa5df478f@nvidia.com



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu
  2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
                   ` (9 preceding siblings ...)
  2025-11-09  5:33 ` [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices Vivek Kasireddy
@ 2025-11-17  9:04 ` Cédric Le Goater
  10 siblings, 0 replies; 45+ messages in thread
From: Cédric Le Goater @ 2025-11-17  9:04 UTC (permalink / raw)
  To: Vivek Kasireddy, qemu-devel
  Cc: Marc-André Lureau, Alex Bennée, Akihiko Odaki,
	Dmitry Osipenko, Leon Romanovsky, Leon Romanovsky,
	Jason Gunthorpe, Dongwon Kim, Alex Williamson

Vivek,

On 11/9/25 06:33, Vivek Kasireddy wrote:
> The virtio-gpu driver running in the Guest VM can create Guest blob
> resources (by importing dmabufs) that are backed by System RAM. This
> is made possible by making use of memfd memory backend and udmabuf
> driver on the Host side. However, in order to create Guest blobs
> that are backed by vfio-pci device regions (which happens when
> virtio-gpu imports dmabufs from devices that have local memory such
> as dGPU VFs), we have to implement VFIO_DEVICE_FEATURE_DMA_BUF and
> leverage it in virtio-gpu.
> 
> So, while creating the blobs we use memory_region_is_ram_device() to
> figure out if the blob is backed by memfd or a vfio-pci device. If
> it is determined that the blob is backed by vfio-pci device region,
> instead of calling into udmabuf driver to create a dmabuf fd we would
> now call into vfio-pci driver to have a dmabuf fd created on the Host.
> 
> Changelog:
> v1 -> v2:
> - Drop the patch that uses res->blob_size instead of res->blob to
>    identify blob resources (Akihiko)
> - Remove the res->dmabuf_fd < 0 check while attaching backing to a
>    resource (Akihiko)
> - Remove cmd->cmd_hdr.type != VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
>    check while attaching backing (Akihiko)
> - Improve vfio_get_region_index_from_mr() and add documentation (Cedric)
> - Remove rcu_read_lock/unlock around qemu_ram_block_from_host()
>    (Akihiko, Cedric)
> - Improve, document and rename vfio_device_create_dmabuf() to
>    vfio_device_create_dmabuf_fd() (Cedric)
> - Add a new helper to lookup VFIO device from memory region (Cedric)
> - Document vfio_device_get_region_info() (Cedric)
> - Ensure that info variable (in vfio_dmabuf_mmap()) is initialized
>    before use (Cedric)
> - Rename udmabuf files and helpers to dmabuf (Akihiko)
> - Remove the redundant check for virtio_gpu_have_udmabuf() in
>    virtio_gpu_init_dmabuf() (Akihiko)
> - Add a helper to check whether all the entries of a dmabuf belong
>    to a single memory region or not (Akihiko)
> 
> RFC -> v1:
> - Create the CPU mapping using vfio device fd if the dmabuf exporter
>    (vfio-pci) does not provide mmap() support (Akihiko)
> - Log a warning with LOG_GUEST_ERROR instead of warn_report() when
>    dmabuf cannot be created using Guest provided addresses (Akihiko)
> - Use address_space_translate() instead of gpa2hva() to obtain the
>    Host addresses (Akihiko)
> - Rearrange the patches and improve the commit messages (Akihiko)
> - Fix compilation error when VFIO is not enabled (Alex)
> - Add a new helper to obtain VFIO region index from memory region
> - Move vfio_device_create_dmabuf() to hw/vfio/device.c
> 
> Tested with an SRIOV enabled Intel dGPU (B60) by running Gnome Wayland
> (in the VM) and Qemu with the following (relevant) parameters:
> -device vfio-pci,host=0000:03:00.1
> -device virtio-vga,max_outputs=1,xres=1920,yres=1080,blob=true
> -display gtk,gl=on
> 
> Associated vfio-pci kernel driver series:
> https://lore.kernel.org/dri-devel/cover.1754311439.git.leon@kernel.org/
> Associated virtio-gpu kernel driver series (merged):
> https://lore.kernel.org/dri-devel/20241126031643.3490496-1-vivek.kasireddy@intel.com/
> 
> ---
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Alex Bennée <alex.bennee@linaro.org>
> Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>

Please adjust Alex's email :

   Alex Williamson <alex@shazbot.org>

Thanks,

C.



> Cc: Cédric Le Goater <clg@redhat.com>
> Cc: Leon Romanovsky <leon@kernel.org>
> Cc: Leon Romanovsky <leonro@nvidia.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Dongwon Kim <dongwon.kim@intel.com>
> 
> Vivek Kasireddy (10):
>    virtio-gpu: Recreate the resource's dmabuf if new backing is attached
>    virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
>    vfio: Document vfio_device_get_region_info()
>    vfio/region: Add a helper to get region index from memory region
>    vfio/device: Add a helper to lookup VFIODevice from memory region
>    linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF
>    vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF
>    virtio-gpu: Rename udmabuf files and helpers to dmabuf
>    virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions()
>    virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO
>      devices
> 
>   hw/display/Kconfig                            |   5 +
>   hw/display/meson.build                        |   4 +-
>   ...abuf-stubs.c => virtio-gpu-dmabuf-stubs.c} |   4 +-
>   ...rtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c} | 149 ++++++++++++++++--
>   hw/display/virtio-gpu.c                       |  34 +++-
>   hw/vfio/device.c                              |  52 ++++++
>   hw/vfio/region.c                              |  14 ++
>   include/hw/vfio/vfio-device.h                 |  46 ++++++
>   include/hw/virtio/virtio-gpu.h                |   6 +-
>   linux-headers/linux/vfio.h                    |  25 +++
>   10 files changed, 316 insertions(+), 23 deletions(-)
>   rename hw/display/{virtio-gpu-udmabuf-stubs.c => virtio-gpu-dmabuf-stubs.c} (79%)
>   rename hw/display/{virtio-gpu-udmabuf.c => virtio-gpu-dmabuf.c} (56%)
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-17  7:13               ` Akihiko Odaki
@ 2025-11-18  5:22                 ` Kasireddy, Vivek
  2025-11-18  6:02                   ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-18  5:22 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated with VFIO devices
> 
> On 2025/11/17 13:19, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> >> associated with VFIO devices
> >>
> >> On 2025/11/13 12:17, Kasireddy, Vivek wrote:
> >>> Hi Akihiko,
> >>>
> >>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
> >> blobs
> >>>> associated with VFIO devices
> >>>>
> >>>> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
> >>>>> Hi Akihiko,
> >>>>>
> >>>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
> >>>> blobs
> >>>>>> associated with VFIO devices
> >>>>>>
> >>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>>>> In addition to memfd, a blob resource can also have its backing
> >>>>>>> storage in a VFIO device region. Therefore, we first need to figure
> >>>>>>> out if the blob is backed by a VFIO device region or a memfd before
> >>>>>>> we can call the right API to get a dmabuf fd created.
> >>>>>>>
> >>>>>>> So, once we have the ramblock and the associated mr, we rely on
> >>>>>>> memory_region_is_ram_device() to tell us where the backing storage
> >>>>>>> is located. If the blob resource is VFIO backed, we try to find the
> >>>>>>> right VFIO device that contains the blob and then invoke the API
> >>>>>>> vfio_device_create_dmabuf().
> >>>>>>>
> >>>>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> >>>>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
> >>>>>>> use the VFIO device fd directly to create the CPU mapping.
> >>>>>>
> >>>>>> I have just remembered that mremap() will work for either of udmabuf
> >>>> and
> >>>>>> VFIO. That will avoid having two different methods and make
> >>>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
> >>>>>> unnecessary.
> >>>>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we
> >> are
> >>>> not
> >>>>> actually doing remap but are simply calling mmap(). In other words, we
> >>>> are not
> >>>>> expanding or shrinking existing mapping but are creating a new
> >> mapping.
> >>>>> And, for dmabufs associated with VFIO devices, without having to call
> >>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
> >>>> don't see
> >>>>> any other way to determine the region offset.
> >>>>>
> >>>>> So, I guess I'll create a new patch to do s/remapped/map.
> >>>>
> >>>> I mean calling mremap() with 0 as the old_size parameter. The man page
> >>>> says:
> >>>>    > If the value of old_size is zero, and old_address refers to a
> >>>>    > shareable mapping (see the description of MAP_SHARED in
> mmap(2)),
> >>>> then
> >>>>    > mremap() will create a new mapping of the same pages.
> >>> It might be possible to use mremap() here but using mmap() seems very
> >>> straightforward given that we are actually not shrinking or expanding
> >>> an existing mapping but are instead creating a new mapping. Also, I am
> >>> wondering what benefit would mremap() bring as opposed to just using
> >>> mmap()?
> >>
> >> As I noted earlier, mremap() removes the need of having two different
> >> paths for udmabuf and VFIO, and make vfio_get_region_index_from_mr()
> >> and
> >> vfio_device_get_region_info() unnecessary, reducing code complexity.
> > Sorry, I should have researched thoroughly before but after looking at the
> code
> > again, I don't see how mremap() removes the need for having two different
> > paths for udmabuf and VFIO and make vfio_get_region_index_from_mr()
> > and vfio_device_get_region_info() unnecessary. Could you please elaborate
> > how it can be done?
> 
> Not tested, but something like the following:
> 
> head = qemu_ram_mmap(-1, res->blob_size, qemu_real_host_page_size(),
>                       QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);
> if (head == MAP_FAILED) {
>      return NULL;
> }
> 
> cursor = head;
> 
> for (i = 0; i < res->iov_cnt; i++) {
>      if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
>                 MREMAP_FIXED, cursor) == MAP_FAILED) {
This is very elegant and I can now see how it is expected to work. However,
I went ahead and tested it and it does not seem to work for VFIO backed
buffers. It works for buffers based out of System RAM though. Here is the
actual code I tested with that I am unconditionally calling for both VFIO
and udmabuf cases:
static void *vfio_dmabuf_mmap2(struct virtio_gpu_simple_resource *res,
                               VFIODevice *vdev)
{
    void *head, *cursor;
    int i;

    head = qemu_ram_mmap(-1, res->blob_size, qemu_real_host_page_size(),
                         			QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);
    if (head == MAP_FAILED) {
        return head;
    }

    cursor = head;
    for (i = 0; i < res->iov_cnt; i++) {
         if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
             MREMAP_FIXED | MREMAP_MAYMOVE, cursor) == MAP_FAILED) {
             goto err;
         }
         cursor += res->iov[i].iov_len;
    }
    return head;
err:
    qemu_ram_munmap(-1, head, res->blob_size);
    return MAP_FAILED;
}

It (mremap) initially errored with -EINVAL in all cases but adding MREMAP_MAYMOVE
fixed it for buffers based out of RAM but for VFIO backed buffers, it seems to be
throwing -EFAULT/Bad Address error. I did not yet check why or where the kernel
driver is returning this error from.

Thanks,
Vivek


>          qemu_ram_munmap(-1, head, res->blob_size);
>          return NULL;
>      }
> 
>      cursor += res->iov[i].iov_len;
> }
> 
> return head;
> 
> Regards,
> Akihiko Odaki
> 
> >
> > Thanks,
> > Vivek
> >
> >>
> >> mremap() is also sufficiently straightforward. The man page explicitly
> >> states it is designed to create a new mapping. Using it for the purpose
> >> (not shrinking or expanding an existing mapping) is not an abuse of the API.
> >>
> >> Regards,
> >> Akihiko Odaki



^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-17  6:56               ` Akihiko Odaki
@ 2025-11-18  5:26                 ` Kasireddy, Vivek
  2025-11-18  5:37                   ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-18  5:26 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> 
> On 2025/11/17 13:14, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> >> associated with a ram device
> >>
> >>
> >>
> >> On 2025/11/13 12:39, Kasireddy, Vivek wrote:
> >>> Hi Akihiko,
> >>>
> >>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> >>>> associated with a ram device
> >>>>
> >>>> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
> >>>>> Hi Akihiko,
> >>>>>
> >>>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA
> >> addr
> >>>>>> associated with a ram device
> >>>>>>
> >>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>>>> If the Guest provides a DMA address that is associated with a ram
> >>>>>>> device (such as a PCI device region and not its system memory), then
> >>>>>>> we can obtain the hva (host virtual address) by invoking
> >>>>>>> address_space_translate() followed by
> >> memory_region_get_ram_ptr().
> >>>>>>>
> >>>>>>> This is because the ram device's address space is not accessible to
> >>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
> >>>>>>> Therefore, we first need to identify the memory region associated
> >> with
> >>>>>>> the DMA address and figure out if it belongs to a ram device or not
> >>>>>>> and decide how to obtain the host address accordingly.
> >>>>>>>
> >>>>>>> Note that we take a reference on the memory region if it belongs to a
> >>>>>>> ram device but we would still call dma_memory_unmap() later (to
> >> unref
> >>>>>>> mr) regardless of how we obtained the hva.
> >>>>>>>
> >>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
> >>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
> >>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
> >>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
> >>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
> >>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
> >>>>>>> ---
> >>>>>>>      hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >>>>>>>      1 file changed, 21 insertions(+), 3 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
> >>>>>>> 199b18c746..d352b5afd6 100644
> >>>>>>> --- a/hw/display/virtio-gpu.c
> >>>>>>> +++ b/hw/display/virtio-gpu.c
> >>>>>>> @@ -798,6 +798,26 @@ static void
> >>>> virtio_gpu_set_scanout_blob(VirtIOGPU
> >>>>>> *g,
> >>>>>>>                                    &fb, res, &ss.r, &cmd->error);
> >>>>>>>      }
> >>>>>>>
> >>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> >>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
> >>>>>>> +                                       uint64_t a, hwaddr *len) {
> >>>>>>> +    MemoryRegion *mr = NULL;
> >>>>>>> +    hwaddr xlat;
> >>>>>>> +
> >>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
> >> &xlat,
> >>>> len,
> >>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
> >>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
> >>>>>>> +    if (memory_region_is_ram_device(mr)) {
> >>>>>>> +        memory_region_ref(mr);
> >>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> >>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
> >>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
> >>>>>> This function should:
> >>>>>> - call memory_region_get_ram_ptr(mr)
> >>>>>>       if memory_region_is_ram(mr)
> >>>>>> - return NULL otherwise
> >>>>>>
> >>>>>> There are a few reasons. First, the documentation of
> >>>> dma_memory_map()
> >>>>>> tells to use it "only for reads OR writes - not for read-modify-write
> >>>>>> operations." It can be used for read-modify-write operations so
> >>>>>> dma_memory_map() should be avoided.
> >>>>> This patch series only deals with non-virgl use-cases where AFAICS
> >>>> resources
> >>>>> are not written to on the Host.
> >>>>>
> >>>>>> Second, it ensures that the mapped pointer is writable.
> >>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> >>>> associated
> >>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
> >>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
> >>>> other
> >>>>>> callers also use the function to map writable pointers.
> >>>>> Unless I am missing something, I don't see where writable pointers are
> >>>> used
> >>>>> in non-virgl use-cases?
> >>>> Rutabaga uses too, but you are right about that 2D operations won't use
> >> it.
> >>>>
> >>>> That said, exposing non-writable memory to Virgl and Rutabaga lets the
> >>>> guest corrupt memory so should be avoided. On the other hand, it is
> >>>> unlikely that rejecting non-writable memory will cause any problem. You
> >>>> can also add another code path to use
> >>>> memory_region_supports_direct_access() instead of
> >>>> memory_region_is_ram()
> >>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
> >>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
> >> working.
> >>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga
> >> code.
> >>> And, this patch series and my use-case (GPU SRIOV) only needs to deal
> >> with
> >>> non-writeable memory because the rendering is already done by the
> >> Guest and
> >>> the Host only needs to display the Guest's FB.
> >>>
> >>> However, I see that virtio_gpu_create_mapping_iov() is used by
> >> virgl/rutabaga
> >>> code as well, so I am wondering how do things work right now given that
> >>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
> >>> In other words, is there no problem currently with non-writeable memory
> >>> in virgl/rutabaga use-cases?
> >>
> >> The current code is problematic, and using memory_region_is_ram() will
> >> fix it.
> > Ok, I'll make the change.
> >
> >>
> >>>
> >>>>>> It also makes the check of memory_region_is_ram_device() and
> >>>>>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
> >>>> reducing
> >>>>>> the overall complexity.
> >>>>> Since buffers reside completely in either ram or ram_device regions,
> >> using
> >>>> both
> >>>>> memory_region_is_ram_device() and memory_region_is_ram() to
> check
> >>>> where
> >>>>> they are located seems necessary and unavoidable.
> >>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
> >>>> function finds the memory is incompatible with udmabuf, it can call
> >>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
> >>> Yeah, what you suggest is doable but seems a bit convoluted to have to
> >>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
> >>> functions.
> >>>
> >>> I think using memory_region_is_ram_device() and
> >> memory_region_is_ram()
> >>> to identify the right memory region and calling either
> >> virtio_gpu_create_udmabuf()
> >>> or vfio_create_dmabuf() is much more intuitive and readable.
> >>
> >> memory_region_is_ram_device() and memory_region_is_ram() are not
> >> sufficient to identify the right memory region.
> >> memory_region_is_ram_device() returns true for RAM device created by
> >> non-VFIO devices, and memory_region_is_ram() returns true for memory
> >> regions created with memory_region_init_ram_ptr(), which is not backed
> >> with memfd.
> > Right, but structuring the code in the following way would address your
> concerns
> > and make it more robust:
> >          if (memory_region_is_ram_device(rb->mr) && (vdev =
> vfio_device_lookup(rb->mr))) {
> > 	vfio_create_dmabuf(vdev, res);
> >          } else if (memory_region_is_ram(rb->mr) &&
> virtio_gpu_have_udmabuf()) {
> > 	virtio_gpu_create_udmabuf(res);
> >          } else {
> > 	...
> >          }
> 
> One of the concerns I raised is that having such checks has an inherent
> hazard that they can be inconsistent with the actual implementations.
> 
> The original checks had such inconsistency, and the updated one still
> have too. memory_region_is_ram(rb->mr) && virtio_gpu_have_udmabuf()
> can
> be still true even for memory regions that do not have memfd; please
> refer to the example of memory_region_init_ram_ptr() I pointed out in
> the last email.
> 
> Even if you somehow managed to write checks that match with the
> implementations, it is still possible that a future change can break it.
> Letting the implementations check their prerequisite conditions
> completely prevents such an error by construction and makes the code
> more robust.
IIUC, your suggestion is to add a check for memory_region_supports_direct_access()
inside virtio_gpu_create_udmabuf() and call it unconditionally right?
And, also move the (memory_region_is_ram_device(rb->mr) && (vdev = vfio_device_lookup(rb->mr)))
check inside vfio_create_dmabuf() right?

Thanks,
Vivek
> 
> Another concern is complexity. Calling another function
> (virtio_gpu_have_udmabuf()) do not reduce complexity.
> 
> I do not understand why having such extra function calls and
> conditionals will make the code more robust.
> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-18  5:26                 ` Kasireddy, Vivek
@ 2025-11-18  5:37                   ` Akihiko Odaki
  2025-11-19  5:42                     ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-18  5:37 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/18 14:26, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>
>> On 2025/11/17 13:14, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>>>> associated with a ram device
>>>>
>>>>
>>>>
>>>> On 2025/11/13 12:39, Kasireddy, Vivek wrote:
>>>>> Hi Akihiko,
>>>>>
>>>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>>>>>> associated with a ram device
>>>>>>
>>>>>> On 2025/11/12 13:30, Kasireddy, Vivek wrote:
>>>>>>> Hi Akihiko,
>>>>>>>
>>>>>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA
>>>> addr
>>>>>>>> associated with a ram device
>>>>>>>>
>>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>>>> If the Guest provides a DMA address that is associated with a ram
>>>>>>>>> device (such as a PCI device region and not its system memory), then
>>>>>>>>> we can obtain the hva (host virtual address) by invoking
>>>>>>>>> address_space_translate() followed by
>>>> memory_region_get_ram_ptr().
>>>>>>>>>
>>>>>>>>> This is because the ram device's address space is not accessible to
>>>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be used.
>>>>>>>>> Therefore, we first need to identify the memory region associated
>>>> with
>>>>>>>>> the DMA address and figure out if it belongs to a ram device or not
>>>>>>>>> and decide how to obtain the host address accordingly.
>>>>>>>>>
>>>>>>>>> Note that we take a reference on the memory region if it belongs to a
>>>>>>>>> ram device but we would still call dma_memory_unmap() later (to
>>>> unref
>>>>>>>>> mr) regardless of how we obtained the hva.
>>>>>>>>>
>>>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
>>>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
>>>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
>>>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
>>>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
>>>>>>>>> ---
>>>>>>>>>       hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>>>>>>>       1 file changed, 21 insertions(+), 3 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index
>>>>>>>>> 199b18c746..d352b5afd6 100644
>>>>>>>>> --- a/hw/display/virtio-gpu.c
>>>>>>>>> +++ b/hw/display/virtio-gpu.c
>>>>>>>>> @@ -798,6 +798,26 @@ static void
>>>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
>>>>>>>> *g,
>>>>>>>>>                                     &fb, res, &ss.r, &cmd->error);
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>>>>>>>> +                                       uint64_t a, hwaddr *len) {
>>>>>>>>> +    MemoryRegion *mr = NULL;
>>>>>>>>> +    hwaddr xlat;
>>>>>>>>> +
>>>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
>>>> &xlat,
>>>>>> len,
>>>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>>>>>>>> +    if (memory_region_is_ram_device(mr)) {
>>>>>>>>> +        memory_region_ref(mr);
>>>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
>>>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
>>>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
>>>>>>>> This function should:
>>>>>>>> - call memory_region_get_ram_ptr(mr)
>>>>>>>>        if memory_region_is_ram(mr)
>>>>>>>> - return NULL otherwise
>>>>>>>>
>>>>>>>> There are a few reasons. First, the documentation of
>>>>>> dma_memory_map()
>>>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
>>>>>>>> operations." It can be used for read-modify-write operations so
>>>>>>>> dma_memory_map() should be avoided.
>>>>>>> This patch series only deals with non-virgl use-cases where AFAICS
>>>>>> resources
>>>>>>> are not written to on the Host.
>>>>>>>
>>>>>>>> Second, it ensures that the mapped pointer is writable.
>>>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>>>>>> associated
>>>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
>>>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but the
>>>>>> other
>>>>>>>> callers also use the function to map writable pointers.
>>>>>>> Unless I am missing something, I don't see where writable pointers are
>>>>>> used
>>>>>>> in non-virgl use-cases?
>>>>>> Rutabaga uses too, but you are right about that 2D operations won't use
>>>> it.
>>>>>>
>>>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets the
>>>>>> guest corrupt memory so should be avoided. On the other hand, it is
>>>>>> unlikely that rejecting non-writable memory will cause any problem. You
>>>>>> can also add another code path to use
>>>>>> memory_region_supports_direct_access() instead of
>>>>>> memory_region_is_ram()
>>>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
>>>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
>>>> working.
>>>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-rutabaga
>>>> code.
>>>>> And, this patch series and my use-case (GPU SRIOV) only needs to deal
>>>> with
>>>>> non-writeable memory because the rendering is already done by the
>>>> Guest and
>>>>> the Host only needs to display the Guest's FB.
>>>>>
>>>>> However, I see that virtio_gpu_create_mapping_iov() is used by
>>>> virgl/rutabaga
>>>>> code as well, so I am wondering how do things work right now given that
>>>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
>>>>> In other words, is there no problem currently with non-writeable memory
>>>>> in virgl/rutabaga use-cases?
>>>>
>>>> The current code is problematic, and using memory_region_is_ram() will
>>>> fix it.
>>> Ok, I'll make the change.
>>>
>>>>
>>>>>
>>>>>>>> It also makes the check of memory_region_is_ram_device() and
>>>>>>>> memory_region_is_ram() unnecessary for virtio_gpu_init_dmabuf(),
>>>>>> reducing
>>>>>>>> the overall complexity.
>>>>>>> Since buffers reside completely in either ram or ram_device regions,
>>>> using
>>>>>> both
>>>>>>> memory_region_is_ram_device() and memory_region_is_ram() to
>> check
>>>>>> where
>>>>>>> they are located seems necessary and unavoidable.
>>>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
>>>>>> function finds the memory is incompatible with udmabuf, it can call
>>>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
>>>>> Yeah, what you suggest is doable but seems a bit convoluted to have to
>>>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO related
>>>>> functions.
>>>>>
>>>>> I think using memory_region_is_ram_device() and
>>>> memory_region_is_ram()
>>>>> to identify the right memory region and calling either
>>>> virtio_gpu_create_udmabuf()
>>>>> or vfio_create_dmabuf() is much more intuitive and readable.
>>>>
>>>> memory_region_is_ram_device() and memory_region_is_ram() are not
>>>> sufficient to identify the right memory region.
>>>> memory_region_is_ram_device() returns true for RAM device created by
>>>> non-VFIO devices, and memory_region_is_ram() returns true for memory
>>>> regions created with memory_region_init_ram_ptr(), which is not backed
>>>> with memfd.
>>> Right, but structuring the code in the following way would address your
>> concerns
>>> and make it more robust:
>>>           if (memory_region_is_ram_device(rb->mr) && (vdev =
>> vfio_device_lookup(rb->mr))) {
>>> 	vfio_create_dmabuf(vdev, res);
>>>           } else if (memory_region_is_ram(rb->mr) &&
>> virtio_gpu_have_udmabuf()) {
>>> 	virtio_gpu_create_udmabuf(res);
>>>           } else {
>>> 	...
>>>           }
>>
>> One of the concerns I raised is that having such checks has an inherent
>> hazard that they can be inconsistent with the actual implementations.
>>
>> The original checks had such inconsistency, and the updated one still
>> have too. memory_region_is_ram(rb->mr) && virtio_gpu_have_udmabuf()
>> can
>> be still true even for memory regions that do not have memfd; please
>> refer to the example of memory_region_init_ram_ptr() I pointed out in
>> the last email.
>>
>> Even if you somehow managed to write checks that match with the
>> implementations, it is still possible that a future change can break it.
>> Letting the implementations check their prerequisite conditions
>> completely prevents such an error by construction and makes the code
>> more robust.
> IIUC, your suggestion is to add a check for memory_region_supports_direct_access()
> inside virtio_gpu_create_udmabuf() and call it unconditionally right?

No, my suggestion is to remove it at all. Creating udmabuf only requires 
that the memory regions are backed with memfd. 
memory_region_supports_direct_access() on the other hand tells if the 
host can access it directly by normal load and store instructions, which 
is irrelevant when creating udmabuf.

> And, also move the (memory_region_is_ram_device(rb->mr) && (vdev = vfio_device_lookup(rb->mr)))
> check inside vfio_create_dmabuf() right?

Moving vdev = vfio_device_lookup(rb->mr) into vfio_create_dmabuf() 
sounds good to me.
memory_region_is_ram_device(rb->mr) is again irrelevant; it tells if the 
memory region has a flag that prevents MMIO-incompatible operations, but 
we only care if it belongs to a VFIO device.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-18  5:22                 ` Kasireddy, Vivek
@ 2025-11-18  6:02                   ` Akihiko Odaki
  2025-11-19  5:33                     ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-18  6:02 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/18 14:22, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>> associated with VFIO devices
>>
>> On 2025/11/17 13:19, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>>>> associated with VFIO devices
>>>>
>>>> On 2025/11/13 12:17, Kasireddy, Vivek wrote:
>>>>> Hi Akihiko,
>>>>>
>>>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
>>>> blobs
>>>>>> associated with VFIO devices
>>>>>>
>>>>>> On 2025/11/12 13:26, Kasireddy, Vivek wrote:
>>>>>>> Hi Akihiko,
>>>>>>>
>>>>>>>> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for
>>>>>> blobs
>>>>>>>> associated with VFIO devices
>>>>>>>>
>>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>>>> In addition to memfd, a blob resource can also have its backing
>>>>>>>>> storage in a VFIO device region. Therefore, we first need to figure
>>>>>>>>> out if the blob is backed by a VFIO device region or a memfd before
>>>>>>>>> we can call the right API to get a dmabuf fd created.
>>>>>>>>>
>>>>>>>>> So, once we have the ramblock and the associated mr, we rely on
>>>>>>>>> memory_region_is_ram_device() to tell us where the backing storage
>>>>>>>>> is located. If the blob resource is VFIO backed, we try to find the
>>>>>>>>> right VFIO device that contains the blob and then invoke the API
>>>>>>>>> vfio_device_create_dmabuf().
>>>>>>>>>
>>>>>>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
>>>>>>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
>>>>>>>>> use the VFIO device fd directly to create the CPU mapping.
>>>>>>>>
>>>>>>>> I have just remembered that mremap() will work for either of udmabuf
>>>>>> and
>>>>>>>> VFIO. That will avoid having two different methods and make
>>>>>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info()
>>>>>>>> unnecessary.
>>>>>>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because we
>>>> are
>>>>>> not
>>>>>>> actually doing remap but are simply calling mmap(). In other words, we
>>>>>> are not
>>>>>>> expanding or shrinking existing mapping but are creating a new
>>>> mapping.
>>>>>>> And, for dmabufs associated with VFIO devices, without having to call
>>>>>>> vfio_get_region_index_from_mr() and vfio_device_get_region_info(), I
>>>>>> don't see
>>>>>>> any other way to determine the region offset.
>>>>>>>
>>>>>>> So, I guess I'll create a new patch to do s/remapped/map.
>>>>>>
>>>>>> I mean calling mremap() with 0 as the old_size parameter. The man page
>>>>>> says:
>>>>>>     > If the value of old_size is zero, and old_address refers to a
>>>>>>     > shareable mapping (see the description of MAP_SHARED in
>> mmap(2)),
>>>>>> then
>>>>>>     > mremap() will create a new mapping of the same pages.
>>>>> It might be possible to use mremap() here but using mmap() seems very
>>>>> straightforward given that we are actually not shrinking or expanding
>>>>> an existing mapping but are instead creating a new mapping. Also, I am
>>>>> wondering what benefit would mremap() bring as opposed to just using
>>>>> mmap()?
>>>>
>>>> As I noted earlier, mremap() removes the need of having two different
>>>> paths for udmabuf and VFIO, and make vfio_get_region_index_from_mr()
>>>> and
>>>> vfio_device_get_region_info() unnecessary, reducing code complexity.
>>> Sorry, I should have researched thoroughly before but after looking at the
>> code
>>> again, I don't see how mremap() removes the need for having two different
>>> paths for udmabuf and VFIO and make vfio_get_region_index_from_mr()
>>> and vfio_device_get_region_info() unnecessary. Could you please elaborate
>>> how it can be done?
>>
>> Not tested, but something like the following:
>>
>> head = qemu_ram_mmap(-1, res->blob_size, qemu_real_host_page_size(),
>>                        QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);
>> if (head == MAP_FAILED) {
>>       return NULL;
>> }
>>
>> cursor = head;
>>
>> for (i = 0; i < res->iov_cnt; i++) {
>>       if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
>>                  MREMAP_FIXED, cursor) == MAP_FAILED) {
> This is very elegant and I can now see how it is expected to work. However,
> I went ahead and tested it and it does not seem to work for VFIO backed
> buffers. It works for buffers based out of System RAM though. Here is the
> actual code I tested with that I am unconditionally calling for both VFIO
> and udmabuf cases:
> static void *vfio_dmabuf_mmap2(struct virtio_gpu_simple_resource *res,
>                                 VFIODevice *vdev)
> {
>      void *head, *cursor;
>      int i;
> 
>      head = qemu_ram_mmap(-1, res->blob_size, qemu_real_host_page_size(),
>                           			QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);

By the way, please do:
head = mmap(NULL, res->blob_size, PROT_NONE, MAP_SHARED, -1, 0);

I forgot that we don't need to map a RAM but mmap() with PROT_NONE is 
sufficient. It will catch a bug that fails to mmap() a real resource on 
top of it.

>      if (head == MAP_FAILED) {
>          return head;
>      }
> 
>      cursor = head;
>      for (i = 0; i < res->iov_cnt; i++) {
>           if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
>               MREMAP_FIXED | MREMAP_MAYMOVE, cursor) == MAP_FAILED) {
>               goto err;
>           }
>           cursor += res->iov[i].iov_len;
>      }
>      return head;
> err:
>      qemu_ram_munmap(-1, head, res->blob_size);
>      return MAP_FAILED;
> }
> 
> It (mremap) initially errored with -EINVAL in all cases but adding MREMAP_MAYMOVE
> fixed it for buffers based out of RAM but for VFIO backed buffers, it seems to be
> throwing -EFAULT/Bad Address error. I did not yet check why or where the kernel
> driver is returning this error from.

The man page says that EFAULT means:
 > Some address in the range old_address to old_address+old_size is an
 > invalid virtual memory address for this process. You can also get
 > EFAULT even if there exist mappings that cover the whole address space
 > requested, but those mappings are of different types.

None of this should be true so it should be a bug, though I'm not sure 
if it is a bug of QEMU, Linux, or the man page (i.e., the man page 
failed to mention another failure scenario). In any case it needs to be 
debugged.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices
  2025-11-18  6:02                   ` Akihiko Odaki
@ 2025-11-19  5:33                     ` Kasireddy, Vivek
  0 siblings, 0 replies; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-19  5:33 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> associated with VFIO devices
> >>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>>>>>> In addition to memfd, a blob resource can also have its backing
> >>>>>>>>> storage in a VFIO device region. Therefore, we first need to figure
> >>>>>>>>> out if the blob is backed by a VFIO device region or a memfd
> before
> >>>>>>>>> we can call the right API to get a dmabuf fd created.
> >>>>>>>>>
> >>>>>>>>> So, once we have the ramblock and the associated mr, we rely
> on
> >>>>>>>>> memory_region_is_ram_device() to tell us where the backing
> storage
> >>>>>>>>> is located. If the blob resource is VFIO backed, we try to find the
> >>>>>>>>> right VFIO device that contains the blob and then invoke the API
> >>>>>>>>> vfio_device_create_dmabuf().
> >>>>>>>>>
> >>>>>>>>> Note that in virtio_gpu_remap_udmabuf(), we first try to test if
> >>>>>>>>> the VFIO dmabuf exporter supports mmap or not. If it doesn't,
> we
> >>>>>>>>> use the VFIO device fd directly to create the CPU mapping.
> >>>>>>>>
> >>>>>>>> I have just remembered that mremap() will work for either of
> udmabuf
> >>>>>> and
> >>>>>>>> VFIO. That will avoid having two different methods and make
> >>>>>>>> vfio_get_region_index_from_mr() and
> vfio_device_get_region_info()
> >>>>>>>> unnecessary.
> >>>>>>> IIUC, the name virtio_gpu_remap_dmabuf() is misleading because
> we
> >>>> are
> >>>>>> not
> >>>>>>> actually doing remap but are simply calling mmap(). In other
> words, we
> >>>>>> are not
> >>>>>>> expanding or shrinking existing mapping but are creating a new
> >>>> mapping.
> >>>>>>> And, for dmabufs associated with VFIO devices, without having to
> call
> >>>>>>> vfio_get_region_index_from_mr() and
> vfio_device_get_region_info(), I
> >>>>>> don't see
> >>>>>>> any other way to determine the region offset.
> >>>>>>>
> >>>>>>> So, I guess I'll create a new patch to do s/remapped/map.
> >>>>>>
> >>>>>> I mean calling mremap() with 0 as the old_size parameter. The man
> page
> >>>>>> says:
> >>>>>>     > If the value of old_size is zero, and old_address refers to a
> >>>>>>     > shareable mapping (see the description of MAP_SHARED in
> >> mmap(2)),
> >>>>>> then
> >>>>>>     > mremap() will create a new mapping of the same pages.
> >>>>> It might be possible to use mremap() here but using mmap() seems
> very
> >>>>> straightforward given that we are actually not shrinking or expanding
> >>>>> an existing mapping but are instead creating a new mapping. Also, I
> am
> >>>>> wondering what benefit would mremap() bring as opposed to just
> using
> >>>>> mmap()?
> >>>>
> >>>> As I noted earlier, mremap() removes the need of having two different
> >>>> paths for udmabuf and VFIO, and make
> vfio_get_region_index_from_mr()
> >>>> and
> >>>> vfio_device_get_region_info() unnecessary, reducing code complexity.
> >>> Sorry, I should have researched thoroughly before but after looking at
> the
> >> code
> >>> again, I don't see how mremap() removes the need for having two
> different
> >>> paths for udmabuf and VFIO and make
> vfio_get_region_index_from_mr()
> >>> and vfio_device_get_region_info() unnecessary. Could you please
> elaborate
> >>> how it can be done?
> >>
> >> Not tested, but something like the following:
> >>
> >> head = qemu_ram_mmap(-1, res->blob_size,
> qemu_real_host_page_size(),
> >>                        QEMU_MAP_READONLY | QEMU_MAP_SHARED, 0);
> >> if (head == MAP_FAILED) {
> >>       return NULL;
> >> }
> >>
> >> cursor = head;
> >>
> >> for (i = 0; i < res->iov_cnt; i++) {
> >>       if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
> >>                  MREMAP_FIXED, cursor) == MAP_FAILED) {
> > This is very elegant and I can now see how it is expected to work.
> However,
> > I went ahead and tested it and it does not seem to work for VFIO backed
> > buffers. It works for buffers based out of System RAM though. Here is the
> > actual code I tested with that I am unconditionally calling for both VFIO
> > and udmabuf cases:
> > static void *vfio_dmabuf_mmap2(struct virtio_gpu_simple_resource *res,
> >                                 VFIODevice *vdev)
> > {
> >      void *head, *cursor;
> >      int i;
> >
> >      head = qemu_ram_mmap(-1, res->blob_size,
> qemu_real_host_page_size(),
> >                           			QEMU_MAP_READONLY |
> QEMU_MAP_SHARED, 0);
> 
> By the way, please do:
> head = mmap(NULL, res->blob_size, PROT_NONE, MAP_SHARED, -1, 0);
> 
> I forgot that we don't need to map a RAM but mmap() with PROT_NONE is
> sufficient. It will catch a bug that fails to mmap() a real resource on
> top of it.
> 
> >      if (head == MAP_FAILED) {
> >          return head;
> >      }
> >
> >      cursor = head;
> >      for (i = 0; i < res->iov_cnt; i++) {
> >           if (mremap(res->iov[i].iov_base, 0, res->iov[i].iov_len,
> >               MREMAP_FIXED | MREMAP_MAYMOVE, cursor) == MAP_FAILED) {
> >               goto err;
> >           }
> >           cursor += res->iov[i].iov_len;
> >      }
> >      return head;
> > err:
> >      qemu_ram_munmap(-1, head, res->blob_size);
> >      return MAP_FAILED;
> > }
> >
> > It (mremap) initially errored with -EINVAL in all cases but adding
> MREMAP_MAYMOVE
> > fixed it for buffers based out of RAM but for VFIO backed buffers, it seems
> to be
> > throwing -EFAULT/Bad Address error. I did not yet check why or where the
> kernel
> > driver is returning this error from.
> 
> The man page says that EFAULT means:
>  > Some address in the range old_address to old_address+old_size is an
>  > invalid virtual memory address for this process. You can also get
>  > EFAULT even if there exist mappings that cover the whole address space
>  > requested, but those mappings are of different types.
> 
> None of this should be true so it should be a bug, though I'm not sure
> if it is a bug of QEMU, Linux, or the man page (i.e., the man page
> failed to mention another failure scenario). In any case it needs to be
> debugged.
I found that the check that fails due to which -EFAULT is returned is this:
https://elixir.bootlin.com/linux/v6.18-rc6/source/mm/mremap.c#L1736

And, removing the check makes it (mremap) work. So, it is not clear whether
mremap() is supported for VMAs that have VM_DONTEXPAND and VM_PFNMAP
flags set, like in the case of VFIO. I guess I'll send a patch removing this check
to linux-mm to discuss this issue with mm developers.

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-18  5:37                   ` Akihiko Odaki
@ 2025-11-19  5:42                     ` Kasireddy, Vivek
  2025-11-19  5:55                       ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-19  5:42 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> >>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>>>>>> If the Guest provides a DMA address that is associated with a
> ram
> >>>>>>>>> device (such as a PCI device region and not its system memory),
> then
> >>>>>>>>> we can obtain the hva (host virtual address) by invoking
> >>>>>>>>> address_space_translate() followed by
> >>>> memory_region_get_ram_ptr().
> >>>>>>>>>
> >>>>>>>>> This is because the ram device's address space is not accessible
> to
> >>>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be
> used.
> >>>>>>>>> Therefore, we first need to identify the memory region
> associated
> >>>> with
> >>>>>>>>> the DMA address and figure out if it belongs to a ram device or
> not
> >>>>>>>>> and decide how to obtain the host address accordingly.
> >>>>>>>>>
> >>>>>>>>> Note that we take a reference on the memory region if it belongs
> to a
> >>>>>>>>> ram device but we would still call dma_memory_unmap() later
> (to
> >>>> unref
> >>>>>>>>> mr) regardless of how we obtained the hva.
> >>>>>>>>>
> >>>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
> >>>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
> >>>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
> >>>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
> >>>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
> >>>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
> >>>>>>>>> ---
> >>>>>>>>>       hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >>>>>>>>>       1 file changed, 21 insertions(+), 3 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index
> >>>>>>>>> 199b18c746..d352b5afd6 100644
> >>>>>>>>> --- a/hw/display/virtio-gpu.c
> >>>>>>>>> +++ b/hw/display/virtio-gpu.c
> >>>>>>>>> @@ -798,6 +798,26 @@ static void
> >>>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
> >>>>>>>> *g,
> >>>>>>>>>                                     &fb, res, &ss.r, &cmd->error);
> >>>>>>>>>       }
> >>>>>>>>>
> >>>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> >>>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
> >>>>>>>>> +                                       uint64_t a, hwaddr *len) {
> >>>>>>>>> +    MemoryRegion *mr = NULL;
> >>>>>>>>> +    hwaddr xlat;
> >>>>>>>>> +
> >>>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
> >>>> &xlat,
> >>>>>> len,
> >>>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
> >>>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
> >>>>>>>>> +    if (memory_region_is_ram_device(mr)) {
> >>>>>>>>> +        memory_region_ref(mr);
> >>>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
> >>>>>>>>> +    }
> >>>>>>>>> +
> >>>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
> >>>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
> >>>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
> >>>>>>>> This function should:
> >>>>>>>> - call memory_region_get_ram_ptr(mr)
> >>>>>>>>        if memory_region_is_ram(mr)
> >>>>>>>> - return NULL otherwise
> >>>>>>>>
> >>>>>>>> There are a few reasons. First, the documentation of
> >>>>>> dma_memory_map()
> >>>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
> >>>>>>>> operations." It can be used for read-modify-write operations so
> >>>>>>>> dma_memory_map() should be avoided.
> >>>>>>> This patch series only deals with non-virgl use-cases where AFAICS
> >>>>>> resources
> >>>>>>> are not written to on the Host.
> >>>>>>>
> >>>>>>>> Second, it ensures that the mapped pointer is writable.
> >>>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> >>>>>> associated
> >>>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
> >>>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but
> the
> >>>>>> other
> >>>>>>>> callers also use the function to map writable pointers.
> >>>>>>> Unless I am missing something, I don't see where writable pointers
> are
> >>>>>> used
> >>>>>>> in non-virgl use-cases?
> >>>>>> Rutabaga uses too, but you are right about that 2D operations won't
> use
> >>>> it.
> >>>>>>
> >>>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets
> the
> >>>>>> guest corrupt memory so should be avoided. On the other hand, it is
> >>>>>> unlikely that rejecting non-writable memory will cause any problem.
> You
> >>>>>> can also add another code path to use
> >>>>>> memory_region_supports_direct_access() instead of
> >>>>>> memory_region_is_ram()
> >>>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
> >>>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
> >>>> working.
> >>>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-
> rutabaga
> >>>> code.
> >>>>> And, this patch series and my use-case (GPU SRIOV) only needs to
> deal
> >>>> with
> >>>>> non-writeable memory because the rendering is already done by the
> >>>> Guest and
> >>>>> the Host only needs to display the Guest's FB.
> >>>>>
> >>>>> However, I see that virtio_gpu_create_mapping_iov() is used by
> >>>> virgl/rutabaga
> >>>>> code as well, so I am wondering how do things work right now given
> that
> >>>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
> >>>>> In other words, is there no problem currently with non-writeable
> memory
> >>>>> in virgl/rutabaga use-cases?
> >>>>
> >>>> The current code is problematic, and using memory_region_is_ram()
> will
> >>>> fix it.
> >>> Ok, I'll make the change.
> >>>
> >>>>
> >>>>>
> >>>>>>>> It also makes the check of memory_region_is_ram_device() and
> >>>>>>>> memory_region_is_ram() unnecessary for
> virtio_gpu_init_dmabuf(),
> >>>>>> reducing
> >>>>>>>> the overall complexity.
> >>>>>>> Since buffers reside completely in either ram or ram_device
> regions,
> >>>> using
> >>>>>> both
> >>>>>>> memory_region_is_ram_device() and memory_region_is_ram() to
> >> check
> >>>>>> where
> >>>>>>> they are located seems necessary and unavoidable.
> >>>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
> >>>>>> function finds the memory is incompatible with udmabuf, it can call
> >>>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
> >>>>> Yeah, what you suggest is doable but seems a bit convoluted to have
> to
> >>>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO
> related
> >>>>> functions.
> >>>>>
> >>>>> I think using memory_region_is_ram_device() and
> >>>> memory_region_is_ram()
> >>>>> to identify the right memory region and calling either
> >>>> virtio_gpu_create_udmabuf()
> >>>>> or vfio_create_dmabuf() is much more intuitive and readable.
> >>>>
> >>>> memory_region_is_ram_device() and memory_region_is_ram() are not
> >>>> sufficient to identify the right memory region.
> >>>> memory_region_is_ram_device() returns true for RAM device created
> by
> >>>> non-VFIO devices, and memory_region_is_ram() returns true for
> memory
> >>>> regions created with memory_region_init_ram_ptr(), which is not
> backed
> >>>> with memfd.
> >>> Right, but structuring the code in the following way would address your
> >> concerns
> >>> and make it more robust:
> >>>           if (memory_region_is_ram_device(rb->mr) && (vdev =
> >> vfio_device_lookup(rb->mr))) {
> >>> 	vfio_create_dmabuf(vdev, res);
> >>>           } else if (memory_region_is_ram(rb->mr) &&
> >> virtio_gpu_have_udmabuf()) {
> >>> 	virtio_gpu_create_udmabuf(res);
> >>>           } else {
> >>> 	...
> >>>           }
> >>
> >> One of the concerns I raised is that having such checks has an inherent
> >> hazard that they can be inconsistent with the actual implementations.
> >>
> >> The original checks had such inconsistency, and the updated one still
> >> have too. memory_region_is_ram(rb->mr) &&
> virtio_gpu_have_udmabuf()
> >> can
> >> be still true even for memory regions that do not have memfd; please
> >> refer to the example of memory_region_init_ram_ptr() I pointed out in
> >> the last email.
> >>
> >> Even if you somehow managed to write checks that match with the
> >> implementations, it is still possible that a future change can break it.
> >> Letting the implementations check their prerequisite conditions
> >> completely prevents such an error by construction and makes the code
> >> more robust.
> > IIUC, your suggestion is to add a check for
> memory_region_supports_direct_access()
> > inside virtio_gpu_create_udmabuf() and call it unconditionally right?
> 
> No, my suggestion is to remove it at all. Creating udmabuf only requires
> that the memory regions are backed with memfd.
If we unconditionally call virtio_gpu_create_udmabuf() for VFIO backed buffers,
I think it makes sense to return early without having to iterate over all the iov
entries to check their memory regions. So, I am thinking adding a
memory_region_supports_direct_access() or !memory_region_is_ram_device()
check would help with this.

Thanks,
Vivek 

> memory_region_supports_direct_access() on the other hand tells if the
> host can access it directly by normal load and store instructions, which
> is irrelevant when creating udmabuf.
> 
> > And, also move the (memory_region_is_ram_device(rb->mr) && (vdev =
> vfio_device_lookup(rb->mr)))
> > check inside vfio_create_dmabuf() right?
> 
> Moving vdev = vfio_device_lookup(rb->mr) into vfio_create_dmabuf()
> sounds good to me.
> memory_region_is_ram_device(rb->mr) is again irrelevant; it tells if the
> memory region has a flag that prevents MMIO-incompatible operations, but
> we only care if it belongs to a VFIO device.
> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-19  5:42                     ` Kasireddy, Vivek
@ 2025-11-19  5:55                       ` Akihiko Odaki
  2025-11-21  6:56                         ` Kasireddy, Vivek
  0 siblings, 1 reply; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-19  5:55 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/19 14:42, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>>>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>>>>>> If the Guest provides a DMA address that is associated with a
>> ram
>>>>>>>>>>> device (such as a PCI device region and not its system memory),
>> then
>>>>>>>>>>> we can obtain the hva (host virtual address) by invoking
>>>>>>>>>>> address_space_translate() followed by
>>>>>> memory_region_get_ram_ptr().
>>>>>>>>>>>
>>>>>>>>>>> This is because the ram device's address space is not accessible
>> to
>>>>>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be
>> used.
>>>>>>>>>>> Therefore, we first need to identify the memory region
>> associated
>>>>>> with
>>>>>>>>>>> the DMA address and figure out if it belongs to a ram device or
>> not
>>>>>>>>>>> and decide how to obtain the host address accordingly.
>>>>>>>>>>>
>>>>>>>>>>> Note that we take a reference on the memory region if it belongs
>> to a
>>>>>>>>>>> ram device but we would still call dma_memory_unmap() later
>> (to
>>>>>> unref
>>>>>>>>>>> mr) regardless of how we obtained the hva.
>>>>>>>>>>>
>>>>>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
>>>>>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
>>>>>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
>>>>>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>>>>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
>>>>>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>        hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>>>>>>>>>        1 file changed, 21 insertions(+), 3 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
>> index
>>>>>>>>>>> 199b18c746..d352b5afd6 100644
>>>>>>>>>>> --- a/hw/display/virtio-gpu.c
>>>>>>>>>>> +++ b/hw/display/virtio-gpu.c
>>>>>>>>>>> @@ -798,6 +798,26 @@ static void
>>>>>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
>>>>>>>>>> *g,
>>>>>>>>>>>                                      &fb, res, &ss.r, &cmd->error);
>>>>>>>>>>>        }
>>>>>>>>>>>
>>>>>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>>>>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>>>>>>>>>> +                                       uint64_t a, hwaddr *len) {
>>>>>>>>>>> +    MemoryRegion *mr = NULL;
>>>>>>>>>>> +    hwaddr xlat;
>>>>>>>>>>> +
>>>>>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as, a,
>>>>>> &xlat,
>>>>>>>> len,
>>>>>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>>>>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>>> +    if (memory_region_is_ram_device(mr)) {
>>>>>>>>>>> +        memory_region_ref(mr);
>>>>>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, len,
>>>>>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
>>>>>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>> This function should:
>>>>>>>>>> - call memory_region_get_ram_ptr(mr)
>>>>>>>>>>         if memory_region_is_ram(mr)
>>>>>>>>>> - return NULL otherwise
>>>>>>>>>>
>>>>>>>>>> There are a few reasons. First, the documentation of
>>>>>>>> dma_memory_map()
>>>>>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
>>>>>>>>>> operations." It can be used for read-modify-write operations so
>>>>>>>>>> dma_memory_map() should be avoided.
>>>>>>>>> This patch series only deals with non-virgl use-cases where AFAICS
>>>>>>>> resources
>>>>>>>>> are not written to on the Host.
>>>>>>>>>
>>>>>>>>>> Second, it ensures that the mapped pointer is writable.
>>>>>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>>>>>>>> associated
>>>>>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
>>>>>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(), but
>> the
>>>>>>>> other
>>>>>>>>>> callers also use the function to map writable pointers.
>>>>>>>>> Unless I am missing something, I don't see where writable pointers
>> are
>>>>>>>> used
>>>>>>>>> in non-virgl use-cases?
>>>>>>>> Rutabaga uses too, but you are right about that 2D operations won't
>> use
>>>>>> it.
>>>>>>>>
>>>>>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets
>> the
>>>>>>>> guest corrupt memory so should be avoided. On the other hand, it is
>>>>>>>> unlikely that rejecting non-writable memory will cause any problem.
>> You
>>>>>>>> can also add another code path to use
>>>>>>>> memory_region_supports_direct_access() instead of
>>>>>>>> memory_region_is_ram()
>>>>>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
>>>>>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
>>>>>> working.
>>>>>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-
>> rutabaga
>>>>>> code.
>>>>>>> And, this patch series and my use-case (GPU SRIOV) only needs to
>> deal
>>>>>> with
>>>>>>> non-writeable memory because the rendering is already done by the
>>>>>> Guest and
>>>>>>> the Host only needs to display the Guest's FB.
>>>>>>>
>>>>>>> However, I see that virtio_gpu_create_mapping_iov() is used by
>>>>>> virgl/rutabaga
>>>>>>> code as well, so I am wondering how do things work right now given
>> that
>>>>>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
>>>>>>> In other words, is there no problem currently with non-writeable
>> memory
>>>>>>> in virgl/rutabaga use-cases?
>>>>>>
>>>>>> The current code is problematic, and using memory_region_is_ram()
>> will
>>>>>> fix it.
>>>>> Ok, I'll make the change.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>> It also makes the check of memory_region_is_ram_device() and
>>>>>>>>>> memory_region_is_ram() unnecessary for
>> virtio_gpu_init_dmabuf(),
>>>>>>>> reducing
>>>>>>>>>> the overall complexity.
>>>>>>>>> Since buffers reside completely in either ram or ram_device
>> regions,
>>>>>> using
>>>>>>>> both
>>>>>>>>> memory_region_is_ram_device() and memory_region_is_ram() to
>>>> check
>>>>>>>> where
>>>>>>>>> they are located seems necessary and unavoidable.
>>>>>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
>>>>>>>> function finds the memory is incompatible with udmabuf, it can call
>>>>>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
>>>>>>> Yeah, what you suggest is doable but seems a bit convoluted to have
>> to
>>>>>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO
>> related
>>>>>>> functions.
>>>>>>>
>>>>>>> I think using memory_region_is_ram_device() and
>>>>>> memory_region_is_ram()
>>>>>>> to identify the right memory region and calling either
>>>>>> virtio_gpu_create_udmabuf()
>>>>>>> or vfio_create_dmabuf() is much more intuitive and readable.
>>>>>>
>>>>>> memory_region_is_ram_device() and memory_region_is_ram() are not
>>>>>> sufficient to identify the right memory region.
>>>>>> memory_region_is_ram_device() returns true for RAM device created
>> by
>>>>>> non-VFIO devices, and memory_region_is_ram() returns true for
>> memory
>>>>>> regions created with memory_region_init_ram_ptr(), which is not
>> backed
>>>>>> with memfd.
>>>>> Right, but structuring the code in the following way would address your
>>>> concerns
>>>>> and make it more robust:
>>>>>            if (memory_region_is_ram_device(rb->mr) && (vdev =
>>>> vfio_device_lookup(rb->mr))) {
>>>>> 	vfio_create_dmabuf(vdev, res);
>>>>>            } else if (memory_region_is_ram(rb->mr) &&
>>>> virtio_gpu_have_udmabuf()) {
>>>>> 	virtio_gpu_create_udmabuf(res);
>>>>>            } else {
>>>>> 	...
>>>>>            }
>>>>
>>>> One of the concerns I raised is that having such checks has an inherent
>>>> hazard that they can be inconsistent with the actual implementations.
>>>>
>>>> The original checks had such inconsistency, and the updated one still
>>>> have too. memory_region_is_ram(rb->mr) &&
>> virtio_gpu_have_udmabuf()
>>>> can
>>>> be still true even for memory regions that do not have memfd; please
>>>> refer to the example of memory_region_init_ram_ptr() I pointed out in
>>>> the last email.
>>>>
>>>> Even if you somehow managed to write checks that match with the
>>>> implementations, it is still possible that a future change can break it.
>>>> Letting the implementations check their prerequisite conditions
>>>> completely prevents such an error by construction and makes the code
>>>> more robust.
>>> IIUC, your suggestion is to add a check for
>> memory_region_supports_direct_access()
>>> inside virtio_gpu_create_udmabuf() and call it unconditionally right?
>>
>> No, my suggestion is to remove it at all. Creating udmabuf only requires
>> that the memory regions are backed with memfd.
> If we unconditionally call virtio_gpu_create_udmabuf() for VFIO backed buffers,
> I think it makes sense to return early without having to iterate over all the iov
> entries to check their memory regions. So, I am thinking adding a
> memory_region_supports_direct_access() or !memory_region_is_ram_device()
> check would help with this.

They don't make a difference.

If the first memory region is backed by VFIO, the loop ends with the 
first iteration and you do not need to iterate over all the iov entries.

If the first memory region is backed by memfd but a succeeding memory 
regions is backed by VFIO, checking 
memory_region_supports_direct_access() or !memory_region_is_ram_device() 
only for the first memory region is ineffective.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-19  5:55                       ` Akihiko Odaki
@ 2025-11-21  6:56                         ` Kasireddy, Vivek
  2025-11-21  7:08                           ` Akihiko Odaki
  0 siblings, 1 reply; 45+ messages in thread
From: Kasireddy, Vivek @ 2025-11-21  6:56 UTC (permalink / raw)
  To: Akihiko Odaki, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

Hi Akihiko,

> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> associated with a ram device
> 
> On 2025/11/19 14:42, Kasireddy, Vivek wrote:
> > Hi Akihiko,
> >
> >> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
> >> associated with a ram device
> >>>>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
> >>>>>>>>>>> If the Guest provides a DMA address that is associated with a
> >> ram
> >>>>>>>>>>> device (such as a PCI device region and not its system memory),
> >> then
> >>>>>>>>>>> we can obtain the hva (host virtual address) by invoking
> >>>>>>>>>>> address_space_translate() followed by
> >>>>>> memory_region_get_ram_ptr().
> >>>>>>>>>>>
> >>>>>>>>>>> This is because the ram device's address space is not accessible
> >> to
> >>>>>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be
> >> used.
> >>>>>>>>>>> Therefore, we first need to identify the memory region
> >> associated
> >>>>>> with
> >>>>>>>>>>> the DMA address and figure out if it belongs to a ram device or
> >> not
> >>>>>>>>>>> and decide how to obtain the host address accordingly.
> >>>>>>>>>>>
> >>>>>>>>>>> Note that we take a reference on the memory region if it
> belongs
> >> to a
> >>>>>>>>>>> ram device but we would still call dma_memory_unmap() later
> >> (to
> >>>>>> unref
> >>>>>>>>>>> mr) regardless of how we obtained the hva.
> >>>>>>>>>>>
> >>>>>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
> >>>>>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
> >>>>>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
> >>>>>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
> >>>>>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
> >>>>>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
> >>>>>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>        hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
> >>>>>>>>>>>        1 file changed, 21 insertions(+), 3 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> >> index
> >>>>>>>>>>> 199b18c746..d352b5afd6 100644
> >>>>>>>>>>> --- a/hw/display/virtio-gpu.c
> >>>>>>>>>>> +++ b/hw/display/virtio-gpu.c
> >>>>>>>>>>> @@ -798,6 +798,26 @@ static void
> >>>>>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
> >>>>>>>>>> *g,
> >>>>>>>>>>>                                      &fb, res, &ss.r, &cmd->error);
> >>>>>>>>>>>        }
> >>>>>>>>>>>
> >>>>>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
> >>>>>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
> >>>>>>>>>>> +                                       uint64_t a, hwaddr *len) {
> >>>>>>>>>>> +    MemoryRegion *mr = NULL;
> >>>>>>>>>>> +    hwaddr xlat;
> >>>>>>>>>>> +
> >>>>>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as,
> a,
> >>>>>> &xlat,
> >>>>>>>> len,
> >>>>>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
> >>>>>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
> >>>>>>>>>>> +    if (memory_region_is_ram_device(mr)) {
> >>>>>>>>>>> +        memory_region_ref(mr);
> >>>>>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
> >>>>>>>>>>> +    }
> >>>>>>>>>>> +
> >>>>>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a,
> len,
> >>>>>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
> >>>>>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
> >>>>>>>>>> This function should:
> >>>>>>>>>> - call memory_region_get_ram_ptr(mr)
> >>>>>>>>>>         if memory_region_is_ram(mr)
> >>>>>>>>>> - return NULL otherwise
> >>>>>>>>>>
> >>>>>>>>>> There are a few reasons. First, the documentation of
> >>>>>>>> dma_memory_map()
> >>>>>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
> >>>>>>>>>> operations." It can be used for read-modify-write operations so
> >>>>>>>>>> dma_memory_map() should be avoided.
> >>>>>>>>> This patch series only deals with non-virgl use-cases where AFAICS
> >>>>>>>> resources
> >>>>>>>>> are not written to on the Host.
> >>>>>>>>>
> >>>>>>>>>> Second, it ensures that the mapped pointer is writable.
> >>>>>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
> >>>>>>>> associated
> >>>>>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
> >>>>>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(),
> but
> >> the
> >>>>>>>> other
> >>>>>>>>>> callers also use the function to map writable pointers.
> >>>>>>>>> Unless I am missing something, I don't see where writable pointers
> >> are
> >>>>>>>> used
> >>>>>>>>> in non-virgl use-cases?
> >>>>>>>> Rutabaga uses too, but you are right about that 2D operations
> won't
> >> use
> >>>>>> it.
> >>>>>>>>
> >>>>>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets
> >> the
> >>>>>>>> guest corrupt memory so should be avoided. On the other hand, it
> is
> >>>>>>>> unlikely that rejecting non-writable memory will cause any
> problem.
> >> You
> >>>>>>>> can also add another code path to use
> >>>>>>>> memory_region_supports_direct_access() instead of
> >>>>>>>> memory_region_is_ram()
> >>>>>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
> >>>>>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
> >>>>>> working.
> >>>>>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-
> >> rutabaga
> >>>>>> code.
> >>>>>>> And, this patch series and my use-case (GPU SRIOV) only needs to
> >> deal
> >>>>>> with
> >>>>>>> non-writeable memory because the rendering is already done by the
> >>>>>> Guest and
> >>>>>>> the Host only needs to display the Guest's FB.
> >>>>>>>
> >>>>>>> However, I see that virtio_gpu_create_mapping_iov() is used by
> >>>>>> virgl/rutabaga
> >>>>>>> code as well, so I am wondering how do things work right now given
> >> that
> >>>>>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
> >>>>>>> In other words, is there no problem currently with non-writeable
> >> memory
> >>>>>>> in virgl/rutabaga use-cases?
> >>>>>>
> >>>>>> The current code is problematic, and using memory_region_is_ram()
> >> will
> >>>>>> fix it.
> >>>>> Ok, I'll make the change.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>>> It also makes the check of memory_region_is_ram_device() and
> >>>>>>>>>> memory_region_is_ram() unnecessary for
> >> virtio_gpu_init_dmabuf(),
> >>>>>>>> reducing
> >>>>>>>>>> the overall complexity.
> >>>>>>>>> Since buffers reside completely in either ram or ram_device
> >> regions,
> >>>>>> using
> >>>>>>>> both
> >>>>>>>>> memory_region_is_ram_device() and memory_region_is_ram() to
> >>>> check
> >>>>>>>> where
> >>>>>>>>> they are located seems necessary and unavoidable.
> >>>>>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
> >>>>>>>> function finds the memory is incompatible with udmabuf, it can call
> >>>>>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
> >>>>>>> Yeah, what you suggest is doable but seems a bit convoluted to have
> >> to
> >>>>>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO
> >> related
> >>>>>>> functions.
> >>>>>>>
> >>>>>>> I think using memory_region_is_ram_device() and
> >>>>>> memory_region_is_ram()
> >>>>>>> to identify the right memory region and calling either
> >>>>>> virtio_gpu_create_udmabuf()
> >>>>>>> or vfio_create_dmabuf() is much more intuitive and readable.
> >>>>>>
> >>>>>> memory_region_is_ram_device() and memory_region_is_ram() are
> not
> >>>>>> sufficient to identify the right memory region.
> >>>>>> memory_region_is_ram_device() returns true for RAM device created
> >> by
> >>>>>> non-VFIO devices, and memory_region_is_ram() returns true for
> >> memory
> >>>>>> regions created with memory_region_init_ram_ptr(), which is not
> >> backed
> >>>>>> with memfd.
> >>>>> Right, but structuring the code in the following way would address your
> >>>> concerns
> >>>>> and make it more robust:
> >>>>>            if (memory_region_is_ram_device(rb->mr) && (vdev =
> >>>> vfio_device_lookup(rb->mr))) {
> >>>>> 	vfio_create_dmabuf(vdev, res);
> >>>>>            } else if (memory_region_is_ram(rb->mr) &&
> >>>> virtio_gpu_have_udmabuf()) {
> >>>>> 	virtio_gpu_create_udmabuf(res);
> >>>>>            } else {
> >>>>> 	...
> >>>>>            }
> >>>>
> >>>> One of the concerns I raised is that having such checks has an inherent
> >>>> hazard that they can be inconsistent with the actual implementations.
> >>>>
> >>>> The original checks had such inconsistency, and the updated one still
> >>>> have too. memory_region_is_ram(rb->mr) &&
> >> virtio_gpu_have_udmabuf()
> >>>> can
> >>>> be still true even for memory regions that do not have memfd; please
> >>>> refer to the example of memory_region_init_ram_ptr() I pointed out in
> >>>> the last email.
> >>>>
> >>>> Even if you somehow managed to write checks that match with the
> >>>> implementations, it is still possible that a future change can break it.
> >>>> Letting the implementations check their prerequisite conditions
> >>>> completely prevents such an error by construction and makes the code
> >>>> more robust.
> >>> IIUC, your suggestion is to add a check for
> >> memory_region_supports_direct_access()
> >>> inside virtio_gpu_create_udmabuf() and call it unconditionally right?
> >>
> >> No, my suggestion is to remove it at all. Creating udmabuf only requires
> >> that the memory regions are backed with memfd.
> > If we unconditionally call virtio_gpu_create_udmabuf() for VFIO backed
> buffers,
> > I think it makes sense to return early without having to iterate over all the
> iov
> > entries to check their memory regions. So, I am thinking adding a
> > memory_region_supports_direct_access() or
> !memory_region_is_ram_device()
> > check would help with this.
> 
> They don't make a difference.
> 
> If the first memory region is backed by VFIO, the loop ends with the
> first iteration and you do not need to iterate over all the iov entries.
> 
> If the first memory region is backed by memfd but a succeeding memory
> regions is backed by VFIO, checking
> memory_region_supports_direct_access() or
> !memory_region_is_ram_device()
> only for the first memory region is ineffective.
Given that we have not yet encountered buffers with mixed memory regions
(memfd and VFIO) and since this is only a theoretical possibility at this point
(as we have no evidence of their existence), I think it is reasonable to assume
that the first entry's memory region is representative of the whole buffer.

In other words, if the first memory region is backed by memfd, then we have
to assume that memory regions associated with other entries are also backed
by memfd.

Thanks,
Vivek

> 
> Regards,
> Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device
  2025-11-21  6:56                         ` Kasireddy, Vivek
@ 2025-11-21  7:08                           ` Akihiko Odaki
  0 siblings, 0 replies; 45+ messages in thread
From: Akihiko Odaki @ 2025-11-21  7:08 UTC (permalink / raw)
  To: Kasireddy, Vivek, qemu-devel@nongnu.org
  Cc: Marc-André Lureau, Alex Bennée, Dmitry Osipenko,
	Alex Williamson, Cédric Le Goater

On 2025/11/21 15:56, Kasireddy, Vivek wrote:
> Hi Akihiko,
> 
>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>> associated with a ram device
>>
>> On 2025/11/19 14:42, Kasireddy, Vivek wrote:
>>> Hi Akihiko,
>>>
>>>> Subject: Re: [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr
>>>> associated with a ram device
>>>>>>>>>>>> On 2025/11/09 14:33, Vivek Kasireddy wrote:
>>>>>>>>>>>>> If the Guest provides a DMA address that is associated with a
>>>> ram
>>>>>>>>>>>>> device (such as a PCI device region and not its system memory),
>>>> then
>>>>>>>>>>>>> we can obtain the hva (host virtual address) by invoking
>>>>>>>>>>>>> address_space_translate() followed by
>>>>>>>> memory_region_get_ram_ptr().
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is because the ram device's address space is not accessible
>>>> to
>>>>>>>>>>>>> virtio-gpu directly and hence dma_memory_map() cannot be
>>>> used.
>>>>>>>>>>>>> Therefore, we first need to identify the memory region
>>>> associated
>>>>>>>> with
>>>>>>>>>>>>> the DMA address and figure out if it belongs to a ram device or
>>>> not
>>>>>>>>>>>>> and decide how to obtain the host address accordingly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note that we take a reference on the memory region if it
>> belongs
>>>> to a
>>>>>>>>>>>>> ram device but we would still call dma_memory_unmap() later
>>>> (to
>>>>>>>> unref
>>>>>>>>>>>>> mr) regardless of how we obtained the hva.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cc: Marc-André Lureau<marcandre.lureau@redhat.com>
>>>>>>>>>>>>> Cc: Alex Bennée<alex.bennee@linaro.org>
>>>>>>>>>>>>> Cc: Akihiko Odaki<odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>>>>>>>>>>> Cc: Dmitry Osipenko<dmitry.osipenko@collabora.com>
>>>>>>>>>>>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>>>>>>>>>>>> Cc: Cédric Le Goater<clg@redhat.com>
>>>>>>>>>>>>> Signed-off-by: Vivek Kasireddy<vivek.kasireddy@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>         hw/display/virtio-gpu.c | 24 +++++++++++++++++++++---
>>>>>>>>>>>>>         1 file changed, 21 insertions(+), 3 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
>>>> index
>>>>>>>>>>>>> 199b18c746..d352b5afd6 100644
>>>>>>>>>>>>> --- a/hw/display/virtio-gpu.c
>>>>>>>>>>>>> +++ b/hw/display/virtio-gpu.c
>>>>>>>>>>>>> @@ -798,6 +798,26 @@ static void
>>>>>>>>>> virtio_gpu_set_scanout_blob(VirtIOGPU
>>>>>>>>>>>> *g,
>>>>>>>>>>>>>                                       &fb, res, &ss.r, &cmd->error);
>>>>>>>>>>>>>         }
>>>>>>>>>>>>>
>>>>>>>>>>>>> +static void *virtio_gpu_dma_memory_map(VirtIOGPU *g,
>>>>>>>>>>>>> +                                       struct virtio_gpu_ctrl_command *cmd,
>>>>>>>>>>>>> +                                       uint64_t a, hwaddr *len) {
>>>>>>>>>>>>> +    MemoryRegion *mr = NULL;
>>>>>>>>>>>>> +    hwaddr xlat;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +    mr = address_space_translate(VIRTIO_DEVICE(g)->dma_as,
>> a,
>>>>>>>> &xlat,
>>>>>>>>>> len,
>>>>>>>>>>>>> +                                 DMA_DIRECTION_TO_DEVICE,
>>>>>>>>>>>>> +                                 MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>>>>> +    if (memory_region_is_ram_device(mr)) {
>>>>>>>>>>>>> +        memory_region_ref(mr);
>>>>>>>>>>>>> +        return memory_region_get_ram_ptr(mr) + xlat;
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +    return dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a,
>> len,
>>>>>>>>>>>>> +                          DMA_DIRECTION_TO_DEVICE,
>>>>>>>>>>>>> +                          MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>>>> This function should:
>>>>>>>>>>>> - call memory_region_get_ram_ptr(mr)
>>>>>>>>>>>>          if memory_region_is_ram(mr)
>>>>>>>>>>>> - return NULL otherwise
>>>>>>>>>>>>
>>>>>>>>>>>> There are a few reasons. First, the documentation of
>>>>>>>>>> dma_memory_map()
>>>>>>>>>>>> tells to use it "only for reads OR writes - not for read-modify-write
>>>>>>>>>>>> operations." It can be used for read-modify-write operations so
>>>>>>>>>>>> dma_memory_map() should be avoided.
>>>>>>>>>>> This patch series only deals with non-virgl use-cases where AFAICS
>>>>>>>>>> resources
>>>>>>>>>>> are not written to on the Host.
>>>>>>>>>>>
>>>>>>>>>>>> Second, it ensures that the mapped pointer is writable.
>>>>>>>>>>>> "[PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs
>>>>>>>>>> associated
>>>>>>>>>>>> with VFIO devices" adds checks for memory_region_is_ram() and
>>>>>>>>>>>> memory_region_is_ram_device() to virtio_gpu_init_dmabuf(),
>> but
>>>> the
>>>>>>>>>> other
>>>>>>>>>>>> callers also use the function to map writable pointers.
>>>>>>>>>>> Unless I am missing something, I don't see where writable pointers
>>>> are
>>>>>>>>>> used
>>>>>>>>>>> in non-virgl use-cases?
>>>>>>>>>> Rutabaga uses too, but you are right about that 2D operations
>> won't
>>>> use
>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> That said, exposing non-writable memory to Virgl and Rutabaga lets
>>>> the
>>>>>>>>>> guest corrupt memory so should be avoided. On the other hand, it
>> is
>>>>>>>>>> unlikely that rejecting non-writable memory will cause any
>> problem.
>>>> You
>>>>>>>>>> can also add another code path to use
>>>>>>>>>> memory_region_supports_direct_access() instead of
>>>>>>>>>> memory_region_is_ram()
>>>>>>>>>> for virtio-gpu for 2D and avoid calling memory_region_is_ram() in
>>>>>>>>>> virtio_gpu_init_dmabuf() if you want to keep non-writable memory
>>>>>>>> working.
>>>>>>>>> AFAICS, virtio_gpu_init_dmabuf() is only called in non-virgl/non-
>>>> rutabaga
>>>>>>>> code.
>>>>>>>>> And, this patch series and my use-case (GPU SRIOV) only needs to
>>>> deal
>>>>>>>> with
>>>>>>>>> non-writeable memory because the rendering is already done by the
>>>>>>>> Guest and
>>>>>>>>> the Host only needs to display the Guest's FB.
>>>>>>>>>
>>>>>>>>> However, I see that virtio_gpu_create_mapping_iov() is used by
>>>>>>>> virgl/rutabaga
>>>>>>>>> code as well, so I am wondering how do things work right now given
>>>> that
>>>>>>>>> virtio_gpu_create_mapping_iov() always calls dma_memory_map()?
>>>>>>>>> In other words, is there no problem currently with non-writeable
>>>> memory
>>>>>>>>> in virgl/rutabaga use-cases?
>>>>>>>>
>>>>>>>> The current code is problematic, and using memory_region_is_ram()
>>>> will
>>>>>>>> fix it.
>>>>>>> Ok, I'll make the change.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> It also makes the check of memory_region_is_ram_device() and
>>>>>>>>>>>> memory_region_is_ram() unnecessary for
>>>> virtio_gpu_init_dmabuf(),
>>>>>>>>>> reducing
>>>>>>>>>>>> the overall complexity.
>>>>>>>>>>> Since buffers reside completely in either ram or ram_device
>>>> regions,
>>>>>>>> using
>>>>>>>>>> both
>>>>>>>>>>> memory_region_is_ram_device() and memory_region_is_ram() to
>>>>>> check
>>>>>>>>>> where
>>>>>>>>>>> they are located seems necessary and unavoidable.
>>>>>>>>>> It can unconditionally call virtio_gpu_create_udmabuf(), and if the
>>>>>>>>>> function finds the memory is incompatible with udmabuf, it can call
>>>>>>>>>> vfio_device_lookup() to tell if the memory belongs to VFIO or not.
>>>>>>>>> Yeah, what you suggest is doable but seems a bit convoluted to have
>>>> to
>>>>>>>>> first call virtio_gpu_create_udmabuf() and if it fails then call VFIO
>>>> related
>>>>>>>>> functions.
>>>>>>>>>
>>>>>>>>> I think using memory_region_is_ram_device() and
>>>>>>>> memory_region_is_ram()
>>>>>>>>> to identify the right memory region and calling either
>>>>>>>> virtio_gpu_create_udmabuf()
>>>>>>>>> or vfio_create_dmabuf() is much more intuitive and readable.
>>>>>>>>
>>>>>>>> memory_region_is_ram_device() and memory_region_is_ram() are
>> not
>>>>>>>> sufficient to identify the right memory region.
>>>>>>>> memory_region_is_ram_device() returns true for RAM device created
>>>> by
>>>>>>>> non-VFIO devices, and memory_region_is_ram() returns true for
>>>> memory
>>>>>>>> regions created with memory_region_init_ram_ptr(), which is not
>>>> backed
>>>>>>>> with memfd.
>>>>>>> Right, but structuring the code in the following way would address your
>>>>>> concerns
>>>>>>> and make it more robust:
>>>>>>>             if (memory_region_is_ram_device(rb->mr) && (vdev =
>>>>>> vfio_device_lookup(rb->mr))) {
>>>>>>> 	vfio_create_dmabuf(vdev, res);
>>>>>>>             } else if (memory_region_is_ram(rb->mr) &&
>>>>>> virtio_gpu_have_udmabuf()) {
>>>>>>> 	virtio_gpu_create_udmabuf(res);
>>>>>>>             } else {
>>>>>>> 	...
>>>>>>>             }
>>>>>>
>>>>>> One of the concerns I raised is that having such checks has an inherent
>>>>>> hazard that they can be inconsistent with the actual implementations.
>>>>>>
>>>>>> The original checks had such inconsistency, and the updated one still
>>>>>> have too. memory_region_is_ram(rb->mr) &&
>>>> virtio_gpu_have_udmabuf()
>>>>>> can
>>>>>> be still true even for memory regions that do not have memfd; please
>>>>>> refer to the example of memory_region_init_ram_ptr() I pointed out in
>>>>>> the last email.
>>>>>>
>>>>>> Even if you somehow managed to write checks that match with the
>>>>>> implementations, it is still possible that a future change can break it.
>>>>>> Letting the implementations check their prerequisite conditions
>>>>>> completely prevents such an error by construction and makes the code
>>>>>> more robust.
>>>>> IIUC, your suggestion is to add a check for
>>>> memory_region_supports_direct_access()
>>>>> inside virtio_gpu_create_udmabuf() and call it unconditionally right?
>>>>
>>>> No, my suggestion is to remove it at all. Creating udmabuf only requires
>>>> that the memory regions are backed with memfd.
>>> If we unconditionally call virtio_gpu_create_udmabuf() for VFIO backed
>> buffers,
>>> I think it makes sense to return early without having to iterate over all the
>> iov
>>> entries to check their memory regions. So, I am thinking adding a
>>> memory_region_supports_direct_access() or
>> !memory_region_is_ram_device()
>>> check would help with this.
>>
>> They don't make a difference.
>>
>> If the first memory region is backed by VFIO, the loop ends with the
>> first iteration and you do not need to iterate over all the iov entries.
>>
>> If the first memory region is backed by memfd but a succeeding memory
>> regions is backed by VFIO, checking
>> memory_region_supports_direct_access() or
>> !memory_region_is_ram_device()
>> only for the first memory region is ineffective.
> Given that we have not yet encountered buffers with mixed memory regions
> (memfd and VFIO) and since this is only a theoretical possibility at this point
> (as we have no evidence of their existence), I think it is reasonable to assume
> that the first entry's memory region is representative of the whole buffer.

Input validation is mandated by docs/devel/secure-coding-practices.rst.

> 
> In other words, if the first memory region is backed by memfd, then we have
> to assume that memory regions associated with other entries are also backed
> by memfd.

Again, memory_region_supports_direct_access() and 
!memory_region_is_ram_device() do *not* tell if it is backed by memfd.

And memory_region_supports_direct_access() and 
!memory_region_is_ram_device() still don't make difference even when 
assuming that all elements point to one memory region and they tell if 
it is backed by memfd.

If the memory region is backed by memfd, the checks of 
memory_region_supports_direct_access() and 
!memory_region_is_ram_device() passes so there will be no change by 
having them in virtio_gpu_create_udmabuf().

If the memory region is not backed by memfd, the loop in 
virtio_gpu_create_udmabuf() ends with the first iteration, so there is 
no real difference.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2025-11-21  7:41 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-09  5:33 [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Vivek Kasireddy
2025-11-09  5:33 ` [PATCH v2 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
2025-11-10  4:59   ` Akihiko Odaki
2025-11-09  5:33 ` [PATCH v2 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
2025-11-10  5:00   ` Akihiko Odaki
2025-11-11  4:45   ` Akihiko Odaki
2025-11-12  4:30     ` Kasireddy, Vivek
2025-11-12  8:14       ` Akihiko Odaki
2025-11-13  3:39         ` Kasireddy, Vivek
2025-11-13  4:08           ` Akihiko Odaki
2025-11-17  4:14             ` Kasireddy, Vivek
2025-11-17  6:56               ` Akihiko Odaki
2025-11-18  5:26                 ` Kasireddy, Vivek
2025-11-18  5:37                   ` Akihiko Odaki
2025-11-19  5:42                     ` Kasireddy, Vivek
2025-11-19  5:55                       ` Akihiko Odaki
2025-11-21  6:56                         ` Kasireddy, Vivek
2025-11-21  7:08                           ` Akihiko Odaki
2025-11-09  5:33 ` [PATCH v2 03/10] vfio: Document vfio_device_get_region_info() Vivek Kasireddy
2025-11-09  5:33 ` [PATCH v2 04/10] vfio/region: Add a helper to get region index from memory region Vivek Kasireddy
2025-11-09  5:33 ` [PATCH v2 05/10] vfio/device: Add a helper to lookup VFIODevice " Vivek Kasireddy
2025-11-09  5:33 ` [PATCH v2 06/10] linux-headers: Update vfio.h to include VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
2025-11-09  6:49   ` Leon Romanovsky
2025-11-17  9:02     ` Cédric Le Goater
2025-11-09  5:33 ` [PATCH v2 07/10] vfio/device: Add support for VFIO_DEVICE_FEATURE_DMA_BUF Vivek Kasireddy
2025-11-09  5:33 ` [PATCH v2 08/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf Vivek Kasireddy
2025-11-10  5:00   ` Akihiko Odaki
2025-11-09  5:33 ` [PATCH v2 09/10] virtio-gpu-dmabuf: Introduce qemu_iovec_same_memory_regions() Vivek Kasireddy
2025-11-10  5:19   ` Akihiko Odaki
2025-11-12  4:24     ` Kasireddy, Vivek
2025-11-12  8:21       ` Akihiko Odaki
2025-11-13  3:14         ` Kasireddy, Vivek
2025-11-13  3:28           ` Akihiko Odaki
2025-11-09  5:33 ` [PATCH v2 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices Vivek Kasireddy
2025-11-10  4:55   ` Akihiko Odaki
2025-11-12  4:26     ` Kasireddy, Vivek
2025-11-12  8:17       ` Akihiko Odaki
2025-11-13  3:17         ` Kasireddy, Vivek
2025-11-13  3:31           ` Akihiko Odaki
2025-11-17  4:19             ` Kasireddy, Vivek
2025-11-17  7:13               ` Akihiko Odaki
2025-11-18  5:22                 ` Kasireddy, Vivek
2025-11-18  6:02                   ` Akihiko Odaki
2025-11-19  5:33                     ` Kasireddy, Vivek
2025-11-17  9:04 ` [PATCH v2 00/10] vfio: Implement VFIO_DEVICE_FEATURE_DMA_BUF and use it in virtio-gpu Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).