qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support
@ 2025-05-30  9:35 Zhenzhong Duan
  2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30  9:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
	jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Hi,

The first 6 patches of [1] are all VFIO or IOMMUFD related additions.
Split them out per Cédric and seek for quick acceptance.

I didn't copy changelog from [1] as it's a mix of the whole nesting series.

For who want a quick view of the whole nesting series [2].

Test done:
- VFIO devices hotplug/unplug
- build test on Windows

[1] https://lists.gnu.org/archive/html/qemu-devel/2025-05/msg05002.html
[2] https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting.v1.wip

Thanks
Zhenzhong

Changelog:
v2:
- report kernel BUG as error instead of assert (Cédric)
- merge patch2 and patch3 (Cédric)
- handle vendor cap check directly from vtd_check_hiod, so patch6 removed (Cédric)
- s/data_ptr/data (Cédric)
- s/totally/total (Donald)

v1:
- changed to save raw data in VendorCaps, so we can keep all vendor structure
  decoding inside the backend and VFIO wouldn't need to care about types nor
  what's inside the data.


Zhenzhong Duan (4):
  backends/iommufd: Add a helper to invalidate user-managed HWPT
  vfio/iommufd: Add properties and handlers to
    TYPE_HOST_IOMMU_DEVICE_IOMMUFD
  vfio/iommufd: Implement [at|de]tach_hwpt handlers
  vfio/iommufd: Save vendor specific device info

 include/system/host_iommu_device.h | 11 ++++++
 include/system/iommufd.h           | 54 ++++++++++++++++++++++++++++
 backends/iommufd.c                 | 58 ++++++++++++++++++++++++++++++
 hw/vfio/iommufd.c                  | 36 ++++++++++++++++---
 backends/trace-events              |  1 +
 5 files changed, 155 insertions(+), 5 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
  2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
@ 2025-05-30  9:35 ` Zhenzhong Duan
  2025-06-01 13:57   ` Cédric Le Goater
  2025-06-03 12:21   ` Eric Auger
  2025-05-30  9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30  9:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
	jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

This helper passes cache invalidation request from guest to invalidate
stage-1 page table cache in host hardware.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/system/iommufd.h |  4 ++++
 backends/iommufd.c       | 36 ++++++++++++++++++++++++++++++++++++
 backends/trace-events    |  1 +
 3 files changed, 41 insertions(+)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index cbab75bfbf..83ab8e1e4c 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
                                       uint64_t iova, ram_addr_t size,
                                       uint64_t page_size, uint64_t *data,
                                       Error **errp);
+bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
+                                      uint32_t data_type, uint32_t entry_len,
+                                      uint32_t *entry_num, void *data,
+                                      Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index b73f75cd0b..8bcdb60fe7 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
     return true;
 }
 
+bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
+                                      uint32_t data_type, uint32_t entry_len,
+                                      uint32_t *entry_num, void *data,
+                                      Error **errp)
+{
+    int ret, fd = be->fd;
+    uint32_t total_entries = *entry_num;
+    struct iommu_hwpt_invalidate cache = {
+        .size = sizeof(cache),
+        .hwpt_id = id,
+        .data_type = data_type,
+        .entry_len = entry_len,
+        .entry_num = total_entries,
+        .data_uptr = (uintptr_t)data,
+    };
+
+    ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
+    trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
+                                           total_entries, cache.entry_num,
+                                           (uintptr_t)data, ret ? errno : 0);
+    *entry_num = cache.entry_num;
+
+    if (ret) {
+        error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
+                         " total %d entries, processed %d entries",
+                         total_entries, cache.entry_num);
+    } else if (total_entries != cache.entry_num) {
+        error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
+                         " entries: total %d entries, processed %d entries."
+                         " Kernel BUG?!", total_entries, cache.entry_num);
+        return false;
+    }
+
+    return !ret;
+}
+
 static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
 {
     HostIOMMUDeviceCaps *caps = &hiod->caps;
diff --git a/backends/trace-events b/backends/trace-events
index 40811a3162..7278214ea5 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
 iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
 iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
 iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
+iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
  2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
  2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
@ 2025-05-30  9:35 ` Zhenzhong Duan
  2025-05-30 21:00   ` Nicolin Chen
  2025-06-03 12:27   ` Eric Auger
  2025-05-30  9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30  9:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
	jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
to the iommufd BE + 2 new class functions.

IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
re-attach IOMMUFD backed device to its default VFIO sub-system created
hwpt, i.e., when vIOMMU is disabled by guest. These properties are
initialized in hiod::realize() after attachment.

2 new class functions are [at|de]tach_hwpt(). They are used to
attach/detach hwpt. VFIO and VDPA can have different implementions,
so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
e.g., in HostIOMMUDeviceIOMMUFDVFIO.

Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
two functions.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
 include/system/iommufd.h | 50 ++++++++++++++++++++++++++++++++++++++++
 backends/iommufd.c       | 22 ++++++++++++++++++
 hw/vfio/iommufd.c        |  6 +++++
 3 files changed, 78 insertions(+)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 83ab8e1e4c..283861b924 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -67,4 +67,54 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
                                       Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
+OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
+                    HOST_IOMMU_DEVICE_IOMMUFD)
+
+/* Overload of the host IOMMU device for the iommufd backend */
+struct HostIOMMUDeviceIOMMUFD {
+    HostIOMMUDevice parent_obj;
+
+    IOMMUFDBackend *iommufd;
+    uint32_t devid;
+    uint32_t hwpt_id;
+};
+
+struct HostIOMMUDeviceIOMMUFDClass {
+    HostIOMMUDeviceClass parent_class;
+
+    /**
+     * @attach_hwpt: attach host IOMMU device to IOMMUFD hardware page table.
+     * VFIO and VDPA device can have different implementation.
+     *
+     * Mandatory callback.
+     *
+     * @idev: host IOMMU device backed by IOMMUFD backend.
+     *
+     * @hwpt_id: ID of IOMMUFD hardware page table.
+     *
+     * @errp: pass an Error out when attachment fails.
+     *
+     * Returns: true on success, false on failure.
+     */
+    bool (*attach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, uint32_t hwpt_id,
+                        Error **errp);
+    /**
+     * @detach_hwpt: detach host IOMMU device from IOMMUFD hardware page table.
+     * VFIO and VDPA device can have different implementation.
+     *
+     * Mandatory callback.
+     *
+     * @idev: host IOMMU device backed by IOMMUFD backend.
+     *
+     * @errp: pass an Error out when attachment fails.
+     *
+     * Returns: true on success, false on failure.
+     */
+    bool (*detach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, Error **errp);
+};
+
+bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           uint32_t hwpt_id, Error **errp);
+bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           Error **errp);
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 8bcdb60fe7..c2c47abf7e 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -347,6 +347,26 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
     return !ret;
 }
 
+bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           uint32_t hwpt_id, Error **errp)
+{
+    HostIOMMUDeviceIOMMUFDClass *idevc =
+        HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
+
+    g_assert(idevc->attach_hwpt);
+    return idevc->attach_hwpt(idev, hwpt_id, errp);
+}
+
+bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           Error **errp)
+{
+    HostIOMMUDeviceIOMMUFDClass *idevc =
+        HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
+
+    g_assert(idevc->detach_hwpt);
+    return idevc->detach_hwpt(idev, errp);
+}
+
 static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
 {
     HostIOMMUDeviceCaps *caps = &hiod->caps;
@@ -385,6 +405,8 @@ static const TypeInfo types[] = {
     }, {
         .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
         .parent = TYPE_HOST_IOMMU_DEVICE,
+        .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
+        .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
         .class_init = hiod_iommufd_class_init,
         .abstract = true,
     }
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index af1c7ab10a..5fde2b633a 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -814,6 +814,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
                                       Error **errp)
 {
     VFIODevice *vdev = opaque;
+    HostIOMMUDeviceIOMMUFD *idev;
     HostIOMMUDeviceCaps *caps = &hiod->caps;
     enum iommu_hw_info_type type;
     union {
@@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     caps->type = type;
     caps->hw_caps = hw_caps;
 
+    idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
+    idev->iommufd = vdev->iommufd;
+    idev->devid = vdev->devid;
+    idev->hwpt_id = vdev->hwpt->hwpt_id;
+
     return true;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
  2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
  2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
  2025-05-30  9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
@ 2025-05-30  9:35 ` Zhenzhong Duan
  2025-05-30 20:31   ` Nicolin Chen via
  2025-06-03 17:38   ` Eric Auger
  2025-05-30  9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
  2025-06-01 18:04 ` [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Cédric Le Goater
  4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30  9:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
	jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
utilizes them to attach to or detach from hwpt on host side.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
 hw/vfio/iommufd.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 5fde2b633a..d661737c17 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -810,6 +810,24 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, const void *data)
     vioc->query_dirty_bitmap = iommufd_query_dirty_bitmap;
 };
 
+static bool
+host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           uint32_t hwpt_id, Error **errp)
+{
+    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
+
+    return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
+}
+
+static bool
+host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+                                           Error **errp)
+{
+    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
+
+    return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
+}
+
 static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
                                       Error **errp)
 {
@@ -864,10 +882,14 @@ hiod_iommufd_vfio_get_page_size_mask(HostIOMMUDevice *hiod)
 static void hiod_iommufd_vfio_class_init(ObjectClass *oc, const void *data)
 {
     HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
+    HostIOMMUDeviceIOMMUFDClass *idevc = HOST_IOMMU_DEVICE_IOMMUFD_CLASS(oc);
 
     hiodc->realize = hiod_iommufd_vfio_realize;
     hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
     hiodc->get_page_size_mask = hiod_iommufd_vfio_get_page_size_mask;
+
+    idevc->attach_hwpt = host_iommu_device_iommufd_vfio_attach_hwpt;
+    idevc->detach_hwpt = host_iommu_device_iommufd_vfio_detach_hwpt;
 };
 
 static const TypeInfo types[] = {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
  2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
                   ` (2 preceding siblings ...)
  2025-05-30  9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
@ 2025-05-30  9:35 ` Zhenzhong Duan
  2025-05-30 21:05   ` Nicolin Chen
  2025-06-03 12:40   ` Eric Auger
  2025-06-01 18:04 ` [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Cédric Le Goater
  4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30  9:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
	jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
	Zhenzhong Duan

Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
specific. Save them as raw data in a union supporting different vendors,
then vendor IOMMU can query the raw data with its fixed format for
capability directly.

Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
capability related structures with CONFIG_LINUX.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/system/host_iommu_device.h | 11 +++++++++++
 hw/vfio/iommufd.c                  |  8 +++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/system/host_iommu_device.h b/include/system/host_iommu_device.h
index 809cced4ba..10fccc10be 100644
--- a/include/system/host_iommu_device.h
+++ b/include/system/host_iommu_device.h
@@ -14,6 +14,13 @@
 
 #include "qom/object.h"
 #include "qapi/error.h"
+#ifdef CONFIG_LINUX
+#include "linux/iommufd.h"
+
+typedef union VendorCaps {
+    struct iommu_hw_info_vtd vtd;
+    struct iommu_hw_info_arm_smmuv3 smmuv3;
+} VendorCaps;
 
 /**
  * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
@@ -26,7 +33,9 @@
 typedef struct HostIOMMUDeviceCaps {
     uint32_t type;
     uint64_t hw_caps;
+    VendorCaps vendor_caps;
 } HostIOMMUDeviceCaps;
+#endif
 
 #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
 OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
@@ -38,7 +47,9 @@ struct HostIOMMUDevice {
     void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
     PCIBus *aliased_bus;
     int aliased_devfn;
+#ifdef CONFIG_LINUX
     HostIOMMUDeviceCaps caps;
+#endif
 };
 
 /**
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index d661737c17..fbf47cab09 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -834,16 +834,14 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     VFIODevice *vdev = opaque;
     HostIOMMUDeviceIOMMUFD *idev;
     HostIOMMUDeviceCaps *caps = &hiod->caps;
+    VendorCaps *vendor_caps = &caps->vendor_caps;
     enum iommu_hw_info_type type;
-    union {
-        struct iommu_hw_info_vtd vtd;
-    } data;
     uint64_t hw_caps;
 
     hiod->agent = opaque;
 
-    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
-                                         &type, &data, sizeof(data),
+    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
+                                         vendor_caps, sizeof(*vendor_caps),
                                          &hw_caps, errp)) {
         return false;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
  2025-05-30  9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
@ 2025-05-30 20:31   ` Nicolin Chen via
  2025-06-03  3:03     ` Duan, Zhenzhong
  2025-06-03 17:38   ` Eric Auger
  1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen via @ 2025-05-30 20:31 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
	peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

On Fri, May 30, 2025 at 05:35:11PM +0800, Zhenzhong Duan wrote:
> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
> utilizes them to attach to or detach from hwpt on host side.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

> +static bool
> +host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           uint32_t hwpt_id, Error **errp)
> +{
> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> +    return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
> +}
> +
> +static bool
> +host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           Error **errp)
> +{
> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> +    return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
> +}

Could be a separate patch though:

So, we have the attach API returning "int" while the detach API
returning "bool". Is errno returned back to the attach caller(s)
so the attach API can be a "bool" type too?

Nicolin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
  2025-05-30  9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
@ 2025-05-30 21:00   ` Nicolin Chen
  2025-06-03  3:12     ` Duan, Zhenzhong
  2025-06-03 12:27   ` Eric Auger
  1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen @ 2025-05-30 21:00 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
	peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

On Fri, May 30, 2025 at 05:35:10PM +0800, Zhenzhong Duan wrote:
> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
> to the iommufd BE + 2 new class functions.
> 
> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
> re-attach IOMMUFD backed device to its default VFIO sub-system created
> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
> initialized in hiod::realize() after attachment.
>
> 2 new class functions are [at|de]tach_hwpt(). They are used to
> attach/detach hwpt. VFIO and VDPA can have different implementions,
> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
> e.g., in HostIOMMUDeviceIOMMUFDVFIO.

You mean HostIOMMUDeviceVFIO and HostIOMMUDeviceVDPA?

I wonder what are the difference between their implementations.

Main reason for asking this is that we have alloc_hwpt (and will
have alloc_viommu soon) too, and should that/those be a part of
the Class op(s) too?

> @@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>      caps->type = type;
>      caps->hw_caps = hw_caps;
>  
> +    idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
> +    idev->iommufd = vdev->iommufd;
> +    idev->devid = vdev->devid;
> +    idev->hwpt_id = vdev->hwpt->hwpt_id;

Since it's the default hwpt that VFIO contain allocated, perhaps
it's better to call it default_hwpt_id, indicating that won't be
changed by further HWPT attachment.

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
  2025-05-30  9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
@ 2025-05-30 21:05   ` Nicolin Chen
  2025-06-03  3:50     ` Duan, Zhenzhong
  2025-06-03 12:40   ` Eric Auger
  1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen @ 2025-05-30 21:05 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
	peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

On Fri, May 30, 2025 at 05:35:12PM +0800, Zhenzhong Duan wrote:
> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
> specific. Save them as raw data in a union supporting different vendors,
> then vendor IOMMU can query the raw data with its fixed format for
> capability directly.
> 
> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
> capability related structures with CONFIG_LINUX.
> 
> Suggested-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

I like this that we eventually moved all vendor stuff back to the
vendor vIOMMU code.

> +typedef union VendorCaps {
> +    struct iommu_hw_info_vtd vtd;
> +    struct iommu_hw_info_arm_smmuv3 smmuv3;
> +} VendorCaps;

Nit: can it be just:

typedef union {
	...
} VendorCaps; 

?

>  typedef struct HostIOMMUDeviceCaps {
>      uint32_t type;
>      uint64_t hw_caps;
> +    VendorCaps vendor_caps;
>  } HostIOMMUDeviceCaps;

Ditto.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
  2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
@ 2025-06-01 13:57   ` Cédric Le Goater
  2025-06-03 12:21   ` Eric Auger
  1 sibling, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2025-06-01 13:57 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

On 5/30/25 11:35, Zhenzhong Duan wrote:
> This helper passes cache invalidation request from guest to invalidate
> stage-1 page table cache in host hardware.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>



Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.


> ---
>   include/system/iommufd.h |  4 ++++
>   backends/iommufd.c       | 36 ++++++++++++++++++++++++++++++++++++
>   backends/trace-events    |  1 +
>   3 files changed, 41 insertions(+)
> 
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index cbab75bfbf..83ab8e1e4c 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>                                         uint64_t iova, ram_addr_t size,
>                                         uint64_t page_size, uint64_t *data,
>                                         Error **errp);
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> +                                      uint32_t data_type, uint32_t entry_len,
> +                                      uint32_t *entry_num, void *data,
> +                                      Error **errp);
>   
>   #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>   #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index b73f75cd0b..8bcdb60fe7 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>       return true;
>   }
>   
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> +                                      uint32_t data_type, uint32_t entry_len,
> +                                      uint32_t *entry_num, void *data,
> +                                      Error **errp)
> +{
> +    int ret, fd = be->fd;
> +    uint32_t total_entries = *entry_num;
> +    struct iommu_hwpt_invalidate cache = {
> +        .size = sizeof(cache),
> +        .hwpt_id = id,
> +        .data_type = data_type,
> +        .entry_len = entry_len,
> +        .entry_num = total_entries,
> +        .data_uptr = (uintptr_t)data,
> +    };
> +
> +    ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
> +    trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
> +                                           total_entries, cache.entry_num,
> +                                           (uintptr_t)data, ret ? errno : 0);
> +    *entry_num = cache.entry_num;
> +
> +    if (ret) {
> +        error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
> +                         " total %d entries, processed %d entries",
> +                         total_entries, cache.entry_num);
> +    } else if (total_entries != cache.entry_num) {
> +        error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
> +                         " entries: total %d entries, processed %d entries."
> +                         " Kernel BUG?!", total_entries, cache.entry_num);
> +        return false;
> +    }
> +
> +    return !ret;
> +}
> +
>   static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
>   {
>       HostIOMMUDeviceCaps *caps = &hiod->caps;
> diff --git a/backends/trace-events b/backends/trace-events
> index 40811a3162..7278214ea5 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
>   iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
>   iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
>   iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support
  2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
                   ` (3 preceding siblings ...)
  2025-05-30  9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
@ 2025-06-01 18:04 ` Cédric Le Goater
  4 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2025-06-01 18:04 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

On 5/30/25 11:35, Zhenzhong Duan wrote:
> Hi,
> 
> The first 6 patches of [1] are all VFIO or IOMMUFD related additions.
> Split them out per Cédric and seek for quick acceptance.
> 
> I didn't copy changelog from [1] as it's a mix of the whole nesting series.
> 
> For who want a quick view of the whole nesting series [2].
> 
> Test done:
> - VFIO devices hotplug/unplug
> - build test on Windows
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2025-05/msg05002.html
> [2] https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting.v1.wip
> 
> Thanks
> Zhenzhong
> 
> Changelog:
> v2:
> - report kernel BUG as error instead of assert (Cédric)
> - merge patch2 and patch3 (Cédric)
> - handle vendor cap check directly from vtd_check_hiod, so patch6 removed (Cédric)
> - s/data_ptr/data (Cédric)
> - s/totally/total (Donald)


Applied to vfio-next.

Thanks,

C.



> 
> v1:
> - changed to save raw data in VendorCaps, so we can keep all vendor structure
>    decoding inside the backend and VFIO wouldn't need to care about types nor
>    what's inside the data.
> 
> 
> Zhenzhong Duan (4):
>    backends/iommufd: Add a helper to invalidate user-managed HWPT
>    vfio/iommufd: Add properties and handlers to
>      TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>    vfio/iommufd: Implement [at|de]tach_hwpt handlers
>    vfio/iommufd: Save vendor specific device info
> 
>   include/system/host_iommu_device.h | 11 ++++++
>   include/system/iommufd.h           | 54 ++++++++++++++++++++++++++++
>   backends/iommufd.c                 | 58 ++++++++++++++++++++++++++++++
>   hw/vfio/iommufd.c                  | 36 ++++++++++++++++---
>   backends/trace-events              |  1 +
>   5 files changed, 155 insertions(+), 5 deletions(-)
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
  2025-05-30 20:31   ` Nicolin Chen via
@ 2025-06-03  3:03     ` Duan, Zhenzhong
  0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03  3:03 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
	eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
	peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P



>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
>
>On Fri, May 30, 2025 at 05:35:11PM +0800, Zhenzhong Duan wrote:
>> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
>> utilizes them to attach to or detach from hwpt on host side.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> Reviewed-by: Cédric Le Goater <clg@redhat.com>
>
>Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
>> +static bool
>>
>+host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD
>*idev,
>> +                                           uint32_t hwpt_id, Error **errp)
>> +{
>> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
>> +
>> +    return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
>> +}
>> +
>> +static bool
>>
>+host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUF
>D *idev,
>> +                                           Error **errp)
>> +{
>> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
>> +
>> +    return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
>> +}
>
>Could be a separate patch though:
>
>So, we have the attach API returning "int" while the detach API
>returning "bool". Is errno returned back to the attach caller(s)
>so the attach API can be a "bool" type too?

Errno is returned through errp. We didn't check errno here,
just treat all errors as incompatible.

Zhenzhong


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
  2025-05-30 21:00   ` Nicolin Chen
@ 2025-06-03  3:12     ` Duan, Zhenzhong
  0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03  3:12 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
	eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
	peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P



>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to
>TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>
>On Fri, May 30, 2025 at 05:35:10PM +0800, Zhenzhong Duan wrote:
>> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
>> to the iommufd BE + 2 new class functions.
>>
>> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
>> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
>> re-attach IOMMUFD backed device to its default VFIO sub-system created
>> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
>> initialized in hiod::realize() after attachment.
>>
>> 2 new class functions are [at|de]tach_hwpt(). They are used to
>> attach/detach hwpt. VFIO and VDPA can have different implementions,
>> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
>> e.g., in HostIOMMUDeviceIOMMUFDVFIO.
>
>You mean HostIOMMUDeviceVFIO and HostIOMMUDeviceVDPA?

HostIOMMUDeviceIOMMUFDVFIO and HostIOMMUDeviceIOMMUFDVDPA.

>
>I wonder what are the difference between their implementations.

They have different way to transform HOST_IOMMU_DEVICE(idev)->agent.
See host_iommu_device_iommufd_vfio_attach_hwpt()

>
>Main reason for asking this is that we have alloc_hwpt (and will
>have alloc_viommu soon) too, and should that/those be a part of
>the Class op(s) too?

I think vIOMMU could call alloc_viommu and alloc_hwpt directly,
they are common IOMMUFD interface no matter VFIO or VDPA backend device.

>
>> @@ -833,6 +834,11 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>>      caps->type = type;
>>      caps->hw_caps = hw_caps;
>>
>> +    idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
>> +    idev->iommufd = vdev->iommufd;
>> +    idev->devid = vdev->devid;
>> +    idev->hwpt_id = vdev->hwpt->hwpt_id;
>
>Since it's the default hwpt that VFIO contain allocated, perhaps
>it's better to call it default_hwpt_id, indicating that won't be
>changed by further HWPT attachment.

hiod_iommufd_vfio_realize() is called only once, no one will change idev->hwpt_id.

Thanks
Zhenzhong


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
  2025-05-30 21:05   ` Nicolin Chen
@ 2025-06-03  3:50     ` Duan, Zhenzhong
  0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03  3:50 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
	eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
	peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P



>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
>
>On Fri, May 30, 2025 at 05:35:12PM +0800, Zhenzhong Duan wrote:
>> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
>> specific. Save them as raw data in a union supporting different vendors,
>> then vendor IOMMU can query the raw data with its fixed format for
>> capability directly.
>>
>> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
>> capability related structures with CONFIG_LINUX.
>>
>> Suggested-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>
>Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
>I like this that we eventually moved all vendor stuff back to the
>vendor vIOMMU code.
>
>> +typedef union VendorCaps {
>> +    struct iommu_hw_info_vtd vtd;
>> +    struct iommu_hw_info_arm_smmuv3 smmuv3;
>> +} VendorCaps;
>
>Nit: can it be just:
>
>typedef union {
>	...
>} VendorCaps;
>
>?

Any special reason?
I didn't see anonymous union is preferred over named union in docs/devel/style.rst

Thanks
Zhenzhong

>
>>  typedef struct HostIOMMUDeviceCaps {
>>      uint32_t type;
>>      uint64_t hw_caps;
>> +    VendorCaps vendor_caps;
>>  } HostIOMMUDeviceCaps;
>
>Ditto.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
  2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
  2025-06-01 13:57   ` Cédric Le Goater
@ 2025-06-03 12:21   ` Eric Auger
  2025-06-04  5:50     ` Duan, Zhenzhong
  1 sibling, 1 reply; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:21 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

Hi Zhenzhong,

On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> This helper passes cache invalidation request from guest to invalidate
> stage-1 page table cache in host hardware.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  include/system/iommufd.h |  4 ++++
>  backends/iommufd.c       | 36 ++++++++++++++++++++++++++++++++++++
>  backends/trace-events    |  1 +
>  3 files changed, 41 insertions(+)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index cbab75bfbf..83ab8e1e4c 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>                                        uint64_t iova, ram_addr_t size,
>                                        uint64_t page_size, uint64_t *data,
>                                        Error **errp);
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> +                                      uint32_t data_type, uint32_t entry_len,
> +                                      uint32_t *entry_num, void *data,
> +                                      Error **errp);
>  
>  #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>  #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index b73f75cd0b..8bcdb60fe7 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>      return true;
>  }
>  
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> +                                      uint32_t data_type, uint32_t entry_len,
> +                                      uint32_t *entry_num, void *data,
> +                                      Error **errp)
> +{
> +    int ret, fd = be->fd;
> +    uint32_t total_entries = *entry_num;
> +    struct iommu_hwpt_invalidate cache = {
> +        .size = sizeof(cache),
> +        .hwpt_id = id,
> +        .data_type = data_type,
> +        .entry_len = entry_len,
> +        .entry_num = total_entries,
> +        .data_uptr = (uintptr_t)data,
> +    };
> +
> +    ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
> +    trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
> +                                           total_entries, cache.entry_num,
> +                                           (uintptr_t)data, ret ? errno : 0);
> +    *entry_num = cache.entry_num;
> +
> +    if (ret) {
> +        error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
> +                         " total %d entries, processed %d entries",
> +                         total_entries, cache.entry_num);
> +    } else if (total_entries != cache.entry_num) {
> +        error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
> +                         " entries: total %d entries, processed %d entries."
> +                         " Kernel BUG?!", total_entries, cache.entry_num);
Can this happen? Isn't it a failure case?

Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>


Eric
> +        return false;
> +    }
> +
> +    return !ret;
> +}
> +
>  static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
>  {
>      HostIOMMUDeviceCaps *caps = &hiod->caps;
> diff --git a/backends/trace-events b/backends/trace-events
> index 40811a3162..7278214ea5 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
>  iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
>  iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
>  iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
  2025-05-30  9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
  2025-05-30 21:00   ` Nicolin Chen
@ 2025-06-03 12:27   ` Eric Auger
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:27 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng



On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
> to the iommufd BE + 2 new class functions.
>
> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
> re-attach IOMMUFD backed device to its default VFIO sub-system created
> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
> initialized in hiod::realize() after attachment.
>
> 2 new class functions are [at|de]tach_hwpt(). They are used to
> attach/detach hwpt. VFIO and VDPA can have different implementions,
> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
> e.g., in HostIOMMUDeviceIOMMUFDVFIO.
>
> Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
> two functions.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  include/system/iommufd.h | 50 ++++++++++++++++++++++++++++++++++++++++
>  backends/iommufd.c       | 22 ++++++++++++++++++
>  hw/vfio/iommufd.c        |  6 +++++
>  3 files changed, 78 insertions(+)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index 83ab8e1e4c..283861b924 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -67,4 +67,54 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>                                        Error **errp);
>  
>  #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
> +OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
> +                    HOST_IOMMU_DEVICE_IOMMUFD)
> +
> +/* Overload of the host IOMMU device for the iommufd backend */
> +struct HostIOMMUDeviceIOMMUFD {
> +    HostIOMMUDevice parent_obj;
> +
> +    IOMMUFDBackend *iommufd;
> +    uint32_t devid;
> +    uint32_t hwpt_id;
> +};
> +
> +struct HostIOMMUDeviceIOMMUFDClass {
> +    HostIOMMUDeviceClass parent_class;
> +
> +    /**
> +     * @attach_hwpt: attach host IOMMU device to IOMMUFD hardware page table.
> +     * VFIO and VDPA device can have different implementation.
> +     *
> +     * Mandatory callback.
> +     *
> +     * @idev: host IOMMU device backed by IOMMUFD backend.
> +     *
> +     * @hwpt_id: ID of IOMMUFD hardware page table.
> +     *
> +     * @errp: pass an Error out when attachment fails.
> +     *
> +     * Returns: true on success, false on failure.
> +     */
> +    bool (*attach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, uint32_t hwpt_id,
> +                        Error **errp);
> +    /**
> +     * @detach_hwpt: detach host IOMMU device from IOMMUFD hardware page table.
> +     * VFIO and VDPA device can have different implementation.
> +     *
> +     * Mandatory callback.
> +     *
> +     * @idev: host IOMMU device backed by IOMMUFD backend.
> +     *
> +     * @errp: pass an Error out when attachment fails.
> +     *
> +     * Returns: true on success, false on failure.
> +     */
> +    bool (*detach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, Error **errp);
> +};
> +
> +bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           uint32_t hwpt_id, Error **errp);
> +bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           Error **errp);
>  #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index 8bcdb60fe7..c2c47abf7e 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -347,6 +347,26 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>      return !ret;
>  }
>  
> +bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           uint32_t hwpt_id, Error **errp)
> +{
> +    HostIOMMUDeviceIOMMUFDClass *idevc =
> +        HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
> +
> +    g_assert(idevc->attach_hwpt);
> +    return idevc->attach_hwpt(idev, hwpt_id, errp);
> +}
> +
> +bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           Error **errp)
> +{
> +    HostIOMMUDeviceIOMMUFDClass *idevc =
> +        HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
> +
> +    g_assert(idevc->detach_hwpt);
> +    return idevc->detach_hwpt(idev, errp);
> +}
> +
>  static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
>  {
>      HostIOMMUDeviceCaps *caps = &hiod->caps;
> @@ -385,6 +405,8 @@ static const TypeInfo types[] = {
>      }, {
>          .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>          .parent = TYPE_HOST_IOMMU_DEVICE,
> +        .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
> +        .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>          .class_init = hiod_iommufd_class_init,
>          .abstract = true,
>      }
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index af1c7ab10a..5fde2b633a 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -814,6 +814,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>                                        Error **errp)
>  {
>      VFIODevice *vdev = opaque;
> +    HostIOMMUDeviceIOMMUFD *idev;
>      HostIOMMUDeviceCaps *caps = &hiod->caps;
>      enum iommu_hw_info_type type;
>      union {
> @@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>      caps->type = type;
>      caps->hw_caps = hw_caps;
>  
> +    idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
> +    idev->iommufd = vdev->iommufd;
> +    idev->devid = vdev->devid;
> +    idev->hwpt_id = vdev->hwpt->hwpt_id;
> +
>      return true;
>  }
>  



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
  2025-05-30  9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
  2025-05-30 21:05   ` Nicolin Chen
@ 2025-06-03 12:40   ` Eric Auger
  2025-06-04  3:41     ` Duan, Zhenzhong
  1 sibling, 1 reply; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:40 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng

Hi Zhenzhong,

On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
> specific. Save them as raw data in a union supporting different vendors,
> then vendor IOMMU can query the raw data with its fixed format for
> capability directly.
>
> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
> capability related structures with CONFIG_LINUX.
>
> Suggested-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  include/system/host_iommu_device.h | 11 +++++++++++
>  hw/vfio/iommufd.c                  |  8 +++-----
>  2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/include/system/host_iommu_device.h b/include/system/host_iommu_device.h
> index 809cced4ba..10fccc10be 100644
> --- a/include/system/host_iommu_device.h
> +++ b/include/system/host_iommu_device.h
> @@ -14,6 +14,13 @@
>  
>  #include "qom/object.h"
>  #include "qapi/error.h"
> +#ifdef CONFIG_LINUX
> +#include "linux/iommufd.h"
> +
> +typedef union VendorCaps {
> +    struct iommu_hw_info_vtd vtd;
> +    struct iommu_hw_info_arm_smmuv3 smmuv3;
> +} VendorCaps;
>  
>  /**
>   * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> @@ -26,7 +33,9 @@
>  typedef struct HostIOMMUDeviceCaps {
>      uint32_t type;
>      uint64_t hw_caps;
> +    VendorCaps vendor_caps;
missing the doc comment update for new field vendor_caps

Otherwise

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric


>  } HostIOMMUDeviceCaps;
> +#endif
>  
>  #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>  OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
> @@ -38,7 +47,9 @@ struct HostIOMMUDevice {
>      void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
>      PCIBus *aliased_bus;
>      int aliased_devfn;
> +#ifdef CONFIG_LINUX
>      HostIOMMUDeviceCaps caps;
> +#endif
>  };
>  
>  /**
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index d661737c17..fbf47cab09 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -834,16 +834,14 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>      VFIODevice *vdev = opaque;
>      HostIOMMUDeviceIOMMUFD *idev;
>      HostIOMMUDeviceCaps *caps = &hiod->caps;
> +    VendorCaps *vendor_caps = &caps->vendor_caps;
>      enum iommu_hw_info_type type;
> -    union {
> -        struct iommu_hw_info_vtd vtd;
> -    } data;
>      uint64_t hw_caps;
>  
>      hiod->agent = opaque;
>  
> -    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
> -                                         &type, &data, sizeof(data),
> +    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
> +                                         vendor_caps, sizeof(*vendor_caps),
>                                           &hw_caps, errp)) {
>          return false;
>      }



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
  2025-05-30  9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
  2025-05-30 20:31   ` Nicolin Chen via
@ 2025-06-03 17:38   ` Eric Auger
  1 sibling, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-03 17:38 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
	nicolinc, shameerali.kolothum.thodi, joao.m.martins,
	clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng



On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
> utilizes them to attach to or detach from hwpt on host side.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  hw/vfio/iommufd.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 5fde2b633a..d661737c17 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -810,6 +810,24 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, const void *data)
>      vioc->query_dirty_bitmap = iommufd_query_dirty_bitmap;
>  };
>  
> +static bool
> +host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           uint32_t hwpt_id, Error **errp)
> +{
> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> +    return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
> +}
> +
> +static bool
> +host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                                           Error **errp)
> +{
> +    VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> +    return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
> +}
> +
>  static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>                                        Error **errp)
>  {
> @@ -864,10 +882,14 @@ hiod_iommufd_vfio_get_page_size_mask(HostIOMMUDevice *hiod)
>  static void hiod_iommufd_vfio_class_init(ObjectClass *oc, const void *data)
>  {
>      HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
> +    HostIOMMUDeviceIOMMUFDClass *idevc = HOST_IOMMU_DEVICE_IOMMUFD_CLASS(oc);
>  
>      hiodc->realize = hiod_iommufd_vfio_realize;
>      hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
>      hiodc->get_page_size_mask = hiod_iommufd_vfio_get_page_size_mask;
> +
> +    idevc->attach_hwpt = host_iommu_device_iommufd_vfio_attach_hwpt;
> +    idevc->detach_hwpt = host_iommu_device_iommufd_vfio_detach_hwpt;
>  };
>  
>  static const TypeInfo types[] = {



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
  2025-06-03 12:40   ` Eric Auger
@ 2025-06-04  3:41     ` Duan, Zhenzhong
  0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-04  3:41 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-devel@nongnu.org
  Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
	jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
	jgg@nvidia.com, nicolinc@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P

Hi Eric,

>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
>
>Hi Zhenzhong,
>
>On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
>> specific. Save them as raw data in a union supporting different vendors,
>> then vendor IOMMU can query the raw data with its fixed format for
>> capability directly.
>>
>> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
>> capability related structures with CONFIG_LINUX.
>>
>> Suggested-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  include/system/host_iommu_device.h | 11 +++++++++++
>>  hw/vfio/iommufd.c                  |  8 +++-----
>>  2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/system/host_iommu_device.h
>b/include/system/host_iommu_device.h
>> index 809cced4ba..10fccc10be 100644
>> --- a/include/system/host_iommu_device.h
>> +++ b/include/system/host_iommu_device.h
>> @@ -14,6 +14,13 @@
>>
>>  #include "qom/object.h"
>>  #include "qapi/error.h"
>> +#ifdef CONFIG_LINUX
>> +#include "linux/iommufd.h"
>> +
>> +typedef union VendorCaps {
>> +    struct iommu_hw_info_vtd vtd;
>> +    struct iommu_hw_info_arm_smmuv3 smmuv3;
>> +} VendorCaps;
>>
>>  /**
>>   * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
>> @@ -26,7 +33,9 @@
>>  typedef struct HostIOMMUDeviceCaps {
>>      uint32_t type;
>>      uint64_t hw_caps;
>> +    VendorCaps vendor_caps;
>missing the doc comment update for new field vendor_caps

Good catch, will add. Thanks

Zhenzhong

>
>Otherwise
>
>Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
>Thanks
>
>Eric
>
>
>>  } HostIOMMUDeviceCaps;
>> +#endif
>>
>>  #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>>  OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> @@ -38,7 +47,9 @@ struct HostIOMMUDevice {
>>      void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
>>      PCIBus *aliased_bus;
>>      int aliased_devfn;
>> +#ifdef CONFIG_LINUX
>>      HostIOMMUDeviceCaps caps;
>> +#endif
>>  };
>>
>>  /**
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index d661737c17..fbf47cab09 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -834,16 +834,14 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>>      VFIODevice *vdev = opaque;
>>      HostIOMMUDeviceIOMMUFD *idev;
>>      HostIOMMUDeviceCaps *caps = &hiod->caps;
>> +    VendorCaps *vendor_caps = &caps->vendor_caps;
>>      enum iommu_hw_info_type type;
>> -    union {
>> -        struct iommu_hw_info_vtd vtd;
>> -    } data;
>>      uint64_t hw_caps;
>>
>>      hiod->agent = opaque;
>>
>> -    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>> -                                         &type, &data, sizeof(data),
>> +    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
>> +                                         vendor_caps, sizeof(*vendor_caps),
>>                                           &hw_caps, errp)) {
>>          return false;
>>      }


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
  2025-06-03 12:21   ` Eric Auger
@ 2025-06-04  5:50     ` Duan, Zhenzhong
  2025-06-04 17:26       ` Eric Auger
  0 siblings, 1 reply; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-04  5:50 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-devel@nongnu.org
  Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
	jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
	jgg@nvidia.com, nicolinc@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P

Hi Eric,

>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Sent: Tuesday, June 3, 2025 8:21 PM
>To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu-devel@nongnu.org
>Cc: alex.williamson@redhat.com; clg@redhat.com; mst@redhat.com;
>jasowang@redhat.com; peterx@redhat.com; ddutile@redhat.com;
>jgg@nvidia.com; nicolinc@nvidia.com; shameerali.kolothum.thodi@huawei.com;
>joao.m.martins@oracle.com; clement.mathieu--drif@eviden.com; Tian, Kevin
><kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P
><chao.p.peng@intel.com>
>Subject: Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-
>managed HWPT
>
>Hi Zhenzhong,
>
>On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>> This helper passes cache invalidation request from guest to invalidate
>> stage-1 page table cache in host hardware.
>>
>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  include/system/iommufd.h |  4 ++++
>>  backends/iommufd.c       | 36 ++++++++++++++++++++++++++++++++++++
>>  backends/trace-events    |  1 +
>>  3 files changed, 41 insertions(+)
>>
>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>> index cbab75bfbf..83ab8e1e4c 100644
>> --- a/include/system/iommufd.h
>> +++ b/include/system/iommufd.h
>> @@ -61,6 +61,10 @@ bool
>iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>>                                        uint64_t iova, ram_addr_t size,
>>                                        uint64_t page_size, uint64_t *data,
>>                                        Error **errp);
>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>> +                                      uint32_t data_type, uint32_t entry_len,
>> +                                      uint32_t *entry_num, void *data,
>> +                                      Error **errp);
>>
>>  #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>  #endif
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index b73f75cd0b..8bcdb60fe7 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -311,6 +311,42 @@ bool
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>      return true;
>>  }
>>
>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>> +                                      uint32_t data_type, uint32_t entry_len,
>> +                                      uint32_t *entry_num, void *data,
>> +                                      Error **errp)
>> +{
>> +    int ret, fd = be->fd;
>> +    uint32_t total_entries = *entry_num;
>> +    struct iommu_hwpt_invalidate cache = {
>> +        .size = sizeof(cache),
>> +        .hwpt_id = id,
>> +        .data_type = data_type,
>> +        .entry_len = entry_len,
>> +        .entry_num = total_entries,
>> +        .data_uptr = (uintptr_t)data,
>> +    };
>> +
>> +    ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
>> +    trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
>> +                                           total_entries, cache.entry_num,
>> +                                           (uintptr_t)data, ret ? errno : 0);
>> +    *entry_num = cache.entry_num;
>> +
>> +    if (ret) {
>> +        error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
>> +                         " total %d entries, processed %d entries",
>> +                         total_entries, cache.entry_num);
>> +    } else if (total_entries != cache.entry_num) {
>> +        error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with
>unprocessed"
>> +                         " entries: total %d entries, processed %d entries."
>> +                         " Kernel BUG?!", total_entries, cache.entry_num);
>Can this happen? Isn't it a failure case?

It shouldn't happen except kernel has a bug. It's a failure case, so false is returned.

Thanks
Zhenzhong

>
>Besides
>Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
>
>Eric
>> +        return false;
>> +    }
>> +
>> +    return !ret;
>> +}
>> +
>>  static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error
>**errp)
>>  {
>>      HostIOMMUDeviceCaps *caps = &hiod->caps;
>> diff --git a/backends/trace-events b/backends/trace-events
>> index 40811a3162..7278214ea5 100644
>> --- a/backends/trace-events
>> +++ b/backends/trace-events
>> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t
>dev_id, uint32_t pt_id, uint32_
>>  iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d
>id=%d (%d)"
>>  iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret)
>" iommufd=%d hwpt=%u enable=%d (%d)"
>>  iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t
>iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u
>iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
>> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t
>data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t
>data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u
>entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
  2025-06-04  5:50     ` Duan, Zhenzhong
@ 2025-06-04 17:26       ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-04 17:26 UTC (permalink / raw)
  To: Duan, Zhenzhong, qemu-devel@nongnu.org
  Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
	jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
	jgg@nvidia.com, nicolinc@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P

Hi Zhenzhong,

On 6/4/25 7:50 AM, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: Tuesday, June 3, 2025 8:21 PM
>> To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu-devel@nongnu.org
>> Cc: alex.williamson@redhat.com; clg@redhat.com; mst@redhat.com;
>> jasowang@redhat.com; peterx@redhat.com; ddutile@redhat.com;
>> jgg@nvidia.com; nicolinc@nvidia.com; shameerali.kolothum.thodi@huawei.com;
>> joao.m.martins@oracle.com; clement.mathieu--drif@eviden.com; Tian, Kevin
>> <kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P
>> <chao.p.peng@intel.com>
>> Subject: Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-
>> managed HWPT
>>
>> Hi Zhenzhong,
>>
>> On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>>> This helper passes cache invalidation request from guest to invalidate
>>> stage-1 page table cache in host hardware.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>  include/system/iommufd.h |  4 ++++
>>>  backends/iommufd.c       | 36 ++++++++++++++++++++++++++++++++++++
>>>  backends/trace-events    |  1 +
>>>  3 files changed, 41 insertions(+)
>>>
>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>> index cbab75bfbf..83ab8e1e4c 100644
>>> --- a/include/system/iommufd.h
>>> +++ b/include/system/iommufd.h
>>> @@ -61,6 +61,10 @@ bool
>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>>>                                        uint64_t iova, ram_addr_t size,
>>>                                        uint64_t page_size, uint64_t *data,
>>>                                        Error **errp);
>>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>>> +                                      uint32_t data_type, uint32_t entry_len,
>>> +                                      uint32_t *entry_num, void *data,
>>> +                                      Error **errp);
>>>
>>>  #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>> TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>>  #endif
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index b73f75cd0b..8bcdb60fe7 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -311,6 +311,42 @@ bool
>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>>      return true;
>>>  }
>>>
>>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>>> +                                      uint32_t data_type, uint32_t entry_len,
>>> +                                      uint32_t *entry_num, void *data,
>>> +                                      Error **errp)
>>> +{
>>> +    int ret, fd = be->fd;
>>> +    uint32_t total_entries = *entry_num;
>>> +    struct iommu_hwpt_invalidate cache = {
>>> +        .size = sizeof(cache),
>>> +        .hwpt_id = id,
>>> +        .data_type = data_type,
>>> +        .entry_len = entry_len,
>>> +        .entry_num = total_entries,
>>> +        .data_uptr = (uintptr_t)data,
>>> +    };
>>> +
>>> +    ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
>>> +    trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
>>> +                                           total_entries, cache.entry_num,
>>> +                                           (uintptr_t)data, ret ? errno : 0);
>>> +    *entry_num = cache.entry_num;
>>> +
>>> +    if (ret) {
>>> +        error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
>>> +                         " total %d entries, processed %d entries",
>>> +                         total_entries, cache.entry_num);
>>> +    } else if (total_entries != cache.entry_num) {
>>> +        error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with
>> unprocessed"
>>> +                         " entries: total %d entries, processed %d entries."
>>> +                         " Kernel BUG?!", total_entries, cache.entry_num);
>> Can this happen? Isn't it a failure case?
> It shouldn't happen except kernel has a bug. It's a failure case, so false is returned.
OK. I missed it was an error case.

Eric
>
> Thanks
> Zhenzhong
>
>> Besides
>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>
>>
>> Eric
>>> +        return false;
>>> +    }
>>> +
>>> +    return !ret;
>>> +}
>>> +
>>>  static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error
>> **errp)
>>>  {
>>>      HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> diff --git a/backends/trace-events b/backends/trace-events
>>> index 40811a3162..7278214ea5 100644
>>> --- a/backends/trace-events
>>> +++ b/backends/trace-events
>>> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t
>> dev_id, uint32_t pt_id, uint32_
>>>  iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d
>> id=%d (%d)"
>>>  iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret)
>> " iommufd=%d hwpt=%u enable=%d (%d)"
>>>  iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t
>> iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u
>> iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
>>> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t
>> data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t
>> data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u
>> entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-06-04 17:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-30  9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
2025-05-30  9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
2025-06-01 13:57   ` Cédric Le Goater
2025-06-03 12:21   ` Eric Auger
2025-06-04  5:50     ` Duan, Zhenzhong
2025-06-04 17:26       ` Eric Auger
2025-05-30  9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
2025-05-30 21:00   ` Nicolin Chen
2025-06-03  3:12     ` Duan, Zhenzhong
2025-06-03 12:27   ` Eric Auger
2025-05-30  9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
2025-05-30 20:31   ` Nicolin Chen via
2025-06-03  3:03     ` Duan, Zhenzhong
2025-06-03 17:38   ` Eric Auger
2025-05-30  9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
2025-05-30 21:05   ` Nicolin Chen
2025-06-03  3:50     ` Duan, Zhenzhong
2025-06-03 12:40   ` Eric Auger
2025-06-04  3:41     ` Duan, Zhenzhong
2025-06-01 18:04 ` [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).