* [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
2025-05-30 9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
@ 2025-05-30 9:35 ` Zhenzhong Duan
2025-06-01 13:57 ` Cédric Le Goater
2025-06-03 12:21 ` Eric Auger
2025-05-30 9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
` (3 subsequent siblings)
4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30 9:35 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
Zhenzhong Duan
This helper passes cache invalidation request from guest to invalidate
stage-1 page table cache in host hardware.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/system/iommufd.h | 4 ++++
backends/iommufd.c | 36 ++++++++++++++++++++++++++++++++++++
backends/trace-events | 1 +
3 files changed, 41 insertions(+)
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index cbab75bfbf..83ab8e1e4c 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
uint64_t iova, ram_addr_t size,
uint64_t page_size, uint64_t *data,
Error **errp);
+bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data,
+ Error **errp);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index b73f75cd0b..8bcdb60fe7 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
return true;
}
+bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data,
+ Error **errp)
+{
+ int ret, fd = be->fd;
+ uint32_t total_entries = *entry_num;
+ struct iommu_hwpt_invalidate cache = {
+ .size = sizeof(cache),
+ .hwpt_id = id,
+ .data_type = data_type,
+ .entry_len = entry_len,
+ .entry_num = total_entries,
+ .data_uptr = (uintptr_t)data,
+ };
+
+ ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
+ trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
+ total_entries, cache.entry_num,
+ (uintptr_t)data, ret ? errno : 0);
+ *entry_num = cache.entry_num;
+
+ if (ret) {
+ error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
+ " total %d entries, processed %d entries",
+ total_entries, cache.entry_num);
+ } else if (total_entries != cache.entry_num) {
+ error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
+ " entries: total %d entries, processed %d entries."
+ " Kernel BUG?!", total_entries, cache.entry_num);
+ return false;
+ }
+
+ return !ret;
+}
+
static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
{
HostIOMMUDeviceCaps *caps = &hiod->caps;
diff --git a/backends/trace-events b/backends/trace-events
index 40811a3162..7278214ea5 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
+iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
2025-05-30 9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
@ 2025-06-01 13:57 ` Cédric Le Goater
2025-06-03 12:21 ` Eric Auger
1 sibling, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2025-06-01 13:57 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On 5/30/25 11:35, Zhenzhong Duan wrote:
> This helper passes cache invalidation request from guest to invalidate
> stage-1 page table cache in host hardware.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> include/system/iommufd.h | 4 ++++
> backends/iommufd.c | 36 ++++++++++++++++++++++++++++++++++++
> backends/trace-events | 1 +
> 3 files changed, 41 insertions(+)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index cbab75bfbf..83ab8e1e4c 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
> uint64_t iova, ram_addr_t size,
> uint64_t page_size, uint64_t *data,
> Error **errp);
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> + uint32_t data_type, uint32_t entry_len,
> + uint32_t *entry_num, void *data,
> + Error **errp);
>
> #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
> #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index b73f75cd0b..8bcdb60fe7 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
> return true;
> }
>
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> + uint32_t data_type, uint32_t entry_len,
> + uint32_t *entry_num, void *data,
> + Error **errp)
> +{
> + int ret, fd = be->fd;
> + uint32_t total_entries = *entry_num;
> + struct iommu_hwpt_invalidate cache = {
> + .size = sizeof(cache),
> + .hwpt_id = id,
> + .data_type = data_type,
> + .entry_len = entry_len,
> + .entry_num = total_entries,
> + .data_uptr = (uintptr_t)data,
> + };
> +
> + ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
> + trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
> + total_entries, cache.entry_num,
> + (uintptr_t)data, ret ? errno : 0);
> + *entry_num = cache.entry_num;
> +
> + if (ret) {
> + error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
> + " total %d entries, processed %d entries",
> + total_entries, cache.entry_num);
> + } else if (total_entries != cache.entry_num) {
> + error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
> + " entries: total %d entries, processed %d entries."
> + " Kernel BUG?!", total_entries, cache.entry_num);
> + return false;
> + }
> +
> + return !ret;
> +}
> +
> static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> {
> HostIOMMUDeviceCaps *caps = &hiod->caps;
> diff --git a/backends/trace-events b/backends/trace-events
> index 40811a3162..7278214ea5 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
> iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
> iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
> iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
2025-05-30 9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
2025-06-01 13:57 ` Cédric Le Goater
@ 2025-06-03 12:21 ` Eric Auger
2025-06-04 5:50 ` Duan, Zhenzhong
1 sibling, 1 reply; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:21 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
Hi Zhenzhong,
On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> This helper passes cache invalidation request from guest to invalidate
> stage-1 page table cache in host hardware.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/system/iommufd.h | 4 ++++
> backends/iommufd.c | 36 ++++++++++++++++++++++++++++++++++++
> backends/trace-events | 1 +
> 3 files changed, 41 insertions(+)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index cbab75bfbf..83ab8e1e4c 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -61,6 +61,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
> uint64_t iova, ram_addr_t size,
> uint64_t page_size, uint64_t *data,
> Error **errp);
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> + uint32_t data_type, uint32_t entry_len,
> + uint32_t *entry_num, void *data,
> + Error **errp);
>
> #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
> #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index b73f75cd0b..8bcdb60fe7 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -311,6 +311,42 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
> return true;
> }
>
> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> + uint32_t data_type, uint32_t entry_len,
> + uint32_t *entry_num, void *data,
> + Error **errp)
> +{
> + int ret, fd = be->fd;
> + uint32_t total_entries = *entry_num;
> + struct iommu_hwpt_invalidate cache = {
> + .size = sizeof(cache),
> + .hwpt_id = id,
> + .data_type = data_type,
> + .entry_len = entry_len,
> + .entry_num = total_entries,
> + .data_uptr = (uintptr_t)data,
> + };
> +
> + ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
> + trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
> + total_entries, cache.entry_num,
> + (uintptr_t)data, ret ? errno : 0);
> + *entry_num = cache.entry_num;
> +
> + if (ret) {
> + error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
> + " total %d entries, processed %d entries",
> + total_entries, cache.entry_num);
> + } else if (total_entries != cache.entry_num) {
> + error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with unprocessed"
> + " entries: total %d entries, processed %d entries."
> + " Kernel BUG?!", total_entries, cache.entry_num);
Can this happen? Isn't it a failure case?
Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
> + return false;
> + }
> +
> + return !ret;
> +}
> +
> static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> {
> HostIOMMUDeviceCaps *caps = &hiod->caps;
> diff --git a/backends/trace-events b/backends/trace-events
> index 40811a3162..7278214ea5 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
> iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
> iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
> iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
2025-06-03 12:21 ` Eric Auger
@ 2025-06-04 5:50 ` Duan, Zhenzhong
2025-06-04 17:26 ` Eric Auger
0 siblings, 1 reply; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-04 5:50 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
jgg@nvidia.com, nicolinc@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Sent: Tuesday, June 3, 2025 8:21 PM
>To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu-devel@nongnu.org
>Cc: alex.williamson@redhat.com; clg@redhat.com; mst@redhat.com;
>jasowang@redhat.com; peterx@redhat.com; ddutile@redhat.com;
>jgg@nvidia.com; nicolinc@nvidia.com; shameerali.kolothum.thodi@huawei.com;
>joao.m.martins@oracle.com; clement.mathieu--drif@eviden.com; Tian, Kevin
><kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P
><chao.p.peng@intel.com>
>Subject: Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-
>managed HWPT
>
>Hi Zhenzhong,
>
>On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>> This helper passes cache invalidation request from guest to invalidate
>> stage-1 page table cache in host hardware.
>>
>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/system/iommufd.h | 4 ++++
>> backends/iommufd.c | 36 ++++++++++++++++++++++++++++++++++++
>> backends/trace-events | 1 +
>> 3 files changed, 41 insertions(+)
>>
>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>> index cbab75bfbf..83ab8e1e4c 100644
>> --- a/include/system/iommufd.h
>> +++ b/include/system/iommufd.h
>> @@ -61,6 +61,10 @@ bool
>iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>> uint64_t iova, ram_addr_t size,
>> uint64_t page_size, uint64_t *data,
>> Error **errp);
>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>> + uint32_t data_type, uint32_t entry_len,
>> + uint32_t *entry_num, void *data,
>> + Error **errp);
>>
>> #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>TYPE_HOST_IOMMU_DEVICE "-iommufd"
>> #endif
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index b73f75cd0b..8bcdb60fe7 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -311,6 +311,42 @@ bool
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>> return true;
>> }
>>
>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>> + uint32_t data_type, uint32_t entry_len,
>> + uint32_t *entry_num, void *data,
>> + Error **errp)
>> +{
>> + int ret, fd = be->fd;
>> + uint32_t total_entries = *entry_num;
>> + struct iommu_hwpt_invalidate cache = {
>> + .size = sizeof(cache),
>> + .hwpt_id = id,
>> + .data_type = data_type,
>> + .entry_len = entry_len,
>> + .entry_num = total_entries,
>> + .data_uptr = (uintptr_t)data,
>> + };
>> +
>> + ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
>> + trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
>> + total_entries, cache.entry_num,
>> + (uintptr_t)data, ret ? errno : 0);
>> + *entry_num = cache.entry_num;
>> +
>> + if (ret) {
>> + error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
>> + " total %d entries, processed %d entries",
>> + total_entries, cache.entry_num);
>> + } else if (total_entries != cache.entry_num) {
>> + error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with
>unprocessed"
>> + " entries: total %d entries, processed %d entries."
>> + " Kernel BUG?!", total_entries, cache.entry_num);
>Can this happen? Isn't it a failure case?
It shouldn't happen except kernel has a bug. It's a failure case, so false is returned.
Thanks
Zhenzhong
>
>Besides
>Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
>
>Eric
>> + return false;
>> + }
>> +
>> + return !ret;
>> +}
>> +
>> static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error
>**errp)
>> {
>> HostIOMMUDeviceCaps *caps = &hiod->caps;
>> diff --git a/backends/trace-events b/backends/trace-events
>> index 40811a3162..7278214ea5 100644
>> --- a/backends/trace-events
>> +++ b/backends/trace-events
>> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t
>dev_id, uint32_t pt_id, uint32_
>> iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d
>id=%d (%d)"
>> iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret)
>" iommufd=%d hwpt=%u enable=%d (%d)"
>> iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t
>iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u
>iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
>> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t
>data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t
>data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u
>entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT
2025-06-04 5:50 ` Duan, Zhenzhong
@ 2025-06-04 17:26 ` Eric Auger
0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-04 17:26 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
jgg@nvidia.com, nicolinc@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
Hi Zhenzhong,
On 6/4/25 7:50 AM, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: Tuesday, June 3, 2025 8:21 PM
>> To: Duan, Zhenzhong <zhenzhong.duan@intel.com>; qemu-devel@nongnu.org
>> Cc: alex.williamson@redhat.com; clg@redhat.com; mst@redhat.com;
>> jasowang@redhat.com; peterx@redhat.com; ddutile@redhat.com;
>> jgg@nvidia.com; nicolinc@nvidia.com; shameerali.kolothum.thodi@huawei.com;
>> joao.m.martins@oracle.com; clement.mathieu--drif@eviden.com; Tian, Kevin
>> <kevin.tian@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; Peng, Chao P
>> <chao.p.peng@intel.com>
>> Subject: Re: [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-
>> managed HWPT
>>
>> Hi Zhenzhong,
>>
>> On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>>> This helper passes cache invalidation request from guest to invalidate
>>> stage-1 page table cache in host hardware.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> include/system/iommufd.h | 4 ++++
>>> backends/iommufd.c | 36 ++++++++++++++++++++++++++++++++++++
>>> backends/trace-events | 1 +
>>> 3 files changed, 41 insertions(+)
>>>
>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>> index cbab75bfbf..83ab8e1e4c 100644
>>> --- a/include/system/iommufd.h
>>> +++ b/include/system/iommufd.h
>>> @@ -61,6 +61,10 @@ bool
>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>>> uint64_t iova, ram_addr_t size,
>>> uint64_t page_size, uint64_t *data,
>>> Error **errp);
>>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>>> + uint32_t data_type, uint32_t entry_len,
>>> + uint32_t *entry_num, void *data,
>>> + Error **errp);
>>>
>>> #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>> TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>> #endif
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index b73f75cd0b..8bcdb60fe7 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -311,6 +311,42 @@ bool
>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>> return true;
>>> }
>>>
>>> +bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>>> + uint32_t data_type, uint32_t entry_len,
>>> + uint32_t *entry_num, void *data,
>>> + Error **errp)
>>> +{
>>> + int ret, fd = be->fd;
>>> + uint32_t total_entries = *entry_num;
>>> + struct iommu_hwpt_invalidate cache = {
>>> + .size = sizeof(cache),
>>> + .hwpt_id = id,
>>> + .data_type = data_type,
>>> + .entry_len = entry_len,
>>> + .entry_num = total_entries,
>>> + .data_uptr = (uintptr_t)data,
>>> + };
>>> +
>>> + ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
>>> + trace_iommufd_backend_invalidate_cache(fd, id, data_type, entry_len,
>>> + total_entries, cache.entry_num,
>>> + (uintptr_t)data, ret ? errno : 0);
>>> + *entry_num = cache.entry_num;
>>> +
>>> + if (ret) {
>>> + error_setg_errno(errp, errno, "IOMMU_HWPT_INVALIDATE failed:"
>>> + " total %d entries, processed %d entries",
>>> + total_entries, cache.entry_num);
>>> + } else if (total_entries != cache.entry_num) {
>>> + error_setg(errp, "IOMMU_HWPT_INVALIDATE succeed but with
>> unprocessed"
>>> + " entries: total %d entries, processed %d entries."
>>> + " Kernel BUG?!", total_entries, cache.entry_num);
>> Can this happen? Isn't it a failure case?
> It shouldn't happen except kernel has a bug. It's a failure case, so false is returned.
OK. I missed it was an error case.
Eric
>
> Thanks
> Zhenzhong
>
>> Besides
>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>>
>>
>> Eric
>>> + return false;
>>> + }
>>> +
>>> + return !ret;
>>> +}
>>> +
>>> static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error
>> **errp)
>>> {
>>> HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> diff --git a/backends/trace-events b/backends/trace-events
>>> index 40811a3162..7278214ea5 100644
>>> --- a/backends/trace-events
>>> +++ b/backends/trace-events
>>> @@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t
>> dev_id, uint32_t pt_id, uint32_
>>> iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d
>> id=%d (%d)"
>>> iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret)
>> " iommufd=%d hwpt=%u enable=%d (%d)"
>>> iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t
>> iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u
>> iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
>>> +iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t
>> data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t
>> data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u
>> entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
2025-05-30 9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
2025-05-30 9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
@ 2025-05-30 9:35 ` Zhenzhong Duan
2025-05-30 21:00 ` Nicolin Chen
2025-06-03 12:27 ` Eric Auger
2025-05-30 9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
` (2 subsequent siblings)
4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30 9:35 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
Zhenzhong Duan
Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
to the iommufd BE + 2 new class functions.
IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
re-attach IOMMUFD backed device to its default VFIO sub-system created
hwpt, i.e., when vIOMMU is disabled by guest. These properties are
initialized in hiod::realize() after attachment.
2 new class functions are [at|de]tach_hwpt(). They are used to
attach/detach hwpt. VFIO and VDPA can have different implementions,
so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
e.g., in HostIOMMUDeviceIOMMUFDVFIO.
Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
two functions.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
include/system/iommufd.h | 50 ++++++++++++++++++++++++++++++++++++++++
backends/iommufd.c | 22 ++++++++++++++++++
hw/vfio/iommufd.c | 6 +++++
3 files changed, 78 insertions(+)
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 83ab8e1e4c..283861b924 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -67,4 +67,54 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
Error **errp);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
+OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
+ HOST_IOMMU_DEVICE_IOMMUFD)
+
+/* Overload of the host IOMMU device for the iommufd backend */
+struct HostIOMMUDeviceIOMMUFD {
+ HostIOMMUDevice parent_obj;
+
+ IOMMUFDBackend *iommufd;
+ uint32_t devid;
+ uint32_t hwpt_id;
+};
+
+struct HostIOMMUDeviceIOMMUFDClass {
+ HostIOMMUDeviceClass parent_class;
+
+ /**
+ * @attach_hwpt: attach host IOMMU device to IOMMUFD hardware page table.
+ * VFIO and VDPA device can have different implementation.
+ *
+ * Mandatory callback.
+ *
+ * @idev: host IOMMU device backed by IOMMUFD backend.
+ *
+ * @hwpt_id: ID of IOMMUFD hardware page table.
+ *
+ * @errp: pass an Error out when attachment fails.
+ *
+ * Returns: true on success, false on failure.
+ */
+ bool (*attach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, uint32_t hwpt_id,
+ Error **errp);
+ /**
+ * @detach_hwpt: detach host IOMMU device from IOMMUFD hardware page table.
+ * VFIO and VDPA device can have different implementation.
+ *
+ * Mandatory callback.
+ *
+ * @idev: host IOMMU device backed by IOMMUFD backend.
+ *
+ * @errp: pass an Error out when attachment fails.
+ *
+ * Returns: true on success, false on failure.
+ */
+ bool (*detach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, Error **errp);
+};
+
+bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ uint32_t hwpt_id, Error **errp);
+bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ Error **errp);
#endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 8bcdb60fe7..c2c47abf7e 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -347,6 +347,26 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
return !ret;
}
+bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ uint32_t hwpt_id, Error **errp)
+{
+ HostIOMMUDeviceIOMMUFDClass *idevc =
+ HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
+
+ g_assert(idevc->attach_hwpt);
+ return idevc->attach_hwpt(idev, hwpt_id, errp);
+}
+
+bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ Error **errp)
+{
+ HostIOMMUDeviceIOMMUFDClass *idevc =
+ HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
+
+ g_assert(idevc->detach_hwpt);
+ return idevc->detach_hwpt(idev, errp);
+}
+
static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
{
HostIOMMUDeviceCaps *caps = &hiod->caps;
@@ -385,6 +405,8 @@ static const TypeInfo types[] = {
}, {
.name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
.parent = TYPE_HOST_IOMMU_DEVICE,
+ .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
+ .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
.class_init = hiod_iommufd_class_init,
.abstract = true,
}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index af1c7ab10a..5fde2b633a 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -814,6 +814,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
Error **errp)
{
VFIODevice *vdev = opaque;
+ HostIOMMUDeviceIOMMUFD *idev;
HostIOMMUDeviceCaps *caps = &hiod->caps;
enum iommu_hw_info_type type;
union {
@@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
caps->type = type;
caps->hw_caps = hw_caps;
+ idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
+ idev->iommufd = vdev->iommufd;
+ idev->devid = vdev->devid;
+ idev->hwpt_id = vdev->hwpt->hwpt_id;
+
return true;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
2025-05-30 9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
@ 2025-05-30 21:00 ` Nicolin Chen
2025-06-03 3:12 ` Duan, Zhenzhong
2025-06-03 12:27 ` Eric Auger
1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen @ 2025-05-30 21:00 UTC (permalink / raw)
To: Zhenzhong Duan
Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On Fri, May 30, 2025 at 05:35:10PM +0800, Zhenzhong Duan wrote:
> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
> to the iommufd BE + 2 new class functions.
>
> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
> re-attach IOMMUFD backed device to its default VFIO sub-system created
> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
> initialized in hiod::realize() after attachment.
>
> 2 new class functions are [at|de]tach_hwpt(). They are used to
> attach/detach hwpt. VFIO and VDPA can have different implementions,
> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
> e.g., in HostIOMMUDeviceIOMMUFDVFIO.
You mean HostIOMMUDeviceVFIO and HostIOMMUDeviceVDPA?
I wonder what are the difference between their implementations.
Main reason for asking this is that we have alloc_hwpt (and will
have alloc_viommu soon) too, and should that/those be a part of
the Class op(s) too?
> @@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> caps->type = type;
> caps->hw_caps = hw_caps;
>
> + idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
> + idev->iommufd = vdev->iommufd;
> + idev->devid = vdev->devid;
> + idev->hwpt_id = vdev->hwpt->hwpt_id;
Since it's the default hwpt that VFIO contain allocated, perhaps
it's better to call it default_hwpt_id, indicating that won't be
changed by further HWPT attachment.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
2025-05-30 21:00 ` Nicolin Chen
@ 2025-06-03 3:12 ` Duan, Zhenzhong
0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03 3:12 UTC (permalink / raw)
To: Nicolin Chen
Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to
>TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>
>On Fri, May 30, 2025 at 05:35:10PM +0800, Zhenzhong Duan wrote:
>> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
>> to the iommufd BE + 2 new class functions.
>>
>> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
>> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
>> re-attach IOMMUFD backed device to its default VFIO sub-system created
>> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
>> initialized in hiod::realize() after attachment.
>>
>> 2 new class functions are [at|de]tach_hwpt(). They are used to
>> attach/detach hwpt. VFIO and VDPA can have different implementions,
>> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
>> e.g., in HostIOMMUDeviceIOMMUFDVFIO.
>
>You mean HostIOMMUDeviceVFIO and HostIOMMUDeviceVDPA?
HostIOMMUDeviceIOMMUFDVFIO and HostIOMMUDeviceIOMMUFDVDPA.
>
>I wonder what are the difference between their implementations.
They have different way to transform HOST_IOMMU_DEVICE(idev)->agent.
See host_iommu_device_iommufd_vfio_attach_hwpt()
>
>Main reason for asking this is that we have alloc_hwpt (and will
>have alloc_viommu soon) too, and should that/those be a part of
>the Class op(s) too?
I think vIOMMU could call alloc_viommu and alloc_hwpt directly,
they are common IOMMUFD interface no matter VFIO or VDPA backend device.
>
>> @@ -833,6 +834,11 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>> caps->type = type;
>> caps->hw_caps = hw_caps;
>>
>> + idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
>> + idev->iommufd = vdev->iommufd;
>> + idev->devid = vdev->devid;
>> + idev->hwpt_id = vdev->hwpt->hwpt_id;
>
>Since it's the default hwpt that VFIO contain allocated, perhaps
>it's better to call it default_hwpt_id, indicating that won't be
>changed by further HWPT attachment.
hiod_iommufd_vfio_realize() is called only once, no one will change idev->hwpt_id.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
2025-05-30 9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
2025-05-30 21:00 ` Nicolin Chen
@ 2025-06-03 12:27 ` Eric Auger
1 sibling, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:27 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
> to the iommufd BE + 2 new class functions.
>
> IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
> and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
> re-attach IOMMUFD backed device to its default VFIO sub-system created
> hwpt, i.e., when vIOMMU is disabled by guest. These properties are
> initialized in hiod::realize() after attachment.
>
> 2 new class functions are [at|de]tach_hwpt(). They are used to
> attach/detach hwpt. VFIO and VDPA can have different implementions,
> so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
> e.g., in HostIOMMUDeviceIOMMUFDVFIO.
>
> Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
> two functions.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
> ---
> include/system/iommufd.h | 50 ++++++++++++++++++++++++++++++++++++++++
> backends/iommufd.c | 22 ++++++++++++++++++
> hw/vfio/iommufd.c | 6 +++++
> 3 files changed, 78 insertions(+)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index 83ab8e1e4c..283861b924 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -67,4 +67,54 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> Error **errp);
>
> #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
> +OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
> + HOST_IOMMU_DEVICE_IOMMUFD)
> +
> +/* Overload of the host IOMMU device for the iommufd backend */
> +struct HostIOMMUDeviceIOMMUFD {
> + HostIOMMUDevice parent_obj;
> +
> + IOMMUFDBackend *iommufd;
> + uint32_t devid;
> + uint32_t hwpt_id;
> +};
> +
> +struct HostIOMMUDeviceIOMMUFDClass {
> + HostIOMMUDeviceClass parent_class;
> +
> + /**
> + * @attach_hwpt: attach host IOMMU device to IOMMUFD hardware page table.
> + * VFIO and VDPA device can have different implementation.
> + *
> + * Mandatory callback.
> + *
> + * @idev: host IOMMU device backed by IOMMUFD backend.
> + *
> + * @hwpt_id: ID of IOMMUFD hardware page table.
> + *
> + * @errp: pass an Error out when attachment fails.
> + *
> + * Returns: true on success, false on failure.
> + */
> + bool (*attach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, uint32_t hwpt_id,
> + Error **errp);
> + /**
> + * @detach_hwpt: detach host IOMMU device from IOMMUFD hardware page table.
> + * VFIO and VDPA device can have different implementation.
> + *
> + * Mandatory callback.
> + *
> + * @idev: host IOMMU device backed by IOMMUFD backend.
> + *
> + * @errp: pass an Error out when attachment fails.
> + *
> + * Returns: true on success, false on failure.
> + */
> + bool (*detach_hwpt)(HostIOMMUDeviceIOMMUFD *idev, Error **errp);
> +};
> +
> +bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + uint32_t hwpt_id, Error **errp);
> +bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + Error **errp);
> #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index 8bcdb60fe7..c2c47abf7e 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -347,6 +347,26 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> return !ret;
> }
>
> +bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + uint32_t hwpt_id, Error **errp)
> +{
> + HostIOMMUDeviceIOMMUFDClass *idevc =
> + HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
> +
> + g_assert(idevc->attach_hwpt);
> + return idevc->attach_hwpt(idev, hwpt_id, errp);
> +}
> +
> +bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + Error **errp)
> +{
> + HostIOMMUDeviceIOMMUFDClass *idevc =
> + HOST_IOMMU_DEVICE_IOMMUFD_GET_CLASS(idev);
> +
> + g_assert(idevc->detach_hwpt);
> + return idevc->detach_hwpt(idev, errp);
> +}
> +
> static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> {
> HostIOMMUDeviceCaps *caps = &hiod->caps;
> @@ -385,6 +405,8 @@ static const TypeInfo types[] = {
> }, {
> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> .parent = TYPE_HOST_IOMMU_DEVICE,
> + .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
> + .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
> .class_init = hiod_iommufd_class_init,
> .abstract = true,
> }
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index af1c7ab10a..5fde2b633a 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -814,6 +814,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> Error **errp)
> {
> VFIODevice *vdev = opaque;
> + HostIOMMUDeviceIOMMUFD *idev;
> HostIOMMUDeviceCaps *caps = &hiod->caps;
> enum iommu_hw_info_type type;
> union {
> @@ -833,6 +834,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> caps->type = type;
> caps->hw_caps = hw_caps;
>
> + idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
> + idev->iommufd = vdev->iommufd;
> + idev->devid = vdev->devid;
> + idev->hwpt_id = vdev->hwpt->hwpt_id;
> +
> return true;
> }
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
2025-05-30 9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
2025-05-30 9:35 ` [PATCH v2 1/4] backends/iommufd: Add a helper to invalidate user-managed HWPT Zhenzhong Duan
2025-05-30 9:35 ` [PATCH v2 2/4] vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD Zhenzhong Duan
@ 2025-05-30 9:35 ` Zhenzhong Duan
2025-05-30 20:31 ` Nicolin Chen via
2025-06-03 17:38 ` Eric Auger
2025-05-30 9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
2025-06-01 18:04 ` [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Cédric Le Goater
4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30 9:35 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
Zhenzhong Duan
Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
utilizes them to attach to or detach from hwpt on host side.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/iommufd.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 5fde2b633a..d661737c17 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -810,6 +810,24 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, const void *data)
vioc->query_dirty_bitmap = iommufd_query_dirty_bitmap;
};
+static bool
+host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ uint32_t hwpt_id, Error **errp)
+{
+ VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
+
+ return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
+}
+
+static bool
+host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
+ Error **errp)
+{
+ VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
+
+ return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
+}
+
static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
Error **errp)
{
@@ -864,10 +882,14 @@ hiod_iommufd_vfio_get_page_size_mask(HostIOMMUDevice *hiod)
static void hiod_iommufd_vfio_class_init(ObjectClass *oc, const void *data)
{
HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
+ HostIOMMUDeviceIOMMUFDClass *idevc = HOST_IOMMU_DEVICE_IOMMUFD_CLASS(oc);
hiodc->realize = hiod_iommufd_vfio_realize;
hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
hiodc->get_page_size_mask = hiod_iommufd_vfio_get_page_size_mask;
+
+ idevc->attach_hwpt = host_iommu_device_iommufd_vfio_attach_hwpt;
+ idevc->detach_hwpt = host_iommu_device_iommufd_vfio_detach_hwpt;
};
static const TypeInfo types[] = {
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
2025-05-30 9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
@ 2025-05-30 20:31 ` Nicolin Chen via
2025-06-03 3:03 ` Duan, Zhenzhong
2025-06-03 17:38 ` Eric Auger
1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen via @ 2025-05-30 20:31 UTC (permalink / raw)
To: Zhenzhong Duan
Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On Fri, May 30, 2025 at 05:35:11PM +0800, Zhenzhong Duan wrote:
> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
> utilizes them to attach to or detach from hwpt on host side.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> +static bool
> +host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + uint32_t hwpt_id, Error **errp)
> +{
> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> + return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
> +}
> +
> +static bool
> +host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + Error **errp)
> +{
> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> + return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
> +}
Could be a separate patch though:
So, we have the attach API returning "int" while the detach API
returning "bool". Is errno returned back to the attach caller(s)
so the attach API can be a "bool" type too?
Nicolin
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
2025-05-30 20:31 ` Nicolin Chen via
@ 2025-06-03 3:03 ` Duan, Zhenzhong
0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03 3:03 UTC (permalink / raw)
To: Nicolin Chen
Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
>
>On Fri, May 30, 2025 at 05:35:11PM +0800, Zhenzhong Duan wrote:
>> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
>> utilizes them to attach to or detach from hwpt on host side.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> Reviewed-by: Cédric Le Goater <clg@redhat.com>
>
>Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
>> +static bool
>>
>+host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD
>*idev,
>> + uint32_t hwpt_id, Error **errp)
>> +{
>> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
>> +
>> + return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
>> +}
>> +
>> +static bool
>>
>+host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUF
>D *idev,
>> + Error **errp)
>> +{
>> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
>> +
>> + return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
>> +}
>
>Could be a separate patch though:
>
>So, we have the attach API returning "int" while the detach API
>returning "bool". Is errno returned back to the attach caller(s)
>so the attach API can be a "bool" type too?
Errno is returned through errp. We didn't check errno here,
just treat all errors as incompatible.
Zhenzhong
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers
2025-05-30 9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
2025-05-30 20:31 ` Nicolin Chen via
@ 2025-06-03 17:38 ` Eric Auger
1 sibling, 0 replies; 20+ messages in thread
From: Eric Auger @ 2025-06-03 17:38 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
> utilizes them to attach to or detach from hwpt on host side.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
> ---
> hw/vfio/iommufd.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 5fde2b633a..d661737c17 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -810,6 +810,24 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, const void *data)
> vioc->query_dirty_bitmap = iommufd_query_dirty_bitmap;
> };
>
> +static bool
> +host_iommu_device_iommufd_vfio_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + uint32_t hwpt_id, Error **errp)
> +{
> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> + return !iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp);
> +}
> +
> +static bool
> +host_iommu_device_iommufd_vfio_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> + Error **errp)
> +{
> + VFIODevice *vbasedev = HOST_IOMMU_DEVICE(idev)->agent;
> +
> + return iommufd_cdev_detach_ioas_hwpt(vbasedev, errp);
> +}
> +
> static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> Error **errp)
> {
> @@ -864,10 +882,14 @@ hiod_iommufd_vfio_get_page_size_mask(HostIOMMUDevice *hiod)
> static void hiod_iommufd_vfio_class_init(ObjectClass *oc, const void *data)
> {
> HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
> + HostIOMMUDeviceIOMMUFDClass *idevc = HOST_IOMMU_DEVICE_IOMMUFD_CLASS(oc);
>
> hiodc->realize = hiod_iommufd_vfio_realize;
> hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
> hiodc->get_page_size_mask = hiod_iommufd_vfio_get_page_size_mask;
> +
> + idevc->attach_hwpt = host_iommu_device_iommufd_vfio_attach_hwpt;
> + idevc->detach_hwpt = host_iommu_device_iommufd_vfio_detach_hwpt;
> };
>
> static const TypeInfo types[] = {
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
2025-05-30 9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
` (2 preceding siblings ...)
2025-05-30 9:35 ` [PATCH v2 3/4] vfio/iommufd: Implement [at|de]tach_hwpt handlers Zhenzhong Duan
@ 2025-05-30 9:35 ` Zhenzhong Duan
2025-05-30 21:05 ` Nicolin Chen
2025-06-03 12:40 ` Eric Auger
2025-06-01 18:04 ` [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Cédric Le Goater
4 siblings, 2 replies; 20+ messages in thread
From: Zhenzhong Duan @ 2025-05-30 9:35 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, jasowang, peterx, ddutile,
jgg, nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng,
Zhenzhong Duan
Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
specific. Save them as raw data in a union supporting different vendors,
then vendor IOMMU can query the raw data with its fixed format for
capability directly.
Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
capability related structures with CONFIG_LINUX.
Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/system/host_iommu_device.h | 11 +++++++++++
hw/vfio/iommufd.c | 8 +++-----
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/include/system/host_iommu_device.h b/include/system/host_iommu_device.h
index 809cced4ba..10fccc10be 100644
--- a/include/system/host_iommu_device.h
+++ b/include/system/host_iommu_device.h
@@ -14,6 +14,13 @@
#include "qom/object.h"
#include "qapi/error.h"
+#ifdef CONFIG_LINUX
+#include "linux/iommufd.h"
+
+typedef union VendorCaps {
+ struct iommu_hw_info_vtd vtd;
+ struct iommu_hw_info_arm_smmuv3 smmuv3;
+} VendorCaps;
/**
* struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
@@ -26,7 +33,9 @@
typedef struct HostIOMMUDeviceCaps {
uint32_t type;
uint64_t hw_caps;
+ VendorCaps vendor_caps;
} HostIOMMUDeviceCaps;
+#endif
#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
@@ -38,7 +47,9 @@ struct HostIOMMUDevice {
void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
PCIBus *aliased_bus;
int aliased_devfn;
+#ifdef CONFIG_LINUX
HostIOMMUDeviceCaps caps;
+#endif
};
/**
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index d661737c17..fbf47cab09 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -834,16 +834,14 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
VFIODevice *vdev = opaque;
HostIOMMUDeviceIOMMUFD *idev;
HostIOMMUDeviceCaps *caps = &hiod->caps;
+ VendorCaps *vendor_caps = &caps->vendor_caps;
enum iommu_hw_info_type type;
- union {
- struct iommu_hw_info_vtd vtd;
- } data;
uint64_t hw_caps;
hiod->agent = opaque;
- if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
- &type, &data, sizeof(data),
+ if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
+ vendor_caps, sizeof(*vendor_caps),
&hw_caps, errp)) {
return false;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
2025-05-30 9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
@ 2025-05-30 21:05 ` Nicolin Chen
2025-06-03 3:50 ` Duan, Zhenzhong
2025-06-03 12:40 ` Eric Auger
1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen @ 2025-05-30 21:05 UTC (permalink / raw)
To: Zhenzhong Duan
Cc: qemu-devel, alex.williamson, clg, eric.auger, mst, jasowang,
peterx, ddutile, jgg, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On Fri, May 30, 2025 at 05:35:12PM +0800, Zhenzhong Duan wrote:
> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
> specific. Save them as raw data in a union supporting different vendors,
> then vendor IOMMU can query the raw data with its fixed format for
> capability directly.
>
> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
> capability related structures with CONFIG_LINUX.
>
> Suggested-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
I like this that we eventually moved all vendor stuff back to the
vendor vIOMMU code.
> +typedef union VendorCaps {
> + struct iommu_hw_info_vtd vtd;
> + struct iommu_hw_info_arm_smmuv3 smmuv3;
> +} VendorCaps;
Nit: can it be just:
typedef union {
...
} VendorCaps;
?
> typedef struct HostIOMMUDeviceCaps {
> uint32_t type;
> uint64_t hw_caps;
> + VendorCaps vendor_caps;
> } HostIOMMUDeviceCaps;
Ditto.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
2025-05-30 21:05 ` Nicolin Chen
@ 2025-06-03 3:50 ` Duan, Zhenzhong
0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-03 3:50 UTC (permalink / raw)
To: Nicolin Chen
Cc: qemu-devel@nongnu.org, alex.williamson@redhat.com, clg@redhat.com,
eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com,
peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Nicolin Chen <nicolinc@nvidia.com>
>Subject: Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
>
>On Fri, May 30, 2025 at 05:35:12PM +0800, Zhenzhong Duan wrote:
>> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
>> specific. Save them as raw data in a union supporting different vendors,
>> then vendor IOMMU can query the raw data with its fixed format for
>> capability directly.
>>
>> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
>> capability related structures with CONFIG_LINUX.
>>
>> Suggested-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>
>Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
>I like this that we eventually moved all vendor stuff back to the
>vendor vIOMMU code.
>
>> +typedef union VendorCaps {
>> + struct iommu_hw_info_vtd vtd;
>> + struct iommu_hw_info_arm_smmuv3 smmuv3;
>> +} VendorCaps;
>
>Nit: can it be just:
>
>typedef union {
> ...
>} VendorCaps;
>
>?
Any special reason?
I didn't see anonymous union is preferred over named union in docs/devel/style.rst
Thanks
Zhenzhong
>
>> typedef struct HostIOMMUDeviceCaps {
>> uint32_t type;
>> uint64_t hw_caps;
>> + VendorCaps vendor_caps;
>> } HostIOMMUDeviceCaps;
>
>Ditto.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
2025-05-30 9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
2025-05-30 21:05 ` Nicolin Chen
@ 2025-06-03 12:40 ` Eric Auger
2025-06-04 3:41 ` Duan, Zhenzhong
1 sibling, 1 reply; 20+ messages in thread
From: Eric Auger @ 2025-06-03 12:40 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
Hi Zhenzhong,
On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
> specific. Save them as raw data in a union supporting different vendors,
> then vendor IOMMU can query the raw data with its fixed format for
> capability directly.
>
> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
> capability related structures with CONFIG_LINUX.
>
> Suggested-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/system/host_iommu_device.h | 11 +++++++++++
> hw/vfio/iommufd.c | 8 +++-----
> 2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/include/system/host_iommu_device.h b/include/system/host_iommu_device.h
> index 809cced4ba..10fccc10be 100644
> --- a/include/system/host_iommu_device.h
> +++ b/include/system/host_iommu_device.h
> @@ -14,6 +14,13 @@
>
> #include "qom/object.h"
> #include "qapi/error.h"
> +#ifdef CONFIG_LINUX
> +#include "linux/iommufd.h"
> +
> +typedef union VendorCaps {
> + struct iommu_hw_info_vtd vtd;
> + struct iommu_hw_info_arm_smmuv3 smmuv3;
> +} VendorCaps;
>
> /**
> * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> @@ -26,7 +33,9 @@
> typedef struct HostIOMMUDeviceCaps {
> uint32_t type;
> uint64_t hw_caps;
> + VendorCaps vendor_caps;
missing the doc comment update for new field vendor_caps
Otherwise
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Thanks
Eric
> } HostIOMMUDeviceCaps;
> +#endif
>
> #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
> OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
> @@ -38,7 +47,9 @@ struct HostIOMMUDevice {
> void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
> PCIBus *aliased_bus;
> int aliased_devfn;
> +#ifdef CONFIG_LINUX
> HostIOMMUDeviceCaps caps;
> +#endif
> };
>
> /**
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index d661737c17..fbf47cab09 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -834,16 +834,14 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> VFIODevice *vdev = opaque;
> HostIOMMUDeviceIOMMUFD *idev;
> HostIOMMUDeviceCaps *caps = &hiod->caps;
> + VendorCaps *vendor_caps = &caps->vendor_caps;
> enum iommu_hw_info_type type;
> - union {
> - struct iommu_hw_info_vtd vtd;
> - } data;
> uint64_t hw_caps;
>
> hiod->agent = opaque;
>
> - if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
> - &type, &data, sizeof(data),
> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
> + vendor_caps, sizeof(*vendor_caps),
> &hw_caps, errp)) {
> return false;
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
2025-06-03 12:40 ` Eric Auger
@ 2025-06-04 3:41 ` Duan, Zhenzhong
0 siblings, 0 replies; 20+ messages in thread
From: Duan, Zhenzhong @ 2025-06-04 3:41 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
jgg@nvidia.com, nicolinc@nvidia.com,
shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info
>
>Hi Zhenzhong,
>
>On 5/30/25 11:35 AM, Zhenzhong Duan wrote:
>> Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
>> specific. Save them as raw data in a union supporting different vendors,
>> then vendor IOMMU can query the raw data with its fixed format for
>> capability directly.
>>
>> Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
>> capability related structures with CONFIG_LINUX.
>>
>> Suggested-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/system/host_iommu_device.h | 11 +++++++++++
>> hw/vfio/iommufd.c | 8 +++-----
>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/system/host_iommu_device.h
>b/include/system/host_iommu_device.h
>> index 809cced4ba..10fccc10be 100644
>> --- a/include/system/host_iommu_device.h
>> +++ b/include/system/host_iommu_device.h
>> @@ -14,6 +14,13 @@
>>
>> #include "qom/object.h"
>> #include "qapi/error.h"
>> +#ifdef CONFIG_LINUX
>> +#include "linux/iommufd.h"
>> +
>> +typedef union VendorCaps {
>> + struct iommu_hw_info_vtd vtd;
>> + struct iommu_hw_info_arm_smmuv3 smmuv3;
>> +} VendorCaps;
>>
>> /**
>> * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
>> @@ -26,7 +33,9 @@
>> typedef struct HostIOMMUDeviceCaps {
>> uint32_t type;
>> uint64_t hw_caps;
>> + VendorCaps vendor_caps;
>missing the doc comment update for new field vendor_caps
Good catch, will add. Thanks
Zhenzhong
>
>Otherwise
>
>Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
>Thanks
>
>Eric
>
>
>> } HostIOMMUDeviceCaps;
>> +#endif
>>
>> #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> @@ -38,7 +47,9 @@ struct HostIOMMUDevice {
>> void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
>> PCIBus *aliased_bus;
>> int aliased_devfn;
>> +#ifdef CONFIG_LINUX
>> HostIOMMUDeviceCaps caps;
>> +#endif
>> };
>>
>> /**
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index d661737c17..fbf47cab09 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -834,16 +834,14 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>> VFIODevice *vdev = opaque;
>> HostIOMMUDeviceIOMMUFD *idev;
>> HostIOMMUDeviceCaps *caps = &hiod->caps;
>> + VendorCaps *vendor_caps = &caps->vendor_caps;
>> enum iommu_hw_info_type type;
>> - union {
>> - struct iommu_hw_info_vtd vtd;
>> - } data;
>> uint64_t hw_caps;
>>
>> hiod->agent = opaque;
>>
>> - if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>> - &type, &data, sizeof(data),
>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid, &type,
>> + vendor_caps, sizeof(*vendor_caps),
>> &hw_caps, errp)) {
>> return false;
>> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support
2025-05-30 9:35 [PATCH v2 0/4] VFIO and IOMMU prerequisite stuff for IOMMU nesting support Zhenzhong Duan
` (3 preceding siblings ...)
2025-05-30 9:35 ` [PATCH v2 4/4] vfio/iommufd: Save vendor specific device info Zhenzhong Duan
@ 2025-06-01 18:04 ` Cédric Le Goater
4 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2025-06-01 18:04 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, mst, jasowang, peterx, ddutile, jgg,
nicolinc, shameerali.kolothum.thodi, joao.m.martins,
clement.mathieu--drif, kevin.tian, yi.l.liu, chao.p.peng
On 5/30/25 11:35, Zhenzhong Duan wrote:
> Hi,
>
> The first 6 patches of [1] are all VFIO or IOMMUFD related additions.
> Split them out per Cédric and seek for quick acceptance.
>
> I didn't copy changelog from [1] as it's a mix of the whole nesting series.
>
> For who want a quick view of the whole nesting series [2].
>
> Test done:
> - VFIO devices hotplug/unplug
> - build test on Windows
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2025-05/msg05002.html
> [2] https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting.v1.wip
>
> Thanks
> Zhenzhong
>
> Changelog:
> v2:
> - report kernel BUG as error instead of assert (Cédric)
> - merge patch2 and patch3 (Cédric)
> - handle vendor cap check directly from vtd_check_hiod, so patch6 removed (Cédric)
> - s/data_ptr/data (Cédric)
> - s/totally/total (Donald)
Applied to vfio-next.
Thanks,
C.
>
> v1:
> - changed to save raw data in VendorCaps, so we can keep all vendor structure
> decoding inside the backend and VFIO wouldn't need to care about types nor
> what's inside the data.
>
>
> Zhenzhong Duan (4):
> backends/iommufd: Add a helper to invalidate user-managed HWPT
> vfio/iommufd: Add properties and handlers to
> TYPE_HOST_IOMMU_DEVICE_IOMMUFD
> vfio/iommufd: Implement [at|de]tach_hwpt handlers
> vfio/iommufd: Save vendor specific device info
>
> include/system/host_iommu_device.h | 11 ++++++
> include/system/iommufd.h | 54 ++++++++++++++++++++++++++++
> backends/iommufd.c | 58 ++++++++++++++++++++++++++++++
> hw/vfio/iommufd.c | 36 ++++++++++++++++---
> backends/trace-events | 1 +
> 5 files changed, 155 insertions(+), 5 deletions(-)
>
^ permalink raw reply [flat|nested] 20+ messages in thread