* [PATCH v11 01/18] iommu: Require passing new handles to APIs supporting handle
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 02/18] iommu: Introduce a replace API for device pasid Yi Liu
` (18 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
Add kdoc to highligt the caller of iommu_[attach|replace]_group_handle()
and iommu_attach_device_pasid() should always provide a new handle. This
can avoid race with lockless reference to the handle. e.g. the
find_fault_handler() and iommu_report_device_fault() in the PRI path.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommu.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0f4cc15ded1c..9f1db10645ee 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3365,6 +3365,9 @@ static void __iommu_remove_group_pasid(struct iommu_group *group,
* @pasid: the pasid of the device.
* @handle: the attach handle.
*
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle if it intends to pass a valid handle.
+ *
* Return: 0 on success, or an error.
*/
int iommu_attach_device_pasid(struct iommu_domain *domain,
@@ -3525,6 +3528,9 @@ EXPORT_SYMBOL_NS_GPL(iommu_attach_handle_get, "IOMMUFD_INTERNAL");
* This is a variant of iommu_attach_group(). It allows the caller to provide
* an attach handle and use it when the domain is attached. This is currently
* used by IOMMUFD to deliver the I/O page faults.
+ *
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle.
*/
int iommu_attach_group_handle(struct iommu_domain *domain,
struct iommu_group *group,
@@ -3594,6 +3600,9 @@ EXPORT_SYMBOL_NS_GPL(iommu_detach_group_handle, "IOMMUFD_INTERNAL");
*
* If the currently attached domain is a core domain (e.g. a default_domain),
* it will act just like the iommu_attach_group_handle().
+ *
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle.
*/
int iommu_replace_group_handle(struct iommu_group *group,
struct iommu_domain *new_domain,
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 02/18] iommu: Introduce a replace API for device pasid
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
2025-03-21 17:19 ` [PATCH v11 01/18] iommu: Require passing new handles to APIs supporting handle Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 03/18] iommufd: Pass @pasid through the device attach/replace path Yi Liu
` (17 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
Provide a high-level API to allow replacements of one domain with another
for specific pasid of a device. This is similar to
iommu_replace_group_handle() and it is expected to be used only by IOMMUFD.
Co-developed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Drop redundant handle check as handle is always valid in
iommu_replace_device_pasid(). No inline helper
---
drivers/iommu/iommu-priv.h | 3 +
drivers/iommu/iommu.c | 115 +++++++++++++++++++++++++++++++++++--
2 files changed, 114 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
index c74fff25be78..b8528d5dd1e7 100644
--- a/drivers/iommu/iommu-priv.h
+++ b/drivers/iommu/iommu-priv.h
@@ -56,4 +56,7 @@ static inline int iommufd_sw_msi(struct iommu_domain *domain,
}
#endif /* CONFIG_IOMMUFD_DRIVER_CORE && CONFIG_IRQ_MSI_IOMMU */
+int iommu_replace_device_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_attach_handle *handle);
#endif /* __LINUX_IOMMU_PRIV_H */
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9f1db10645ee..c8c891f27e45 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -514,6 +514,13 @@ static void iommu_deinit_device(struct device *dev)
dev_iommu_free(dev);
}
+static struct iommu_domain *pasid_array_entry_to_domain(void *entry)
+{
+ if (xa_pointer_tag(entry) == IOMMU_PASID_ARRAY_DOMAIN)
+ return xa_untag_pointer(entry);
+ return ((struct iommu_attach_handle *)xa_untag_pointer(entry))->domain;
+}
+
DEFINE_MUTEX(iommu_probe_device_lock);
static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
@@ -3324,14 +3331,15 @@ static void iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
}
static int __iommu_set_group_pasid(struct iommu_domain *domain,
- struct iommu_group *group, ioasid_t pasid)
+ struct iommu_group *group, ioasid_t pasid,
+ struct iommu_domain *old)
{
struct group_device *device, *last_gdev;
int ret;
for_each_group_device(group, device) {
ret = domain->ops->set_dev_pasid(domain, device->dev,
- pasid, NULL);
+ pasid, old);
if (ret)
goto err_revert;
}
@@ -3343,7 +3351,15 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
for_each_group_device(group, device) {
if (device == last_gdev)
break;
- iommu_remove_dev_pasid(device->dev, pasid, domain);
+ /*
+ * If no old domain, undo the succeeded devices/pasid.
+ * Otherwise, rollback the succeeded devices/pasid to the old
+ * domain. And it is a driver bug to fail attaching with a
+ * previously good domain.
+ */
+ if (!old || WARN_ON(old->ops->set_dev_pasid(old, device->dev,
+ pasid, domain)))
+ iommu_remove_dev_pasid(device->dev, pasid, domain);
}
return ret;
}
@@ -3412,7 +3428,7 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
if (ret)
goto out_unlock;
- ret = __iommu_set_group_pasid(domain, group, pasid);
+ ret = __iommu_set_group_pasid(domain, group, pasid, NULL);
if (ret) {
xa_release(&group->pasid_array, pasid);
goto out_unlock;
@@ -3433,6 +3449,97 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
}
EXPORT_SYMBOL_GPL(iommu_attach_device_pasid);
+/**
+ * iommu_replace_device_pasid - Replace the domain that a specific pasid
+ * of the device is attached to
+ * @domain: the new iommu domain
+ * @dev: the attached device.
+ * @pasid: the pasid of the device.
+ * @handle: the attach handle.
+ *
+ * This API allows the pasid to switch domains. The @pasid should have been
+ * attached. Otherwise, this fails. The pasid will keep the old configuration
+ * if replacement failed.
+ *
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle if it intends to pass a valid handle.
+ *
+ * Return 0 on success, or an error.
+ */
+int iommu_replace_device_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_attach_handle *handle)
+{
+ /* Caller must be a probed driver on dev */
+ struct iommu_group *group = dev->iommu_group;
+ struct iommu_attach_handle *entry;
+ struct iommu_domain *curr_domain;
+ void *curr;
+ int ret;
+
+ if (!group)
+ return -ENODEV;
+
+ if (!domain->ops->set_dev_pasid)
+ return -EOPNOTSUPP;
+
+ if (dev_iommu_ops(dev) != domain->owner ||
+ pasid == IOMMU_NO_PASID || !handle)
+ return -EINVAL;
+
+ mutex_lock(&group->mutex);
+ entry = iommu_make_pasid_array_entry(domain, handle);
+ curr = xa_cmpxchg(&group->pasid_array, pasid, NULL,
+ XA_ZERO_ENTRY, GFP_KERNEL);
+ if (xa_is_err(curr)) {
+ ret = xa_err(curr);
+ goto out_unlock;
+ }
+
+ /*
+ * No domain (with or without handle) attached, hence not
+ * a replace case.
+ */
+ if (!curr) {
+ xa_release(&group->pasid_array, pasid);
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ /*
+ * Reusing handle is problematic as there are paths that refers
+ * the handle without lock. To avoid race, reject the callers that
+ * attempt it.
+ */
+ if (curr == entry) {
+ WARN_ON(1);
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ curr_domain = pasid_array_entry_to_domain(curr);
+ ret = 0;
+
+ if (curr_domain != domain) {
+ ret = __iommu_set_group_pasid(domain, group,
+ pasid, curr_domain);
+ if (ret)
+ goto out_unlock;
+ }
+
+ /*
+ * The above xa_cmpxchg() reserved the memory, and the
+ * group->mutex is held, this cannot fail.
+ */
+ WARN_ON(xa_is_err(xa_store(&group->pasid_array,
+ pasid, entry, GFP_KERNEL)));
+
+out_unlock:
+ mutex_unlock(&group->mutex);
+ return ret;
+}
+EXPORT_SYMBOL_NS_GPL(iommu_replace_device_pasid, "IOMMUFD_INTERNAL");
+
/*
* iommu_detach_device_pasid() - Detach the domain from pasid of device
* @domain: the iommu domain.
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 03/18] iommufd: Pass @pasid through the device attach/replace path
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
2025-03-21 17:19 ` [PATCH v11 01/18] iommu: Require passing new handles to APIs supporting handle Yi Liu
2025-03-21 17:19 ` [PATCH v11 02/18] iommu: Introduce a replace API for device pasid Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 04/18] iommufd/device: Only add reserved_iova in non-pasid path Yi Liu
` (16 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
Most of the core logic before conducting the actual device attach/
replace operation can be shared with pasid attach/replace. So pass
@pasid through the device attach/replace helpers to prepare adding
pasid attach/replace.
So far the @pasid should only be IOMMU_NO_PASID. No functional change.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 70 +++++++++++++++----------
drivers/iommu/iommufd/hw_pagetable.c | 13 ++---
drivers/iommu/iommufd/iommufd_private.h | 8 +--
3 files changed, 52 insertions(+), 39 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index d18ea9a61522..7051feda2fab 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -368,7 +368,8 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev)
}
static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
- struct iommufd_device *idev)
+ struct iommufd_device *idev,
+ ioasid_t pasid)
{
struct iommufd_attach_handle *handle;
int rc;
@@ -386,6 +387,7 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
}
handle->idev = idev;
+ WARN_ON(pasid != IOMMU_NO_PASID);
rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group,
&handle->handle);
if (rc)
@@ -402,25 +404,28 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
}
static struct iommufd_attach_handle *
-iommufd_device_get_attach_handle(struct iommufd_device *idev)
+iommufd_device_get_attach_handle(struct iommufd_device *idev, ioasid_t pasid)
{
struct iommu_attach_handle *handle;
lockdep_assert_held(&idev->igroup->lock);
handle =
- iommu_attach_handle_get(idev->igroup->group, IOMMU_NO_PASID, 0);
+ iommu_attach_handle_get(idev->igroup->group, pasid, 0);
if (IS_ERR(handle))
return NULL;
return to_iommufd_handle(handle);
}
static void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt,
- struct iommufd_device *idev)
+ struct iommufd_device *idev,
+ ioasid_t pasid)
{
struct iommufd_attach_handle *handle;
- handle = iommufd_device_get_attach_handle(idev);
+ WARN_ON(pasid != IOMMU_NO_PASID);
+
+ handle = iommufd_device_get_attach_handle(idev, pasid);
iommu_detach_group_handle(hwpt->domain, idev->igroup->group);
if (hwpt->fault) {
iommufd_auto_response_faults(hwpt, handle);
@@ -430,13 +435,17 @@ static void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt,
}
static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
+ ioasid_t pasid,
struct iommufd_hw_pagetable *hwpt,
struct iommufd_hw_pagetable *old)
{
- struct iommufd_attach_handle *handle, *old_handle =
- iommufd_device_get_attach_handle(idev);
+ struct iommufd_attach_handle *handle, *old_handle;
int rc;
+ WARN_ON(pasid != IOMMU_NO_PASID);
+
+ old_handle = iommufd_device_get_attach_handle(idev, pasid);
+
handle = kzalloc(sizeof(*handle), GFP_KERNEL);
if (!handle)
return -ENOMEM;
@@ -471,7 +480,7 @@ static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
}
int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
- struct iommufd_device *idev)
+ struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
int rc;
@@ -497,7 +506,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
* attachment.
*/
if (list_empty(&idev->igroup->device_list)) {
- rc = iommufd_hwpt_attach_device(hwpt, idev);
+ rc = iommufd_hwpt_attach_device(hwpt, idev, pasid);
if (rc)
goto err_unresv;
idev->igroup->hwpt = hwpt;
@@ -515,7 +524,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
}
struct iommufd_hw_pagetable *
-iommufd_hw_pagetable_detach(struct iommufd_device *idev)
+iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
@@ -523,7 +532,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev)
mutex_lock(&idev->igroup->lock);
list_del(&idev->group_item);
if (list_empty(&idev->igroup->device_list)) {
- iommufd_hwpt_detach_device(hwpt, idev);
+ iommufd_hwpt_detach_device(hwpt, idev, pasid);
idev->igroup->hwpt = NULL;
}
if (hwpt_paging)
@@ -535,12 +544,12 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev)
}
static struct iommufd_hw_pagetable *
-iommufd_device_do_attach(struct iommufd_device *idev,
+iommufd_device_do_attach(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_hw_pagetable *hwpt)
{
int rc;
- rc = iommufd_hw_pagetable_attach(hwpt, idev);
+ rc = iommufd_hw_pagetable_attach(hwpt, idev, pasid);
if (rc)
return ERR_PTR(rc);
return NULL;
@@ -589,7 +598,7 @@ iommufd_group_do_replace_reserved_iova(struct iommufd_group *igroup,
}
static struct iommufd_hw_pagetable *
-iommufd_device_do_replace(struct iommufd_device *idev,
+iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_hw_pagetable *hwpt)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
@@ -623,7 +632,7 @@ iommufd_device_do_replace(struct iommufd_device *idev,
goto err_unlock;
}
- rc = iommufd_hwpt_replace_device(idev, hwpt, old_hwpt);
+ rc = iommufd_hwpt_replace_device(idev, pasid, hwpt, old_hwpt);
if (rc)
goto err_unresv;
@@ -656,7 +665,8 @@ iommufd_device_do_replace(struct iommufd_device *idev,
}
typedef struct iommufd_hw_pagetable *(*attach_fn)(
- struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt);
+ struct iommufd_device *idev, ioasid_t pasid,
+ struct iommufd_hw_pagetable *hwpt);
/*
* When automatically managing the domains we search for a compatible domain in
@@ -664,7 +674,7 @@ typedef struct iommufd_hw_pagetable *(*attach_fn)(
* Automatic domain selection will never pick a manually created domain.
*/
static struct iommufd_hw_pagetable *
-iommufd_device_auto_get_domain(struct iommufd_device *idev,
+iommufd_device_auto_get_domain(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_ioas *ioas, u32 *pt_id,
attach_fn do_attach)
{
@@ -693,7 +703,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev,
hwpt = &hwpt_paging->common;
if (!iommufd_lock_obj(&hwpt->obj))
continue;
- destroy_hwpt = (*do_attach)(idev, hwpt);
+ destroy_hwpt = (*do_attach)(idev, pasid, hwpt);
if (IS_ERR(destroy_hwpt)) {
iommufd_put_object(idev->ictx, &hwpt->obj);
/*
@@ -711,8 +721,8 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev,
goto out_unlock;
}
- hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, 0,
- immediate_attach, NULL);
+ hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, pasid,
+ 0, immediate_attach, NULL);
if (IS_ERR(hwpt_paging)) {
destroy_hwpt = ERR_CAST(hwpt_paging);
goto out_unlock;
@@ -720,7 +730,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev,
hwpt = &hwpt_paging->common;
if (!immediate_attach) {
- destroy_hwpt = (*do_attach)(idev, hwpt);
+ destroy_hwpt = (*do_attach)(idev, pasid, hwpt);
if (IS_ERR(destroy_hwpt))
goto out_abort;
} else {
@@ -741,8 +751,9 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev,
return destroy_hwpt;
}
-static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id,
- attach_fn do_attach)
+static int iommufd_device_change_pt(struct iommufd_device *idev,
+ ioasid_t pasid,
+ u32 *pt_id, attach_fn do_attach)
{
struct iommufd_hw_pagetable *destroy_hwpt;
struct iommufd_object *pt_obj;
@@ -757,7 +768,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id,
struct iommufd_hw_pagetable *hwpt =
container_of(pt_obj, struct iommufd_hw_pagetable, obj);
- destroy_hwpt = (*do_attach)(idev, hwpt);
+ destroy_hwpt = (*do_attach)(idev, pasid, hwpt);
if (IS_ERR(destroy_hwpt))
goto out_put_pt_obj;
break;
@@ -766,8 +777,8 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id,
struct iommufd_ioas *ioas =
container_of(pt_obj, struct iommufd_ioas, obj);
- destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id,
- do_attach);
+ destroy_hwpt = iommufd_device_auto_get_domain(idev, pasid, ioas,
+ pt_id, do_attach);
if (IS_ERR(destroy_hwpt))
goto out_put_pt_obj;
break;
@@ -804,7 +815,8 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
{
int rc;
- rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach);
+ rc = iommufd_device_change_pt(idev, IOMMU_NO_PASID, pt_id,
+ &iommufd_device_do_attach);
if (rc)
return rc;
@@ -834,7 +846,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, "IOMMUFD");
*/
int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id)
{
- return iommufd_device_change_pt(idev, pt_id,
+ return iommufd_device_change_pt(idev, IOMMU_NO_PASID, pt_id,
&iommufd_device_do_replace);
}
EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, "IOMMUFD");
@@ -850,7 +862,7 @@ void iommufd_device_detach(struct iommufd_device *idev)
{
struct iommufd_hw_pagetable *hwpt;
- hwpt = iommufd_hw_pagetable_detach(idev);
+ hwpt = iommufd_hw_pagetable_detach(idev, IOMMU_NO_PASID);
iommufd_hw_pagetable_put(idev->ictx, hwpt);
refcount_dec(&idev->obj.users);
}
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 8e87ae71e128..bd9dd26a5295 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -90,6 +90,7 @@ iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging)
* @ictx: iommufd context
* @ioas: IOAS to associate the domain with
* @idev: Device to get an iommu_domain for
+ * @pasid: PASID to get an iommu_domain for
* @flags: Flags from userspace
* @immediate_attach: True if idev should be attached to the hwpt
* @user_data: The user provided driver specific data describing the domain to
@@ -105,8 +106,8 @@ iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging)
*/
struct iommufd_hwpt_paging *
iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
- struct iommufd_device *idev, u32 flags,
- bool immediate_attach,
+ struct iommufd_device *idev, ioasid_t pasid,
+ u32 flags, bool immediate_attach,
const struct iommu_user_data *user_data)
{
const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
@@ -189,7 +190,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
* sequence. Once those drivers are fixed this should be removed.
*/
if (immediate_attach) {
- rc = iommufd_hw_pagetable_attach(hwpt, idev);
+ rc = iommufd_hw_pagetable_attach(hwpt, idev, pasid);
if (rc)
goto out_abort;
}
@@ -202,7 +203,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
out_detach:
if (immediate_attach)
- iommufd_hw_pagetable_detach(idev);
+ iommufd_hw_pagetable_detach(idev, pasid);
out_abort:
iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
return ERR_PTR(rc);
@@ -364,8 +365,8 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
ioas = container_of(pt_obj, struct iommufd_ioas, obj);
mutex_lock(&ioas->mutex);
hwpt_paging = iommufd_hwpt_paging_alloc(
- ucmd->ictx, ioas, idev, cmd->flags, false,
- user_data.len ? &user_data : NULL);
+ ucmd->ictx, ioas, idev, IOMMU_NO_PASID, cmd->flags,
+ false, user_data.len ? &user_data : NULL);
if (IS_ERR(hwpt_paging)) {
rc = PTR_ERR(hwpt_paging);
goto out_unlock;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 8c49ca16919a..891800948d1a 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -369,13 +369,13 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd);
struct iommufd_hwpt_paging *
iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
- struct iommufd_device *idev, u32 flags,
- bool immediate_attach,
+ struct iommufd_device *idev, ioasid_t pasid,
+ u32 flags, bool immediate_attach,
const struct iommu_user_data *user_data);
int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
- struct iommufd_device *idev);
+ struct iommufd_device *idev, ioasid_t pasid);
struct iommufd_hw_pagetable *
-iommufd_hw_pagetable_detach(struct iommufd_device *idev);
+iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid);
void iommufd_hwpt_paging_destroy(struct iommufd_object *obj);
void iommufd_hwpt_paging_abort(struct iommufd_object *obj);
void iommufd_hwpt_nested_destroy(struct iommufd_object *obj);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 04/18] iommufd/device: Only add reserved_iova in non-pasid path
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (2 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 03/18] iommufd: Pass @pasid through the device attach/replace path Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 05/18] iommufd/device: Replace idev->igroup with local variable Yi Liu
` (15 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
As the pasid is passed through the attach/replace/detach helpers, it is
necessary to ensure only the non-pasid path adds reserved_iova.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 7051feda2fab..4625f084f7d0 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -483,6 +483,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
+ bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
int rc;
mutex_lock(&idev->igroup->lock);
@@ -492,7 +493,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
goto err_unlock;
}
- if (hwpt_paging) {
+ if (attach_resv) {
rc = iommufd_device_attach_reserved_iova(idev, hwpt_paging);
if (rc)
goto err_unlock;
@@ -516,7 +517,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
mutex_unlock(&idev->igroup->lock);
return 0;
err_unresv:
- if (hwpt_paging)
+ if (attach_resv)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
err_unlock:
mutex_unlock(&idev->igroup->lock);
@@ -535,7 +536,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
iommufd_hwpt_detach_device(hwpt, idev, pasid);
idev->igroup->hwpt = NULL;
}
- if (hwpt_paging)
+ if (hwpt_paging && pasid == IOMMU_NO_PASID)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
mutex_unlock(&idev->igroup->lock);
@@ -602,6 +603,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_hw_pagetable *hwpt)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
+ bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
struct iommufd_hwpt_paging *old_hwpt_paging;
struct iommufd_group *igroup = idev->igroup;
struct iommufd_hw_pagetable *old_hwpt;
@@ -626,7 +628,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
}
old_hwpt = igroup->hwpt;
- if (hwpt_paging) {
+ if (attach_resv) {
rc = iommufd_group_do_replace_reserved_iova(igroup, hwpt_paging);
if (rc)
goto err_unlock;
@@ -637,7 +639,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
goto err_unresv;
old_hwpt_paging = find_hwpt_paging(old_hwpt);
- if (old_hwpt_paging &&
+ if (old_hwpt_paging && pasid == IOMMU_NO_PASID &&
(!hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas))
iommufd_group_remove_reserved_iova(igroup, old_hwpt_paging);
@@ -657,7 +659,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
/* Caller must destroy old_hwpt */
return old_hwpt;
err_unresv:
- if (hwpt_paging)
+ if (attach_resv)
iommufd_group_remove_reserved_iova(igroup, hwpt_paging);
err_unlock:
mutex_unlock(&idev->igroup->lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 05/18] iommufd/device: Replace idev->igroup with local variable
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (3 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 04/18] iommufd/device: Only add reserved_iova in non-pasid path Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 06/18] iommufd/device: Add helper to detect the first attach of a group Yi Liu
` (14 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
With more use of the fields of igroup, use a local vairable instead of
using the idev->igroup heavily.
No functional change expected.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 43 ++++++++++++++++++----------------
1 file changed, 23 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 4625f084f7d0..15733b316b70 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -333,18 +333,19 @@ static int
iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
struct iommufd_hwpt_paging *hwpt_paging)
{
+ struct iommufd_group *igroup = idev->igroup;
int rc;
- lockdep_assert_held(&idev->igroup->lock);
+ lockdep_assert_held(&igroup->lock);
rc = iopt_table_enforce_dev_resv_regions(&hwpt_paging->ioas->iopt,
idev->dev,
- &idev->igroup->sw_msi_start);
+ &igroup->sw_msi_start);
if (rc)
return rc;
- if (list_empty(&idev->igroup->device_list)) {
- rc = iommufd_group_setup_msi(idev->igroup, hwpt_paging);
+ if (list_empty(&igroup->device_list)) {
+ rc = iommufd_group_setup_msi(igroup, hwpt_paging);
if (rc) {
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt,
idev->dev);
@@ -484,11 +485,12 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
+ struct iommufd_group *igroup = idev->igroup;
int rc;
- mutex_lock(&idev->igroup->lock);
+ mutex_lock(&igroup->lock);
- if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt) {
+ if (igroup->hwpt && igroup->hwpt != hwpt) {
rc = -EINVAL;
goto err_unlock;
}
@@ -506,39 +508,40 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
* reserved regions are only updated during individual device
* attachment.
*/
- if (list_empty(&idev->igroup->device_list)) {
+ if (list_empty(&igroup->device_list)) {
rc = iommufd_hwpt_attach_device(hwpt, idev, pasid);
if (rc)
goto err_unresv;
- idev->igroup->hwpt = hwpt;
+ igroup->hwpt = hwpt;
}
refcount_inc(&hwpt->obj.users);
- list_add_tail(&idev->group_item, &idev->igroup->device_list);
- mutex_unlock(&idev->igroup->lock);
+ list_add_tail(&idev->group_item, &igroup->device_list);
+ mutex_unlock(&igroup->lock);
return 0;
err_unresv:
if (attach_resv)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
err_unlock:
- mutex_unlock(&idev->igroup->lock);
+ mutex_unlock(&igroup->lock);
return rc;
}
struct iommufd_hw_pagetable *
iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
{
- struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt;
+ struct iommufd_group *igroup = idev->igroup;
+ struct iommufd_hw_pagetable *hwpt = igroup->hwpt;
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
- mutex_lock(&idev->igroup->lock);
+ mutex_lock(&igroup->lock);
list_del(&idev->group_item);
- if (list_empty(&idev->igroup->device_list)) {
+ if (list_empty(&igroup->device_list)) {
iommufd_hwpt_detach_device(hwpt, idev, pasid);
- idev->igroup->hwpt = NULL;
+ igroup->hwpt = NULL;
}
if (hwpt_paging && pasid == IOMMU_NO_PASID)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
- mutex_unlock(&idev->igroup->lock);
+ mutex_unlock(&igroup->lock);
/* Caller must destroy hwpt */
return hwpt;
@@ -610,7 +613,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
unsigned int num_devices;
int rc;
- mutex_lock(&idev->igroup->lock);
+ mutex_lock(&igroup->lock);
if (igroup->hwpt == NULL) {
rc = -EINVAL;
@@ -623,7 +626,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
}
if (hwpt == igroup->hwpt) {
- mutex_unlock(&idev->igroup->lock);
+ mutex_unlock(&igroup->lock);
return NULL;
}
@@ -654,7 +657,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
if (num_devices > 1)
WARN_ON(refcount_sub_and_test(num_devices - 1,
&old_hwpt->obj.users));
- mutex_unlock(&idev->igroup->lock);
+ mutex_unlock(&igroup->lock);
/* Caller must destroy old_hwpt */
return old_hwpt;
@@ -662,7 +665,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
if (attach_resv)
iommufd_group_remove_reserved_iova(igroup, hwpt_paging);
err_unlock:
- mutex_unlock(&idev->igroup->lock);
+ mutex_unlock(&igroup->lock);
return ERR_PTR(rc);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 06/18] iommufd/device: Add helper to detect the first attach of a group
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (4 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 05/18] iommufd/device: Replace idev->igroup with local variable Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 07/18] iommufd/device: Wrap igroup->hwpt and igroup->device_list into attach struct Yi Liu
` (13 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
The existing code detects the first attach by checking the
igroup->device_list. However, the igroup->hwpt can also be used to detect
the first attach. In future modifications, it is better to check the
igroup->hwpt instead of the device_list. To improve readbility and also
prepare for further modifications on this part, this adds a helper for it.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: rename igroup_first_attach() to iommufd_group_first_attach()
and make it as normal helper
---
drivers/iommu/iommufd/device.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 15733b316b70..2cc3c12d301d 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -329,6 +329,13 @@ iommufd_group_setup_msi(struct iommufd_group *igroup,
}
#endif
+static bool
+iommufd_group_first_attach(struct iommufd_group *igroup, ioasid_t pasid)
+{
+ lockdep_assert_held(&igroup->lock);
+ return !igroup->hwpt;
+}
+
static int
iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
struct iommufd_hwpt_paging *hwpt_paging)
@@ -344,7 +351,7 @@ iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
if (rc)
return rc;
- if (list_empty(&igroup->device_list)) {
+ if (iommufd_group_first_attach(igroup, IOMMU_NO_PASID)) {
rc = iommufd_group_setup_msi(igroup, hwpt_paging);
if (rc) {
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt,
@@ -508,7 +515,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
* reserved regions are only updated during individual device
* attachment.
*/
- if (list_empty(&igroup->device_list)) {
+ if (iommufd_group_first_attach(igroup, pasid)) {
rc = iommufd_hwpt_attach_device(hwpt, idev, pasid);
if (rc)
goto err_unresv;
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 07/18] iommufd/device: Wrap igroup->hwpt and igroup->device_list into attach struct
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (5 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 06/18] iommufd/device: Add helper to detect the first attach of a group Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 08/18] iommufd/device: Replace device_list with device_array Yi Liu
` (12 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
The igroup->hwpt and igroup->device_list are used to track the hwpt attach
of a group in the RID path. While the coming PASID path also needs such
tracking. To be prepared, wrap igroup->hwpt and igroup->device_list into
attach struct which is allocated per attaching the first device of the
group and freed per detaching the last device of the group.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 76 ++++++++++++++++++-------
drivers/iommu/iommufd/iommufd_private.h | 5 +-
2 files changed, 58 insertions(+), 23 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 2cc3c12d301d..6b4764c2d9af 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -17,12 +17,17 @@ MODULE_PARM_DESC(
"Allow IOMMUFD to bind to devices even if the platform cannot isolate "
"the MSI interrupt window. Enabling this is a security weakness.");
+struct iommufd_attach {
+ struct iommufd_hw_pagetable *hwpt;
+ struct list_head device_list;
+};
+
static void iommufd_group_release(struct kref *kref)
{
struct iommufd_group *igroup =
container_of(kref, struct iommufd_group, ref);
- WARN_ON(igroup->hwpt || !list_empty(&igroup->device_list));
+ WARN_ON(igroup->attach);
xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup,
NULL, GFP_KERNEL);
@@ -89,7 +94,6 @@ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
kref_init(&new_igroup->ref);
mutex_init(&new_igroup->lock);
- INIT_LIST_HEAD(&new_igroup->device_list);
new_igroup->sw_msi_start = PHYS_ADDR_MAX;
/* group reference moves into new_igroup */
new_igroup->group = group;
@@ -333,7 +337,7 @@ static bool
iommufd_group_first_attach(struct iommufd_group *igroup, ioasid_t pasid)
{
lockdep_assert_held(&igroup->lock);
- return !igroup->hwpt;
+ return !igroup->attach;
}
static int
@@ -369,7 +373,7 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev)
{
struct iommufd_device *cur;
- list_for_each_entry(cur, &idev->igroup->device_list, group_item)
+ list_for_each_entry(cur, &idev->igroup->attach->device_list, group_item)
if (cur == idev)
return true;
return false;
@@ -493,19 +497,33 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
struct iommufd_group *igroup = idev->igroup;
+ struct iommufd_hw_pagetable *old_hwpt;
+ struct iommufd_attach *attach;
int rc;
mutex_lock(&igroup->lock);
- if (igroup->hwpt && igroup->hwpt != hwpt) {
+ attach = igroup->attach;
+ if (!attach) {
+ attach = kzalloc(sizeof(*attach), GFP_KERNEL);
+ if (!attach) {
+ rc = -ENOMEM;
+ goto err_unlock;
+ }
+ INIT_LIST_HEAD(&attach->device_list);
+ }
+
+ old_hwpt = attach->hwpt;
+
+ if (old_hwpt && old_hwpt != hwpt) {
rc = -EINVAL;
- goto err_unlock;
+ goto err_free_attach;
}
if (attach_resv) {
rc = iommufd_device_attach_reserved_iova(idev, hwpt_paging);
if (rc)
- goto err_unlock;
+ goto err_free_attach;
}
/*
@@ -519,15 +537,19 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
rc = iommufd_hwpt_attach_device(hwpt, idev, pasid);
if (rc)
goto err_unresv;
- igroup->hwpt = hwpt;
+ attach->hwpt = hwpt;
+ igroup->attach = attach;
}
refcount_inc(&hwpt->obj.users);
- list_add_tail(&idev->group_item, &igroup->device_list);
+ list_add_tail(&idev->group_item, &attach->device_list);
mutex_unlock(&igroup->lock);
return 0;
err_unresv:
if (attach_resv)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
+err_free_attach:
+ if (iommufd_group_first_attach(igroup, pasid))
+ kfree(attach);
err_unlock:
mutex_unlock(&igroup->lock);
return rc;
@@ -537,14 +559,20 @@ struct iommufd_hw_pagetable *
iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_group *igroup = idev->igroup;
- struct iommufd_hw_pagetable *hwpt = igroup->hwpt;
- struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommufd_hw_pagetable *hwpt;
+ struct iommufd_attach *attach;
mutex_lock(&igroup->lock);
+ attach = igroup->attach;
+ hwpt = attach->hwpt;
+ hwpt_paging = find_hwpt_paging(hwpt);
+
list_del(&idev->group_item);
- if (list_empty(&igroup->device_list)) {
+ if (list_empty(&attach->device_list)) {
iommufd_hwpt_detach_device(hwpt, idev, pasid);
- igroup->hwpt = NULL;
+ igroup->attach = NULL;
+ kfree(attach);
}
if (hwpt_paging && pasid == IOMMU_NO_PASID)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
@@ -574,7 +602,7 @@ iommufd_group_remove_reserved_iova(struct iommufd_group *igroup,
lockdep_assert_held(&igroup->lock);
- list_for_each_entry(cur, &igroup->device_list, group_item)
+ list_for_each_entry(cur, &igroup->attach->device_list, group_item)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev);
}
@@ -588,9 +616,10 @@ iommufd_group_do_replace_reserved_iova(struct iommufd_group *igroup,
lockdep_assert_held(&igroup->lock);
- old_hwpt_paging = find_hwpt_paging(igroup->hwpt);
+ old_hwpt_paging = find_hwpt_paging(igroup->attach->hwpt);
if (!old_hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas) {
- list_for_each_entry(cur, &igroup->device_list, group_item) {
+ list_for_each_entry(cur,
+ &igroup->attach->device_list, group_item) {
rc = iopt_table_enforce_dev_resv_regions(
&hwpt_paging->ioas->iopt, cur->dev, NULL);
if (rc)
@@ -617,27 +646,32 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_hwpt_paging *old_hwpt_paging;
struct iommufd_group *igroup = idev->igroup;
struct iommufd_hw_pagetable *old_hwpt;
+ struct iommufd_attach *attach;
unsigned int num_devices;
int rc;
mutex_lock(&igroup->lock);
- if (igroup->hwpt == NULL) {
+ attach = igroup->attach;
+ if (!attach) {
rc = -EINVAL;
goto err_unlock;
}
+ old_hwpt = attach->hwpt;
+
+ WARN_ON(!old_hwpt || list_empty(&attach->device_list));
+
if (!iommufd_device_is_attached(idev)) {
rc = -EINVAL;
goto err_unlock;
}
- if (hwpt == igroup->hwpt) {
+ if (hwpt == old_hwpt) {
mutex_unlock(&igroup->lock);
return NULL;
}
- old_hwpt = igroup->hwpt;
if (attach_resv) {
rc = iommufd_group_do_replace_reserved_iova(igroup, hwpt_paging);
if (rc)
@@ -653,9 +687,9 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
(!hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas))
iommufd_group_remove_reserved_iova(igroup, old_hwpt_paging);
- igroup->hwpt = hwpt;
+ attach->hwpt = hwpt;
- num_devices = list_count_nodes(&igroup->device_list);
+ num_devices = list_count_nodes(&attach->device_list);
/*
* Move the refcounts held by the device_list to the new hwpt. Retain a
* refcount for this thread as the caller will free it.
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 891800948d1a..5b4d8962166b 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -399,13 +399,14 @@ static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx,
refcount_dec(&hwpt->obj.users);
}
+struct iommufd_attach;
+
struct iommufd_group {
struct kref ref;
struct mutex lock;
struct iommufd_ctx *ictx;
struct iommu_group *group;
- struct iommufd_hw_pagetable *hwpt;
- struct list_head device_list;
+ struct iommufd_attach *attach;
struct iommufd_sw_msi_maps required_sw_msi;
phys_addr_t sw_msi_start;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 08/18] iommufd/device: Replace device_list with device_array
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (6 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 07/18] iommufd/device: Wrap igroup->hwpt and igroup->device_list into attach struct Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 09/18] iommufd/device: Add pasid_attach array to track per-PASID attach Yi Liu
` (11 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
igroup->attach->device_list is used to track attached device of a group
in the RID path. Such tracking is also needed in the PASID path in order
to share path with the RID path.
While there is only one list_head in the iommufd_device. It cannot work
if the device has been attached in both RID path and PASID path. To solve
it, replacing the device_list with an xarray. The attached iommufd_device
is stored in the entry indexed by the idev->obj.id.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 58 +++++++++++++++++++++++-----------
1 file changed, 39 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 6b4764c2d9af..760917f5d764 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -19,7 +19,7 @@ MODULE_PARM_DESC(
struct iommufd_attach {
struct iommufd_hw_pagetable *hwpt;
- struct list_head device_list;
+ struct xarray device_array;
};
static void iommufd_group_release(struct kref *kref)
@@ -297,6 +297,20 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
}
EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, "IOMMUFD");
+static unsigned int iommufd_group_device_num(struct iommufd_group *igroup)
+{
+ struct iommufd_device *idev;
+ unsigned int count = 0;
+ unsigned long index;
+
+ lockdep_assert_held(&igroup->lock);
+
+ if (igroup->attach)
+ xa_for_each(&igroup->attach->device_array, index, idev)
+ count++;
+ return count;
+}
+
#ifdef CONFIG_IRQ_MSI_IOMMU
static int iommufd_group_setup_msi(struct iommufd_group *igroup,
struct iommufd_hwpt_paging *hwpt_paging)
@@ -371,12 +385,7 @@ iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
/* Check if idev is attached to igroup->hwpt */
static bool iommufd_device_is_attached(struct iommufd_device *idev)
{
- struct iommufd_device *cur;
-
- list_for_each_entry(cur, &idev->igroup->attach->device_list, group_item)
- if (cur == idev)
- return true;
- return false;
+ return xa_load(&idev->igroup->attach->device_array, idev->obj.id);
}
static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
@@ -510,20 +519,27 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
rc = -ENOMEM;
goto err_unlock;
}
- INIT_LIST_HEAD(&attach->device_list);
+ xa_init(&attach->device_array);
}
old_hwpt = attach->hwpt;
+ rc = xa_insert(&attach->device_array, idev->obj.id, XA_ZERO_ENTRY,
+ GFP_KERNEL);
+ if (rc) {
+ WARN_ON(rc == -EBUSY && !old_hwpt);
+ goto err_free_attach;
+ }
+
if (old_hwpt && old_hwpt != hwpt) {
rc = -EINVAL;
- goto err_free_attach;
+ goto err_release_devid;
}
if (attach_resv) {
rc = iommufd_device_attach_reserved_iova(idev, hwpt_paging);
if (rc)
- goto err_free_attach;
+ goto err_release_devid;
}
/*
@@ -541,12 +557,15 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
igroup->attach = attach;
}
refcount_inc(&hwpt->obj.users);
- list_add_tail(&idev->group_item, &attach->device_list);
+ WARN_ON(xa_is_err(xa_store(&attach->device_array, idev->obj.id,
+ idev, GFP_KERNEL)));
mutex_unlock(&igroup->lock);
return 0;
err_unresv:
if (attach_resv)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
+err_release_devid:
+ xa_release(&attach->device_array, idev->obj.id);
err_free_attach:
if (iommufd_group_first_attach(igroup, pasid))
kfree(attach);
@@ -568,8 +587,8 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
hwpt = attach->hwpt;
hwpt_paging = find_hwpt_paging(hwpt);
- list_del(&idev->group_item);
- if (list_empty(&attach->device_list)) {
+ xa_erase(&attach->device_array, idev->obj.id);
+ if (xa_empty(&attach->device_array)) {
iommufd_hwpt_detach_device(hwpt, idev, pasid);
igroup->attach = NULL;
kfree(attach);
@@ -599,10 +618,11 @@ iommufd_group_remove_reserved_iova(struct iommufd_group *igroup,
struct iommufd_hwpt_paging *hwpt_paging)
{
struct iommufd_device *cur;
+ unsigned long index;
lockdep_assert_held(&igroup->lock);
- list_for_each_entry(cur, &igroup->attach->device_list, group_item)
+ xa_for_each(&igroup->attach->device_array, index, cur)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev);
}
@@ -612,14 +632,14 @@ iommufd_group_do_replace_reserved_iova(struct iommufd_group *igroup,
{
struct iommufd_hwpt_paging *old_hwpt_paging;
struct iommufd_device *cur;
+ unsigned long index;
int rc;
lockdep_assert_held(&igroup->lock);
old_hwpt_paging = find_hwpt_paging(igroup->attach->hwpt);
if (!old_hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas) {
- list_for_each_entry(cur,
- &igroup->attach->device_list, group_item) {
+ xa_for_each(&igroup->attach->device_array, index, cur) {
rc = iopt_table_enforce_dev_resv_regions(
&hwpt_paging->ioas->iopt, cur->dev, NULL);
if (rc)
@@ -660,7 +680,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
old_hwpt = attach->hwpt;
- WARN_ON(!old_hwpt || list_empty(&attach->device_list));
+ WARN_ON(!old_hwpt || xa_empty(&attach->device_array));
if (!iommufd_device_is_attached(idev)) {
rc = -EINVAL;
@@ -689,9 +709,9 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
attach->hwpt = hwpt;
- num_devices = list_count_nodes(&attach->device_list);
+ num_devices = iommufd_group_device_num(igroup);
/*
- * Move the refcounts held by the device_list to the new hwpt. Retain a
+ * Move the refcounts held by the device_array to the new hwpt. Retain a
* refcount for this thread as the caller will free it.
*/
refcount_add(num_devices, &hwpt->obj.users);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 09/18] iommufd/device: Add pasid_attach array to track per-PASID attach
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (7 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 08/18] iommufd/device: Replace device_list with device_array Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 10/18] iommufd: Enforce PASID-compatible domain in PASID path Yi Liu
` (10 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
PASIDs of PASID-capable device can be attached to hwpt separately, hence
a pasid array to track per-PASID attachment is necessary. The index
IOMMU_NO_PASID is used by the RID path. Hence drop the igroup->attach.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Use xa_load() in iommufd_device_do_replace()
---
drivers/iommu/iommufd/device.c | 59 +++++++++++++++++--------
drivers/iommu/iommufd/iommufd_private.h | 2 +-
2 files changed, 41 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 760917f5d764..175f3d39baaa 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -27,7 +27,7 @@ static void iommufd_group_release(struct kref *kref)
struct iommufd_group *igroup =
container_of(kref, struct iommufd_group, ref);
- WARN_ON(igroup->attach);
+ WARN_ON(!xa_empty(&igroup->pasid_attach));
xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup,
NULL, GFP_KERNEL);
@@ -94,6 +94,7 @@ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
kref_init(&new_igroup->ref);
mutex_init(&new_igroup->lock);
+ xa_init(&new_igroup->pasid_attach);
new_igroup->sw_msi_start = PHYS_ADDR_MAX;
/* group reference moves into new_igroup */
new_igroup->group = group;
@@ -297,16 +298,19 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
}
EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, "IOMMUFD");
-static unsigned int iommufd_group_device_num(struct iommufd_group *igroup)
+static unsigned int iommufd_group_device_num(struct iommufd_group *igroup,
+ ioasid_t pasid)
{
+ struct iommufd_attach *attach;
struct iommufd_device *idev;
unsigned int count = 0;
unsigned long index;
lockdep_assert_held(&igroup->lock);
- if (igroup->attach)
- xa_for_each(&igroup->attach->device_array, index, idev)
+ attach = xa_load(&igroup->pasid_attach, pasid);
+ if (attach)
+ xa_for_each(&attach->device_array, index, idev)
count++;
return count;
}
@@ -351,7 +355,7 @@ static bool
iommufd_group_first_attach(struct iommufd_group *igroup, ioasid_t pasid)
{
lockdep_assert_held(&igroup->lock);
- return !igroup->attach;
+ return !xa_load(&igroup->pasid_attach, pasid);
}
static int
@@ -382,10 +386,13 @@ iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
/* The device attach/detach/replace helpers for attach_handle */
-/* Check if idev is attached to igroup->hwpt */
-static bool iommufd_device_is_attached(struct iommufd_device *idev)
+static bool iommufd_device_is_attached(struct iommufd_device *idev,
+ ioasid_t pasid)
{
- return xa_load(&idev->igroup->attach->device_array, idev->obj.id);
+ struct iommufd_attach *attach;
+
+ attach = xa_load(&idev->igroup->pasid_attach, pasid);
+ return xa_load(&attach->device_array, idev->obj.id);
}
static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
@@ -512,12 +519,18 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
mutex_lock(&igroup->lock);
- attach = igroup->attach;
+ attach = xa_cmpxchg(&igroup->pasid_attach, pasid, NULL,
+ XA_ZERO_ENTRY, GFP_KERNEL);
+ if (xa_is_err(attach)) {
+ rc = xa_err(attach);
+ goto err_unlock;
+ }
+
if (!attach) {
attach = kzalloc(sizeof(*attach), GFP_KERNEL);
if (!attach) {
rc = -ENOMEM;
- goto err_unlock;
+ goto err_release_pasid;
}
xa_init(&attach->device_array);
}
@@ -554,7 +567,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
if (rc)
goto err_unresv;
attach->hwpt = hwpt;
- igroup->attach = attach;
+ WARN_ON(xa_is_err(xa_store(&igroup->pasid_attach, pasid, attach,
+ GFP_KERNEL)));
}
refcount_inc(&hwpt->obj.users);
WARN_ON(xa_is_err(xa_store(&attach->device_array, idev->obj.id,
@@ -569,6 +583,9 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
err_free_attach:
if (iommufd_group_first_attach(igroup, pasid))
kfree(attach);
+err_release_pasid:
+ if (iommufd_group_first_attach(igroup, pasid))
+ xa_release(&igroup->pasid_attach, pasid);
err_unlock:
mutex_unlock(&igroup->lock);
return rc;
@@ -583,14 +600,14 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
struct iommufd_attach *attach;
mutex_lock(&igroup->lock);
- attach = igroup->attach;
+ attach = xa_load(&igroup->pasid_attach, pasid);
hwpt = attach->hwpt;
hwpt_paging = find_hwpt_paging(hwpt);
xa_erase(&attach->device_array, idev->obj.id);
if (xa_empty(&attach->device_array)) {
iommufd_hwpt_detach_device(hwpt, idev, pasid);
- igroup->attach = NULL;
+ xa_erase(&igroup->pasid_attach, pasid);
kfree(attach);
}
if (hwpt_paging && pasid == IOMMU_NO_PASID)
@@ -617,12 +634,14 @@ static void
iommufd_group_remove_reserved_iova(struct iommufd_group *igroup,
struct iommufd_hwpt_paging *hwpt_paging)
{
+ struct iommufd_attach *attach;
struct iommufd_device *cur;
unsigned long index;
lockdep_assert_held(&igroup->lock);
- xa_for_each(&igroup->attach->device_array, index, cur)
+ attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID);
+ xa_for_each(&attach->device_array, index, cur)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev);
}
@@ -631,15 +650,17 @@ iommufd_group_do_replace_reserved_iova(struct iommufd_group *igroup,
struct iommufd_hwpt_paging *hwpt_paging)
{
struct iommufd_hwpt_paging *old_hwpt_paging;
+ struct iommufd_attach *attach;
struct iommufd_device *cur;
unsigned long index;
int rc;
lockdep_assert_held(&igroup->lock);
- old_hwpt_paging = find_hwpt_paging(igroup->attach->hwpt);
+ attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID);
+ old_hwpt_paging = find_hwpt_paging(attach->hwpt);
if (!old_hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas) {
- xa_for_each(&igroup->attach->device_array, index, cur) {
+ xa_for_each(&attach->device_array, index, cur) {
rc = iopt_table_enforce_dev_resv_regions(
&hwpt_paging->ioas->iopt, cur->dev, NULL);
if (rc)
@@ -672,7 +693,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
mutex_lock(&igroup->lock);
- attach = igroup->attach;
+ attach = xa_load(&igroup->pasid_attach, pasid);
if (!attach) {
rc = -EINVAL;
goto err_unlock;
@@ -682,7 +703,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
WARN_ON(!old_hwpt || xa_empty(&attach->device_array));
- if (!iommufd_device_is_attached(idev)) {
+ if (!iommufd_device_is_attached(idev, pasid)) {
rc = -EINVAL;
goto err_unlock;
}
@@ -709,7 +730,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
attach->hwpt = hwpt;
- num_devices = iommufd_group_device_num(igroup);
+ num_devices = iommufd_group_device_num(igroup, pasid);
/*
* Move the refcounts held by the device_array to the new hwpt. Retain a
* refcount for this thread as the caller will free it.
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 5b4d8962166b..85467f53bdb2 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -406,7 +406,7 @@ struct iommufd_group {
struct mutex lock;
struct iommufd_ctx *ictx;
struct iommu_group *group;
- struct iommufd_attach *attach;
+ struct xarray pasid_attach;
struct iommufd_sw_msi_maps required_sw_msi;
phys_addr_t sw_msi_start;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 10/18] iommufd: Enforce PASID-compatible domain in PASID path
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (8 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 09/18] iommufd/device: Add pasid_attach array to track per-PASID attach Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 11/18] iommufd: Support pasid attach/replace Yi Liu
` (9 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
AMD IOMMU requires attaching PASID-compatible domains to PASID-capable
devices. This includes the domains attached to RID and PASIDs. Related
discussions in link [1] and [2]. ARM also has such a requirement, Intel
does not need it, but can live up with it. Hence, iommufd is going to
enforce this requirement as it is not harmful to vendors that do not
need it.
Mark the PASID-compatible domains and enforce it in the PASID path.
[1] https://lore.kernel.org/linux-iommu/20240709182303.GK14050@ziepe.ca/
[2] https://lore.kernel.org/linux-iommu/20240822124433.GD3468552@ziepe.ca/
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 17 +++++++++++++++++
drivers/iommu/iommufd/hw_pagetable.c | 3 +++
drivers/iommu/iommufd/iommufd_private.h | 1 +
3 files changed, 21 insertions(+)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 175f3d39baaa..ba21b81e43bc 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -395,6 +395,15 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev,
return xa_load(&attach->device_array, idev->obj.id);
}
+static int iommufd_hwpt_pasid_compat(struct iommufd_hw_pagetable *hwpt,
+ struct iommufd_device *idev,
+ ioasid_t pasid)
+{
+ if (pasid != IOMMU_NO_PASID && !hwpt->pasid_compat)
+ return -EINVAL;
+ return 0;
+}
+
static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev,
ioasid_t pasid)
@@ -404,6 +413,10 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
lockdep_assert_held(&idev->igroup->lock);
+ rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid);
+ if (rc)
+ return rc;
+
handle = kzalloc(sizeof(*handle), GFP_KERNEL);
if (!handle)
return -ENOMEM;
@@ -472,6 +485,10 @@ static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
WARN_ON(pasid != IOMMU_NO_PASID);
+ rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid);
+ if (rc)
+ return rc;
+
old_handle = iommufd_device_get_attach_handle(idev, pasid);
handle = kzalloc(sizeof(*handle), GFP_KERNEL);
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index bd9dd26a5295..3724533a23c9 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -136,6 +136,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
if (IS_ERR(hwpt_paging))
return ERR_CAST(hwpt_paging);
hwpt = &hwpt_paging->common;
+ hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID;
INIT_LIST_HEAD(&hwpt_paging->hwpt_item);
/* Pairs with iommufd_hw_pagetable_destroy() */
@@ -244,6 +245,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
if (IS_ERR(hwpt_nested))
return ERR_CAST(hwpt_nested);
hwpt = &hwpt_nested->common;
+ hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID;
refcount_inc(&parent->common.obj.users);
hwpt_nested->parent = parent;
@@ -300,6 +302,7 @@ iommufd_viommu_alloc_hwpt_nested(struct iommufd_viommu *viommu, u32 flags,
if (IS_ERR(hwpt_nested))
return ERR_CAST(hwpt_nested);
hwpt = &hwpt_nested->common;
+ hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID;
hwpt_nested->viommu = viommu;
refcount_inc(&viommu->obj.users);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 85467f53bdb2..80e8c76d25f2 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -299,6 +299,7 @@ struct iommufd_hw_pagetable {
struct iommufd_object obj;
struct iommu_domain *domain;
struct iommufd_fault *fault;
+ bool pasid_compat : 1;
};
struct iommufd_hwpt_paging {
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 11/18] iommufd: Support pasid attach/replace
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (9 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 10/18] iommufd: Enforce PASID-compatible domain in PASID path Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 12/18] iommufd: Enforce PASID-compatible domain for RID Yi Liu
` (8 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
This extends the below APIs to support PASID. Device drivers to manage pasid
attach/replace/detach.
int iommufd_device_attach(struct iommufd_device *idev,
ioasid_t pasid, u32 *pt_id);
int iommufd_device_replace(struct iommufd_device *idev,
ioasid_t pasid, u32 *pt_id);
void iommufd_device_detach(struct iommufd_device *idev,
ioasid_t pasid);
The pasid operations share underlying attach/replace/detach infrastructure
with the device operations, but still have some different implications:
- no reserved region per pasid otherwise SVA architecture is already
broken (CPU address space doesn't count device reserved regions);
- accordingly no sw_msi trick;
Cache coherency enforcement is still applied to pasid operations since
it is about memory accesses post page table walking (no matter the walk
is per RID or per PASID).
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Fix a memleak issue due to the broken order in the
iommufd_hwpt_detach_device()
---
drivers/iommu/iommufd/device.c | 59 ++++++++++++++++++++------------
drivers/iommu/iommufd/selftest.c | 8 ++---
drivers/vfio/iommufd.c | 10 +++---
include/linux/iommufd.h | 9 +++--
4 files changed, 53 insertions(+), 33 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index ba21b81e43bc..4cc6de03f76e 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -428,9 +428,12 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
}
handle->idev = idev;
- WARN_ON(pasid != IOMMU_NO_PASID);
- rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group,
- &handle->handle);
+ if (pasid == IOMMU_NO_PASID)
+ rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group,
+ &handle->handle);
+ else
+ rc = iommu_attach_device_pasid(hwpt->domain, idev->dev, pasid,
+ &handle->handle);
if (rc)
goto out_disable_iopf;
@@ -464,10 +467,12 @@ static void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt,
{
struct iommufd_attach_handle *handle;
- WARN_ON(pasid != IOMMU_NO_PASID);
-
handle = iommufd_device_get_attach_handle(idev, pasid);
- iommu_detach_group_handle(hwpt->domain, idev->igroup->group);
+ if (pasid == IOMMU_NO_PASID)
+ iommu_detach_group_handle(hwpt->domain, idev->igroup->group);
+ else
+ iommu_detach_device_pasid(hwpt->domain, idev->dev, pasid);
+
if (hwpt->fault) {
iommufd_auto_response_faults(hwpt, handle);
iommufd_fault_iopf_disable(idev);
@@ -483,8 +488,6 @@ static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
struct iommufd_attach_handle *handle, *old_handle;
int rc;
- WARN_ON(pasid != IOMMU_NO_PASID);
-
rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid);
if (rc)
return rc;
@@ -502,8 +505,12 @@ static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
}
handle->idev = idev;
- rc = iommu_replace_group_handle(idev->igroup->group, hwpt->domain,
- &handle->handle);
+ if (pasid == IOMMU_NO_PASID)
+ rc = iommu_replace_group_handle(idev->igroup->group,
+ hwpt->domain, &handle->handle);
+ else
+ rc = iommu_replace_device_pasid(hwpt->domain, idev->dev,
+ pasid, &handle->handle);
if (rc)
goto out_disable_iopf;
@@ -904,22 +911,25 @@ static int iommufd_device_change_pt(struct iommufd_device *idev,
}
/**
- * iommufd_device_attach - Connect a device to an iommu_domain
+ * iommufd_device_attach - Connect a device/pasid to an iommu_domain
* @idev: device to attach
+ * @pasid: pasid to attach
* @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING
* Output the IOMMUFD_OBJ_HWPT_PAGING ID
*
- * This connects the device to an iommu_domain, either automatically or manually
- * selected. Once this completes the device could do DMA.
+ * This connects the device/pasid to an iommu_domain, either automatically
+ * or manually selected. Once this completes the device could do DMA with
+ * @pasid. @pasid is IOMMU_NO_PASID if this attach is for no pasid usage.
*
* The caller should return the resulting pt_id back to userspace.
* This function is undone by calling iommufd_device_detach().
*/
-int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
+int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid,
+ u32 *pt_id)
{
int rc;
- rc = iommufd_device_change_pt(idev, IOMMU_NO_PASID, pt_id,
+ rc = iommufd_device_change_pt(idev, pasid, pt_id,
&iommufd_device_do_attach);
if (rc)
return rc;
@@ -934,8 +944,9 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, "IOMMUFD");
/**
- * iommufd_device_replace - Change the device's iommu_domain
+ * iommufd_device_replace - Change the device/pasid's iommu_domain
* @idev: device to change
+ * @pasid: pasid to change
* @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING
* Output the IOMMUFD_OBJ_HWPT_PAGING ID
*
@@ -946,27 +957,31 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, "IOMMUFD");
*
* If it fails then no change is made to the attachment. The iommu driver may
* implement this so there is no disruption in translation. This can only be
- * called if iommufd_device_attach() has already succeeded.
+ * called if iommufd_device_attach() has already succeeded. @pasid is
+ * IOMMU_NO_PASID for no pasid usage.
*/
-int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id)
+int iommufd_device_replace(struct iommufd_device *idev, ioasid_t pasid,
+ u32 *pt_id)
{
- return iommufd_device_change_pt(idev, IOMMU_NO_PASID, pt_id,
+ return iommufd_device_change_pt(idev, pasid, pt_id,
&iommufd_device_do_replace);
}
EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, "IOMMUFD");
/**
- * iommufd_device_detach - Disconnect a device to an iommu_domain
+ * iommufd_device_detach - Disconnect a device/device to an iommu_domain
* @idev: device to detach
+ * @pasid: pasid to detach
*
* Undo iommufd_device_attach(). This disconnects the idev from the previously
* attached pt_id. The device returns back to a blocked DMA translation.
+ * @pasid is IOMMU_NO_PASID for no pasid usage.
*/
-void iommufd_device_detach(struct iommufd_device *idev)
+void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_hw_pagetable *hwpt;
- hwpt = iommufd_hw_pagetable_detach(idev, IOMMU_NO_PASID);
+ hwpt = iommufd_hw_pagetable_detach(idev, pasid);
iommufd_hw_pagetable_put(idev->ictx, hwpt);
refcount_dec(&idev->obj.users);
}
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index d55dde28e9bc..0b3f5cbf242b 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -945,7 +945,7 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd,
}
sobj->idev.idev = idev;
- rc = iommufd_device_attach(idev, &pt_id);
+ rc = iommufd_device_attach(idev, IOMMU_NO_PASID, &pt_id);
if (rc)
goto out_unbind;
@@ -960,7 +960,7 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd,
return 0;
out_detach:
- iommufd_device_detach(idev);
+ iommufd_device_detach(idev, IOMMU_NO_PASID);
out_unbind:
iommufd_device_unbind(idev);
out_mdev:
@@ -994,7 +994,7 @@ static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd,
goto out_dev_obj;
}
- rc = iommufd_device_replace(sobj->idev.idev, &pt_id);
+ rc = iommufd_device_replace(sobj->idev.idev, IOMMU_NO_PASID, &pt_id);
if (rc)
goto out_dev_obj;
@@ -1655,7 +1655,7 @@ void iommufd_selftest_destroy(struct iommufd_object *obj)
switch (sobj->type) {
case TYPE_IDEV:
- iommufd_device_detach(sobj->idev.idev);
+ iommufd_device_detach(sobj->idev.idev, IOMMU_NO_PASID);
iommufd_device_unbind(sobj->idev.idev);
mock_dev_destroy(sobj->idev.mock_dev);
break;
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 516294fd901b..37e1efa2c7bf 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -128,7 +128,7 @@ void vfio_iommufd_physical_unbind(struct vfio_device *vdev)
lockdep_assert_held(&vdev->dev_set->lock);
if (vdev->iommufd_attached) {
- iommufd_device_detach(vdev->iommufd_device);
+ iommufd_device_detach(vdev->iommufd_device, IOMMU_NO_PASID);
vdev->iommufd_attached = false;
}
iommufd_device_unbind(vdev->iommufd_device);
@@ -146,9 +146,11 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
return -EINVAL;
if (vdev->iommufd_attached)
- rc = iommufd_device_replace(vdev->iommufd_device, pt_id);
+ rc = iommufd_device_replace(vdev->iommufd_device,
+ IOMMU_NO_PASID, pt_id);
else
- rc = iommufd_device_attach(vdev->iommufd_device, pt_id);
+ rc = iommufd_device_attach(vdev->iommufd_device,
+ IOMMU_NO_PASID, pt_id);
if (rc)
return rc;
vdev->iommufd_attached = true;
@@ -163,7 +165,7 @@ void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev)
if (WARN_ON(!vdev->iommufd_device) || !vdev->iommufd_attached)
return;
- iommufd_device_detach(vdev->iommufd_device);
+ iommufd_device_detach(vdev->iommufd_device, IOMMU_NO_PASID);
vdev->iommufd_attached = false;
}
EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas);
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 60eff9272551..34b6e6ca4bfa 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -8,6 +8,7 @@
#include <linux/err.h>
#include <linux/errno.h>
+#include <linux/iommu.h>
#include <linux/refcount.h>
#include <linux/types.h>
#include <linux/xarray.h>
@@ -54,9 +55,11 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
struct device *dev, u32 *id);
void iommufd_device_unbind(struct iommufd_device *idev);
-int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id);
-int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id);
-void iommufd_device_detach(struct iommufd_device *idev);
+int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid,
+ u32 *pt_id);
+int iommufd_device_replace(struct iommufd_device *idev, ioasid_t pasid,
+ u32 *pt_id);
+void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid);
struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev);
u32 iommufd_device_to_id(struct iommufd_device *idev);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 12/18] iommufd: Enforce PASID-compatible domain for RID
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (10 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 11/18] iommufd: Support pasid attach/replace Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 13/18] iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support Yi Liu
` (7 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
Per the definition of IOMMU_HWPT_ALLOC_PASID, iommufd needs to enforce
the RID to use PASID-compatible domain if PASID has been attached, and
vice versa. The PASID path has already enforced it. This adds the
enforcement in the RID path.
This enforcement requires a lock across the RID and PASID attach path,
the idev->igroup->lock is used as both the RID and the PASID path holds
it.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 4cc6de03f76e..1605f6c0e1ee 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -399,8 +399,28 @@ static int iommufd_hwpt_pasid_compat(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev,
ioasid_t pasid)
{
- if (pasid != IOMMU_NO_PASID && !hwpt->pasid_compat)
- return -EINVAL;
+ struct iommufd_group *igroup = idev->igroup;
+
+ lockdep_assert_held(&igroup->lock);
+
+ if (pasid == IOMMU_NO_PASID) {
+ unsigned long start = IOMMU_NO_PASID;
+
+ if (!hwpt->pasid_compat &&
+ xa_find_after(&igroup->pasid_attach,
+ &start, UINT_MAX, XA_PRESENT))
+ return -EINVAL;
+ } else {
+ struct iommufd_attach *attach;
+
+ if (!hwpt->pasid_compat)
+ return -EINVAL;
+
+ attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID);
+ if (attach && attach->hwpt && !attach->hwpt->pasid_compat)
+ return -EINVAL;
+ }
+
return 0;
}
@@ -411,8 +431,6 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
struct iommufd_attach_handle *handle;
int rc;
- lockdep_assert_held(&idev->igroup->lock);
-
rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid);
if (rc)
return rc;
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 13/18] iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (11 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 12/18] iommufd: Enforce PASID-compatible domain for RID Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 14/18] iommufd: Allow allocating PASID-compatible domain Yi Liu
` (6 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
Intel iommu driver just treats it as a nop since Intel VT-d does not have
special requirement on domains attached to either the PASID or RID of a
PASID-capable device.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 3 ++-
drivers/iommu/intel/nested.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index cc46098f875b..7bc890609b90 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3338,7 +3338,8 @@ intel_iommu_domain_alloc_paging_flags(struct device *dev, u32 flags,
bool first_stage;
if (flags &
- (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING)))
+ (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
+ IOMMU_HWPT_ALLOC_PASID)))
return ERR_PTR(-EOPNOTSUPP);
if (nested_parent && !nested_supported(iommu))
return ERR_PTR(-EOPNOTSUPP);
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index aba92c00b427..6ac5c534bef4 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -198,7 +198,7 @@ intel_iommu_domain_alloc_nested(struct device *dev, struct iommu_domain *parent,
struct dmar_domain *domain;
int ret;
- if (!nested_supported(iommu) || flags)
+ if (!nested_supported(iommu) || flags & ~IOMMU_HWPT_ALLOC_PASID)
return ERR_PTR(-EOPNOTSUPP);
/* Must be nested domain */
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 14/18] iommufd: Allow allocating PASID-compatible domain
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (12 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 13/18] iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 15/18] iommufd/selftest: Add set_dev_pasid in mock iommu Yi Liu
` (5 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
The underlying infrastructure has supported the PASID attach and related
enforcement per the requirement of the IOMMU_HWPT_ALLOC_PASID flag. This
extends iommufd to support PASID compatible domain requested by userspace.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: make auto_hwpt always non-pasid-compat to avoid confussions
between RID and PASID path
---
drivers/iommu/iommufd/hw_pagetable.c | 7 ++++---
include/uapi/linux/iommufd.h | 3 +++
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 3724533a23c9..487779470261 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -112,7 +112,8 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
{
const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
- IOMMU_HWPT_FAULT_ID_VALID;
+ IOMMU_HWPT_FAULT_ID_VALID |
+ IOMMU_HWPT_ALLOC_PASID;
const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hwpt_paging *hwpt_paging;
struct iommufd_hw_pagetable *hwpt;
@@ -233,7 +234,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
struct iommufd_hw_pagetable *hwpt;
int rc;
- if ((flags & ~IOMMU_HWPT_FAULT_ID_VALID) ||
+ if ((flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID)) ||
!user_data->len || !ops->domain_alloc_nested)
return ERR_PTR(-EOPNOTSUPP);
if (parent->auto_domain || !parent->nest_parent ||
@@ -290,7 +291,7 @@ iommufd_viommu_alloc_hwpt_nested(struct iommufd_viommu *viommu, u32 flags,
struct iommufd_hw_pagetable *hwpt;
int rc;
- if (flags & ~IOMMU_HWPT_FAULT_ID_VALID)
+ if (flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID))
return ERR_PTR(-EOPNOTSUPP);
if (!user_data->len)
return ERR_PTR(-EOPNOTSUPP);
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 8719d4f5d618..6901804ec736 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -393,6 +393,9 @@ struct iommu_vfio_ioas {
* Any domain attached to the non-PASID part of the
* device must also be flagged, otherwise attaching a
* PASID will blocked.
+ * For the user that wants to attach PASID, ioas is
+ * not recommended for both the non-PASID part
+ * and PASID part of the device.
* If IOMMU does not support PASID it will return
* error (-EOPNOTSUPP).
*/
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 15/18] iommufd/selftest: Add set_dev_pasid in mock iommu
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (13 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 14/18] iommufd: Allow allocating PASID-compatible domain Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 16/18] iommufd/selftest: Add a helper to get test device Yi Liu
` (4 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
The callback is needed to make pasid_attach/detach path complete for mock
device. A nop is enough for set_dev_pasid.
A MOCK_FLAGS_DEVICE_PASID is added to indicate a pasid-capable mock device
for the pasid test cases. Other test cases will still create a non-pasid
mock device. While the mock iommu always pretends to be pasid-capable.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Minor tweaks. Detail refer to discussion in v10 of this patch
---
drivers/iommu/iommufd/iommufd_test.h | 4 +++
drivers/iommu/iommufd/selftest.c | 37 ++++++++++++++++++++++++----
2 files changed, 36 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index 87e9165cea27..1a066feb8697 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -49,6 +49,7 @@ enum {
enum {
MOCK_FLAGS_DEVICE_NO_DIRTY = 1 << 0,
MOCK_FLAGS_DEVICE_HUGE_IOVA = 1 << 1,
+ MOCK_FLAGS_DEVICE_PASID = 1 << 2,
};
enum {
@@ -154,6 +155,9 @@ struct iommu_test_cmd {
};
#define IOMMU_TEST_CMD _IO(IOMMUFD_TYPE, IOMMUFD_CMD_BASE + 32)
+/* Mock device/iommu PASID width */
+#define MOCK_PASID_WIDTH 20
+
/* Mock structs for IOMMU_DEVICE_GET_HW_INFO ioctl */
#define IOMMU_HW_INFO_TYPE_SELFTEST 0xfeedbeef
#define IOMMU_HW_INFO_SELFTEST_REGVAL 0xdeadbeef
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 0b3f5cbf242b..aa3da0adc4e1 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -223,8 +223,16 @@ static int mock_domain_nop_attach(struct iommu_domain *domain,
return 0;
}
+static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
+{
+ return 0;
+}
+
static const struct iommu_domain_ops mock_blocking_ops = {
.attach_dev = mock_domain_nop_attach,
+ .set_dev_pasid = mock_domain_set_dev_pasid_nop
};
static struct iommu_domain mock_blocking_domain = {
@@ -366,7 +374,7 @@ mock_domain_alloc_nested(struct device *dev, struct iommu_domain *parent,
struct mock_iommu_domain_nested *mock_nested;
struct mock_iommu_domain *mock_parent;
- if (flags)
+ if (flags & ~IOMMU_HWPT_ALLOC_PASID)
return ERR_PTR(-EOPNOTSUPP);
if (!parent || parent->ops != mock_ops.default_domain_ops)
return ERR_PTR(-EINVAL);
@@ -388,7 +396,8 @@ mock_domain_alloc_paging_flags(struct device *dev, u32 flags,
{
bool has_dirty_flag = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
- IOMMU_HWPT_ALLOC_NEST_PARENT;
+ IOMMU_HWPT_ALLOC_NEST_PARENT |
+ IOMMU_HWPT_ALLOC_PASID;
struct mock_dev *mdev = to_mock_dev(dev);
bool no_dirty_ops = mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY;
struct mock_iommu_domain *mock;
@@ -608,7 +617,7 @@ mock_viommu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags,
struct mock_viommu *mock_viommu = to_mock_viommu(viommu);
struct mock_iommu_domain_nested *mock_nested;
- if (flags)
+ if (flags & ~IOMMU_HWPT_ALLOC_PASID)
return ERR_PTR(-EOPNOTSUPP);
mock_nested = __mock_domain_alloc_nested(user_data);
@@ -743,6 +752,7 @@ static const struct iommu_ops mock_ops = {
.map_pages = mock_domain_map_pages,
.unmap_pages = mock_domain_unmap_pages,
.iova_to_phys = mock_domain_iova_to_phys,
+ .set_dev_pasid = mock_domain_set_dev_pasid_nop,
},
};
@@ -803,6 +813,7 @@ static struct iommu_domain_ops domain_nested_ops = {
.free = mock_domain_free_nested,
.attach_dev = mock_domain_nop_attach,
.cache_invalidate_user = mock_domain_cache_invalidate_user,
+ .set_dev_pasid = mock_domain_set_dev_pasid_nop,
};
static inline struct iommufd_hw_pagetable *
@@ -862,11 +873,17 @@ static void mock_dev_release(struct device *dev)
static struct mock_dev *mock_dev_create(unsigned long dev_flags)
{
+ struct property_entry prop[] = {
+ PROPERTY_ENTRY_U32("pasid-num-bits", 0),
+ {},
+ };
+ const u32 valid_flags = MOCK_FLAGS_DEVICE_NO_DIRTY |
+ MOCK_FLAGS_DEVICE_HUGE_IOVA |
+ MOCK_FLAGS_DEVICE_PASID;
struct mock_dev *mdev;
int rc, i;
- if (dev_flags &
- ~(MOCK_FLAGS_DEVICE_NO_DIRTY | MOCK_FLAGS_DEVICE_HUGE_IOVA))
+ if (dev_flags & ~valid_flags)
return ERR_PTR(-EINVAL);
mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
@@ -890,6 +907,15 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags)
if (rc)
goto err_put;
+ if (dev_flags & MOCK_FLAGS_DEVICE_PASID)
+ prop[0] = PROPERTY_ENTRY_U32("pasid-num-bits", MOCK_PASID_WIDTH);
+
+ rc = device_create_managed_software_node(&mdev->dev, prop, NULL);
+ if (rc) {
+ dev_err(&mdev->dev, "add pasid-num-bits property failed, rc: %d", rc);
+ goto err_put;
+ }
+
rc = device_add(&mdev->dev);
if (rc)
goto err_put;
@@ -1778,6 +1804,7 @@ int __init iommufd_test_init(void)
init_completion(&mock_iommu.complete);
mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq");
+ mock_iommu.iommu_dev.max_pasids = (1 << MOCK_PASID_WIDTH);
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 16/18] iommufd/selftest: Add a helper to get test device
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (14 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 15/18] iommufd/selftest: Add set_dev_pasid in mock iommu Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:19 ` [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach Yi Liu
` (3 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
There is need to get the selftest device (sobj->type == TYPE_IDEV) in
multiple places, so have a helper to for it.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/selftest.c | 36 ++++++++++++++++++++------------
1 file changed, 23 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index aa3da0adc4e1..04a4b84f5fa1 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -996,39 +996,49 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd,
return rc;
}
-/* Replace the mock domain with a manually allocated hw_pagetable */
-static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd,
- unsigned int device_id, u32 pt_id,
- struct iommu_test_cmd *cmd)
+static struct selftest_obj *
+iommufd_test_get_selftest_obj(struct iommufd_ctx *ictx, u32 id)
{
struct iommufd_object *dev_obj;
struct selftest_obj *sobj;
- int rc;
/*
* Prefer to use the OBJ_SELFTEST because the destroy_rwsem will ensure
* it doesn't race with detach, which is not allowed.
*/
- dev_obj =
- iommufd_get_object(ucmd->ictx, device_id, IOMMUFD_OBJ_SELFTEST);
+ dev_obj = iommufd_get_object(ictx, id, IOMMUFD_OBJ_SELFTEST);
if (IS_ERR(dev_obj))
- return PTR_ERR(dev_obj);
+ return ERR_CAST(dev_obj);
sobj = to_selftest_obj(dev_obj);
if (sobj->type != TYPE_IDEV) {
- rc = -EINVAL;
- goto out_dev_obj;
+ iommufd_put_object(ictx, dev_obj);
+ return ERR_PTR(-EINVAL);
}
+ return sobj;
+}
+
+/* Replace the mock domain with a manually allocated hw_pagetable */
+static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd,
+ unsigned int device_id, u32 pt_id,
+ struct iommu_test_cmd *cmd)
+{
+ struct selftest_obj *sobj;
+ int rc;
+
+ sobj = iommufd_test_get_selftest_obj(ucmd->ictx, device_id);
+ if (IS_ERR(sobj))
+ return PTR_ERR(sobj);
rc = iommufd_device_replace(sobj->idev.idev, IOMMU_NO_PASID, &pt_id);
if (rc)
- goto out_dev_obj;
+ goto out_sobj;
cmd->mock_domain_replace.pt_id = pt_id;
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
-out_dev_obj:
- iommufd_put_object(ucmd->ictx, dev_obj);
+out_sobj:
+ iommufd_put_object(ucmd->ictx, &sobj->obj);
return rc;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (15 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 16/18] iommufd/selftest: Add a helper to get test device Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-28 1:00 ` Lai, Yi
2025-03-21 17:19 ` [PATCH v11 18/18] iommufd/selftest: Add coverage for iommufd " Yi Liu
` (2 subsequent siblings)
19 siblings, 1 reply; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
This adds 4 test ops for pasid attach/replace/detach testing. There are
ops to attach/detach pasid, and also op to check the attached hwpt of a
pasid.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Various tweaks. Detail refer to discussion in v10 of this patch
---
drivers/iommu/iommufd/iommufd_test.h | 26 +++++
drivers/iommu/iommufd/selftest.c | 162 +++++++++++++++++++++++++++
2 files changed, 188 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index 1a066feb8697..1cd7e8394129 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -25,6 +25,10 @@ enum {
IOMMU_TEST_OP_TRIGGER_IOPF,
IOMMU_TEST_OP_DEV_CHECK_CACHE,
IOMMU_TEST_OP_TRIGGER_VEVENT,
+ IOMMU_TEST_OP_PASID_ATTACH,
+ IOMMU_TEST_OP_PASID_REPLACE,
+ IOMMU_TEST_OP_PASID_DETACH,
+ IOMMU_TEST_OP_PASID_CHECK_HWPT,
};
enum {
@@ -62,6 +66,9 @@ enum {
MOCK_DEV_CACHE_NUM = 4,
};
+/* Reserved for special pasid replace test */
+#define IOMMU_TEST_PASID_RESERVED 1024
+
struct iommu_test_cmd {
__u32 size;
__u32 op;
@@ -150,6 +157,25 @@ struct iommu_test_cmd {
struct {
__u32 dev_id;
} trigger_vevent;
+ struct {
+ __u32 pasid;
+ __u32 pt_id;
+ /* @id is stdev_id */
+ } pasid_attach;
+ struct {
+ __u32 pasid;
+ __u32 pt_id;
+ /* @id is stdev_id */
+ } pasid_replace;
+ struct {
+ __u32 pasid;
+ /* @id is stdev_id */
+ } pasid_detach;
+ struct {
+ __u32 pasid;
+ __u32 hwpt_id;
+ /* @id is stdev_id */
+ } pasid_check;
};
__u32 last;
};
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 04a4b84f5fa1..18d9a216eb30 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -167,6 +167,7 @@ struct mock_dev {
unsigned long vdev_id;
int id;
u32 cache[MOCK_DEV_CACHE_NUM];
+ atomic_t pasid_1024_fake_error;
};
static inline struct mock_dev *to_mock_dev(struct device *dev)
@@ -227,6 +228,34 @@ static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid,
struct iommu_domain *old)
{
+ struct mock_dev *mdev = to_mock_dev(dev);
+
+ /*
+ * Per the first attach with pasid 1024, set the
+ * mdev->pasid_1024_fake_error. Hence the second call of this op
+ * can fake an error to validate the error path of the core. This
+ * is helpful to test the case in which the iommu core needs to
+ * rollback to the old domain due to driver failure. e.g. replace.
+ * User should be careful about the third call of this op, it shall
+ * succeed since the mdev->pasid_1024_fake_error is cleared in the
+ * second call.
+ */
+ if (pasid == 1024) {
+ if (domain->type == IOMMU_DOMAIN_BLOCKED) {
+ atomic_set(&mdev->pasid_1024_fake_error, 0);
+ } else if (atomic_read(&mdev->pasid_1024_fake_error)) {
+ /*
+ * Clear the flag, and fake an error to fail the
+ * replacement.
+ */
+ atomic_set(&mdev->pasid_1024_fake_error, 0);
+ return -ENOMEM;
+ } else {
+ /* Set the flag to fake an error in next call */
+ atomic_set(&mdev->pasid_1024_fake_error, 1);
+ }
+ }
+
return 0;
}
@@ -1685,6 +1714,131 @@ static int iommufd_test_trigger_vevent(struct iommufd_ucmd *ucmd,
return rc;
}
+static inline struct iommufd_hw_pagetable *
+iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id)
+{
+ struct iommufd_object *pt_obj;
+
+ pt_obj = iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_ANY);
+ if (IS_ERR(pt_obj))
+ return ERR_CAST(pt_obj);
+
+ if (pt_obj->type != IOMMUFD_OBJ_HWPT_NESTED &&
+ pt_obj->type != IOMMUFD_OBJ_HWPT_PAGING) {
+ iommufd_put_object(ucmd->ictx, pt_obj);
+ return ERR_PTR(-EINVAL);
+ }
+
+ return container_of(pt_obj, struct iommufd_hw_pagetable, obj);
+}
+
+static int iommufd_test_pasid_check_hwpt(struct iommufd_ucmd *ucmd,
+ struct iommu_test_cmd *cmd)
+{
+ u32 hwpt_id = cmd->pasid_check.hwpt_id;
+ struct iommu_domain *attached_domain;
+ struct iommu_attach_handle *handle;
+ struct iommufd_hw_pagetable *hwpt;
+ struct selftest_obj *sobj;
+ struct mock_dev *mdev;
+ int rc = 0;
+
+ sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
+ if (IS_ERR(sobj))
+ return PTR_ERR(sobj);
+
+ mdev = sobj->idev.mock_dev;
+
+ handle = iommu_attach_handle_get(mdev->dev.iommu_group,
+ cmd->pasid_check.pasid, 0);
+ if (IS_ERR(handle))
+ attached_domain = NULL;
+ else
+ attached_domain = handle->domain;
+
+ /* hwpt_id == 0 means to check if pasid is detached */
+ if (!hwpt_id) {
+ if (attached_domain)
+ rc = -EINVAL;
+ goto out_sobj;
+ }
+
+ hwpt = iommufd_get_hwpt(ucmd, hwpt_id);
+ if (IS_ERR(hwpt)) {
+ rc = PTR_ERR(hwpt);
+ goto out_sobj;
+ }
+
+ if (attached_domain != hwpt->domain)
+ rc = -EINVAL;
+
+ iommufd_put_object(ucmd->ictx, &hwpt->obj);
+out_sobj:
+ iommufd_put_object(ucmd->ictx, &sobj->obj);
+ return rc;
+}
+
+static int iommufd_test_pasid_attach(struct iommufd_ucmd *ucmd,
+ struct iommu_test_cmd *cmd)
+{
+ struct selftest_obj *sobj;
+ int rc;
+
+ sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
+ if (IS_ERR(sobj))
+ return PTR_ERR(sobj);
+
+ rc = iommufd_device_attach(sobj->idev.idev, cmd->pasid_attach.pasid,
+ &cmd->pasid_attach.pt_id);
+ if (rc)
+ goto out_sobj;
+
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+ if (rc)
+ iommufd_device_detach(sobj->idev.idev,
+ cmd->pasid_attach.pasid);
+
+out_sobj:
+ iommufd_put_object(ucmd->ictx, &sobj->obj);
+ return rc;
+}
+
+static int iommufd_test_pasid_replace(struct iommufd_ucmd *ucmd,
+ struct iommu_test_cmd *cmd)
+{
+ struct selftest_obj *sobj;
+ int rc;
+
+ sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
+ if (IS_ERR(sobj))
+ return PTR_ERR(sobj);
+
+ rc = iommufd_device_replace(sobj->idev.idev, cmd->pasid_attach.pasid,
+ &cmd->pasid_attach.pt_id);
+ if (rc)
+ goto out_sobj;
+
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+
+out_sobj:
+ iommufd_put_object(ucmd->ictx, &sobj->obj);
+ return rc;
+}
+
+static int iommufd_test_pasid_detach(struct iommufd_ucmd *ucmd,
+ struct iommu_test_cmd *cmd)
+{
+ struct selftest_obj *sobj;
+
+ sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
+ if (IS_ERR(sobj))
+ return PTR_ERR(sobj);
+
+ iommufd_device_detach(sobj->idev.idev, cmd->pasid_detach.pasid);
+ iommufd_put_object(ucmd->ictx, &sobj->obj);
+ return 0;
+}
+
void iommufd_selftest_destroy(struct iommufd_object *obj)
{
struct selftest_obj *sobj = to_selftest_obj(obj);
@@ -1768,6 +1922,14 @@ int iommufd_test(struct iommufd_ucmd *ucmd)
return iommufd_test_trigger_iopf(ucmd, cmd);
case IOMMU_TEST_OP_TRIGGER_VEVENT:
return iommufd_test_trigger_vevent(ucmd, cmd);
+ case IOMMU_TEST_OP_PASID_ATTACH:
+ return iommufd_test_pasid_attach(ucmd, cmd);
+ case IOMMU_TEST_OP_PASID_REPLACE:
+ return iommufd_test_pasid_replace(ucmd, cmd);
+ case IOMMU_TEST_OP_PASID_DETACH:
+ return iommufd_test_pasid_detach(ucmd, cmd);
+ case IOMMU_TEST_OP_PASID_CHECK_HWPT:
+ return iommufd_test_pasid_check_hwpt(ucmd, cmd);
default:
return -EOPNOTSUPP;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach
2025-03-21 17:19 ` [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach Yi Liu
@ 2025-03-28 1:00 ` Lai, Yi
2025-03-28 7:47 ` Yi Liu
0 siblings, 1 reply; 24+ messages in thread
From: Lai, Yi @ 2025-03-28 1:00 UTC (permalink / raw)
To: Yi Liu; +Cc: kevin.tian, jgg, joro, baolu.lu, iommu, nicolinc, yi1.lai
On Fri, Mar 21, 2025 at 10:19:39AM -0700, Yi Liu wrote:
> This adds 4 test ops for pasid attach/replace/detach testing. There are
> ops to attach/detach pasid, and also op to check the attached hwpt of a
> pasid.
>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
> v10 -> v11: Various tweaks. Detail refer to discussion in v10 of this patch
> ---
> drivers/iommu/iommufd/iommufd_test.h | 26 +++++
> drivers/iommu/iommufd/selftest.c | 162 +++++++++++++++++++++++++++
> 2 files changed, 188 insertions(+)
>
> diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
> index 1a066feb8697..1cd7e8394129 100644
> --- a/drivers/iommu/iommufd/iommufd_test.h
> +++ b/drivers/iommu/iommufd/iommufd_test.h
> @@ -25,6 +25,10 @@ enum {
> IOMMU_TEST_OP_TRIGGER_IOPF,
> IOMMU_TEST_OP_DEV_CHECK_CACHE,
> IOMMU_TEST_OP_TRIGGER_VEVENT,
> + IOMMU_TEST_OP_PASID_ATTACH,
> + IOMMU_TEST_OP_PASID_REPLACE,
> + IOMMU_TEST_OP_PASID_DETACH,
> + IOMMU_TEST_OP_PASID_CHECK_HWPT,
> };
>
> enum {
> @@ -62,6 +66,9 @@ enum {
> MOCK_DEV_CACHE_NUM = 4,
> };
>
> +/* Reserved for special pasid replace test */
> +#define IOMMU_TEST_PASID_RESERVED 1024
> +
> struct iommu_test_cmd {
> __u32 size;
> __u32 op;
> @@ -150,6 +157,25 @@ struct iommu_test_cmd {
> struct {
> __u32 dev_id;
> } trigger_vevent;
> + struct {
> + __u32 pasid;
> + __u32 pt_id;
> + /* @id is stdev_id */
> + } pasid_attach;
> + struct {
> + __u32 pasid;
> + __u32 pt_id;
> + /* @id is stdev_id */
> + } pasid_replace;
> + struct {
> + __u32 pasid;
> + /* @id is stdev_id */
> + } pasid_detach;
> + struct {
> + __u32 pasid;
> + __u32 hwpt_id;
> + /* @id is stdev_id */
> + } pasid_check;
> };
> __u32 last;
> };
> diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
> index 04a4b84f5fa1..18d9a216eb30 100644
> --- a/drivers/iommu/iommufd/selftest.c
> +++ b/drivers/iommu/iommufd/selftest.c
> @@ -167,6 +167,7 @@ struct mock_dev {
> unsigned long vdev_id;
> int id;
> u32 cache[MOCK_DEV_CACHE_NUM];
> + atomic_t pasid_1024_fake_error;
> };
>
> static inline struct mock_dev *to_mock_dev(struct device *dev)
> @@ -227,6 +228,34 @@ static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain,
> struct device *dev, ioasid_t pasid,
> struct iommu_domain *old)
> {
> + struct mock_dev *mdev = to_mock_dev(dev);
> +
> + /*
> + * Per the first attach with pasid 1024, set the
> + * mdev->pasid_1024_fake_error. Hence the second call of this op
> + * can fake an error to validate the error path of the core. This
> + * is helpful to test the case in which the iommu core needs to
> + * rollback to the old domain due to driver failure. e.g. replace.
> + * User should be careful about the third call of this op, it shall
> + * succeed since the mdev->pasid_1024_fake_error is cleared in the
> + * second call.
> + */
> + if (pasid == 1024) {
> + if (domain->type == IOMMU_DOMAIN_BLOCKED) {
> + atomic_set(&mdev->pasid_1024_fake_error, 0);
> + } else if (atomic_read(&mdev->pasid_1024_fake_error)) {
> + /*
> + * Clear the flag, and fake an error to fail the
> + * replacement.
> + */
> + atomic_set(&mdev->pasid_1024_fake_error, 0);
> + return -ENOMEM;
> + } else {
> + /* Set the flag to fake an error in next call */
> + atomic_set(&mdev->pasid_1024_fake_error, 1);
> + }
> + }
> +
> return 0;
> }
>
> @@ -1685,6 +1714,131 @@ static int iommufd_test_trigger_vevent(struct iommufd_ucmd *ucmd,
> return rc;
> }
>
> +static inline struct iommufd_hw_pagetable *
> +iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id)
> +{
> + struct iommufd_object *pt_obj;
> +
> + pt_obj = iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_ANY);
> + if (IS_ERR(pt_obj))
> + return ERR_CAST(pt_obj);
> +
> + if (pt_obj->type != IOMMUFD_OBJ_HWPT_NESTED &&
> + pt_obj->type != IOMMUFD_OBJ_HWPT_PAGING) {
> + iommufd_put_object(ucmd->ictx, pt_obj);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + return container_of(pt_obj, struct iommufd_hw_pagetable, obj);
> +}
> +
> +static int iommufd_test_pasid_check_hwpt(struct iommufd_ucmd *ucmd,
> + struct iommu_test_cmd *cmd)
> +{
> + u32 hwpt_id = cmd->pasid_check.hwpt_id;
> + struct iommu_domain *attached_domain;
> + struct iommu_attach_handle *handle;
> + struct iommufd_hw_pagetable *hwpt;
> + struct selftest_obj *sobj;
> + struct mock_dev *mdev;
> + int rc = 0;
> +
> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
> + if (IS_ERR(sobj))
> + return PTR_ERR(sobj);
> +
> + mdev = sobj->idev.mock_dev;
> +
> + handle = iommu_attach_handle_get(mdev->dev.iommu_group,
> + cmd->pasid_check.pasid, 0);
> + if (IS_ERR(handle))
> + attached_domain = NULL;
> + else
> + attached_domain = handle->domain;
> +
> + /* hwpt_id == 0 means to check if pasid is detached */
> + if (!hwpt_id) {
> + if (attached_domain)
> + rc = -EINVAL;
> + goto out_sobj;
> + }
> +
> + hwpt = iommufd_get_hwpt(ucmd, hwpt_id);
> + if (IS_ERR(hwpt)) {
> + rc = PTR_ERR(hwpt);
> + goto out_sobj;
> + }
> +
> + if (attached_domain != hwpt->domain)
> + rc = -EINVAL;
> +
> + iommufd_put_object(ucmd->ictx, &hwpt->obj);
> +out_sobj:
> + iommufd_put_object(ucmd->ictx, &sobj->obj);
> + return rc;
> +}
> +
> +static int iommufd_test_pasid_attach(struct iommufd_ucmd *ucmd,
> + struct iommu_test_cmd *cmd)
> +{
> + struct selftest_obj *sobj;
> + int rc;
> +
> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
> + if (IS_ERR(sobj))
> + return PTR_ERR(sobj);
> +
> + rc = iommufd_device_attach(sobj->idev.idev, cmd->pasid_attach.pasid,
> + &cmd->pasid_attach.pt_id);
> + if (rc)
> + goto out_sobj;
> +
> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> + if (rc)
> + iommufd_device_detach(sobj->idev.idev,
> + cmd->pasid_attach.pasid);
> +
> +out_sobj:
> + iommufd_put_object(ucmd->ictx, &sobj->obj);
> + return rc;
> +}
> +
> +static int iommufd_test_pasid_replace(struct iommufd_ucmd *ucmd,
> + struct iommu_test_cmd *cmd)
> +{
> + struct selftest_obj *sobj;
> + int rc;
> +
> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
> + if (IS_ERR(sobj))
> + return PTR_ERR(sobj);
> +
> + rc = iommufd_device_replace(sobj->idev.idev, cmd->pasid_attach.pasid,
> + &cmd->pasid_attach.pt_id);
> + if (rc)
> + goto out_sobj;
> +
> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> +
> +out_sobj:
> + iommufd_put_object(ucmd->ictx, &sobj->obj);
> + return rc;
> +}
> +
> +static int iommufd_test_pasid_detach(struct iommufd_ucmd *ucmd,
> + struct iommu_test_cmd *cmd)
> +{
> + struct selftest_obj *sobj;
> +
> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
> + if (IS_ERR(sobj))
> + return PTR_ERR(sobj);
> +
> + iommufd_device_detach(sobj->idev.idev, cmd->pasid_detach.pasid);
> + iommufd_put_object(ucmd->ictx, &sobj->obj);
> + return 0;
> +}
> +
> void iommufd_selftest_destroy(struct iommufd_object *obj)
> {
> struct selftest_obj *sobj = to_selftest_obj(obj);
> @@ -1768,6 +1922,14 @@ int iommufd_test(struct iommufd_ucmd *ucmd)
> return iommufd_test_trigger_iopf(ucmd, cmd);
> case IOMMU_TEST_OP_TRIGGER_VEVENT:
> return iommufd_test_trigger_vevent(ucmd, cmd);
> + case IOMMU_TEST_OP_PASID_ATTACH:
> + return iommufd_test_pasid_attach(ucmd, cmd);
> + case IOMMU_TEST_OP_PASID_REPLACE:
> + return iommufd_test_pasid_replace(ucmd, cmd);
> + case IOMMU_TEST_OP_PASID_DETACH:
> + return iommufd_test_pasid_detach(ucmd, cmd);
> + case IOMMU_TEST_OP_PASID_CHECK_HWPT:
> + return iommufd_test_pasid_check_hwpt(ucmd, cmd);
> default:
> return -EOPNOTSUPP;
> }
> --
> 2.34.1
>
Hi Yi Liu,
Greetings!
I used Syzkaller and found that there is general protection fault in iommufd_hw_pagetable_detach in linux-next tag - next-20250325.
After bisection and the first bad commit is:
"
3d183bab95ea iommufd/selftest: Add test ops to test pasid attach/detach
"
The deadlock can still be reproduced. You could try following reproduction binary.
All detailed into can be found at:
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach
Syzkaller repro code:
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.c
Syzkaller repro syscall steps:
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.prog
Syzkaller report:
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.report
Kconfig(make olddefconfig):
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/kconfig_origin
Bisect info:
https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/bisect_info.log
bzImage:
https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/250327_190630_iommufd_hw_pagetable_detach/bzImage_eb4bc4b07f66f01618d9cb1aa4eaef59b1188415
Issue dmesg:
https://github.com/laifryiee/syzkaller_logs/blob/main/250327_190630_iommufd_hw_pagetable_detach/eb4bc4b07f66f01618d9cb1aa4eaef59b1188415_dmesg.log
"
[ 37.609031] iommufd_mock iommufd_mock0: Adding to iommu group 0
[ 37.611696] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASI
[ 37.613179] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 37.614126] CPU: 1 UID: 0 PID: 668 Comm: repro Not tainted 6.14.0-next-20250325-eb4bc4b07f66 #1 PREEMPT(voluntary)
[ 37.615361] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org4
[ 37.616706] RIP: 0010:iommufd_hw_pagetable_detach+0x8a/0x4d0
[ 37.617468] Code: 00 00 00 44 89 ee 48 89 c7 48 89 75 c8 48 89 45 c0 e8 ca 55 17 02 48 89 c2 49 89 c4 48 b8 00 00 00b
[ 37.619613] RSP: 0018:ffff888021b17b78 EFLAGS: 00010246
[ 37.620256] RAX: dffffc0000000000 RBX: ffff888014b5a000 RCX: ffff888021b17a64
[ 37.621360] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801dad07fc
[ 37.623597] RBP: ffff888021b17bc8 R08: 0000000000000001 R09: 0000000000000001
[ 37.625915] R10: 0000000000000001 R11: ffff88801dad0e58 R12: 0000000000000000
[ 37.627802] R13: 0000000000000001 R14: ffff888021b17e18 R15: ffff8880132d3008
[ 37.629383] FS: 00007fca52013600(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000
[ 37.630955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.631860] CR2: 00000000200006c0 CR3: 00000000112d0005 CR4: 0000000000770ef0
[ 37.632941] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 37.633869] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 37.634740] PKRU: 55555554
[ 37.635093] Call Trace:
[ 37.635417] <TASK>
[ 37.635717] ? show_regs+0x6d/0x80
[ 37.636205] ? die_addr+0x45/0xb0
[ 37.636652] ? exc_general_protection+0x1ad/0x340
[ 37.637305] ? asm_exc_general_protection+0x2b/0x30
[ 37.637939] ? iommufd_hw_pagetable_detach+0x8a/0x4d0
[ 37.638589] ? iommufd_hw_pagetable_detach+0x76/0x4d0
[ 37.639256] iommufd_device_detach+0x2a/0x2e0
[ 37.639832] iommufd_test+0x2f99/0x5cd0
[ 37.640353] ? __pfx_iommufd_test+0x10/0x10
[ 37.640899] ? __might_fault+0x14a/0x1b0
[ 37.641443] ? __this_cpu_preempt_check+0x21/0x30
[ 37.642062] ? lock_release+0x14f/0x2c0
[ 37.642590] ? __might_fault+0xf1/0x1b0
[ 37.643104] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 37.643826] iommufd_fops_ioctl+0x38e/0x520
[ 37.644386] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 37.644995] ? __this_cpu_preempt_check+0x21/0x30
[ 37.645598] ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
[ 37.646387] ? lockdep_hardirqs_on+0x89/0x110
[ 37.646954] ? ktime_get_coarse_real_ts64+0xb6/0x100
[ 37.647586] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 37.648188] __x64_sys_ioctl+0x1ba/0x220
[ 37.648725] x64_sys_call+0x122e/0x2150
[ 37.649220] do_syscall_64+0x6d/0x150
[ 37.649703] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 37.650343] RIP: 0033:0x7fca51e3ee5d
[ 37.650823] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d8
[ 37.653042] RSP: 002b:00007ffc6ea0e9f8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
[ 37.653973] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fca51e3ee5d
[ 37.654849] RDX: 0000000020000300 RSI: 0000000000003ba0 RDI: 0000000000000003
[ 37.655725] RBP: 00007ffc6ea0ea10 R08: 0000000000000800 R09: 0000000000000800
[ 37.656605] R10: 0000000000000800 R11: 0000000000000213 R12: 00007ffc6ea0eb28
[ 37.657479] R13: 0000000000401136 R14: 0000000000403e08 R15: 00007fca5205c000
[ 37.658381] </TASK>
[ 37.658683] Modules linked in:
[ 37.659218] ---[ end trace 0000000000000000 ]---
[ 37.659818] RIP: 0010:iommufd_hw_pagetable_detach+0x8a/0x4d0
[ 37.660556] Code: 00 00 00 44 89 ee 48 89 c7 48 89 75 c8 48 89 45 c0 e8 ca 55 17 02 48 89 c2 49 89 c4 48 b8 00 00 00b
[ 37.662822] RSP: 0018:ffff888021b17b78 EFLAGS: 00010246
[ 37.663481] RAX: dffffc0000000000 RBX: ffff888014b5a000 RCX: ffff888021b17a64
[ 37.664360] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801dad07fc
[ 37.665236] RBP: ffff888021b17bc8 R08: 0000000000000001 R09: 0000000000000001
[ 37.666123] R10: 0000000000000001 R11: ffff88801dad0e58 R12: 0000000000000000
[ 37.666997] R13: 0000000000000001 R14: ffff888021b17e18 R15: ffff8880132d3008
[ 37.667866] FS: 00007fca52013600(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000
[ 37.668857] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.669601] CR2: 00000000200006c0 CR3: 00000000112d0005 CR4: 0000000000770ef0
[ 37.670482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 37.671356] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 37.672228] PKRU: 55555554
[ 37.673445] ------------[ cut here ]------------
[ 37.674088] WARNING: CPU: 1 PID: 668 at drivers/iommu/iommufd/main.c:265 iommufd_fops_release+0x386/0x420
[ 37.675253] Modules linked in:
[ 37.675658] CPU: 1 UID: 0 PID: 668 Comm: repro Tainted: G D 6.14.0-next-20250325-eb4bc4b07f66 #1 PR
[ 37.677106] Tainted: [D]=DIE
[ 37.677944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org4
[ 37.679312] RIP: 0010:iommufd_fops_release+0x386/0x420
[ 37.679982] Code: 8b 45 d0 65 48 2b 05 f1 59 3a 05 75 76 48 81 c4 88 00 00 00 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3f
[ 37.682261] RSP: 0018:ffff888021b17c08 EFLAGS: 00010293
[ 37.682913] RAX: 0000000000000000 RBX: ffff88801b347808 RCX: ffffffff83afe4ca
[ 37.683777] RDX: ffff88801dad0000 RSI: ffffffff83afe636 RDI: 0000000000000005
[ 37.684644] RBP: ffff888021b17cb8 R08: 0000000000000000 R09: 0000000000000000
[ 37.685213] R10: 0000000000000000 R11: ffff888017ef2130 R12: 0000000000000000
[ 37.685802] R13: 0000000000000000 R14: ffff888021b17c50 R15: 0000000000000000
[ 37.686378] FS: 0000000000000000(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000
[ 37.686923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.687320] CR2: 00000000200006c0 CR3: 0000000007086006 CR4: 0000000000770ef0
[ 37.687812] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 37.688294] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 37.688849] PKRU: 55555554
[ 37.689075] Call Trace:
[ 37.689255] <TASK>
[ 37.689427] ? show_regs+0x6d/0x80
[ 37.689738] ? __warn+0xf3/0x380
[ 37.690011] ? report_bug+0x25e/0x4b0
[ 37.690311] ? iommufd_fops_release+0x386/0x420
[ 37.690672] ? report_bug+0x2cb/0x4b0
[ 37.690971] ? iommufd_fops_release+0x386/0x420
[ 37.691316] ? iommufd_fops_release+0x386/0x420
[ 37.691664] ? handle_bug+0x2cd/0x510
[ 37.691964] ? iommufd_fops_release+0x388/0x420
[ 37.692323] ? exc_invalid_op+0x3c/0x80
[ 37.692661] ? asm_exc_invalid_op+0x1f/0x30
[ 37.693007] ? iommufd_fops_release+0x21a/0x420
[ 37.693420] ? iommufd_fops_release+0x386/0x420
[ 37.693845] ? iommufd_fops_release+0x386/0x420
[ 37.694230] ? iommufd_fops_release+0x386/0x420
[ 37.694609] ? locks_remove_file+0x3b4/0x5d0
[ 37.694987] ? __pfx_iommufd_fops_release+0x10/0x10
[ 37.695372] ? __memcg_slab_free_hook+0xc1/0x540
[ 37.695758] ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
[ 37.696170] ? evm_file_release+0x141/0x220
[ 37.696526] ? __pfx_iommufd_fops_release+0x10/0x10
[ 37.696914] __fput+0x41c/0xb70
[ 37.697172] ____fput+0x22/0x30
[ 37.697423] task_work_run+0x19b/0x2b0
[ 37.697758] ? __pfx_task_work_run+0x10/0x10
[ 37.698115] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 37.698589] ? switch_task_namespaces+0xc6/0x110
[ 37.699065] do_exit+0xb0e/0x29d0
[ 37.699350] ? ktime_get_coarse_real_ts64+0xb6/0x100
[ 37.699743] ? __pfx_do_exit+0x10/0x10
[ 37.700072] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 37.700454] ? __x64_sys_ioctl+0x1ba/0x220
[ 37.700795] make_task_dead+0x181/0x3c0
[ 37.701118] rewind_stack_and_make_dead+0x16/0x20
[ 37.701560] RIP: 0033:0x7fca51e3ee5d
[ 37.701886] Code: Unable to access opcode bytes at 0x7fca51e3ee33.
[ 37.702389] RSP: 002b:00007ffc6ea0e9f8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
[ 37.703008] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fca51e3ee5d
[ 37.703594] RDX: 0000000020000300 RSI: 0000000000003ba0 RDI: 0000000000000003
[ 37.704188] RBP: 00007ffc6ea0ea10 R08: 0000000000000800 R09: 0000000000000800
[ 37.704787] R10: 0000000000000800 R11: 0000000000000213 R12: 00007ffc6ea0eb28
[ 37.705380] R13: 0000000000401136 R14: 0000000000403e08 R15: 00007fca5205c000
[ 37.705993] </TASK>
[ 37.706205] irq event stamp: 3011
[ 37.706495] hardirqs last enabled at (3011): [<ffffffff812e194b>] cond_local_irq_enable.isra.0+0x3b/0x50
[ 37.707270] hardirqs last disabled at (3010): [<ffffffff85c6ecc6>] exc_general_protection+0x36/0x340
[ 37.708008] softirqs last enabled at (2704): [<ffffffff8149141e>] __irq_exit_rcu+0x10e/0x170
[ 37.708728] softirqs last disabled at (2685): [<ffffffff8149141e>] __irq_exit_rcu+0x10e/0x170
[ 37.709420] ---[ end trace 0000000000000000 ]---
[ 37.709935] ------------[ cut here ]------------
[ 37.710326] WARNING: CPU: 1 PID: 668 at drivers/iommu/iommufd/main.c:268 iommufd_fops_release+0x392/0x420
[ 37.711053] Modules linked in:
[ 37.711300] CPU: 1 UID: 0 PID: 668 Comm: repro Tainted: G D W 6.14.0-next-20250325-eb4bc4b07f66 #1 PR
[ 37.712216] Tainted: [D]=DIE, [W]=WARN
[ 37.712560] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org4
[ 37.713475] RIP: 0010:iommufd_fops_release+0x392/0x420
[ 37.713926] Code: 76 48 81 c4 88 00 00 00 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 2a d6 d7 fd 0f 0b e9f
[ 37.715439] RSP: 0018:ffff888021b17c08 EFLAGS: 00010293
[ 37.715864] RAX: 0000000000000000 RBX: ffff88801b347808 RCX: ffffffff83afe4ca
[ 37.716455] RDX: ffff88801dad0000 RSI: ffffffff83afe642 RDI: ffff88801b3478a0
[ 37.717024] RBP: ffff888021b17cb8 R08: 0000000000000000 R09: 0000000000000000
[ 37.717622] R10: 0000000000000000 R11: ffff888017ef2130 R12: 0000000000000000
[ 37.718191] R13: 0000000000000000 R14: ffff888021b17c50 R15: 0000000000000000
[ 37.718752] FS: 0000000000000000(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000
[ 37.719393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.719849] CR2: 00000000200006c0 CR3: 0000000007086006 CR4: 0000000000770ef0
[ 37.720409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 37.721007] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 37.721600] PKRU: 55555554
[ 37.721847] Call Trace:
[ 37.722061] <TASK>
[ 37.722260] ? show_regs+0x6d/0x80
[ 37.722568] ? __warn+0xf3/0x380
[ 37.722851] ? report_bug+0x25e/0x4b0
[ 37.723151] ? iommufd_fops_release+0x392/0x420
[ 37.723525] ? report_bug+0x2cb/0x4b0
[ 37.723835] ? iommufd_fops_release+0x392/0x420
[ 37.724203] ? iommufd_fops_release+0x392/0x420
[ 37.724586] ? handle_bug+0x2cd/0x510
[ 37.724880] ? iommufd_fops_release+0x394/0x420
[ 37.725253] ? exc_invalid_op+0x3c/0x80
[ 37.725576] ? asm_exc_invalid_op+0x1f/0x30
[ 37.725904] ? iommufd_fops_release+0x21a/0x420
[ 37.726249] ? iommufd_fops_release+0x392/0x420
[ 37.726607] ? iommufd_fops_release+0x392/0x420
[ 37.726969] ? iommufd_fops_release+0x392/0x420
[ 37.727392] ? locks_remove_file+0x3b4/0x5d0
[ 37.727736] ? __pfx_iommufd_fops_release+0x10/0x10
[ 37.728113] ? __memcg_slab_free_hook+0xc1/0x540
[ 37.728521] ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
[ 37.728974] ? evm_file_release+0x141/0x220
[ 37.729313] ? __pfx_iommufd_fops_release+0x10/0x10
[ 37.729708] __fput+0x41c/0xb70
[ 37.729991] ____fput+0x22/0x30
[ 37.730240] task_work_run+0x19b/0x2b0
[ 37.730579] ? __pfx_task_work_run+0x10/0x10
[ 37.730960] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 37.731415] ? switch_task_namespaces+0xc6/0x110
[ 37.731800] do_exit+0xb0e/0x29d0
[ 37.732100] ? ktime_get_coarse_real_ts64+0xb6/0x100
[ 37.732521] ? __pfx_do_exit+0x10/0x10
[ 37.732831] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 37.733222] ? __x64_sys_ioctl+0x1ba/0x220
[ 37.733598] make_task_dead+0x181/0x3c0
[ 37.733917] rewind_stack_and_make_dead+0x16/0x20
[ 37.734306] RIP: 0033:0x7fca51e3ee5d
[ 37.734605] Code: Unable to access opcode bytes at 0x7fca51e3ee33.
[ 37.735106] RSP: 002b:00007ffc6ea0e9f8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
[ 37.735694] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fca51e3ee5d
[ 37.736276] RDX: 0000000020000300 RSI: 0000000000003ba0 RDI: 0000000000000003
[ 37.736848] RBP: 00007ffc6ea0ea10 R08: 0000000000000800 R09: 0000000000000800
[ 37.737425] R10: 0000000000000800 R11: 0000000000000213 R12: 00007ffc6ea0eb28
[ 37.738025] R13: 0000000000401136 R14: 0000000000403e08 R15: 00007fca5205c000
[ 37.738624] </TASK>
[ 37.738815] irq event stamp: 3011
[ 37.739103] hardirqs last enabled at (3011): [<ffffffff812e194b>] cond_local_irq_enable.isra.0+0x3b/0x50
[ 37.739855] hardirqs last disabled at (3010): [<ffffffff85c6ecc6>] exc_general_protection+0x36/0x340
[ 37.740563] softirqs last enabled at (2704): [<ffffffff8149141e>] __irq_exit_rcu+0x10e/0x170
[ 37.741219] softirqs last disabled at (2685): [<ffffffff8149141e>] __irq_exit_rcu+0x10e/0x170
[ 37.741868] ---[ end trace 0000000000000000 ]---
"
Hope this cound be insightful to you.
Regards,
Yi Lai
---
If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.
How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
// Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost
After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/
Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has
Fill the bzImage file into above start3.sh to load the target kernel in vm.
Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach
2025-03-28 1:00 ` Lai, Yi
@ 2025-03-28 7:47 ` Yi Liu
0 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-28 7:47 UTC (permalink / raw)
To: Lai, Yi; +Cc: kevin.tian, jgg, joro, baolu.lu, iommu, nicolinc, yi1.lai
On 2025/3/28 09:00, Lai, Yi wrote:
> On Fri, Mar 21, 2025 at 10:19:39AM -0700, Yi Liu wrote:
>> This adds 4 test ops for pasid attach/replace/detach testing. There are
>> ops to attach/detach pasid, and also op to check the attached hwpt of a
>> pasid.
>>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>> v10 -> v11: Various tweaks. Detail refer to discussion in v10 of this patch
>> ---
>> drivers/iommu/iommufd/iommufd_test.h | 26 +++++
>> drivers/iommu/iommufd/selftest.c | 162 +++++++++++++++++++++++++++
>> 2 files changed, 188 insertions(+)
>>
>> diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
>> index 1a066feb8697..1cd7e8394129 100644
>> --- a/drivers/iommu/iommufd/iommufd_test.h
>> +++ b/drivers/iommu/iommufd/iommufd_test.h
>> @@ -25,6 +25,10 @@ enum {
>> IOMMU_TEST_OP_TRIGGER_IOPF,
>> IOMMU_TEST_OP_DEV_CHECK_CACHE,
>> IOMMU_TEST_OP_TRIGGER_VEVENT,
>> + IOMMU_TEST_OP_PASID_ATTACH,
>> + IOMMU_TEST_OP_PASID_REPLACE,
>> + IOMMU_TEST_OP_PASID_DETACH,
>> + IOMMU_TEST_OP_PASID_CHECK_HWPT,
>> };
>>
>> enum {
>> @@ -62,6 +66,9 @@ enum {
>> MOCK_DEV_CACHE_NUM = 4,
>> };
>>
>> +/* Reserved for special pasid replace test */
>> +#define IOMMU_TEST_PASID_RESERVED 1024
>> +
>> struct iommu_test_cmd {
>> __u32 size;
>> __u32 op;
>> @@ -150,6 +157,25 @@ struct iommu_test_cmd {
>> struct {
>> __u32 dev_id;
>> } trigger_vevent;
>> + struct {
>> + __u32 pasid;
>> + __u32 pt_id;
>> + /* @id is stdev_id */
>> + } pasid_attach;
>> + struct {
>> + __u32 pasid;
>> + __u32 pt_id;
>> + /* @id is stdev_id */
>> + } pasid_replace;
>> + struct {
>> + __u32 pasid;
>> + /* @id is stdev_id */
>> + } pasid_detach;
>> + struct {
>> + __u32 pasid;
>> + __u32 hwpt_id;
>> + /* @id is stdev_id */
>> + } pasid_check;
>> };
>> __u32 last;
>> };
>> diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
>> index 04a4b84f5fa1..18d9a216eb30 100644
>> --- a/drivers/iommu/iommufd/selftest.c
>> +++ b/drivers/iommu/iommufd/selftest.c
>> @@ -167,6 +167,7 @@ struct mock_dev {
>> unsigned long vdev_id;
>> int id;
>> u32 cache[MOCK_DEV_CACHE_NUM];
>> + atomic_t pasid_1024_fake_error;
>> };
>>
>> static inline struct mock_dev *to_mock_dev(struct device *dev)
>> @@ -227,6 +228,34 @@ static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain,
>> struct device *dev, ioasid_t pasid,
>> struct iommu_domain *old)
>> {
>> + struct mock_dev *mdev = to_mock_dev(dev);
>> +
>> + /*
>> + * Per the first attach with pasid 1024, set the
>> + * mdev->pasid_1024_fake_error. Hence the second call of this op
>> + * can fake an error to validate the error path of the core. This
>> + * is helpful to test the case in which the iommu core needs to
>> + * rollback to the old domain due to driver failure. e.g. replace.
>> + * User should be careful about the third call of this op, it shall
>> + * succeed since the mdev->pasid_1024_fake_error is cleared in the
>> + * second call.
>> + */
>> + if (pasid == 1024) {
>> + if (domain->type == IOMMU_DOMAIN_BLOCKED) {
>> + atomic_set(&mdev->pasid_1024_fake_error, 0);
>> + } else if (atomic_read(&mdev->pasid_1024_fake_error)) {
>> + /*
>> + * Clear the flag, and fake an error to fail the
>> + * replacement.
>> + */
>> + atomic_set(&mdev->pasid_1024_fake_error, 0);
>> + return -ENOMEM;
>> + } else {
>> + /* Set the flag to fake an error in next call */
>> + atomic_set(&mdev->pasid_1024_fake_error, 1);
>> + }
>> + }
>> +
>> return 0;
>> }
>>
>> @@ -1685,6 +1714,131 @@ static int iommufd_test_trigger_vevent(struct iommufd_ucmd *ucmd,
>> return rc;
>> }
>>
>> +static inline struct iommufd_hw_pagetable *
>> +iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id)
>> +{
>> + struct iommufd_object *pt_obj;
>> +
>> + pt_obj = iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_ANY);
>> + if (IS_ERR(pt_obj))
>> + return ERR_CAST(pt_obj);
>> +
>> + if (pt_obj->type != IOMMUFD_OBJ_HWPT_NESTED &&
>> + pt_obj->type != IOMMUFD_OBJ_HWPT_PAGING) {
>> + iommufd_put_object(ucmd->ictx, pt_obj);
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>> + return container_of(pt_obj, struct iommufd_hw_pagetable, obj);
>> +}
>> +
>> +static int iommufd_test_pasid_check_hwpt(struct iommufd_ucmd *ucmd,
>> + struct iommu_test_cmd *cmd)
>> +{
>> + u32 hwpt_id = cmd->pasid_check.hwpt_id;
>> + struct iommu_domain *attached_domain;
>> + struct iommu_attach_handle *handle;
>> + struct iommufd_hw_pagetable *hwpt;
>> + struct selftest_obj *sobj;
>> + struct mock_dev *mdev;
>> + int rc = 0;
>> +
>> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
>> + if (IS_ERR(sobj))
>> + return PTR_ERR(sobj);
>> +
>> + mdev = sobj->idev.mock_dev;
>> +
>> + handle = iommu_attach_handle_get(mdev->dev.iommu_group,
>> + cmd->pasid_check.pasid, 0);
>> + if (IS_ERR(handle))
>> + attached_domain = NULL;
>> + else
>> + attached_domain = handle->domain;
>> +
>> + /* hwpt_id == 0 means to check if pasid is detached */
>> + if (!hwpt_id) {
>> + if (attached_domain)
>> + rc = -EINVAL;
>> + goto out_sobj;
>> + }
>> +
>> + hwpt = iommufd_get_hwpt(ucmd, hwpt_id);
>> + if (IS_ERR(hwpt)) {
>> + rc = PTR_ERR(hwpt);
>> + goto out_sobj;
>> + }
>> +
>> + if (attached_domain != hwpt->domain)
>> + rc = -EINVAL;
>> +
>> + iommufd_put_object(ucmd->ictx, &hwpt->obj);
>> +out_sobj:
>> + iommufd_put_object(ucmd->ictx, &sobj->obj);
>> + return rc;
>> +}
>> +
>> +static int iommufd_test_pasid_attach(struct iommufd_ucmd *ucmd,
>> + struct iommu_test_cmd *cmd)
>> +{
>> + struct selftest_obj *sobj;
>> + int rc;
>> +
>> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
>> + if (IS_ERR(sobj))
>> + return PTR_ERR(sobj);
>> +
>> + rc = iommufd_device_attach(sobj->idev.idev, cmd->pasid_attach.pasid,
>> + &cmd->pasid_attach.pt_id);
>> + if (rc)
>> + goto out_sobj;
>> +
>> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
>> + if (rc)
>> + iommufd_device_detach(sobj->idev.idev,
>> + cmd->pasid_attach.pasid);
>> +
>> +out_sobj:
>> + iommufd_put_object(ucmd->ictx, &sobj->obj);
>> + return rc;
>> +}
>> +
>> +static int iommufd_test_pasid_replace(struct iommufd_ucmd *ucmd,
>> + struct iommu_test_cmd *cmd)
>> +{
>> + struct selftest_obj *sobj;
>> + int rc;
>> +
>> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
>> + if (IS_ERR(sobj))
>> + return PTR_ERR(sobj);
>> +
>> + rc = iommufd_device_replace(sobj->idev.idev, cmd->pasid_attach.pasid,
>> + &cmd->pasid_attach.pt_id);
>> + if (rc)
>> + goto out_sobj;
>> +
>> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
>> +
>> +out_sobj:
>> + iommufd_put_object(ucmd->ictx, &sobj->obj);
>> + return rc;
>> +}
>> +
>> +static int iommufd_test_pasid_detach(struct iommufd_ucmd *ucmd,
>> + struct iommu_test_cmd *cmd)
>> +{
>> + struct selftest_obj *sobj;
>> +
>> + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id);
>> + if (IS_ERR(sobj))
>> + return PTR_ERR(sobj);
>> +
>> + iommufd_device_detach(sobj->idev.idev, cmd->pasid_detach.pasid);
>> + iommufd_put_object(ucmd->ictx, &sobj->obj);
>> + return 0;
>> +}
>> +
>> void iommufd_selftest_destroy(struct iommufd_object *obj)
>> {
>> struct selftest_obj *sobj = to_selftest_obj(obj);
>> @@ -1768,6 +1922,14 @@ int iommufd_test(struct iommufd_ucmd *ucmd)
>> return iommufd_test_trigger_iopf(ucmd, cmd);
>> case IOMMU_TEST_OP_TRIGGER_VEVENT:
>> return iommufd_test_trigger_vevent(ucmd, cmd);
>> + case IOMMU_TEST_OP_PASID_ATTACH:
>> + return iommufd_test_pasid_attach(ucmd, cmd);
>> + case IOMMU_TEST_OP_PASID_REPLACE:
>> + return iommufd_test_pasid_replace(ucmd, cmd);
>> + case IOMMU_TEST_OP_PASID_DETACH:
>> + return iommufd_test_pasid_detach(ucmd, cmd);
>> + case IOMMU_TEST_OP_PASID_CHECK_HWPT:
>> + return iommufd_test_pasid_check_hwpt(ucmd, cmd);
>> default:
>> return -EOPNOTSUPP;
>> }
>> --
>> 2.34.1
>>
> Hi Yi Liu,
>
> Greetings!
>
> I used Syzkaller and found that there is general protection fault in iommufd_hw_pagetable_detach in linux-next tag - next-20250325.
>
> After bisection and the first bad commit is:
> "
> 3d183bab95ea iommufd/selftest: Add test ops to test pasid attach/detach
> "
>
> The deadlock can still be reproduced. You could try following reproduction binary.
>
> All detailed into can be found at:
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach
> Syzkaller repro code:
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.c
> Syzkaller repro syscall steps:
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.prog
> Syzkaller report:
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/repro.report
> Kconfig(make olddefconfig):
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/kconfig_origin
> Bisect info:
> https://github.com/laifryiee/syzkaller_logs/tree/main/250327_190630_iommufd_hw_pagetable_detach/bisect_info.log
> bzImage:
> https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/250327_190630_iommufd_hw_pagetable_detach/bzImage_eb4bc4b07f66f01618d9cb1aa4eaef59b1188415
> Issue dmesg:
> https://github.com/laifryiee/syzkaller_logs/blob/main/250327_190630_iommufd_hw_pagetable_detach/eb4bc4b07f66f01618d9cb1aa4eaef59b1188415_dmesg.log
>
> "
> [ 37.609031] iommufd_mock iommufd_mock0: Adding to iommu group 0
> [ 37.611696] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASI
thanks. I think this is due to detaching without attaching first. Noramlly,
it should not happen. But it is better to test the attach before using it.
mutex_lock(&igroup->lock);
attach = xa_load(&igroup->pasid_attach, pasid);
hwpt = attach->hwpt;
hwpt_paging = find_hwpt_paging(hwpt);
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v11 18/18] iommufd/selftest: Add coverage for iommufd pasid attach/detach
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (16 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 17/18] iommufd/selftest: Add test ops to test pasid attach/detach Yi Liu
@ 2025-03-21 17:19 ` Yi Liu
2025-03-21 17:30 ` [PATCH v11 00/18] iommufd support pasid attach/replace Nicolin Chen
2025-03-25 13:24 ` Jason Gunthorpe
19 siblings, 0 replies; 24+ messages in thread
From: Yi Liu @ 2025-03-21 17:19 UTC (permalink / raw)
To: kevin.tian, jgg; +Cc: joro, baolu.lu, yi.l.liu, iommu, nicolinc
This tests iommufd pasid attach/replace/detach.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
v10 -> v11: Various tweaks. Detail refer to discussion in v10 of this patch,
besides added the iopf trigger test in pasid path
---
tools/testing/selftests/iommu/iommufd.c | 301 ++++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 48 ++-
tools/testing/selftests/iommu/iommufd_utils.h | 97 +++++-
3 files changed, 437 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 156c74da53cd..c39222b9869b 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -2996,4 +2996,305 @@ TEST_F(iommufd_viommu, vdevice_cache)
}
}
+FIXTURE(iommufd_device_pasid)
+{
+ int fd;
+ uint32_t ioas_id;
+ uint32_t hwpt_id;
+ uint32_t stdev_id;
+ uint32_t device_id;
+ uint32_t no_pasid_stdev_id;
+ uint32_t no_pasid_device_id;
+};
+
+FIXTURE_VARIANT(iommufd_device_pasid)
+{
+ bool pasid_capable;
+};
+
+FIXTURE_SETUP(iommufd_device_pasid)
+{
+ self->fd = open("/dev/iommu", O_RDWR);
+ ASSERT_NE(-1, self->fd);
+ test_ioctl_ioas_alloc(&self->ioas_id);
+
+ test_cmd_mock_domain_flags(self->ioas_id,
+ MOCK_FLAGS_DEVICE_PASID,
+ &self->stdev_id, &self->hwpt_id,
+ &self->device_id);
+ if (!variant->pasid_capable)
+ test_cmd_mock_domain_flags(self->ioas_id, 0,
+ &self->no_pasid_stdev_id, NULL,
+ &self->no_pasid_device_id);
+}
+
+FIXTURE_TEARDOWN(iommufd_device_pasid)
+{
+ teardown_iommufd(self->fd, _metadata);
+}
+
+FIXTURE_VARIANT_ADD(iommufd_device_pasid, no_pasid)
+{
+ .pasid_capable = false,
+};
+
+FIXTURE_VARIANT_ADD(iommufd_device_pasid, has_pasid)
+{
+ .pasid_capable = true,
+};
+
+TEST_F(iommufd_device_pasid, pasid_attach)
+{
+ struct iommu_hwpt_selftest data = {
+ .iotlb = IOMMU_TEST_IOTLB_DEFAULT,
+ };
+ uint32_t nested_hwpt_id[3] = {};
+ uint32_t parent_hwpt_id = 0;
+ uint32_t fault_id, fault_fd;
+ uint32_t s2_hwpt_id = 0;
+ uint32_t iopf_hwpt_id;
+ uint32_t pasid = 100;
+ uint32_t viommu_id;
+
+ /* Allocate two nested hwpts sharing one common parent hwpt */
+ test_cmd_hwpt_alloc(self->device_id, self->ioas_id,
+ IOMMU_HWPT_ALLOC_NEST_PARENT,
+ &parent_hwpt_id);
+ test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id,
+ IOMMU_HWPT_ALLOC_PASID,
+ &nested_hwpt_id[0],
+ IOMMU_HWPT_DATA_SELFTEST,
+ &data, sizeof(data));
+ test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id,
+ IOMMU_HWPT_ALLOC_PASID,
+ &nested_hwpt_id[1],
+ IOMMU_HWPT_DATA_SELFTEST,
+ &data, sizeof(data));
+
+ /* Fault related preparation */
+ test_ioctl_fault_alloc(&fault_id, &fault_fd);
+ test_cmd_hwpt_alloc_iopf(self->device_id, parent_hwpt_id, fault_id,
+ IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID,
+ &iopf_hwpt_id,
+ IOMMU_HWPT_DATA_SELFTEST, &data,
+ sizeof(data));
+
+ /* Allocate a regular nested hwpt based on viommu */
+ test_cmd_viommu_alloc(self->device_id, parent_hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST,
+ &viommu_id);
+ test_cmd_hwpt_alloc_nested(self->device_id, viommu_id,
+ IOMMU_HWPT_ALLOC_PASID,
+ &nested_hwpt_id[2],
+ IOMMU_HWPT_DATA_SELFTEST, &data,
+ sizeof(data));
+
+ test_cmd_hwpt_alloc(self->device_id, self->ioas_id,
+ IOMMU_HWPT_ALLOC_PASID,
+ &s2_hwpt_id);
+
+ /* Attach RID to non-pasid compat domain, */
+ test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id);
+ /* then attach to pasid should fail */
+ test_err_pasid_attach(EINVAL, pasid, s2_hwpt_id);
+
+ /* Attach RID to pasid compat domain, */
+ test_cmd_mock_domain_replace(self->stdev_id, s2_hwpt_id);
+ /* then attach to pasid should succeed, */
+ test_cmd_pasid_attach(pasid, nested_hwpt_id[0]);
+ /* but attach RID to non-pasid compat domain should fail now. */
+ test_err_mock_domain_replace(EINVAL, self->stdev_id, parent_hwpt_id);
+ /*
+ * Detach hwpt from pasid 100, and check if the pasid 100
+ * has null domain.
+ */
+ test_cmd_pasid_detach(pasid);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, 0));
+ /* RID is attached to pasid-comapt domain, pasid path is not used */
+
+ if (!variant->pasid_capable) {
+ /*
+ * PASID-compatible domain can be used by non-PASID-capable
+ * device.
+ */
+ test_cmd_mock_domain_replace(self->no_pasid_stdev_id, nested_hwpt_id[0]);
+ test_cmd_mock_domain_replace(self->no_pasid_stdev_id, self->ioas_id);
+ /*
+ * Attach hwpt to pasid 100 of non-PASID-capable device,
+ * should fail, no matter domain is pasid-comapt or not.
+ */
+ EXPECT_ERRNO(EINVAL,
+ _test_cmd_pasid_attach(self->fd, self->no_pasid_stdev_id,
+ pasid, parent_hwpt_id));
+ EXPECT_ERRNO(EINVAL,
+ _test_cmd_pasid_attach(self->fd, self->no_pasid_stdev_id,
+ pasid, s2_hwpt_id));
+ }
+
+ /*
+ * Attach non pasid compat hwpt to pasid-capable device, should
+ * fail, and have null domain.
+ */
+ test_err_pasid_attach(EINVAL, pasid, parent_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, 0));
+
+ /*
+ * Attach ioas to pasid 100, should fail, domain should
+ * be null.
+ */
+ test_err_pasid_attach(EINVAL, pasid, self->ioas_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, 0));
+
+ /*
+ * Attach the s2_hwpt to pasid 100, should succeed, domain should
+ * be valid.
+ */
+ test_cmd_pasid_attach(pasid, s2_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /*
+ * Try attach pasid 100 with another hwpt, should FAIL
+ * as attach does not allow overwrite, use REPLACE instead.
+ */
+ test_err_pasid_attach(EBUSY, pasid, nested_hwpt_id[0]);
+
+ /*
+ * Detach hwpt from pasid 100 for next test, should succeed,
+ * and have null domain.
+ */
+ test_cmd_pasid_detach(pasid);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, 0));
+
+ /*
+ * Attach nested hwpt to pasid 100, should succeed, domain
+ * should be valid.
+ */
+ test_cmd_pasid_attach(pasid, nested_hwpt_id[0]);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, nested_hwpt_id[0]));
+
+ /* Attach to pasid 100 which has been attached, should fail. */
+ test_err_pasid_attach(EBUSY, pasid, nested_hwpt_id[0]);
+
+ /* cleanup pasid 100 */
+ test_cmd_pasid_detach(pasid);
+
+ /* Replace tests */
+
+ pasid = 200;
+ /*
+ * Replace pasid 200 without attaching it, should fail
+ * with -EINVAL.
+ */
+ test_err_pasid_replace(EINVAL, pasid, s2_hwpt_id);
+
+ /*
+ * Attach the s2 hwpt to pasid 200, should succeed, domain should
+ * be valid.
+ */
+ test_cmd_pasid_attach(pasid, s2_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /*
+ * Replace pasid 200 with self->ioas_id, should fail
+ * and domain should be the prior s2 hwpt.
+ */
+ test_err_pasid_replace(EINVAL, pasid, self->ioas_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /*
+ * Replace a nested hwpt for pasid 200, should succeed,
+ * and have valid domain.
+ */
+ test_cmd_pasid_replace(pasid, nested_hwpt_id[0]);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, nested_hwpt_id[0]));
+
+ /*
+ * Replace with another nested hwpt for pasid 200, should
+ * succeed, and have valid domain.
+ */
+ test_cmd_pasid_replace(pasid, nested_hwpt_id[1]);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, nested_hwpt_id[1]));
+
+ /* cleanup pasid 200 */
+ test_cmd_pasid_detach(pasid);
+
+ /* Negative Tests for pasid replace, use pasid 1024 */
+
+ /*
+ * Attach the s2 hwpt to pasid 1024, should succeed, domain should
+ * be valid.
+ */
+ pasid = 1024;
+ test_cmd_pasid_attach(pasid, s2_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /*
+ * Replace pasid 1024 with nested_hwpt_id[0], should fail,
+ * but have the old valid domain. This is a designed
+ * negative case. Normally, this shall succeed.
+ */
+ test_err_pasid_replace(ENOMEM, pasid, nested_hwpt_id[0]);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /* cleanup pasid 1024 */
+ test_cmd_pasid_detach(pasid);
+
+ /* Attach to iopf-capable hwpt */
+
+ /*
+ * Attach an iopf hwpt to pasid 2048, should succeed, domain should
+ * be valid.
+ */
+ pasid = 2048;
+ test_cmd_pasid_attach(pasid, iopf_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, iopf_hwpt_id));
+
+ test_cmd_trigger_iopf_pasid(self->device_id, pasid, fault_fd);
+
+ /*
+ * Replace with s2_hwpt_id for pasid 2048, should
+ * succeed, and have valid domain.
+ */
+ test_cmd_pasid_replace(pasid, s2_hwpt_id);
+ ASSERT_EQ(0,
+ test_cmd_pasid_check_hwpt(self->fd, self->stdev_id,
+ pasid, s2_hwpt_id));
+
+ /* cleanup pasid 2048 */
+ test_cmd_pasid_detach(pasid);
+
+ test_ioctl_destroy(iopf_hwpt_id);
+ close(fault_fd);
+ test_ioctl_destroy(fault_id);
+
+ /* Detach the s2_hwpt_id from RID */
+ test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id);
+}
+
TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index 99a7f7897bb2..f182e96eccc1 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -209,12 +209,16 @@ FIXTURE(basic_fail_nth)
{
int fd;
uint32_t access_id;
+ uint32_t stdev_id;
+ uint32_t pasid;
};
FIXTURE_SETUP(basic_fail_nth)
{
self->fd = -1;
self->access_id = 0;
+ self->stdev_id = 0;
+ self->pasid = 0; //test should use a non-zero value
}
FIXTURE_TEARDOWN(basic_fail_nth)
@@ -226,6 +230,8 @@ FIXTURE_TEARDOWN(basic_fail_nth)
rc = _test_cmd_destroy_access(self->access_id);
assert(rc == 0);
}
+ if (self->pasid && self->stdev_id)
+ _test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid);
teardown_iommufd(self->fd, _metadata);
}
@@ -622,9 +628,9 @@ TEST_FAIL_NTH(basic_fail_nth, device)
uint32_t fault_id, fault_fd;
uint32_t veventq_id, veventq_fd;
uint32_t fault_hwpt_id;
+ uint32_t test_hwpt_id;
uint32_t ioas_id;
uint32_t ioas_id2;
- uint32_t stdev_id;
uint32_t idev_id;
uint32_t hwpt_id;
uint32_t viommu_id;
@@ -655,25 +661,29 @@ TEST_FAIL_NTH(basic_fail_nth, device)
fail_nth_enable();
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, NULL,
- &idev_id))
+ if (_test_cmd_mock_domain_flags(self->fd, ioas_id,
+ MOCK_FLAGS_DEVICE_PASID,
+ &self->stdev_id, NULL, &idev_id))
return -1;
if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL))
return -1;
- if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 0, &hwpt_id,
+ if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0,
+ IOMMU_HWPT_ALLOC_PASID, &hwpt_id,
IOMMU_HWPT_DATA_NONE, 0, 0))
return -1;
- if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL))
+ if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, ioas_id2, NULL))
return -1;
- if (_test_cmd_mock_domain_replace(self->fd, stdev_id, hwpt_id, NULL))
+ if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, hwpt_id, NULL))
return -1;
if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0,
- IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id,
+ IOMMU_HWPT_ALLOC_NEST_PARENT |
+ IOMMU_HWPT_ALLOC_PASID,
+ &hwpt_id,
IOMMU_HWPT_DATA_NONE, 0, 0))
return -1;
@@ -699,6 +709,31 @@ TEST_FAIL_NTH(basic_fail_nth, device)
return -1;
close(veventq_fd);
+ if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0,
+ IOMMU_HWPT_ALLOC_PASID,
+ &test_hwpt_id,
+ IOMMU_HWPT_DATA_NONE, 0, 0))
+ return -1;
+
+ /* Tests for pasid attach/replace/detach */
+
+ self->pasid = 200;
+
+ if (_test_cmd_pasid_attach(self->fd, self->stdev_id,
+ self->pasid, hwpt_id)) {
+ self->pasid = 0;
+ return -1;
+ }
+
+ if (_test_cmd_pasid_replace(self->fd, self->stdev_id,
+ self->pasid, test_hwpt_id))
+ return -1;
+
+ if (_test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid))
+ return -1;
+
+ self->pasid = 0;
+
return 0;
}
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 6f2ba2fa8f76..27794b6f58fc 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -843,14 +843,15 @@ static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd)
ASSERT_NE(0, *(fault_fd)); \
})
-static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd)
+static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 pasid,
+ __u32 fault_fd)
{
struct iommu_test_cmd trigger_iopf_cmd = {
.size = sizeof(trigger_iopf_cmd),
.op = IOMMU_TEST_OP_TRIGGER_IOPF,
.trigger_iopf = {
.dev_id = device_id,
- .pasid = 0x1,
+ .pasid = pasid,
.grpid = 0x2,
.perm = IOMMU_PGFAULT_PERM_READ | IOMMU_PGFAULT_PERM_WRITE,
.addr = 0xdeadbeaf,
@@ -881,7 +882,10 @@ static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd)
}
#define test_cmd_trigger_iopf(device_id, fault_fd) \
- ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd))
+ ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, 0x1, fault_fd))
+#define test_cmd_trigger_iopf_pasid(device_id, pasid, fault_fd) \
+ ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, \
+ pasid, fault_fd))
static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id,
__u32 type, __u32 flags, __u32 *viommu_id)
@@ -1051,3 +1055,90 @@ static int _test_cmd_read_vevents(int fd, __u32 event_fd, __u32 nvevents,
EXPECT_ERRNO(_errno, \
_test_cmd_read_vevents(self->fd, event_fd, nvevents, \
virt_id, prev_seq))
+
+static int _test_cmd_pasid_attach(int fd, __u32 stdev_id, __u32 pasid,
+ __u32 pt_id)
+{
+ struct iommu_test_cmd test_attach = {
+ .size = sizeof(test_attach),
+ .op = IOMMU_TEST_OP_PASID_ATTACH,
+ .id = stdev_id,
+ .pasid_attach = {
+ .pasid = pasid,
+ .pt_id = pt_id,
+ },
+ };
+
+ return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_ATTACH),
+ &test_attach);
+}
+
+#define test_cmd_pasid_attach(pasid, hwpt_id) \
+ ASSERT_EQ(0, _test_cmd_pasid_attach(self->fd, self->stdev_id, \
+ pasid, hwpt_id))
+
+#define test_err_pasid_attach(_errno, pasid, hwpt_id) \
+ EXPECT_ERRNO(_errno, \
+ _test_cmd_pasid_attach(self->fd, self->stdev_id, \
+ pasid, hwpt_id))
+
+static int _test_cmd_pasid_replace(int fd, __u32 stdev_id, __u32 pasid,
+ __u32 pt_id)
+{
+ struct iommu_test_cmd test_replace = {
+ .size = sizeof(test_replace),
+ .op = IOMMU_TEST_OP_PASID_REPLACE,
+ .id = stdev_id,
+ .pasid_replace = {
+ .pasid = pasid,
+ .pt_id = pt_id,
+ },
+ };
+
+ return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_REPLACE),
+ &test_replace);
+}
+
+#define test_cmd_pasid_replace(pasid, hwpt_id) \
+ ASSERT_EQ(0, _test_cmd_pasid_replace(self->fd, self->stdev_id, \
+ pasid, hwpt_id))
+
+#define test_err_pasid_replace(_errno, pasid, hwpt_id) \
+ EXPECT_ERRNO(_errno, \
+ _test_cmd_pasid_replace(self->fd, self->stdev_id, \
+ pasid, hwpt_id))
+
+static int _test_cmd_pasid_detach(int fd, __u32 stdev_id, __u32 pasid)
+{
+ struct iommu_test_cmd test_detach = {
+ .size = sizeof(test_detach),
+ .op = IOMMU_TEST_OP_PASID_DETACH,
+ .id = stdev_id,
+ .pasid_detach = {
+ .pasid = pasid,
+ },
+ };
+
+ return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_DETACH),
+ &test_detach);
+}
+
+#define test_cmd_pasid_detach(pasid) \
+ ASSERT_EQ(0, _test_cmd_pasid_detach(self->fd, self->stdev_id, pasid))
+
+static int test_cmd_pasid_check_hwpt(int fd, __u32 stdev_id, __u32 pasid,
+ __u32 hwpt_id)
+{
+ struct iommu_test_cmd test_pasid_check = {
+ .size = sizeof(test_pasid_check),
+ .op = IOMMU_TEST_OP_PASID_CHECK_HWPT,
+ .id = stdev_id,
+ .pasid_check = {
+ .pasid = pasid,
+ .hwpt_id = hwpt_id,
+ },
+ };
+
+ return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_CHECK_HWPT),
+ &test_pasid_check);
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v11 00/18] iommufd support pasid attach/replace
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (17 preceding siblings ...)
2025-03-21 17:19 ` [PATCH v11 18/18] iommufd/selftest: Add coverage for iommufd " Yi Liu
@ 2025-03-21 17:30 ` Nicolin Chen
2025-03-21 19:24 ` Nicolin Chen
2025-03-25 13:24 ` Jason Gunthorpe
19 siblings, 1 reply; 24+ messages in thread
From: Nicolin Chen @ 2025-03-21 17:30 UTC (permalink / raw)
To: Yi Liu; +Cc: kevin.tian, jgg, joro, baolu.lu, iommu
On Fri, Mar 21, 2025 at 10:19:22AM -0700, Yi Liu wrote:
> PASID (Process Address Space ID) is a PCIe extension that tags the DMA
> transactions from a physical device. Most modern IOMMU hardware supports
> PASID-granular address translation. This allows a PASID-capable device
> to be attached to multiple hardware page tables (hwpts, also known as
> domains), with each attachment tagged by a PASID.
>
> This series builds on previous series [1]. It begins by adding a missing
> IOMMU API to replace the domain for a PASID. Utilizing the IOMMU PASID
> attach/replace/detach APIs, this series introduces iommufd APIs for device
> drivers to attach, replace, or detach PASIDs to/from hwpts at the request
> of userspace. It also enforces PASID compatibility with domain requirements,
> allocates PASID-compatible hwpts in iommufd, and includes self-tests to
> validate the iommufd APIs.
>
> The complete code is available at the following link [2]. Please note that
> the existing iommufd self-test was broken, and a temporary fix patch is at
> the top of the branch [2]. If you wish to run the iommufd self-test, please
> apply that fix. We apologize for any inconvenience.
>
> The series is based on Jason's for-next branch.
>
> https://web.git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-next&id=e009e088d88e8402539f9595b10c0014125a70c1
>
> [1] https://lore.kernel.org/linux-iommu/20250226011849.5102-1-yi.l.liu@intel.com/
> [2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
> [3] https://lore.kernel.org/linux-iommu/20250306034842.5950-1-yi.l.liu@intel.com/
>
> Change log:
>
> v11:
> - Handle is always valid for the replace API, hence drop some meaningless check (Nic/Baolu)
> - Avoid inline helpers in .c file patch 02 an 06 of v10 (Baolu)
> - Use xa_load() instead of xa_cmpxch() in iommufd_device_do_replace() (Jason/Nic)
> - Fix a memleak in patch 11 0f v10, it's due to an order broken (Nic/Jason)
> - Make the auto_hwpts always be non-pasid-compat to avoid confusion between
> the RID and PASID path
> - Add pasid-num-bit as 0 for non-pasid-capable mock device (Nic)
> - Misc tweaks to the patch 17 and 18 (Nic)
> - More r-b tags
I am running some sanity with this v11 using the iommufd_pasid
branch, and will give Tested-by in the next a couple of hours
so long as everything works fine.
I sent a couple of replies today but I think they are trivial
and can be cleaned up later if we really want to.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v11 00/18] iommufd support pasid attach/replace
2025-03-21 17:30 ` [PATCH v11 00/18] iommufd support pasid attach/replace Nicolin Chen
@ 2025-03-21 19:24 ` Nicolin Chen
0 siblings, 0 replies; 24+ messages in thread
From: Nicolin Chen @ 2025-03-21 19:24 UTC (permalink / raw)
To: Yi Liu; +Cc: kevin.tian, jgg, joro, baolu.lu, iommu
On Fri, Mar 21, 2025 at 10:30:12AM -0700, Nicolin Chen wrote:
> On Fri, Mar 21, 2025 at 10:19:22AM -0700, Yi Liu wrote:
> > [2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
> >
> > Change log:
> >
> > v11:
> > - Handle is always valid for the replace API, hence drop some meaningless check (Nic/Baolu)
> > - Avoid inline helpers in .c file patch 02 an 06 of v10 (Baolu)
> > - Use xa_load() instead of xa_cmpxch() in iommufd_device_do_replace() (Jason/Nic)
> > - Fix a memleak in patch 11 0f v10, it's due to an order broken (Nic/Jason)
> > - Make the auto_hwpts always be non-pasid-compat to avoid confusion between
> > the RID and PASID path
> > - Add pasid-num-bit as 0 for non-pasid-capable mock device (Nic)
> > - Misc tweaks to the patch 17 and 18 (Nic)
> > - More r-b tags
>
> I am running some sanity with this v11 using the iommufd_pasid
> branch, and will give Tested-by in the next a couple of hours
> so long as everything works fine.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Ran selftest and nested SMMU on ARM64.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v11 00/18] iommufd support pasid attach/replace
2025-03-21 17:19 [PATCH v11 00/18] iommufd support pasid attach/replace Yi Liu
` (18 preceding siblings ...)
2025-03-21 17:30 ` [PATCH v11 00/18] iommufd support pasid attach/replace Nicolin Chen
@ 2025-03-25 13:24 ` Jason Gunthorpe
19 siblings, 0 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2025-03-25 13:24 UTC (permalink / raw)
To: Yi Liu; +Cc: kevin.tian, joro, baolu.lu, iommu, nicolinc
On Fri, Mar 21, 2025 at 10:19:22AM -0700, Yi Liu wrote:
> Yi Liu (18):
> iommu: Require passing new handles to APIs supporting handle
> iommu: Introduce a replace API for device pasid
> iommufd: Pass @pasid through the device attach/replace path
> iommufd/device: Only add reserved_iova in non-pasid path
> iommufd/device: Replace idev->igroup with local variable
> iommufd/device: Add helper to detect the first attach of a group
> iommufd/device: Wrap igroup->hwpt and igroup->device_list into attach
> struct
> iommufd/device: Replace device_list with device_array
> iommufd/device: Add pasid_attach array to track per-PASID attach
> iommufd: Enforce PASID-compatible domain in PASID path
> iommufd: Support pasid attach/replace
> iommufd: Enforce PASID-compatible domain for RID
> iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support
> iommufd: Allow allocating PASID-compatible domain
> iommufd/selftest: Add set_dev_pasid in mock iommu
> iommufd/selftest: Add a helper to get test device
> iommufd/selftest: Add test ops to test pasid attach/detach
> iommufd/selftest: Add coverage for iommufd pasid attach/detach
Applied to iommufd - it has been in linux-next already for four days
Thanks,
Jason
^ permalink raw reply [flat|nested] 24+ messages in thread