linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind
@ 2025-07-15  6:32 Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 1/8] iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice Xu Yilun
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

It is to solve the lifecycle issue that vdevice may outlive idevice. It
is a prerequisite for TIO, to ensure extra secure configurations (e.g.
TSM Bind/Unbind) against vdevice could be rolled back on idevice unbind,
so that VFIO could still work on the physical device without surprise.

Changelog:
v5:
 - Further rebase to iommufd for-next 601b1d0d9395
 - Keep the xa_empty() check in iommufd_fops_release(), update comments
 - Move the *idev next to *viommu for struct iommufd_vdevice
 - Update the description about IOMMUFD_CMD_VDEVICE_ALLOC for lifecycle
 - Remove Baolu's tag for patch 4 because of big changes since v3
 - Add changelog about idev->destroying
 - Adjust line wrappings for tools/testing/selftests/iommu/iommufd.c
 - Clarify that no testing for tombstoned ID repurposing.
 - Add review tags.

v4: https://lore.kernel.org/linux-iommu/20250709040234.1773573-1-yilun.xu@linux.intel.com/
 - Rebase to iommufd for-next.
 - A new patch to roll back iommufd_object_alloc_ucmd() for vdevice.
 - Fix the mistake trying to xa_destroy ictx->groups on
   iommufd_fops_release().
 - Move 'empty' flag inside destroy loop for iommufd_fops_release().
 - Refactor vdev/idev destroy syncing.
   - Drop the iommufd_vdevice_abort() reentrant idea.
   - A new patch that adds pre_destroy() op.
   - Hold short term reference during the whole vdevice's lifecycle.
   - Wait on short term reference on idev's pre_destroy().
   - Add a 'destroying' flag for idev to prevent new reference after
     pre_destroy().
 - Rephrase/fix some comments.
 - Add review tags.

v3: https://lore.kernel.org/linux-iommu/20250627033809.1730752-1-yilun.xu@linux.intel.com/
 - No bother clean each tombstone in iommufd_fops_release().
 - Drop vdev->ictx initialization fix patch.
 - Optimize control flow in iommufd_device_remove_vdev().
 - Make iommufd_vdevice_abort() reentrant.
 - Call iommufd_vdevice_abort() directly instead of waiting for it.
 - Rephrase/fix some comments.
 - A new patch to remove vdev->dev.
 - A new patch to explicitly skip existing viommu tests for no_iommu.
 - Also skip vdevice tombstone test for no_iommu.
 - Allow me to add SoB from Aneesh.

v2: https://lore.kernel.org/linux-iommu/20250623094946.1714996-1-yilun.xu@linux.intel.com/

v1/rfc: https://lore.kernel.org/linux-iommu/20250610065146.1321816-1-aneesh.kumar@kernel.org/

The series is based on iommufd for-next


Xu Yilun (8):
  iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
  iommufd: Add iommufd_object_tombstone_user() helper
  iommufd: Add a pre_destroy() op for objects
  iommufd: Destroy vdevice on idevice destroy
  iommufd/vdevice: Remove struct device reference from struct vdevice
  iommufd/selftest: Explicitly skip tests for inapplicable variant
  iommufd/selftest: Add coverage for vdevice tombstone
  iommufd: Rename some shortterm-related identifiers

 .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c    |   2 +-
 drivers/iommu/iommufd/device.c                |  51 +++
 drivers/iommu/iommufd/driver.c                |   4 +-
 drivers/iommu/iommufd/iommufd_private.h       |  49 ++-
 drivers/iommu/iommufd/main.c                  |  69 +++-
 drivers/iommu/iommufd/viommu.c                |  69 +++-
 include/linux/iommufd.h                       |  10 +-
 include/uapi/linux/iommufd.h                  |   5 +
 tools/testing/selftests/iommu/iommufd.c       | 379 +++++++++---------
 9 files changed, 407 insertions(+), 231 deletions(-)


base-commit: 601b1d0d9395c711383452bd0d47037afbbb4bcf
-- 
2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v5 1/8] iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 2/8] iommufd: Add iommufd_object_tombstone_user() helper Xu Yilun
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

To solve the vdevice lifecycle issue, future patches make the vdevice
allocation protected by lock. That will make
_iommufd_object_alloc_ucmd() not applicable for vdevice. Roll back to
use _iommufd_object_alloc() for preparation.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/iommufd/viommu.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 91339f799916..dcf8a85b9f6e 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -167,8 +167,8 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 		vdev_size = viommu->ops->vdevice_size;
 	}
 
-	vdev = (struct iommufd_vdevice *)_iommufd_object_alloc_ucmd(
-		ucmd, vdev_size, IOMMUFD_OBJ_VDEVICE);
+	vdev = (struct iommufd_vdevice *)_iommufd_object_alloc(
+		ucmd->ictx, vdev_size, IOMMUFD_OBJ_VDEVICE);
 	if (IS_ERR(vdev)) {
 		rc = PTR_ERR(vdev);
 		goto out_put_idev;
@@ -183,18 +183,24 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	curr = xa_cmpxchg(&viommu->vdevs, virt_id, NULL, vdev, GFP_KERNEL);
 	if (curr) {
 		rc = xa_err(curr) ?: -EEXIST;
-		goto out_put_idev;
+		goto out_abort;
 	}
 
 	if (viommu->ops && viommu->ops->vdevice_init) {
 		rc = viommu->ops->vdevice_init(vdev);
 		if (rc)
-			goto out_put_idev;
+			goto out_abort;
 	}
 
 	cmd->out_vdevice_id = vdev->obj.id;
 	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+	if (rc)
+		goto out_abort;
+	iommufd_object_finalize(ucmd->ictx, &vdev->obj);
+	goto out_put_idev;
 
+out_abort:
+	iommufd_object_abort_and_destroy(ucmd->ictx, &vdev->obj);
 out_put_idev:
 	iommufd_put_object(ucmd->ictx, &idev->obj);
 out_put_viommu:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 2/8] iommufd: Add iommufd_object_tombstone_user() helper
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 1/8] iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects Xu Yilun
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

Add the iommufd_object_tombstone_user() helper, which allows the caller
to destroy an iommufd object created by userspace.

This is useful on some destroy paths when the kernel caller finds the
object should have been removed by userspace but is still alive. With
this helper, the caller destroys the object but leave the object ID
reserved (so called tombstone). The tombstone prevents repurposing the
object ID without awareness of the original user.

Since this happens for abnormal userspace behavior, for simplicity, the
tombstoned object ID would be permanently leaked until
iommufd_fops_release(). I.e. the original user gets an error when
calling ioctl(IOMMU_DESTROY) on that ID.

The first use case would be to ensure the iommufd_vdevice can't outlive
the associated iommufd_device.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Co-developed-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/iommufd/iommufd_private.h | 23 ++++++++++++++++++++++-
 drivers/iommu/iommufd/main.c            | 24 +++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index cd14163abdd1..149545060029 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -202,7 +202,8 @@ void iommufd_object_finalize(struct iommufd_ctx *ictx,
 			     struct iommufd_object *obj);
 
 enum {
-	REMOVE_WAIT_SHORTTERM = 1,
+	REMOVE_WAIT_SHORTTERM	= BIT(0),
+	REMOVE_OBJ_TOMBSTONE	= BIT(1),
 };
 int iommufd_object_remove(struct iommufd_ctx *ictx,
 			  struct iommufd_object *to_destroy, u32 id,
@@ -228,6 +229,26 @@ static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx,
 	WARN_ON(ret);
 }
 
+/*
+ * Similar to iommufd_object_destroy_user(), except that the object ID is left
+ * reserved/tombstoned.
+ */
+static inline void iommufd_object_tombstone_user(struct iommufd_ctx *ictx,
+						 struct iommufd_object *obj)
+{
+	int ret;
+
+	ret = iommufd_object_remove(ictx, obj, obj->id,
+				    REMOVE_WAIT_SHORTTERM | REMOVE_OBJ_TOMBSTONE);
+
+	/*
+	 * If there is a bug and we couldn't destroy the object then we did put
+	 * back the caller's users refcount and will eventually try to free it
+	 * again during close.
+	 */
+	WARN_ON(ret);
+}
+
 /*
  * The HWPT allocated by autodomains is used in possibly many devices and
  * is automatically destroyed when its refcount reaches zero.
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 69c2195e77ca..71135f0ec72d 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -225,7 +225,7 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 		goto err_xa;
 	}
 
-	xas_store(&xas, NULL);
+	xas_store(&xas, (flags & REMOVE_OBJ_TOMBSTONE) ? XA_ZERO_ENTRY : NULL);
 	if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj))
 		ictx->vfio_ioas = NULL;
 	xa_unlock(&ictx->objects);
@@ -311,19 +311,41 @@ static int iommufd_fops_release(struct inode *inode, struct file *filp)
 	while (!xa_empty(&ictx->objects)) {
 		unsigned int destroyed = 0;
 		unsigned long index;
+		bool empty = true;
 
+		/*
+		 * We can't use xa_empty() to end the loop as the tombstones
+		 * are stored as XA_ZERO_ENTRY in the xarray. However
+		 * xa_for_each() automatically converts them to NULL and skips
+		 * them causing xa_empty() to be kept false. Thus once
+		 * xa_for_each() finds no further !NULL entries the loop is
+		 * done.
+		 */
 		xa_for_each(&ictx->objects, index, obj) {
+			empty = false;
 			if (!refcount_dec_if_one(&obj->users))
 				continue;
+
 			destroyed++;
 			xa_erase(&ictx->objects, index);
 			iommufd_object_ops[obj->type].destroy(obj);
 			kfree(obj);
 		}
+
+		if (empty)
+			break;
+
 		/* Bug related to users refcount */
 		if (WARN_ON(!destroyed))
 			break;
 	}
+
+	/*
+	 * There may be some tombstones left over from
+	 * iommufd_object_tombstone_user()
+	 */
+	xa_destroy(&ictx->objects);
+
 	WARN_ON(!xa_empty(&ictx->groups));
 
 	mutex_destroy(&ictx->sw_msi_lock);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 1/8] iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice Xu Yilun
  2025-07-15  6:32 ` [PATCH v5 2/8] iommufd: Add iommufd_object_tombstone_user() helper Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 13:19   ` Jason Gunthorpe
  2025-07-15  6:32 ` [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy Xu Yilun
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

Add a pre_destroy() op which gives objects a chance to clear their
short term users references before destruction. This op is intended for
external driver created objects (e.g. idev) which does deterministic
destruction.

In order to manage the lifecycle of interrelated objects as well as the
deterministic destruction (e.g. vdev can't outlive idev, and idev
destruction can't fail), short term users references are allowed to
live out of an ioctl execution. An immediate use case is, vdev holds
idev's short term user reference until vdev destruction completes, idev
leverages existing wait_shortterm mechanism to ensure it is destroyed
after vdev.

This extended usage makes the referenced object unable to just wait for
its reference gone. It needs to actively trigger the reference removal,
as well as prevent new references before wait. Should implement these
work in pre_destroy().

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/iommufd/main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 71135f0ec72d..53085d24ce4a 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -23,6 +23,7 @@
 #include "iommufd_test.h"
 
 struct iommufd_object_ops {
+	void (*pre_destroy)(struct iommufd_object *obj);
 	void (*destroy)(struct iommufd_object *obj);
 	void (*abort)(struct iommufd_object *obj);
 };
@@ -160,6 +161,9 @@ static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx,
 	if (refcount_dec_and_test(&to_destroy->shortterm_users))
 		return 0;
 
+	if (iommufd_object_ops[to_destroy->type].pre_destroy)
+		iommufd_object_ops[to_destroy->type].pre_destroy(to_destroy);
+
 	if (wait_event_timeout(ictx->destroy_wait,
 			       refcount_read(&to_destroy->shortterm_users) == 0,
 			       msecs_to_jiffies(60000)))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (2 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 13:37   ` Jason Gunthorpe
  2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

Destroy iommufd_vdevice (vdev) on iommufd_idevice (idev) destruction so
that vdev can't outlive idev.

idev represents the physical device bound to iommufd, while the vdev
represents the virtual instance of the physical device in the VM. The
lifecycle of the vdev should not be longer than idev. This doesn't
cause real problem on existing use cases cause vdev doesn't impact the
physical device, only provides virtualization information. But to
extend vdev for Confidential Computing (CC), there are needs to do
secure configuration for the vdev, e.g. TSM Bind/Unbind. These
configurations should be rolled back on idev destroy, or the external
driver (VFIO) functionality may be impact.

The idev is created by external driver so its destruction can't fail.
The idev implements pre_destroy() op to actively remove its associated
vdev before destroying itself. There are 3 cases on idev pre_destroy():

  1. vdev is already destroyed by userspace. No extra handling needed.
  2. vdev is still alive. Use iommufd_object_tombstone_user() to
     destroy vdev and tombstone the vdev ID.
  3. vdev is being destroyed by userspace. The vdev ID is already
     freed, but vdev destroy handler is not completed. This requires
     multi-threads syncing - vdev holds idev's short term users
     reference until vdev destruction completes, idev leverages
     existing wait_shortterm mechanism for syncing.

idev should also block any new reference to it after pre_destroy(),
or the following wait shortterm would timeout. Introduce a 'destroying'
flag, set it to true on idev pre_destroy(). Any attempt to reference
idev should honor this flag under the protection of
idev->igroup->lock.

Originally-by: Nicolin Chen <nicolinc@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/iommufd/device.c          | 51 ++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h | 12 ++++++
 drivers/iommu/iommufd/main.c            |  2 +
 drivers/iommu/iommufd/viommu.c          | 52 +++++++++++++++++++++++--
 include/linux/iommufd.h                 |  1 +
 include/uapi/linux/iommufd.h            |  5 +++
 6 files changed, 119 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index e2ba21c43ad2..ee6ff4caf398 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -137,6 +137,57 @@ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx,
 	}
 }
 
+static void iommufd_device_remove_vdev(struct iommufd_device *idev)
+{
+	struct iommufd_vdevice *vdev;
+
+	mutex_lock(&idev->igroup->lock);
+	/* prevent new references from vdev */
+	idev->destroying = true;
+	/* vdev has been completely destroyed by userspace */
+	if (!idev->vdev)
+		goto out_unlock;
+
+	vdev = iommufd_get_vdevice(idev->ictx, idev->vdev->obj.id);
+	/*
+	 * An ongoing vdev destroy ioctl has removed the vdev from the object
+	 * xarray, but has not finished iommufd_vdevice_destroy() yet as it
+	 * needs the same mutex. We exit the locking then wait on short term
+	 * users for the vdev destruction.
+	 */
+	if (IS_ERR(vdev))
+		goto out_unlock;
+
+	/* Should never happen */
+	if (WARN_ON(vdev != idev->vdev)) {
+		iommufd_put_object(idev->ictx, &vdev->obj);
+		goto out_unlock;
+	}
+
+	/*
+	 * vdev is still alive. Hold a users refcount to prevent racing with
+	 * userspace destruction, then use iommufd_object_tombstone_user() to
+	 * destroy it and leave a tombstone.
+	 */
+	refcount_inc(&vdev->obj.users);
+	iommufd_put_object(idev->ictx, &vdev->obj);
+	mutex_unlock(&idev->igroup->lock);
+	iommufd_object_tombstone_user(idev->ictx, &vdev->obj);
+	return;
+
+out_unlock:
+	mutex_unlock(&idev->igroup->lock);
+}
+
+void iommufd_device_pre_destroy(struct iommufd_object *obj)
+{
+	struct iommufd_device *idev =
+		container_of(obj, struct iommufd_device, obj);
+
+	/* Release the short term users on this */
+	iommufd_device_remove_vdev(idev);
+}
+
 void iommufd_device_destroy(struct iommufd_object *obj)
 {
 	struct iommufd_device *idev =
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 149545060029..5d6ea5395cfe 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -489,6 +489,8 @@ struct iommufd_device {
 	/* always the physical device */
 	struct device *dev;
 	bool enforce_cache_coherency;
+	struct iommufd_vdevice *vdev;
+	bool destroying;
 };
 
 static inline struct iommufd_device *
@@ -499,6 +501,7 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
 			    struct iommufd_device, obj);
 }
 
+void iommufd_device_pre_destroy(struct iommufd_object *obj);
 void iommufd_device_destroy(struct iommufd_object *obj);
 int iommufd_get_hw_info(struct iommufd_ucmd *ucmd);
 
@@ -687,9 +690,18 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
 void iommufd_viommu_destroy(struct iommufd_object *obj);
 int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd);
 void iommufd_vdevice_destroy(struct iommufd_object *obj);
+void iommufd_vdevice_abort(struct iommufd_object *obj);
 int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd);
 void iommufd_hw_queue_destroy(struct iommufd_object *obj);
 
+static inline struct iommufd_vdevice *
+iommufd_get_vdevice(struct iommufd_ctx *ictx, u32 id)
+{
+	return container_of(iommufd_get_object(ictx, id,
+					       IOMMUFD_OBJ_VDEVICE),
+			    struct iommufd_vdevice, obj);
+}
+
 #ifdef CONFIG_IOMMUFD_TEST
 int iommufd_test(struct iommufd_ucmd *ucmd);
 void iommufd_selftest_destroy(struct iommufd_object *obj);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 53085d24ce4a..99c1aab3d396 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -655,6 +655,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
 		.destroy = iommufd_access_destroy_object,
 	},
 	[IOMMUFD_OBJ_DEVICE] = {
+		.pre_destroy = iommufd_device_pre_destroy,
 		.destroy = iommufd_device_destroy,
 	},
 	[IOMMUFD_OBJ_FAULT] = {
@@ -676,6 +677,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
 	},
 	[IOMMUFD_OBJ_VDEVICE] = {
 		.destroy = iommufd_vdevice_destroy,
+		.abort = iommufd_vdevice_abort,
 	},
 	[IOMMUFD_OBJ_VEVENTQ] = {
 		.destroy = iommufd_veventq_destroy,
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index dcf8a85b9f6e..ecbae5091ffe 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -110,20 +110,37 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	return rc;
 }
 
-void iommufd_vdevice_destroy(struct iommufd_object *obj)
+void iommufd_vdevice_abort(struct iommufd_object *obj)
 {
 	struct iommufd_vdevice *vdev =
 		container_of(obj, struct iommufd_vdevice, obj);
 	struct iommufd_viommu *viommu = vdev->viommu;
+	struct iommufd_device *idev = vdev->idev;
+
+	lockdep_assert_held(&idev->igroup->lock);
 
 	if (vdev->destroy)
 		vdev->destroy(vdev);
 	/* xa_cmpxchg is okay to fail if alloc failed xa_cmpxchg previously */
 	xa_cmpxchg(&viommu->vdevs, vdev->virt_id, vdev, NULL, GFP_KERNEL);
 	refcount_dec(&viommu->obj.users);
+	idev->vdev = NULL;
 	put_device(vdev->dev);
 }
 
+void iommufd_vdevice_destroy(struct iommufd_object *obj)
+{
+	struct iommufd_vdevice *vdev =
+		container_of(obj, struct iommufd_vdevice, obj);
+	struct iommufd_device *idev = vdev->idev;
+	struct iommufd_ctx *ictx = idev->ictx;
+
+	mutex_lock(&idev->igroup->lock);
+	iommufd_vdevice_abort(obj);
+	mutex_unlock(&idev->igroup->lock);
+	iommufd_put_object(ictx, &idev->obj);
+}
+
 int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 {
 	struct iommu_vdevice_alloc *cmd = ucmd->cmd;
@@ -153,6 +170,17 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 		goto out_put_idev;
 	}
 
+	mutex_lock(&idev->igroup->lock);
+	if (idev->destroying) {
+		rc = -ENOENT;
+		goto out_unlock_igroup;
+	}
+
+	if (idev->vdev) {
+		rc = -EEXIST;
+		goto out_unlock_igroup;
+	}
+
 	if (viommu->ops && viommu->ops->vdevice_size) {
 		/*
 		 * It is a driver bug for:
@@ -171,7 +199,7 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 		ucmd->ictx, vdev_size, IOMMUFD_OBJ_VDEVICE);
 	if (IS_ERR(vdev)) {
 		rc = PTR_ERR(vdev);
-		goto out_put_idev;
+		goto out_unlock_igroup;
 	}
 
 	vdev->virt_id = virt_id;
@@ -179,6 +207,19 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	get_device(idev->dev);
 	vdev->viommu = viommu;
 	refcount_inc(&viommu->obj.users);
+	/*
+	 * A short term users reference is held on the idev so long as we have
+	 * the pointer. iommufd_device_pre_destroy() will revoke it before the
+	 * idev real destruction.
+	 */
+	vdev->idev = idev;
+
+	/*
+	 * iommufd_device_destroy() delays until idev->vdev is NULL before
+	 * freeing the idev, which only happens once the vdev is finished
+	 * destruction.
+	 */
+	idev->vdev = vdev;
 
 	curr = xa_cmpxchg(&viommu->vdevs, virt_id, NULL, vdev, GFP_KERNEL);
 	if (curr) {
@@ -197,12 +238,15 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	if (rc)
 		goto out_abort;
 	iommufd_object_finalize(ucmd->ictx, &vdev->obj);
-	goto out_put_idev;
+	goto out_unlock_igroup;
 
 out_abort:
 	iommufd_object_abort_and_destroy(ucmd->ictx, &vdev->obj);
+out_unlock_igroup:
+	mutex_unlock(&idev->igroup->lock);
 out_put_idev:
-	iommufd_put_object(ucmd->ictx, &idev->obj);
+	if (rc)
+		iommufd_put_object(ucmd->ictx, &idev->obj);
 out_put_viommu:
 	iommufd_put_object(ucmd->ictx, &viommu->obj);
 	return rc;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index e3a0cd47384d..b88911026bc4 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -108,6 +108,7 @@ struct iommufd_viommu {
 struct iommufd_vdevice {
 	struct iommufd_object obj;
 	struct iommufd_viommu *viommu;
+	struct iommufd_device *idev;
 	struct device *dev;
 
 	/*
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 554aacf89ea7..c218c89e0e2e 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1070,6 +1070,11 @@ struct iommu_viommu_alloc {
  *
  * Allocate a virtual device instance (for a physical device) against a vIOMMU.
  * This instance holds the device's information (related to its vIOMMU) in a VM.
+ * User should use IOMMU_DESTROY to destroy the virtual device before
+ * destroying the physical device (by closing vfio_cdev fd). Otherwise the
+ * virtual device would be forcibly destroyed on physical device destruction,
+ * its vdevice_id would be permanently leaked (unremovable & unreusable) until
+ * iommu fd closed.
  */
 struct iommu_vdevice_alloc {
 	__u32 size;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (3 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 13:38   ` Jason Gunthorpe
                     ` (2 more replies)
  2025-07-15  6:32 ` [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant Xu Yilun
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

Remove struct device *dev from struct vdevice.

The dev pointer is the Plan B for vdevice to reference the physical
device. As now vdev->idev is added without refcounting concern, just
use vdev->idev->dev when needed.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 +-
 drivers/iommu/iommufd/driver.c                 | 4 ++--
 drivers/iommu/iommufd/viommu.c                 | 3 ---
 include/linux/iommufd.h                        | 1 -
 4 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index eb90af5093d8..8a515987b948 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -1218,7 +1218,7 @@ static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
 
 static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
 {
-	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->dev);
+	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
 	struct tegra241_vintf *vintf = viommu_to_vintf(vdev->viommu);
 	struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev);
 	struct arm_smmu_stream *stream = &master->streams[0];
diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c
index e4eae20bcd4e..df25db6d2eaf 100644
--- a/drivers/iommu/iommufd/driver.c
+++ b/drivers/iommu/iommufd/driver.c
@@ -92,7 +92,7 @@ struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
 	lockdep_assert_held(&viommu->vdevs.xa_lock);
 
 	vdev = xa_load(&viommu->vdevs, vdev_id);
-	return vdev ? vdev->dev : NULL;
+	return vdev ? vdev->idev->dev : NULL;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, "IOMMUFD");
 
@@ -109,7 +109,7 @@ int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu,
 
 	xa_lock(&viommu->vdevs);
 	xa_for_each(&viommu->vdevs, index, vdev) {
-		if (vdev->dev == dev) {
+		if (vdev->idev->dev == dev) {
 			*vdev_id = vdev->virt_id;
 			rc = 0;
 			break;
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index ecbae5091ffe..6cf0bd5d8f08 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -125,7 +125,6 @@ void iommufd_vdevice_abort(struct iommufd_object *obj)
 	xa_cmpxchg(&viommu->vdevs, vdev->virt_id, vdev, NULL, GFP_KERNEL);
 	refcount_dec(&viommu->obj.users);
 	idev->vdev = NULL;
-	put_device(vdev->dev);
 }
 
 void iommufd_vdevice_destroy(struct iommufd_object *obj)
@@ -203,8 +202,6 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	}
 
 	vdev->virt_id = virt_id;
-	vdev->dev = idev->dev;
-	get_device(idev->dev);
 	vdev->viommu = viommu;
 	refcount_inc(&viommu->obj.users);
 	/*
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index b88911026bc4..ecb0c8abd251 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -109,7 +109,6 @@ struct iommufd_vdevice {
 	struct iommufd_object obj;
 	struct iommufd_viommu *viommu;
 	struct iommufd_device *idev;
-	struct device *dev;
 
 	/*
 	 * Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID of
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (4 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 19:13   ` Nicolin Chen
  2025-07-15  6:32 ` [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone Xu Yilun
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

no_viommu is not applicable for some viommu/vdevice tests. Explicitly
report the skipping, don't do it silently.

Opportunistically adjust the line wrappings after the indentation
changes using git clang-format.

Only add the prints. No functional change intended.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 tools/testing/selftests/iommu/iommufd.c | 365 ++++++++++++------------
 1 file changed, 178 insertions(+), 187 deletions(-)

diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index d59d48022a24..16ea10ea1dbf 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -2779,35 +2779,34 @@ TEST_F(iommufd_viommu, viommu_alloc_nested_iopf)
 	uint32_t fault_fd;
 	uint32_t vdev_id;
 
-	if (self->device_id) {
-		test_ioctl_fault_alloc(&fault_id, &fault_fd);
-		test_err_hwpt_alloc_iopf(
-			ENOENT, dev_id, viommu_id, UINT32_MAX,
-			IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
-			IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data));
-		test_err_hwpt_alloc_iopf(
-			EOPNOTSUPP, dev_id, viommu_id, fault_id,
-			IOMMU_HWPT_FAULT_ID_VALID | (1 << 31), &iopf_hwpt_id,
-			IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data));
-		test_cmd_hwpt_alloc_iopf(
-			dev_id, viommu_id, fault_id, IOMMU_HWPT_FAULT_ID_VALID,
-			&iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, &data,
-			sizeof(data));
+	if (!dev_id)
+		SKIP(return, "Skipping test for variant no_viommu");
 
-		/* Must allocate vdevice before attaching to a nested hwpt */
-		test_err_mock_domain_replace(ENOENT, self->stdev_id,
-					     iopf_hwpt_id);
-		test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
-		test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id);
-		EXPECT_ERRNO(EBUSY,
-			     _test_ioctl_destroy(self->fd, iopf_hwpt_id));
-		test_cmd_trigger_iopf(dev_id, fault_fd);
+	test_ioctl_fault_alloc(&fault_id, &fault_fd);
+	test_err_hwpt_alloc_iopf(ENOENT, dev_id, viommu_id, UINT32_MAX,
+				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
+				 IOMMU_HWPT_DATA_SELFTEST, &data,
+				 sizeof(data));
+	test_err_hwpt_alloc_iopf(EOPNOTSUPP, dev_id, viommu_id, fault_id,
+				 IOMMU_HWPT_FAULT_ID_VALID | (1 << 31),
+				 &iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, &data,
+				 sizeof(data));
+	test_cmd_hwpt_alloc_iopf(dev_id, viommu_id, fault_id,
+				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
+				 IOMMU_HWPT_DATA_SELFTEST, &data,
+				 sizeof(data));
 
-		test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id);
-		test_ioctl_destroy(iopf_hwpt_id);
-		close(fault_fd);
-		test_ioctl_destroy(fault_id);
-	}
+	/* Must allocate vdevice before attaching to a nested hwpt */
+	test_err_mock_domain_replace(ENOENT, self->stdev_id, iopf_hwpt_id);
+	test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
+	test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id);
+	EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, iopf_hwpt_id));
+	test_cmd_trigger_iopf(dev_id, fault_fd);
+
+	test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id);
+	test_ioctl_destroy(iopf_hwpt_id);
+	close(fault_fd);
+	test_ioctl_destroy(fault_id);
 }
 
 TEST_F(iommufd_viommu, viommu_alloc_with_data)
@@ -2902,169 +2901,161 @@ TEST_F(iommufd_viommu, vdevice_cache)
 	uint32_t vdev_id = 0;
 	uint32_t num_inv;
 
-	if (dev_id) {
-		test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
-
-		test_cmd_dev_check_cache_all(dev_id,
-					     IOMMU_TEST_DEV_CACHE_DEFAULT);
-
-		/* Check data_type by passing zero-length array */
-		num_inv = 0;
-		test_cmd_viommu_invalidate(viommu_id, inv_reqs,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: Invalid data_type */
-		num_inv = 1;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: structure size sanity */
-		num_inv = 1;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs) + 1, &num_inv);
-		assert(!num_inv);
-
-		num_inv = 1;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   1, &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: invalid flag is passed */
-		num_inv = 1;
-		inv_reqs[0].flags = 0xffffffff;
-		inv_reqs[0].vdev_id = 0x99;
-		test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: invalid data_uptr when array is not empty */
-		num_inv = 1;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		test_err_viommu_invalidate(EINVAL, viommu_id, NULL,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: invalid entry_len when array is not empty */
-		num_inv = 1;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   0, &num_inv);
-		assert(!num_inv);
-
-		/* Negative test: invalid cache_id */
-		num_inv = 1;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].cache_id = MOCK_DEV_CACHE_ID_MAX + 1;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
+	if (!dev_id)
+		SKIP(return, "Skipping test for variant no_viommu");
 
-		/* Negative test: invalid vdev_id */
-		num_inv = 1;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x9;
-		inv_reqs[0].cache_id = 0;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(!num_inv);
+	test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
+
+	test_cmd_dev_check_cache_all(dev_id, IOMMU_TEST_DEV_CACHE_DEFAULT);
+
+	/* Check data_type by passing zero-length array */
+	num_inv = 0;
+	test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs),
+				   &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: Invalid data_type */
+	num_inv = 1;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: structure size sanity */
+	num_inv = 1;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs) + 1, &num_inv);
+	assert(!num_inv);
+
+	num_inv = 1;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 1,
+				   &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: invalid flag is passed */
+	num_inv = 1;
+	inv_reqs[0].flags = 0xffffffff;
+	inv_reqs[0].vdev_id = 0x99;
+	test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: invalid data_uptr when array is not empty */
+	num_inv = 1;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	test_err_viommu_invalidate(EINVAL, viommu_id, NULL,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: invalid entry_len when array is not empty */
+	num_inv = 1;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 0,
+				   &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: invalid cache_id */
+	num_inv = 1;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].cache_id = MOCK_DEV_CACHE_ID_MAX + 1;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(!num_inv);
+
+	/* Negative test: invalid vdev_id */
+	num_inv = 1;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x9;
+	inv_reqs[0].cache_id = 0;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(!num_inv);
 
-		/*
-		 * Invalidate the 1st cache entry but fail the 2nd request
-		 * due to invalid flags configuration in the 2nd request.
-		 */
-		num_inv = 2;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].cache_id = 0;
-		inv_reqs[1].flags = 0xffffffff;
-		inv_reqs[1].vdev_id = 0x99;
-		inv_reqs[1].cache_id = 1;
-		test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(num_inv == 1);
-		test_cmd_dev_check_cache(dev_id, 0, 0);
-		test_cmd_dev_check_cache(dev_id, 1,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-		test_cmd_dev_check_cache(dev_id, 2,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-		test_cmd_dev_check_cache(dev_id, 3,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-
-		/*
-		 * Invalidate the 1st cache entry but fail the 2nd request
-		 * due to invalid cache_id configuration in the 2nd request.
-		 */
-		num_inv = 2;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].cache_id = 0;
-		inv_reqs[1].flags = 0;
-		inv_reqs[1].vdev_id = 0x99;
-		inv_reqs[1].cache_id = MOCK_DEV_CACHE_ID_MAX + 1;
-		test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
-					   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(num_inv == 1);
-		test_cmd_dev_check_cache(dev_id, 0, 0);
-		test_cmd_dev_check_cache(dev_id, 1,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-		test_cmd_dev_check_cache(dev_id, 2,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-		test_cmd_dev_check_cache(dev_id, 3,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-
-		/* Invalidate the 2nd cache entry and verify */
-		num_inv = 1;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].cache_id = 1;
-		test_cmd_viommu_invalidate(viommu_id, inv_reqs,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(num_inv == 1);
-		test_cmd_dev_check_cache(dev_id, 0, 0);
-		test_cmd_dev_check_cache(dev_id, 1, 0);
-		test_cmd_dev_check_cache(dev_id, 2,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-		test_cmd_dev_check_cache(dev_id, 3,
-					 IOMMU_TEST_DEV_CACHE_DEFAULT);
-
-		/* Invalidate the 3rd and 4th cache entries and verify */
-		num_inv = 2;
-		inv_reqs[0].flags = 0;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].cache_id = 2;
-		inv_reqs[1].flags = 0;
-		inv_reqs[1].vdev_id = 0x99;
-		inv_reqs[1].cache_id = 3;
-		test_cmd_viommu_invalidate(viommu_id, inv_reqs,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(num_inv == 2);
-		test_cmd_dev_check_cache_all(dev_id, 0);
+	/*
+	 * Invalidate the 1st cache entry but fail the 2nd request
+	 * due to invalid flags configuration in the 2nd request.
+	 */
+	num_inv = 2;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].cache_id = 0;
+	inv_reqs[1].flags = 0xffffffff;
+	inv_reqs[1].vdev_id = 0x99;
+	inv_reqs[1].cache_id = 1;
+	test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(num_inv == 1);
+	test_cmd_dev_check_cache(dev_id, 0, 0);
+	test_cmd_dev_check_cache(dev_id, 1, IOMMU_TEST_DEV_CACHE_DEFAULT);
+	test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT);
+	test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT);
 
-		/* Invalidate all cache entries for nested_dev_id[1] and verify */
-		num_inv = 1;
-		inv_reqs[0].vdev_id = 0x99;
-		inv_reqs[0].flags = IOMMU_TEST_INVALIDATE_FLAG_ALL;
-		test_cmd_viommu_invalidate(viommu_id, inv_reqs,
-					   sizeof(*inv_reqs), &num_inv);
-		assert(num_inv == 1);
-		test_cmd_dev_check_cache_all(dev_id, 0);
-		test_ioctl_destroy(vdev_id);
-	}
+	/*
+	 * Invalidate the 1st cache entry but fail the 2nd request
+	 * due to invalid cache_id configuration in the 2nd request.
+	 */
+	num_inv = 2;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].cache_id = 0;
+	inv_reqs[1].flags = 0;
+	inv_reqs[1].vdev_id = 0x99;
+	inv_reqs[1].cache_id = MOCK_DEV_CACHE_ID_MAX + 1;
+	test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs,
+				   IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST,
+				   sizeof(*inv_reqs), &num_inv);
+	assert(num_inv == 1);
+	test_cmd_dev_check_cache(dev_id, 0, 0);
+	test_cmd_dev_check_cache(dev_id, 1, IOMMU_TEST_DEV_CACHE_DEFAULT);
+	test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT);
+	test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT);
+
+	/* Invalidate the 2nd cache entry and verify */
+	num_inv = 1;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].cache_id = 1;
+	test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs),
+				   &num_inv);
+	assert(num_inv == 1);
+	test_cmd_dev_check_cache(dev_id, 0, 0);
+	test_cmd_dev_check_cache(dev_id, 1, 0);
+	test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT);
+	test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT);
+
+	/* Invalidate the 3rd and 4th cache entries and verify */
+	num_inv = 2;
+	inv_reqs[0].flags = 0;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].cache_id = 2;
+	inv_reqs[1].flags = 0;
+	inv_reqs[1].vdev_id = 0x99;
+	inv_reqs[1].cache_id = 3;
+	test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs),
+				   &num_inv);
+	assert(num_inv == 2);
+	test_cmd_dev_check_cache_all(dev_id, 0);
+
+	/* Invalidate all cache entries for nested_dev_id[1] and verify */
+	num_inv = 1;
+	inv_reqs[0].vdev_id = 0x99;
+	inv_reqs[0].flags = IOMMU_TEST_INVALIDATE_FLAG_ALL;
+	test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs),
+				   &num_inv);
+	assert(num_inv == 1);
+	test_cmd_dev_check_cache_all(dev_id, 0);
+	test_ioctl_destroy(vdev_id);
 }
 
 TEST_F(iommufd_viommu, hw_queue)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (5 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 19:03   ` Nicolin Chen
  2025-07-15  6:32 ` [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers Xu Yilun
  2025-07-15 19:33 ` [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Nicolin Chen
  8 siblings, 1 reply; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

This tests the flow to tombstone vdevice when idevice is to be unbound
before vdevice destruction. The expected results of the tombstone are:

 - The vdevice ID can't be reused anymore (not tested in this patch).
 - Even ioctl(IOMMU_DESTROY) can't free the vdevice ID.
 - iommufd_fops_release() can still free everything.

Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 tools/testing/selftests/iommu/iommufd.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 16ea10ea1dbf..6084396f7111 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -3117,6 +3117,20 @@ TEST_F(iommufd_viommu, hw_queue)
 	test_ioctl_ioas_unmap(iova, PAGE_SIZE);
 }
 
+TEST_F(iommufd_viommu, vdevice_tombstone)
+{
+	uint32_t viommu_id = self->viommu_id;
+	uint32_t dev_id = self->device_id;
+	uint32_t vdev_id = 0;
+
+	if (!dev_id)
+		SKIP(return, "Skipping test for variant no_viommu");
+
+	test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
+	test_ioctl_destroy(self->stdev_id);
+	EXPECT_ERRNO(ENOENT, _test_ioctl_destroy(self->fd, vdev_id));
+}
+
 FIXTURE(iommufd_device_pasid)
 {
 	int fd;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (6 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone Xu Yilun
@ 2025-07-15  6:32 ` Xu Yilun
  2025-07-15 13:39   ` Jason Gunthorpe
  2025-07-15 19:13   ` Nicolin Chen
  2025-07-15 19:33 ` [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Nicolin Chen
  8 siblings, 2 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-15  6:32 UTC (permalink / raw)
  To: jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: iommu, linux-kernel, joro, robin.murphy, shuah, nicolinc, aik,
	dan.j.williams, baolu.lu, yilun.xu

Rename the shortterm-related identifiers to wait-related.

The usage of shortterm_users refcount is now beyond its name.  It is
also used for references which live longer than an ioctl execution.
E.g. vdev holds idev's shortterm_users refcount on vdev allocation,
releases it during idev's pre_destroy(). Rename the refcount as
wait_cnt, since it is always used to sync the referencing & the
destruction of the object by waiting for it to go to zero.

List all changed identifiers:

  iommufd_object::shortterm_users -> iommufd_object::wait_cnt
  REMOVE_WAIT_SHORTTERM -> REMOVE_WAIT
  iommufd_object_dec_wait_shortterm() -> iommufd_object_dec_wait()
  zerod_shortterm -> zerod_wait_cnt

No functional change intended.

Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
---
 drivers/iommu/iommufd/device.c          |  6 ++--
 drivers/iommu/iommufd/iommufd_private.h | 18 ++++++------
 drivers/iommu/iommufd/main.c            | 39 +++++++++++++------------
 drivers/iommu/iommufd/viommu.c          |  4 +--
 include/linux/iommufd.h                 |  8 ++++-
 5 files changed, 41 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index ee6ff4caf398..65fbd098f9e9 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -152,8 +152,8 @@ static void iommufd_device_remove_vdev(struct iommufd_device *idev)
 	/*
 	 * An ongoing vdev destroy ioctl has removed the vdev from the object
 	 * xarray, but has not finished iommufd_vdevice_destroy() yet as it
-	 * needs the same mutex. We exit the locking then wait on short term
-	 * users for the vdev destruction.
+	 * needs the same mutex. We exit the locking then wait on wait_cnt
+	 * reference for the vdev destruction.
 	 */
 	if (IS_ERR(vdev))
 		goto out_unlock;
@@ -184,7 +184,7 @@ void iommufd_device_pre_destroy(struct iommufd_object *obj)
 	struct iommufd_device *idev =
 		container_of(obj, struct iommufd_device, obj);
 
-	/* Release the short term users on this */
+	/* Release the wait_cnt reference on this */
 	iommufd_device_remove_vdev(idev);
 }
 
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 5d6ea5395cfe..0da2a81eedfa 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -169,7 +169,7 @@ static inline bool iommufd_lock_obj(struct iommufd_object *obj)
 {
 	if (!refcount_inc_not_zero(&obj->users))
 		return false;
-	if (!refcount_inc_not_zero(&obj->shortterm_users)) {
+	if (!refcount_inc_not_zero(&obj->wait_cnt)) {
 		/*
 		 * If the caller doesn't already have a ref on obj this must be
 		 * called under the xa_lock. Otherwise the caller is holding a
@@ -187,11 +187,11 @@ static inline void iommufd_put_object(struct iommufd_ctx *ictx,
 				      struct iommufd_object *obj)
 {
 	/*
-	 * Users first, then shortterm so that REMOVE_WAIT_SHORTTERM never sees
-	 * a spurious !0 users with a 0 shortterm_users.
+	 * Users first, then wait_cnt so that REMOVE_WAIT never sees a spurious
+	 * !0 users with a 0 wait_cnt.
 	 */
 	refcount_dec(&obj->users);
-	if (refcount_dec_and_test(&obj->shortterm_users))
+	if (refcount_dec_and_test(&obj->wait_cnt))
 		wake_up_interruptible_all(&ictx->destroy_wait);
 }
 
@@ -202,7 +202,7 @@ void iommufd_object_finalize(struct iommufd_ctx *ictx,
 			     struct iommufd_object *obj);
 
 enum {
-	REMOVE_WAIT_SHORTTERM	= BIT(0),
+	REMOVE_WAIT		= BIT(0),
 	REMOVE_OBJ_TOMBSTONE	= BIT(1),
 };
 int iommufd_object_remove(struct iommufd_ctx *ictx,
@@ -211,15 +211,15 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 
 /*
  * The caller holds a users refcount and wants to destroy the object. At this
- * point the caller has no shortterm_users reference and at least the xarray
- * will be holding one.
+ * point the caller has no wait_cnt reference and at least the xarray will be
+ * holding one.
  */
 static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx,
 					       struct iommufd_object *obj)
 {
 	int ret;
 
-	ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM);
+	ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT);
 
 	/*
 	 * If there is a bug and we couldn't destroy the object then we did put
@@ -239,7 +239,7 @@ static inline void iommufd_object_tombstone_user(struct iommufd_ctx *ictx,
 	int ret;
 
 	ret = iommufd_object_remove(ictx, obj, obj->id,
-				    REMOVE_WAIT_SHORTTERM | REMOVE_OBJ_TOMBSTONE);
+				    REMOVE_WAIT | REMOVE_OBJ_TOMBSTONE);
 
 	/*
 	 * If there is a bug and we couldn't destroy the object then we did put
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 99c1aab3d396..15af7ced0501 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -42,7 +42,7 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
 		return ERR_PTR(-ENOMEM);
 	obj->type = type;
 	/* Starts out bias'd by 1 until it is removed from the xarray */
-	refcount_set(&obj->shortterm_users, 1);
+	refcount_set(&obj->wait_cnt, 1);
 	refcount_set(&obj->users, 1);
 
 	/*
@@ -155,22 +155,22 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id,
 	return obj;
 }
 
-static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx,
-					     struct iommufd_object *to_destroy)
+static int iommufd_object_dec_wait(struct iommufd_ctx *ictx,
+				   struct iommufd_object *to_destroy)
 {
-	if (refcount_dec_and_test(&to_destroy->shortterm_users))
+	if (refcount_dec_and_test(&to_destroy->wait_cnt))
 		return 0;
 
 	if (iommufd_object_ops[to_destroy->type].pre_destroy)
 		iommufd_object_ops[to_destroy->type].pre_destroy(to_destroy);
 
 	if (wait_event_timeout(ictx->destroy_wait,
-			       refcount_read(&to_destroy->shortterm_users) == 0,
+			       refcount_read(&to_destroy->wait_cnt) == 0,
 			       msecs_to_jiffies(60000)))
 		return 0;
 
 	pr_crit("Time out waiting for iommufd object to become free\n");
-	refcount_inc(&to_destroy->shortterm_users);
+	refcount_inc(&to_destroy->wait_cnt);
 	return -EBUSY;
 }
 
@@ -184,17 +184,18 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 {
 	struct iommufd_object *obj;
 	XA_STATE(xas, &ictx->objects, id);
-	bool zerod_shortterm = false;
+	bool zerod_wait_cnt = false;
 	int ret;
 
 	/*
-	 * The purpose of the shortterm_users is to ensure deterministic
-	 * destruction of objects used by external drivers and destroyed by this
-	 * function. Any temporary increment of the refcount must increment
-	 * shortterm_users, such as during ioctl execution.
+	 * The purpose of the wait_cnt is to ensure deterministic destruction
+	 * of objects used by external drivers and destroyed by this function.
+	 * Incrementing this wait_cnt should either be short lived, such as
+	 * during ioctl execution, or be revoked and blocked during
+	 * pre_destroy(), such as vdev holding the idev's refcount.
 	 */
-	if (flags & REMOVE_WAIT_SHORTTERM) {
-		ret = iommufd_object_dec_wait_shortterm(ictx, to_destroy);
+	if (flags & REMOVE_WAIT) {
+		ret = iommufd_object_dec_wait(ictx, to_destroy);
 		if (ret) {
 			/*
 			 * We have a bug. Put back the callers reference and
@@ -203,7 +204,7 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 			refcount_dec(&to_destroy->users);
 			return ret;
 		}
-		zerod_shortterm = true;
+		zerod_wait_cnt = true;
 	}
 
 	xa_lock(&ictx->objects);
@@ -235,11 +236,11 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 	xa_unlock(&ictx->objects);
 
 	/*
-	 * Since users is zero any positive users_shortterm must be racing
+	 * Since users is zero any positive wait_cnt must be racing
 	 * iommufd_put_object(), or we have a bug.
 	 */
-	if (!zerod_shortterm) {
-		ret = iommufd_object_dec_wait_shortterm(ictx, obj);
+	if (!zerod_wait_cnt) {
+		ret = iommufd_object_dec_wait(ictx, obj);
 		if (WARN_ON(ret))
 			return ret;
 	}
@@ -249,9 +250,9 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
 	return 0;
 
 err_xa:
-	if (zerod_shortterm) {
+	if (zerod_wait_cnt) {
 		/* Restore the xarray owned reference */
-		refcount_set(&obj->shortterm_users, 1);
+		refcount_set(&obj->wait_cnt, 1);
 	}
 	xa_unlock(&ictx->objects);
 
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 6cf0bd5d8f08..2ca5809b238b 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -205,8 +205,8 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	vdev->viommu = viommu;
 	refcount_inc(&viommu->obj.users);
 	/*
-	 * A short term users reference is held on the idev so long as we have
-	 * the pointer. iommufd_device_pre_destroy() will revoke it before the
+	 * A wait_cnt reference is held on the idev so long as we have the
+	 * pointer. iommufd_device_pre_destroy() will revoke it before the
 	 * idev real destruction.
 	 */
 	vdev->idev = idev;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index ecb0c8abd251..61410a78cbce 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -46,7 +46,13 @@ enum iommufd_object_type {
 
 /* Base struct for all objects with a userspace ID handle. */
 struct iommufd_object {
-	refcount_t shortterm_users;
+	/*
+	 * Destroy will sleep and wait for wait_cnt to go to zero. This allows
+	 * concurrent users of the ID to reliably avoid causing a spurious
+	 * destroy failure. Incrementing this count should either be short
+	 * lived or be revoked and blocked during pre_destroy().
+	 */
+	refcount_t wait_cnt;
 	refcount_t users;
 	enum iommufd_object_type type;
 	unsigned int id;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects
  2025-07-15  6:32 ` [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects Xu Yilun
@ 2025-07-15 13:19   ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2025-07-15 13:19 UTC (permalink / raw)
  To: Xu Yilun
  Cc: kevin.tian, will, aneesh.kumar, iommu, linux-kernel, joro,
	robin.murphy, shuah, nicolinc, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:40PM +0800, Xu Yilun wrote:
> Add a pre_destroy() op which gives objects a chance to clear their
> short term users references before destruction. This op is intended for
> external driver created objects (e.g. idev) which does deterministic
> destruction.
> 
> In order to manage the lifecycle of interrelated objects as well as the
> deterministic destruction (e.g. vdev can't outlive idev, and idev
> destruction can't fail), short term users references are allowed to
> live out of an ioctl execution. An immediate use case is, vdev holds
> idev's short term user reference until vdev destruction completes, idev
> leverages existing wait_shortterm mechanism to ensure it is destroyed
> after vdev.
> 
> This extended usage makes the referenced object unable to just wait for
> its reference gone. It needs to actively trigger the reference removal,
> as well as prevent new references before wait. Should implement these
> work in pre_destroy().
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
>  drivers/iommu/iommufd/main.c | 4 ++++
>  1 file changed, 4 insertions(+)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

>  struct iommufd_object_ops {
> +	void (*pre_destroy)(struct iommufd_object *obj);

I would capture the key points of the commit message in a comment
right here though..

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy
  2025-07-15  6:32 ` [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy Xu Yilun
@ 2025-07-15 13:37   ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2025-07-15 13:37 UTC (permalink / raw)
  To: Xu Yilun
  Cc: kevin.tian, will, aneesh.kumar, iommu, linux-kernel, joro,
	robin.murphy, shuah, nicolinc, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:41PM +0800, Xu Yilun wrote:
> Destroy iommufd_vdevice (vdev) on iommufd_idevice (idev) destruction so
> that vdev can't outlive idev.
> 
> idev represents the physical device bound to iommufd, while the vdev
> represents the virtual instance of the physical device in the VM. The
> lifecycle of the vdev should not be longer than idev. This doesn't
> cause real problem on existing use cases cause vdev doesn't impact the
> physical device, only provides virtualization information. But to
> extend vdev for Confidential Computing (CC), there are needs to do
> secure configuration for the vdev, e.g. TSM Bind/Unbind. These
> configurations should be rolled back on idev destroy, or the external
> driver (VFIO) functionality may be impact.
> 
> The idev is created by external driver so its destruction can't fail.
> The idev implements pre_destroy() op to actively remove its associated
> vdev before destroying itself. There are 3 cases on idev pre_destroy():
> 
>   1. vdev is already destroyed by userspace. No extra handling needed.
>   2. vdev is still alive. Use iommufd_object_tombstone_user() to
>      destroy vdev and tombstone the vdev ID.
>   3. vdev is being destroyed by userspace. The vdev ID is already
>      freed, but vdev destroy handler is not completed. This requires
>      multi-threads syncing - vdev holds idev's short term users
>      reference until vdev destruction completes, idev leverages
>      existing wait_shortterm mechanism for syncing.
> 
> idev should also block any new reference to it after pre_destroy(),
> or the following wait shortterm would timeout. Introduce a 'destroying'
> flag, set it to true on idev pre_destroy(). Any attempt to reference
> idev should honor this flag under the protection of
> idev->igroup->lock.
> 
> Originally-by: Nicolin Chen <nicolinc@nvidia.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Co-developed-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
> Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
>  drivers/iommu/iommufd/device.c          | 51 ++++++++++++++++++++++++
>  drivers/iommu/iommufd/iommufd_private.h | 12 ++++++
>  drivers/iommu/iommufd/main.c            |  2 +
>  drivers/iommu/iommufd/viommu.c          | 52 +++++++++++++++++++++++--
>  include/linux/iommufd.h                 |  1 +
>  include/uapi/linux/iommufd.h            |  5 +++
>  6 files changed, 119 insertions(+), 4 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
  2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
@ 2025-07-15 13:38   ` Jason Gunthorpe
  2025-07-15 18:56   ` Nicolin Chen
  2025-07-15 20:44   ` kernel test robot
  2 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2025-07-15 13:38 UTC (permalink / raw)
  To: Xu Yilun
  Cc: kevin.tian, will, aneesh.kumar, iommu, linux-kernel, joro,
	robin.murphy, shuah, nicolinc, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:42PM +0800, Xu Yilun wrote:
> Remove struct device *dev from struct vdevice.
> 
> The dev pointer is the Plan B for vdevice to reference the physical
> device. As now vdev->idev is added without refcounting concern, just
> use vdev->idev->dev when needed.
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 +-
>  drivers/iommu/iommufd/driver.c                 | 4 ++--
>  drivers/iommu/iommufd/viommu.c                 | 3 ---
>  include/linux/iommufd.h                        | 1 -
>  4 files changed, 3 insertions(+), 7 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers
  2025-07-15  6:32 ` [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers Xu Yilun
@ 2025-07-15 13:39   ` Jason Gunthorpe
  2025-07-15 19:13   ` Nicolin Chen
  1 sibling, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2025-07-15 13:39 UTC (permalink / raw)
  To: Xu Yilun
  Cc: kevin.tian, will, aneesh.kumar, iommu, linux-kernel, joro,
	robin.murphy, shuah, nicolinc, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:45PM +0800, Xu Yilun wrote:
> Rename the shortterm-related identifiers to wait-related.
> 
> The usage of shortterm_users refcount is now beyond its name.  It is
> also used for references which live longer than an ioctl execution.
> E.g. vdev holds idev's shortterm_users refcount on vdev allocation,
> releases it during idev's pre_destroy(). Rename the refcount as
> wait_cnt, since it is always used to sync the referencing & the
> destruction of the object by waiting for it to go to zero.
> 
> List all changed identifiers:
> 
>   iommufd_object::shortterm_users -> iommufd_object::wait_cnt
>   REMOVE_WAIT_SHORTTERM -> REMOVE_WAIT
>   iommufd_object_dec_wait_shortterm() -> iommufd_object_dec_wait()
>   zerod_shortterm -> zerod_wait_cnt
> 
> No functional change intended.
> 
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
>  drivers/iommu/iommufd/device.c          |  6 ++--
>  drivers/iommu/iommufd/iommufd_private.h | 18 ++++++------
>  drivers/iommu/iommufd/main.c            | 39 +++++++++++++------------
>  drivers/iommu/iommufd/viommu.c          |  4 +--
>  include/linux/iommufd.h                 |  8 ++++-
>  5 files changed, 41 insertions(+), 34 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
  2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
  2025-07-15 13:38   ` Jason Gunthorpe
@ 2025-07-15 18:56   ` Nicolin Chen
  2025-07-16  6:09     ` Xu Yilun
  2025-07-15 20:44   ` kernel test robot
  2 siblings, 1 reply; 21+ messages in thread
From: Nicolin Chen @ 2025-07-15 18:56 UTC (permalink / raw)
  To: Xu Yilun
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:42PM +0800, Xu Yilun wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> index eb90af5093d8..8a515987b948 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> @@ -1218,7 +1218,7 @@ static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
>  
>  static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
>  {
> -	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->dev);
> +	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);

Hmm, this breaks :(

drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c: In function 'tegra241_vintf_init_vsid':
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c:1230:71: error: invalid use of undefined type 'struct iommufd_device'
 1230 |         struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);

Unfortunately the iommufd_device structure is defined in the
private header that's not shared with any IOMMU driver.

So, we need in the driver.c a new helper that converts a vdev
pointer to dev. Something like:

diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index ff6bbd2137146..fd6b083535271 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -1227,7 +1227,8 @@ static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
 
 static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
 {
-	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
+	struct device *dev = iommufd_vdevice_to_device(vdev);
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
 	struct tegra241_vintf *vintf = viommu_to_vintf(vdev->viommu);
 	struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev);
 	struct arm_smmu_stream *stream = &master->streams[0];
diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c
index df25db6d2eafc..6f1010da221c9 100644
--- a/drivers/iommu/iommufd/driver.c
+++ b/drivers/iommu/iommufd/driver.c
@@ -83,6 +83,12 @@ void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
 }
 EXPORT_SYMBOL_NS_GPL(_iommufd_destroy_mmap, "IOMMUFD");
 
+struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev)
+{
+	return vdev->idev->dev;
+}
+EXPORT_SYMBOL_NS_GPL(iommufd_vdevice_to_device, "IOMMUFD");
+
 /* Caller should xa_lock(&viommu->vdevs) to protect the return value */
 struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
 				       unsigned long vdev_id)
@@ -92,7 +98,7 @@ struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
 	lockdep_assert_held(&viommu->vdevs.xa_lock);
 
 	vdev = xa_load(&viommu->vdevs, vdev_id);
-	return vdev ? vdev->idev->dev : NULL;
+	return vdev ? iommufd_vdevice_to_device(vdev) : NULL;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, "IOMMUFD");
 
@@ -109,7 +115,7 @@ int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu,
 
 	xa_lock(&viommu->vdevs);
 	xa_for_each(&viommu->vdevs, index, vdev) {
-		if (vdev->idev->dev == dev) {
+		if (iommufd_vdevice_to_device(vdev) == dev) {
 			*vdev_id = vdev->virt_id;
 			rc = 0;
 			break;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 61410a78cbce7..ee88e90021870 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -266,6 +266,7 @@ int _iommufd_alloc_mmap(struct iommufd_ctx *ictx, struct iommufd_object *owner,
 			unsigned long *offset);
 void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
 			   struct iommufd_object *owner, unsigned long offset);
+struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev);
 struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
 				       unsigned long vdev_id);
 int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu,
@@ -300,6 +301,12 @@ static inline void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
 {
 }
 
+static inline struct device *
+iommufd_vdevice_to_device(struct iommufd_vdevice *vdev)
+{
+	return ERR_PTR(-ENODEV);
+}
+
 static inline struct device *
 iommufd_viommu_find_dev(struct iommufd_viommu *viommu, unsigned long vdev_id)
 {


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone
  2025-07-15  6:32 ` [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone Xu Yilun
@ 2025-07-15 19:03   ` Nicolin Chen
  0 siblings, 0 replies; 21+ messages in thread
From: Nicolin Chen @ 2025-07-15 19:03 UTC (permalink / raw)
  To: Xu Yilun
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:44PM +0800, Xu Yilun wrote:
> This tests the flow to tombstone vdevice when idevice is to be unbound
> before vdevice destruction. The expected results of the tombstone are:
> 
>  - The vdevice ID can't be reused anymore (not tested in this patch).
>  - Even ioctl(IOMMU_DESTROY) can't free the vdevice ID.
>  - iommufd_fops_release() can still free everything.
> 
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant
  2025-07-15  6:32 ` [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant Xu Yilun
@ 2025-07-15 19:13   ` Nicolin Chen
  2025-07-16  6:23     ` Xu Yilun
  0 siblings, 1 reply; 21+ messages in thread
From: Nicolin Chen @ 2025-07-15 19:13 UTC (permalink / raw)
  To: Xu Yilun
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:43PM +0800, Xu Yilun wrote:

Two nits:

> +	test_ioctl_fault_alloc(&fault_id, &fault_fd);
> +	test_err_hwpt_alloc_iopf(ENOENT, dev_id, viommu_id, UINT32_MAX,
> +				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
> +				 IOMMU_HWPT_DATA_SELFTEST, &data,
> +				 sizeof(data));

sizeof(data) could fit into previous line.

> +	test_cmd_hwpt_alloc_iopf(dev_id, viommu_id, fault_id,
> +				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
> +				 IOMMU_HWPT_DATA_SELFTEST, &data,
> +				 sizeof(data));

Ditto

Thanks
Nicolin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers
  2025-07-15  6:32 ` [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers Xu Yilun
  2025-07-15 13:39   ` Jason Gunthorpe
@ 2025-07-15 19:13   ` Nicolin Chen
  1 sibling, 0 replies; 21+ messages in thread
From: Nicolin Chen @ 2025-07-15 19:13 UTC (permalink / raw)
  To: Xu Yilun
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:45PM +0800, Xu Yilun wrote:
> Rename the shortterm-related identifiers to wait-related.
> 
> The usage of shortterm_users refcount is now beyond its name.  It is
> also used for references which live longer than an ioctl execution.
> E.g. vdev holds idev's shortterm_users refcount on vdev allocation,
> releases it during idev's pre_destroy(). Rename the refcount as
> wait_cnt, since it is always used to sync the referencing & the
> destruction of the object by waiting for it to go to zero.
> 
> List all changed identifiers:
> 
>   iommufd_object::shortterm_users -> iommufd_object::wait_cnt
>   REMOVE_WAIT_SHORTTERM -> REMOVE_WAIT
>   iommufd_object_dec_wait_shortterm() -> iommufd_object_dec_wait()
>   zerod_shortterm -> zerod_wait_cnt
> 
> No functional change intended.
> 
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind
  2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
                   ` (7 preceding siblings ...)
  2025-07-15  6:32 ` [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers Xu Yilun
@ 2025-07-15 19:33 ` Nicolin Chen
  8 siblings, 0 replies; 21+ messages in thread
From: Nicolin Chen @ 2025-07-15 19:33 UTC (permalink / raw)
  To: Xu Yilun
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 02:32:37PM +0800, Xu Yilun wrote:
> It is to solve the lifecycle issue that vdevice may outlive idevice. It
> is a prerequisite for TIO, to ensure extra secure configurations (e.g.
> TSM Bind/Unbind) against vdevice could be rolled back on idevice unbind,
> so that VFIO could still work on the physical device without surprise.
> 
> Changelog:
> v5:
>  - Further rebase to iommufd for-next 601b1d0d9395
>  - Keep the xa_empty() check in iommufd_fops_release(), update comments
>  - Move the *idev next to *viommu for struct iommufd_vdevice
>  - Update the description about IOMMUFD_CMD_VDEVICE_ALLOC for lifecycle
>  - Remove Baolu's tag for patch 4 because of big changes since v3
>  - Add changelog about idev->destroying
>  - Adjust line wrappings for tools/testing/selftests/iommu/iommufd.c
>  - Clarify that no testing for tombstoned ID repurposing.
>  - Add review tags.

With the patch that I attached in my reply to PATCH-5, sanity works
fine per iommufd's selftest and by testing tegra241-cmdqv in a VM.

So, upon fixing the build break in PATCH-5 (maybe for a v6),

Tested-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
  2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
  2025-07-15 13:38   ` Jason Gunthorpe
  2025-07-15 18:56   ` Nicolin Chen
@ 2025-07-15 20:44   ` kernel test robot
  2 siblings, 0 replies; 21+ messages in thread
From: kernel test robot @ 2025-07-15 20:44 UTC (permalink / raw)
  To: Xu Yilun, jgg, jgg, kevin.tian, will, aneesh.kumar
  Cc: llvm, oe-kbuild-all, iommu, linux-kernel, joro, robin.murphy,
	shuah, nicolinc, aik, dan.j.williams, baolu.lu, yilun.xu

Hi Xu,

kernel test robot noticed the following build errors:

[auto build test ERROR on 601b1d0d9395c711383452bd0d47037afbbb4bcf]

url:    https://github.com/intel-lab-lkp/linux/commits/Xu-Yilun/iommufd-viommu-Roll-back-to-use-iommufd_object_alloc-for-vdevice/20250715-144326
base:   601b1d0d9395c711383452bd0d47037afbbb4bcf
patch link:    https://lore.kernel.org/r/20250715063245.1799534-6-yilun.xu%40linux.intel.com
patch subject: [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
config: arm64-allmodconfig (https://download.01.org/0day-ci/archive/20250716/202507160404.4hMp40iv-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250716/202507160404.4hMp40iv-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507160404.4hMp40iv-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c:1221:64: error: incomplete definition of type 'struct iommufd_device'
    1221 |         struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
         |                                                             ~~~~~~~~~~^
   include/linux/iommufd.h:24:8: note: forward declaration of 'struct iommufd_device'
      24 | struct iommufd_device;
         |        ^
   1 error generated.


vim +1221 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c

  1218	
  1219	static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
  1220	{
> 1221		struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
  1222		struct tegra241_vintf *vintf = viommu_to_vintf(vdev->viommu);
  1223		struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev);
  1224		struct arm_smmu_stream *stream = &master->streams[0];
  1225		u64 virt_sid = vdev->virt_id;
  1226		int sidx;
  1227	
  1228		if (virt_sid > UINT_MAX)
  1229			return -EINVAL;
  1230	
  1231		WARN_ON_ONCE(master->num_streams != 1);
  1232	
  1233		/* Find an empty pair of SID_REPLACE and SID_MATCH */
  1234		sidx = ida_alloc_max(&vintf->sids, vintf->cmdqv->num_sids_per_vintf - 1,
  1235				     GFP_KERNEL);
  1236		if (sidx < 0)
  1237			return sidx;
  1238	
  1239		writel(stream->id, REG_VINTF(vintf, SID_REPLACE(sidx)));
  1240		writel(virt_sid << 1 | 0x1, REG_VINTF(vintf, SID_MATCH(sidx)));
  1241		dev_dbg(vintf->cmdqv->dev,
  1242			"VINTF%u: allocated SID_REPLACE%d for pSID=%x, vSID=%x\n",
  1243			vintf->idx, sidx, stream->id, (u32)virt_sid);
  1244	
  1245		vsid->idx = sidx;
  1246		vsid->vintf = vintf;
  1247		vsid->sid = stream->id;
  1248	
  1249		vdev->destroy = &tegra241_vintf_destroy_vsid;
  1250		return 0;
  1251	}
  1252	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice
  2025-07-15 18:56   ` Nicolin Chen
@ 2025-07-16  6:09     ` Xu Yilun
  0 siblings, 0 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-16  6:09 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 11:56:31AM -0700, Nicolin Chen wrote:
> On Tue, Jul 15, 2025 at 02:32:42PM +0800, Xu Yilun wrote:
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> > index eb90af5093d8..8a515987b948 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> > @@ -1218,7 +1218,7 @@ static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
> >  
> >  static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
> >  {
> > -	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->dev);
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
> 
> Hmm, this breaks :(
> 
> drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c: In function 'tegra241_vintf_init_vsid':
> drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c:1230:71: error: invalid use of undefined type 'struct iommufd_device'
>  1230 |         struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
> 
> Unfortunately the iommufd_device structure is defined in the
> private header that's not shared with any IOMMU driver.
> 
> So, we need in the driver.c a new helper that converts a vdev
> pointer to dev. Something like:
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> index ff6bbd2137146..fd6b083535271 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> @@ -1227,7 +1227,8 @@ static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
>  
>  static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev)
>  {
> -	struct arm_smmu_master *master = dev_iommu_priv_get(vdev->idev->dev);
> +	struct device *dev = iommufd_vdevice_to_device(vdev);
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
>  	struct tegra241_vintf *vintf = viommu_to_vintf(vdev->viommu);
>  	struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev);
>  	struct arm_smmu_stream *stream = &master->streams[0];
> diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c
> index df25db6d2eafc..6f1010da221c9 100644
> --- a/drivers/iommu/iommufd/driver.c
> +++ b/drivers/iommu/iommufd/driver.c
> @@ -83,6 +83,12 @@ void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
>  }
>  EXPORT_SYMBOL_NS_GPL(_iommufd_destroy_mmap, "IOMMUFD");
>  
> +struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev)
> +{
> +	return vdev->idev->dev;
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_vdevice_to_device, "IOMMUFD");
> +
>  /* Caller should xa_lock(&viommu->vdevs) to protect the return value */
>  struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
>  				       unsigned long vdev_id)
> @@ -92,7 +98,7 @@ struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
>  	lockdep_assert_held(&viommu->vdevs.xa_lock);
>  
>  	vdev = xa_load(&viommu->vdevs, vdev_id);
> -	return vdev ? vdev->idev->dev : NULL;
> +	return vdev ? iommufd_vdevice_to_device(vdev) : NULL;
>  }
>  EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, "IOMMUFD");
>  
> @@ -109,7 +115,7 @@ int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu,
>  
>  	xa_lock(&viommu->vdevs);
>  	xa_for_each(&viommu->vdevs, index, vdev) {
> -		if (vdev->idev->dev == dev) {
> +		if (iommufd_vdevice_to_device(vdev) == dev) {
>  			*vdev_id = vdev->virt_id;
>  			rc = 0;
>  			break;
> diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
> index 61410a78cbce7..ee88e90021870 100644
> --- a/include/linux/iommufd.h
> +++ b/include/linux/iommufd.h
> @@ -266,6 +266,7 @@ int _iommufd_alloc_mmap(struct iommufd_ctx *ictx, struct iommufd_object *owner,
>  			unsigned long *offset);
>  void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
>  			   struct iommufd_object *owner, unsigned long offset);
> +struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev);
>  struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu,
>  				       unsigned long vdev_id);
>  int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu,
> @@ -300,6 +301,12 @@ static inline void _iommufd_destroy_mmap(struct iommufd_ctx *ictx,
>  {
>  }
>  
> +static inline struct device *
> +iommufd_vdevice_to_device(struct iommufd_vdevice *vdev)
> +{
> +	return ERR_PTR(-ENODEV);

I prefer return NULL, which is consistent with iommufd_viommu_find_dev().

Others good to me, and thanks for your fixing.

> +}
> +
>  static inline struct device *
>  iommufd_viommu_find_dev(struct iommufd_viommu *viommu, unsigned long vdev_id)
>  {
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant
  2025-07-15 19:13   ` Nicolin Chen
@ 2025-07-16  6:23     ` Xu Yilun
  0 siblings, 0 replies; 21+ messages in thread
From: Xu Yilun @ 2025-07-16  6:23 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: jgg, jgg, kevin.tian, will, aneesh.kumar, iommu, linux-kernel,
	joro, robin.murphy, shuah, aik, dan.j.williams, baolu.lu,
	yilun.xu

On Tue, Jul 15, 2025 at 12:13:08PM -0700, Nicolin Chen wrote:
> On Tue, Jul 15, 2025 at 02:32:43PM +0800, Xu Yilun wrote:
> 
> Two nits:
> 
> > +	test_ioctl_fault_alloc(&fault_id, &fault_fd);
> > +	test_err_hwpt_alloc_iopf(ENOENT, dev_id, viommu_id, UINT32_MAX,
> > +				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
> > +				 IOMMU_HWPT_DATA_SELFTEST, &data,
> > +				 sizeof(data));
> 
> sizeof(data) could fit into previous line.
> 
> > +	test_cmd_hwpt_alloc_iopf(dev_id, viommu_id, fault_id,
> > +				 IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id,
> > +				 IOMMU_HWPT_DATA_SELFTEST, &data,
> > +				 sizeof(data));
> 
> Ditto

Yes clang-format does change like this, but I didn't want them pass 80
columns.

Anyway, applied your changes.

Thanks,
Yilun

> 
> Thanks
> Nicolin

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-07-16  6:32 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-15  6:32 [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Xu Yilun
2025-07-15  6:32 ` [PATCH v5 1/8] iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice Xu Yilun
2025-07-15  6:32 ` [PATCH v5 2/8] iommufd: Add iommufd_object_tombstone_user() helper Xu Yilun
2025-07-15  6:32 ` [PATCH v5 3/8] iommufd: Add a pre_destroy() op for objects Xu Yilun
2025-07-15 13:19   ` Jason Gunthorpe
2025-07-15  6:32 ` [PATCH v5 4/8] iommufd: Destroy vdevice on idevice destroy Xu Yilun
2025-07-15 13:37   ` Jason Gunthorpe
2025-07-15  6:32 ` [PATCH v5 5/8] iommufd/vdevice: Remove struct device reference from struct vdevice Xu Yilun
2025-07-15 13:38   ` Jason Gunthorpe
2025-07-15 18:56   ` Nicolin Chen
2025-07-16  6:09     ` Xu Yilun
2025-07-15 20:44   ` kernel test robot
2025-07-15  6:32 ` [PATCH v5 6/8] iommufd/selftest: Explicitly skip tests for inapplicable variant Xu Yilun
2025-07-15 19:13   ` Nicolin Chen
2025-07-16  6:23     ` Xu Yilun
2025-07-15  6:32 ` [PATCH v5 7/8] iommufd/selftest: Add coverage for vdevice tombstone Xu Yilun
2025-07-15 19:03   ` Nicolin Chen
2025-07-15  6:32 ` [PATCH v5 8/8] iommufd: Rename some shortterm-related identifiers Xu Yilun
2025-07-15 13:39   ` Jason Gunthorpe
2025-07-15 19:13   ` Nicolin Chen
2025-07-15 19:33 ` [PATCH v5 0/8] iommufd: Destroy vdevice on device unbind Nicolin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).