All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: Yi Liu <yi.l.liu@intel.com>
Cc: <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex@shazbot.org>,
	Joerg Roedel <joro@8bytes.org>,
	Mostafa Saleh <smostafa@google.com>,
	David Matlack <dmatlack@google.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Tian, Kevin" <kevin.tian@intel.com>, <skhawaja@google.com>,
	<pasha.tatashin@soleen.com>, Will Deacon <will@kernel.org>,
	Baolu Lu <baolu.lu@linux.intel.com>,
	jacob.pan@linux.microsoft.com
Subject: Re: [PATCH V4 03/10] iommufd: Allow binding to a noiommu device
Date: Wed, 6 May 2026 11:51:16 -0700	[thread overview]
Message-ID: <20260506115116.00001251@linux.microsoft.com> (raw)
In-Reply-To: <dbdd2289-8a0e-40f9-a0b5-842fd861acba@intel.com>

Hi Yi,

On Tue, 28 Apr 2026 15:45:56 +0800
Yi Liu <yi.l.liu@intel.com> wrote:

> On 4/15/26 05:14, Jacob Pan wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > 
> > Allow iommufd to bind devices without an IOMMU (noiommu mode) by
> > creating a dummy IOMMU group for such devices and skipping hwpt
> > operations.
> > 
> > This enables noiommu devices to operate through the same iommufd
> > API as IOMMU- capable devices.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
> > 
> > ---
> > v4:
> >     - handle partially initialized idev w/o igroup (Sashiko)
> > v3:
> >     - handle new error cases iommufd_device_bind. (Mostafa)
> > ---
> >   drivers/iommu/iommufd/device.c | 132
> > +++++++++++++++++++++++---------- 1 file changed, 93 insertions(+),
> > 39 deletions(-)
> > 
> > diff --git a/drivers/iommu/iommufd/device.c
> > b/drivers/iommu/iommufd/device.c index 2f30ea069f66..0283ac39be55
> > 100644 --- a/drivers/iommu/iommufd/device.c
> > +++ b/drivers/iommu/iommufd/device.c
> > @@ -23,6 +23,11 @@ struct iommufd_attach {
> >   	struct xarray device_array;
> >   };
> >   
> > +static bool is_vfio_noiommu(struct iommufd_device *idev)  
> 
> perhaps iommufd_device_is_noiommu().
> 
sounds good, remove vfio.

> > +{
> > +	return !device_iommu_mapped(idev->dev) ||
> > !idev->dev->iommu; +}
> > +
> >   static void iommufd_group_release(struct kref *kref)
> >   {
> >   	struct iommufd_group *igroup =
> > @@ -30,9 +35,11 @@ static void iommufd_group_release(struct kref
> > *kref) 
> >   	WARN_ON(!xa_empty(&igroup->pasid_attach));
> >   
> > -	xa_cmpxchg(&igroup->ictx->groups,
> > iommu_group_id(igroup->group), igroup,
> > -		   NULL, GFP_KERNEL);
> > -	iommu_group_put(igroup->group);
> > +	if (igroup->group) {
> > +		xa_cmpxchg(&igroup->ictx->groups,
> > iommu_group_id(igroup->group),
> > +			   igroup, NULL, GFP_KERNEL);
> > +		iommu_group_put(igroup->group);
> > +	}
> >   	mutex_destroy(&igroup->lock);
> >   	kfree(igroup);
> >   }
> > @@ -204,32 +211,19 @@ void iommufd_device_destroy(struct
> > iommufd_object *obj) struct iommufd_device *idev =
> >   		container_of(obj, struct iommufd_device, obj);
> >   
> > -	iommu_device_release_dma_owner(idev->dev);
> > +	if (!idev->igroup)
> > +		return;
> > +	if (!is_vfio_noiommu(idev))
> > +		iommu_device_release_dma_owner(idev->dev);
> >   	iommufd_put_group(idev->igroup);
> >   	if (!iommufd_selftest_is_mock_dev(idev->dev))
> >   		iommufd_ctx_put(idev->ictx);
> >   }
> >   
> > -/**
> > - * iommufd_device_bind - Bind a physical device to an iommu fd
> > - * @ictx: iommufd file descriptor
> > - * @dev: Pointer to a physical device struct
> > - * @id: Output ID number to return to userspace for this device
> > - *
> > - * A successful bind establishes an ownership over the device and
> > returns
> > - * struct iommufd_device pointer, otherwise returns error pointer.
> > - *
> > - * A driver using this API must set driver_managed_dma and must
> > not touch
> > - * the device until this routine succeeds and establishes
> > ownership.
> > - *
> > - * Binding a PCI device places the entire RID under iommufd
> > control.
> > - *
> > - * The caller must undo this with iommufd_device_unbind()
> > - */
> > -struct iommufd_device *iommufd_device_bind(struct iommufd_ctx
> > *ictx,
> > -					   struct device *dev, u32
> > *id) +static int iommufd_bind_iommu(struct iommufd_device *idev)
> >   {
> > -	struct iommufd_device *idev;
> > +	struct iommufd_ctx *ictx = idev->ictx;
> > +	struct device *dev = idev->dev;
> >   	struct iommufd_group *igroup;
> >   	int rc;
> >   
> > @@ -238,11 +232,11 @@ struct iommufd_device
> > *iommufd_device_bind(struct iommufd_ctx *ictx,
> >   	 * to restore cache coherency.
> >   	 */
> >   	if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
> > -		return ERR_PTR(-EINVAL);
> > +		return -EINVAL;
> >   
> > -	igroup = iommufd_get_group(ictx, dev);
> > +	igroup = iommufd_get_group(idev->ictx, dev);  
> 
> no need as you already have a local ictx.
will do.

> 
> >   	if (IS_ERR(igroup))
> > -		return ERR_CAST(igroup);
> > +		return PTR_ERR(igroup);
> >   
> >   	/*
> >   	 * For historical compat with VFIO the insecure interrupt
> > path is @@ -268,21 +262,69 @@ struct iommufd_device
> > *iommufd_device_bind(struct iommufd_ctx *ictx, if (rc)
> >   		goto out_group_put;
> >   
> > +	/* igroup refcount moves into iommufd_device */
> > +	idev->igroup = igroup;
> > +	return 0;
> > +
> > +out_group_put:
> > +	iommufd_put_group(igroup);
> > +	return rc;
> > +}
> > +
> > +/**
> > + * iommufd_device_bind - Bind a physical device to an iommu fd
> > + * @ictx: iommufd file descriptor
> > + * @dev: Pointer to a physical device struct
> > + * @id: Output ID number to return to userspace for this device
> > + *
> > + * A successful bind establishes an ownership over the device and
> > returns
> > + * struct iommufd_device pointer, otherwise returns error pointer.
> > + *
> > + * A driver using this API must set driver_managed_dma and must
> > not touch
> > + * the device until this routine succeeds and establishes
> > ownership.
> > + *
> > + * Binding a PCI device places the entire RID under iommufd
> > control.
> > + *
> > + * The caller must undo this with iommufd_device_unbind()
> > + */
> > +struct iommufd_device *iommufd_device_bind(struct iommufd_ctx
> > *ictx,
> > +					   struct device *dev, u32
> > *id) +{
> > +	struct iommufd_device *idev;
> > +	int rc;
> > +
> >   	idev = iommufd_object_alloc(ictx, idev,
> > IOMMUFD_OBJ_DEVICE);
> > -	if (IS_ERR(idev)) {
> > -		rc = PTR_ERR(idev);
> > -		goto out_release_owner;
> > -	}
> > +	if (IS_ERR(idev))
> > +		return idev;
> > +
> >   	idev->ictx = ictx;
> > -	if (!iommufd_selftest_is_mock_dev(dev))
> > -		iommufd_ctx_get(ictx);
> >   	idev->dev = dev;
> >   	idev->enforce_cache_coherency =
> >   		device_iommu_capable(dev,
> > IOMMU_CAP_ENFORCE_CACHE_COHERENCY);  
> 
> Functioanlly, it is harmless since device_iommu_capable() returns
> false for noiommu device. But conceptually, it is nice to move it to 
> iommufd_bind_iommu().
make sense, will move it.
> 
> > +
> > +	if (!is_vfio_noiommu(idev)) {
> > +		rc = iommufd_bind_iommu(idev);
> > +		if (rc)
> > +			goto err_out;
> > +	} else {
> > +		struct iommufd_group *igroup;
> > +
> > +		/*
> > +		 * Create a dummy igroup, lots of stuff expects
> > ths igroup to be
> > +		 * present, but a NULL igroup->group is OK
> > +		 */
> > +		igroup = iommufd_alloc_group(ictx, NULL);
> > +		if (IS_ERR(igroup)) {
> > +			rc = PTR_ERR(igroup);
> > +			goto err_out;
> > +		}
> > +		idev->igroup = igroup;
> > +	}
> > +
> > +	if (!iommufd_selftest_is_mock_dev(dev))
> > +		iommufd_ctx_get(ictx);
> >   	/* The calling driver is a user until
> > iommufd_device_unbind() */ refcount_inc(&idev->obj.users);
> > -	/* igroup refcount moves into iommufd_device */
> > -	idev->igroup = igroup;
> >   
> >   	/*
> >   	 * If the caller fails after this success it must call
> > @@ -294,11 +336,14 @@ struct iommufd_device
> > *iommufd_device_bind(struct iommufd_ctx *ictx, *id = idev->obj.id;
> >   	return idev;
> >   
> > -out_release_owner:
> > -	iommu_device_release_dma_owner(dev);
> > -out_group_put:
> > -	iommufd_put_group(igroup);
> > +err_out:
> > +	/*
> > +	 * Be careful that iommufd_device_destroy() can handle
> > partial
> > +	 * initialization.
> > +	 */  
> 
> no native speaker. This is a bit strange to me. I think the comment 
> wants to highligh that iommufd_device_destroy() can handle partial
> initialization. But the above comment seems to warn something...
> how about below?
> 
yes, that is better, This is not a warning, just explaining.

> /*
>   * iommufd_device_destroy() handles partially initialized idev,
>   * so iommufd_object_abort_and_destroy() is safe to call here.
>   */
> 
> > +	iommufd_object_abort_and_destroy(ictx, &idev->obj);
> >   	return ERR_PTR(rc);
> > +
> >   }
> >   EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, "IOMMUFD");
> >   
> > @@ -512,6 +557,9 @@ static int iommufd_hwpt_attach_device(struct
> > iommufd_hw_pagetable *hwpt, struct iommufd_attach_handle *handle;
> >   	int rc;
> >   
> > +	if (is_vfio_noiommu(idev))
> > +		return 0;
> > +
> >   	if (!iommufd_hwpt_compatible_device(hwpt, idev))
> >   		return -EINVAL;
> >   
> > @@ -559,6 +607,9 @@ static void iommufd_hwpt_detach_device(struct
> > iommufd_hw_pagetable *hwpt, {
> >   	struct iommufd_attach_handle *handle;
> >   
> > +	if (is_vfio_noiommu(idev))
> > +		return;
> > +
> >   	handle = iommufd_device_get_attach_handle(idev, pasid);
> >   	if (pasid == IOMMU_NO_PASID)
> >   		iommu_detach_group_handle(hwpt->domain,
> > idev->igroup->group); @@ -577,6 +628,9 @@ static int
> > iommufd_hwpt_replace_device(struct iommufd_device *idev, struct
> > iommufd_attach_handle *handle, *old_handle; int rc;
> >   
> > +	if (is_vfio_noiommu(idev))
> > +		return 0;
> > +
> >   	if (!iommufd_hwpt_compatible_device(hwpt, idev))
> >   		return -EINVAL;
> >   
> > @@ -652,7 +706,7 @@ int iommufd_hw_pagetable_attach(struct
> > iommufd_hw_pagetable *hwpt, goto err_release_devid;
> >   	}
> >   
> > -	if (attach_resv) {
> > +	if (attach_resv && !is_vfio_noiommu(idev)) {  
> 
> how about the iommufd_hw_pagetable_detach() path? Does it also need to
> check noiommu device?
I don't think it is needed. iopt_remove_reserved_iova() is safe to call
when nothing was added — it's a no-op. But I can add the guard for
symmetry if preferred.

> 
> >   		rc = iommufd_device_attach_reserved_iova(idev,
> > hwpt_paging); if (rc)
> >   			goto err_release_devid;  
> 
> Regards,
> Yi Liu


  reply	other threads:[~2026-05-06 18:51 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-14 21:14 [PATCH V4 00/10] iommufd: Enable noiommu mode for cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 01/10] iommufd: Support a HWPT without an iommu driver for noiommu Jacob Pan
2026-04-16  7:25   ` Tian, Kevin
2026-04-17 21:59     ` Jacob Pan
2026-04-22  8:12       ` Tian, Kevin
2026-04-22 16:03         ` Jason Gunthorpe
2026-04-23  7:26           ` Tian, Kevin
2026-04-23 14:51             ` Jason Gunthorpe
2026-04-24  6:31               ` Tian, Kevin
2026-04-28  7:45   ` Yi Liu
2026-04-28 16:42     ` Jacob Pan
2026-04-14 21:14 ` [PATCH V4 02/10] iommufd: Move igroup allocation to a function Jacob Pan
2026-04-16  7:48   ` Tian, Kevin
2026-05-05 21:32     ` Jacob Pan
2026-04-28  7:45   ` Yi Liu
2026-04-14 21:14 ` [PATCH V4 03/10] iommufd: Allow binding to a noiommu device Jacob Pan
2026-04-16  7:56   ` Tian, Kevin
2026-05-05 23:04     ` Jacob Pan
2026-04-28  7:45   ` Yi Liu
2026-05-06 18:51     ` Jacob Pan [this message]
2026-04-14 21:14 ` [PATCH V4 04/10] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA Jacob Pan
2026-04-16  8:02   ` Tian, Kevin
2026-05-04 23:03     ` Jacob Pan
2026-04-16 19:32   ` Alex Williamson
2026-05-04 22:30     ` Jacob Pan
2026-04-14 21:14 ` [PATCH V4 05/10] vfio: Allow null group for noiommu without containers Jacob Pan
2026-04-16  8:13   ` Tian, Kevin
2026-04-16 21:33     ` Jacob Pan
2026-04-16 20:06   ` Alex Williamson
2026-04-17 17:06     ` Jacob Pan
2026-04-17 23:04       ` Alex Williamson
2026-04-14 21:14 ` [PATCH V4 06/10] vfio: Introduce and set noiommu flag on vfio_device Jacob Pan
2026-04-14 21:14 ` [PATCH V4 07/10] vfio: Enable cdev noiommu mode under iommufd Jacob Pan
2026-04-16 20:49   ` Alex Williamson
2026-04-30 23:31     ` Jacob Pan
2026-04-14 21:14 ` [PATCH V4 08/10] vfio:selftest: Handle VFIO noiommu cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 09/10] selftests/vfio: Add iommufd noiommu mode selftest for cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 10/10] Documentation: Update VFIO NOIOMMU mode Jacob Pan
2026-04-28  7:46   ` Yi Liu
2026-04-30 23:41     ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260506115116.00001251@linux.microsoft.com \
    --to=jacob.pan@linux.microsoft.com \
    --cc=alex@shazbot.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.