public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: <kevin.tian@intel.com>, <will@kernel.org>, <joro@8bytes.org>,
	<suravee.suthikulpanit@amd.com>, <robin.murphy@arm.com>,
	<dwmw2@infradead.org>, <baolu.lu@linux.intel.com>,
	<shuah@kernel.org>, <linux-kernel@vger.kernel.org>,
	<iommu@lists.linux.dev>, <linux-arm-kernel@lists.infradead.org>,
	<linux-kselftest@vger.kernel.org>, <eric.auger@redhat.com>,
	<jean-philippe@linaro.org>, <mdf@kernel.org>,
	<mshavit@google.com>, <shameerali.kolothum.thodi@huawei.com>,
	<smostafa@google.com>, <yi.l.liu@intel.com>, <aik@amd.com>,
	<zhangfei.gao@linaro.org>, <patches@lists.linux.dev>
Subject: Re: [PATCH v5 01/13] iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl
Date: Tue, 29 Oct 2024 12:30:00 -0700	[thread overview]
Message-ID: <ZyE3uGyVx9ivJeHI@Asurada-Nvidia> (raw)
In-Reply-To: <20241029184801.GW6956@nvidia.com>

On Tue, Oct 29, 2024 at 03:48:01PM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 29, 2024 at 10:29:56AM -0700, Nicolin Chen wrote:
> > On Tue, Oct 29, 2024 at 12:58:24PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Oct 25, 2024 at 04:50:30PM -0700, Nicolin Chen wrote:
> > > > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> > > > index 5fd3dd420290..e50113305a9c 100644
> > > > --- a/drivers/iommu/iommufd/device.c
> > > > +++ b/drivers/iommu/iommufd/device.c
> > > > @@ -277,6 +277,17 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD);
> > > >   */
> > > >  void iommufd_device_unbind(struct iommufd_device *idev)
> > > >  {
> > > > +	u32 vdev_id = 0;
> > > > +
> > > > +	/* idev->vdev object should be destroyed prior, yet just in case.. */
> > > > +	mutex_lock(&idev->igroup->lock);
> > > > +	if (idev->vdev)
> > > 
> > > Then should it have a WARN_ON here?
> > 
> > It'd be a user space mistake that forgot to call the destroy ioctl
> > to the object, in which case I recall kernel shouldn't WARN_ON?
> 
> But you can't get here because:
> 
>  	refcount_inc(&idev->obj.users);
> 
> And kernel doesn't destroy objects with elevated ref counts?

Hmm, this is not a ->destroy() but iommufd_device_unbind called
by VFIO. And we actually ran into this routine when QEMU didn't
destroy vdev. So, I added this chunk.

The iommufd_object_remove(vdev_id) here would destroy the vdev
where its destroy() does refcount_dec(&idev->obj.users). Then,
the following iommufd_object_destroy_user(.., &idev->obj) will
succeed.

With that said, let's just mandate userspace to destroy vdev.

> > > > +		vdev_id = idev->vdev->obj.id;
> > > > +	mutex_unlock(&idev->igroup->lock);
> > > > +	/* Relying on xa_lock against a race with iommufd_destroy() */
> > > > +	if (vdev_id)
> > > > +		iommufd_object_remove(idev->ictx, NULL, vdev_id, 0);
> > > 
> > > That doesn't seem right, iommufd_object_remove() should never be used
> > > to destroy an object that userspace created with an IOCTL, in fact
> > > that just isn't allowed.
> > 
> > It was for our auto destroy feature. 
> 
> auto domains are "hidden" hwpts that are kernel managed. They are not
> "userspace created".
> 
> "Usespace created" objects are ones that userspace is expected to call
> destroy on.

OK. I misunderstood that.

> If you destroy them behind the scenes in the kerenl then the objecd ID
> can be reallocated for something else and when userspace does DESTROY
> on the ID it thought was still allocated it will malfunction.
> 
> So, only userspace can destroy objects that userspace created.

I see. That makes sense.

> > If user space forgot to destroy the object while trying to unplug
> > the device from VM. This saves the day.
> 
> No, it should/does fail destroy of the VIOMMU object because the users
> refcount is elevated.

The vIOMMU object is refcount_dec also from the unbind() calling
remove(). But anyway, we aligned that userspace should destroy it
explicitly.

> > > Ugh, there is worse here, we can't hold a long term reference on a
> > > kernel owned object:
> > > 
> > > 	idev->vdev = vdev;
> > > 	refcount_inc(&idev->obj.users);
> > > 
> > > As it prevents the kernel from disconnecting it.
> > 
> > Hmm, mind elaborating? I think the iommufd_fops_release() would
> > xa_for_each the object list that destroys the vdev object first
> > then this idev (and viommu too)?
> 
> iommufd_device_unbind() can't fail, and if the object can't be
> destroyed because it has an elevated long term refcount it WARN's:
> 
> 
> 	ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM);
> 
> 	/*
> 	 * If there is a bug and we couldn't destroy the object then we did put
> 	 * back the caller's users refcount and will eventually try to free it
> 	 * again during close.
> 	 */
> 	WARN_ON(ret);
> 
> So you cannot take long term references on kernel owned objects. Only
> userspace owned objects.

OK. I think I had got this part. Gao ran into this WARN_ON at v3,
so I added iommufd_object_remove(vdev_id) in unbind() prior to
this iommufd_object_destroy_user(idev->ictx, &idev->obj).

> > OK. If user space forgot to destroy its vdev while unplugging the
> > device, it would not be allowed to hotplug another device (or the
> > same device) back to the same slot having the same RID, since the
> > RID on the vIOMMU would be occupied by the undestroyed vdev.
> 
> Yes, that seems correct and obvious to me. Until the vdev is
> explicitly destroyed the ID is in-use.
> 
> Good userspace should destroy the iommufd vDEVICE object before
> closing the VFIO file descriptor.
> 
> If it doesn't, then the VDEVICE object remains even though the VFIO it
> was linked to is gone.

I see.

Thanks
Nicolin


  reply	other threads:[~2024-10-29 19:32 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-25 23:50 [PATCH v5 00/13] iommufd: Add vIOMMU infrastructure (Part-2: vDEVICE) Nicolin Chen
2024-10-25 23:50 ` [PATCH v5 01/13] iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl Nicolin Chen
2024-10-28  3:11   ` Tian, Kevin
2024-10-28 20:18     ` Nicolin Chen
2024-10-29 15:58   ` Jason Gunthorpe
2024-10-29 17:29     ` Nicolin Chen
2024-10-29 18:48       ` Jason Gunthorpe
2024-10-29 19:30         ` Nicolin Chen [this message]
2024-10-30  0:08           ` Jason Gunthorpe
2024-10-25 23:50 ` [PATCH v5 02/13] iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage Nicolin Chen
2024-10-29  8:19   ` Tian, Kevin
2024-10-29 15:58   ` Jason Gunthorpe
2024-10-25 23:50 ` [PATCH v5 03/13] iommu/viommu: Add cache_invalidate to iommufd_viommu_ops Nicolin Chen
2024-10-29  8:19   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 04/13] iommufd/hw_pagetable: Enforce invalidation op on vIOMMU-based hwpt_nested Nicolin Chen
2024-10-29  8:22   ` Tian, Kevin
2024-10-29 16:04     ` Jason Gunthorpe
2024-10-30  0:41       ` Tian, Kevin
2024-10-29 16:01   ` Jason Gunthorpe
2024-10-25 23:50 ` [PATCH v5 05/13] iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE Nicolin Chen
2024-10-29  8:23   ` Tian, Kevin
2024-10-29 19:09   ` Jason Gunthorpe
2024-10-29 19:45     ` Nicolin Chen
2024-10-25 23:50 ` [PATCH v5 06/13] iommu: Add iommu_copy_struct_from_full_user_array helper Nicolin Chen
2024-10-29  8:24   ` Tian, Kevin
2024-10-30  4:08     ` Nicolin Chen
2024-10-25 23:50 ` [PATCH v5 07/13] iommufd/viommu: Add iommufd_viommu_find_dev helper Nicolin Chen
2024-10-27 15:02   ` Zhangfei Gao
2024-10-27 22:49     ` Nicolin Chen
2024-10-29  8:25       ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 08/13] iommufd/selftest: Add mock_viommu_cache_invalidate Nicolin Chen
2024-10-29  8:25   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 09/13] iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command Nicolin Chen
2024-10-29  8:25   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 10/13] iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl Nicolin Chen
2024-10-29  8:26   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 11/13] Documentation: userspace-api: iommufd: Update vDEVICE Nicolin Chen
2024-10-29  8:40   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate Nicolin Chen
2024-10-29  8:42   ` Tian, Kevin
2024-10-25 23:50 ` [PATCH v5 13/13] iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED Nicolin Chen
2024-10-28  3:03 ` [PATCH v5 00/13] iommufd: Add vIOMMU infrastructure (Part-2: vDEVICE) Tian, Kevin
2024-10-28 14:17   ` Jason Gunthorpe
2024-10-29  8:51     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyE3uGyVx9ivJeHI@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=aik@amd.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mdf@kernel.org \
    --cc=mshavit@google.com \
    --cc=patches@lists.linux.dev \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shuah@kernel.org \
    --cc=smostafa@google.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox