Re: [PATCH rc v6] iommu: Fix nested pci_dev_reset_iommu_prepare/done()

public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed

From: Nicolin Chen <nicolinc@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"will@kernel.org" <will@kernel.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"xueshuai@linux.alibaba.com" <xueshuai@linux.alibaba.com>
Subject: Re: [PATCH rc v6] iommu: Fix nested pci_dev_reset_iommu_prepare/done()
Date: Fri, 17 Apr 2026 21:56:42 -0700	[thread overview]
Message-ID: <aeMO0whi7pPKE7FH@nvidia.com> (raw)
In-Reply-To: <aeKpxLiykT8O4k1X@Asurada-Nvidia>

On Fri, Apr 17, 2026 at 02:44:41PM -0700, Nicolin Chen wrote:
> On Fri, Apr 17, 2026 at 08:24:27AM +0000, Tian, Kevin wrote:
> > one is that iommu_detach_device_pasid() is not blocked which can trigger
> > devtlb invalidation in middle of reset. but it cannot fail. so the right fix is
> > to skip the blocked device in __iommu_remove_group_pasid().
> 
> Yea, squashing this:
> @@ -3556,3 +3559,4 @@ static void __iommu_remove_group_pasid(struct iommu_group *group,
>         for_each_group_device(group, device) {
> -               if (device->dev->iommu->max_pasids > 0)
> +               /* Device might be already detached for a device recovery */
> +               if (!device->blocked && device->dev->iommu->max_pasids > 0)
>                         iommu_remove_dev_pasid(device->dev, pasid, domain);
> 
> > another is a use-after-free concern upon iommu_detach_device() in
> > middle of reset. In my thinking it will trigger WARN_ON before any UAF:
> > 
> > static void __iommu_group_set_domain_nofail(struct iommu_group *group,
> >                                             struct iommu_domain *new_domain)
> > {
> >         WARN_ON(__iommu_group_set_domain_internal(
> >                 group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> > }
> 
> Yes.
> 
> > but I haven't got time to think about the fix carefully. 
> 
> I think we could squash this:
> 
> @@ -2469,9 +2469,2 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
> 
> -       /*
> -        * This is a concurrent attach during device recovery. Reject it until
> -        * pci_dev_reset_iommu_done() attaches the device to group->domain.
> -        */
> -       if (group->recovery_cnt)
> -               return -EBUSY;
> -

On a second thought, we may not simply drop this -- IIRC, we added
it particularly to fence a case where gdevs share the same RID or
some corner case like that?

In a conservative way, we can still reject concurrent attach while
allowing the detach case:

+	/*
+	 * This is a concurrent attach during device recovery. Reject it until
+	 * pci_dev_reset_iommu_done() attaches the device to group->domain.
+	 *
+	 * Note: still allow MUST_SUCCEED callers (detach/teardown) through to
+	 * avoid UAF on domain release paths.
+	 */
+	if (group->recovery_cnt && !(flags & IOMMU_SET_DOMAIN_MUST_SUCCEED))
+		return -EBUSY;
+

In the detach path, it'll move forward and skip per gdev->blocked
inside the for_each_group_device() and defer the attach to done().

Thanks
Nicolin

> @@ -2484,2 +2477,10 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
>         for_each_group_device(group, gdev) {
> +               /*
> +                * Skip devices under recovery: they are already attached to
> +                * group->blocking_domain at the hardware level. When their
> +                * reset completes, pci_dev_reset_iommu_done() will re-attach
> +                * them to the updated group->domain.
> +                */
> +               if (gdev->blocked)
> +                       continue;
>                 ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
> @@ -2513,2 +2514,4 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
>                         break;
> +               if (gdev->blocked)
> +                       continue;
>                 /*
> 
> 
> > the last one is trivial that goto and guard() shouldn't be mixed in one
> > function according to the cleanup guidelines.
> 
> I don't think this is mixing. The guard is protecting the entire
> routine including those goto paths. So there isn't any goto path
> that is outside the mutex.
> 
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> Thanks!
> Nicolin

     prev parent reply	other threads:[~2026-04-18  4:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07 19:46 [PATCH rc v6] iommu: Fix nested pci_dev_reset_iommu_prepare/done() Nicolin Chen
2026-04-14 14:20 ` Jason Gunthorpe
2026-04-16  7:48 ` Shuai Xue
2026-04-17  8:24 ` Tian, Kevin
2026-04-17 21:44   ` Nicolin Chen
2026-04-18  4:56     ` Nicolin Chen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeMO0whi7pPKE7FH@nvidia.com \
    --to=nicolinc@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox