From: Nicolin Chen <nicolinc@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
"jgg@nvidia.com" <jgg@nvidia.com>,
"will@kernel.org" <will@kernel.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"xueshuai@linux.alibaba.com" <xueshuai@linux.alibaba.com>
Subject: Re: [PATCH rc v6] iommu: Fix nested pci_dev_reset_iommu_prepare/done()
Date: Fri, 17 Apr 2026 21:56:42 -0700 [thread overview]
Message-ID: <aeMO0whi7pPKE7FH@nvidia.com> (raw)
In-Reply-To: <aeKpxLiykT8O4k1X@Asurada-Nvidia>
On Fri, Apr 17, 2026 at 02:44:41PM -0700, Nicolin Chen wrote:
> On Fri, Apr 17, 2026 at 08:24:27AM +0000, Tian, Kevin wrote:
> > one is that iommu_detach_device_pasid() is not blocked which can trigger
> > devtlb invalidation in middle of reset. but it cannot fail. so the right fix is
> > to skip the blocked device in __iommu_remove_group_pasid().
>
> Yea, squashing this:
> @@ -3556,3 +3559,4 @@ static void __iommu_remove_group_pasid(struct iommu_group *group,
> for_each_group_device(group, device) {
> - if (device->dev->iommu->max_pasids > 0)
> + /* Device might be already detached for a device recovery */
> + if (!device->blocked && device->dev->iommu->max_pasids > 0)
> iommu_remove_dev_pasid(device->dev, pasid, domain);
>
> > another is a use-after-free concern upon iommu_detach_device() in
> > middle of reset. In my thinking it will trigger WARN_ON before any UAF:
> >
> > static void __iommu_group_set_domain_nofail(struct iommu_group *group,
> > struct iommu_domain *new_domain)
> > {
> > WARN_ON(__iommu_group_set_domain_internal(
> > group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> > }
>
> Yes.
>
> > but I haven't got time to think about the fix carefully.
>
> I think we could squash this:
>
> @@ -2469,9 +2469,2 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
>
> - /*
> - * This is a concurrent attach during device recovery. Reject it until
> - * pci_dev_reset_iommu_done() attaches the device to group->domain.
> - */
> - if (group->recovery_cnt)
> - return -EBUSY;
> -
On a second thought, we may not simply drop this -- IIRC, we added
it particularly to fence a case where gdevs share the same RID or
some corner case like that?
In a conservative way, we can still reject concurrent attach while
allowing the detach case:
+ /*
+ * This is a concurrent attach during device recovery. Reject it until
+ * pci_dev_reset_iommu_done() attaches the device to group->domain.
+ *
+ * Note: still allow MUST_SUCCEED callers (detach/teardown) through to
+ * avoid UAF on domain release paths.
+ */
+ if (group->recovery_cnt && !(flags & IOMMU_SET_DOMAIN_MUST_SUCCEED))
+ return -EBUSY;
+
In the detach path, it'll move forward and skip per gdev->blocked
inside the for_each_group_device() and defer the attach to done().
Thanks
Nicolin
> @@ -2484,2 +2477,10 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
> for_each_group_device(group, gdev) {
> + /*
> + * Skip devices under recovery: they are already attached to
> + * group->blocking_domain at the hardware level. When their
> + * reset completes, pci_dev_reset_iommu_done() will re-attach
> + * them to the updated group->domain.
> + */
> + if (gdev->blocked)
> + continue;
> ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
> @@ -2513,2 +2514,4 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
> break;
> + if (gdev->blocked)
> + continue;
> /*
>
>
> > the last one is trivial that goto and guard() shouldn't be mixed in one
> > function according to the cleanup guidelines.
>
> I don't think this is mixing. The guard is protecting the entire
> routine including those goto paths. So there isn't any goto path
> that is outside the mutex.
>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>
> Thanks!
> Nicolin
prev parent reply other threads:[~2026-04-18 4:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 19:46 [PATCH rc v6] iommu: Fix nested pci_dev_reset_iommu_prepare/done() Nicolin Chen
2026-04-14 14:20 ` Jason Gunthorpe
2026-04-16 7:48 ` Shuai Xue
2026-04-17 8:24 ` Tian, Kevin
2026-04-17 21:44 ` Nicolin Chen
2026-04-18 4:56 ` Nicolin Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeMO0whi7pPKE7FH@nvidia.com \
--to=nicolinc@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox