* [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during remove
@ 2026-05-29 9:10 Shameer Kolothum
2026-05-29 11:53 ` Jason Gunthorpe
0 siblings, 1 reply; 3+ messages in thread
From: Shameer Kolothum @ 2026-05-29 9:10 UTC (permalink / raw)
To: iommu, linux-kernel, linux-arm-kernel
Cc: nicolinc, jgg, joro, will, robin.murphy, nathanc, mochs,
skolothumtho
tegra241_vcmdq_hw_deinit() unconditionally issues a CMD_SYNC on
smmu->cmdq via tegra241_vcmdq_hw_flush_timeout(). When the SMMU is
being torn down (eg: probe failure), this CMD_SYNC hits freed
memory and UAFs. Observed during testing with a QEMU hack that
fails tegra241_cmdqv_setup_vcmdq(), so the guest sees the VCMDQ
enable as failed:
platform NVDA200C:00: tegra241_cmdqv: VINTF0: VCMDQ0/LVCMDQ0: failed to enable, STATUS=0x00000000
platform NVDA200C:00: tegra241_cmdqv: VINTF0: VCMDQ0/LVCMDQ0: GERRORN=0x0, GERROR=0x4, CONS=0x0
platform NVDA200C:00: tegra241_cmdqv: VINTF0: VCMDQ0/LVCMDQ0: uncleared error detected, resetting
arm-smmu-v3 arm-smmu-v3.0.auto: failed to reset impl
arm-smmu-v3 arm-smmu-v3.0.auto: probe with driver arm-smmu-v3 failed with error -110
Unable to handle kernel paging request at virtual address ffff8000891e0098
...
Internal error: Oops: 0000000096000047 [#1] SMP
...
Call trace:
arm_smmu_cmdq_issue_cmdlist+0x320/0x6fc (P)
tegra241_vcmdq_hw_deinit+0x98/0x168
tegra241_vintf_hw_deinit+0x5c/0x1b0
tegra241_cmdqv_remove_vintf+0x34/0xec
tegra241_cmdqv_remove+0x40/0x9c
arm_smmu_impl_remove+0x20/0x30
devm_action_release+0x14/0x20
devres_release_all+0xa8/0x110
device_unbind_cleanup+0x18/0x84
really_probe+0x1f0/0x29c
The reason is the devres ordering. arm_smmu_impl_probe() registers
arm_smmu_impl_remove as a devres action before arm_smmu_init_queues()
does dmam_alloc_coherent() for smmu->cmdq.q.base.
devres unwinds LIFO, so q.base is released first, then
arm_smmu_impl_remove()
tegra241_cmdqv_remove()
tegra241_vintf_hw_deinit()
tegra241_vcmdq_hw_deinit()
hw_flush_timeout() on a freed q.base.
The flush exists to drain a guest-owned VCMDQ's pending ATC_INV
TIMEOUT before the VCMDQ is handed to the next VM (see the comment
above tegra241_vcmdq_hw_flush_timeout()). impl-remove is not a
handover: no VM is taking the VCMDQ here. The next time a VM is
assigned a VCMDQ via the IOMMU_HW_QUEUE_ALLOC ioctl, this host
kernel driver's tegra241_vcmdq_hw_init_user() runs hw_deinit() ->
hw_flush_timeout() before the hw_queue is returned to userspace,
so any pending TIMEOUT is drained by the host before any guest
sees the VCMDQ.
Mark the cmdqv as removing at the top of tegra241_cmdqv_remove() and
skip the flush in tegra241_vcmdq_hw_deinit() when the flag is set.
All other hw_deinit callers (in-kernel hw_init reset, vintf_hw_init
failure unwind, user-mode lvcmdq destroy) run while smmu->cmdq is
still valid and continue to issue the flush as before.
Fixes: 4dc0d12474f9 ("iommu/tegra241-cmdqv: Add user-space use support")
Cc: stable@vger.kernel.org # v6.17+
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 4915ed96baca..b336b0bffe96 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -208,6 +208,7 @@ struct tegra241_vintf_sid {
* @num_sids_per_vintf: Total number of SID mappings per VINTF
* @vintf_ids: VINTF id allocator
* @vintfs: List of VINTFs
+ * @removing: Set while the device is being torn down via impl_remove
*/
struct tegra241_cmdqv {
struct arm_smmu_device smmu;
@@ -226,6 +227,8 @@ struct tegra241_cmdqv {
struct ida vintf_ids;
struct tegra241_vintf **vintfs;
+
+ bool removing;
};
/* Config and Polling Helpers */
@@ -452,7 +455,9 @@ static void tegra241_vcmdq_hw_deinit(struct tegra241_vcmdq *vcmdq)
readl_relaxed(REG_VCMDQ_PAGE0(vcmdq, GERROR)),
readl_relaxed(REG_VCMDQ_PAGE0(vcmdq, CONS)));
}
- tegra241_vcmdq_hw_flush_timeout(vcmdq);
+ /* In the removing path, smmu->cmdq.q.base is freed by the devres */
+ if (!vcmdq->cmdqv->removing)
+ tegra241_vcmdq_hw_flush_timeout(vcmdq);
writel_relaxed(0, REG_VCMDQ_PAGE0(vcmdq, PROD));
writel_relaxed(0, REG_VCMDQ_PAGE0(vcmdq, CONS));
@@ -789,6 +794,13 @@ static void tegra241_cmdqv_remove(struct arm_smmu_device *smmu)
container_of(smmu, struct tegra241_cmdqv, smmu);
u16 idx;
+ /*
+ * tegra241_cmdqv_remove() is added to devres at the very beginning. So,
+ * at this point, devres has already freed the SMMU resources that this
+ * path must not access to avoid a UAF.
+ */
+ cmdqv->removing = true;
+
/* Remove VINTF resources */
for (idx = 0; idx < cmdqv->num_vintfs; idx++) {
if (cmdqv->vintfs[idx]) {
@@ -932,6 +944,7 @@ __tegra241_cmdqv_probe(struct arm_smmu_device *smmu, struct resource *res,
cmdqv->base = base;
cmdqv->dev = smmu->impl_dev;
cmdqv->base_phys = res->start;
+ cmdqv->removing = false;
if (cmdqv->irq > 0) {
ret = request_threaded_irq(irq, NULL, tegra241_cmdqv_isr,
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during remove
2026-05-29 9:10 [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during remove Shameer Kolothum
@ 2026-05-29 11:53 ` Jason Gunthorpe
2026-05-29 12:50 ` Shameer Kolothum Thodi
0 siblings, 1 reply; 3+ messages in thread
From: Jason Gunthorpe @ 2026-05-29 11:53 UTC (permalink / raw)
To: Shameer Kolothum
Cc: iommu, linux-kernel, linux-arm-kernel, nicolinc, joro, will,
robin.murphy, nathanc, mochs
On Fri, May 29, 2026 at 10:10:52AM +0100, Shameer Kolothum wrote:
> @@ -789,6 +794,13 @@ static void tegra241_cmdqv_remove(struct arm_smmu_device *smmu)
> container_of(smmu, struct tegra241_cmdqv, smmu);
> u16 idx;
>
> + /*
> + * tegra241_cmdqv_remove() is added to devres at the very beginning. So,
> + * at this point, devres has already freed the SMMU resources that this
> + * path must not access to avoid a UAF.
> + */
> + cmdqv->removing = true;
Please don't add random bools to avoid devm bugs. Just don't use devm,
or maybe fix the devm ordering.. These ordering bugs are a well known
problem with using devm :(
Jason
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during remove
2026-05-29 11:53 ` Jason Gunthorpe
@ 2026-05-29 12:50 ` Shameer Kolothum Thodi
0 siblings, 0 replies; 3+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-29 12:50 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Nicolin Chen,
joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
Nathan Chen, Matt Ochs
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: 29 May 2026 12:54
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: iommu@lists.linux.dev; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; Nicolin Chen <nicolinc@nvidia.com>;
> joro@8bytes.org; will@kernel.org; robin.murphy@arm.com; Nathan Chen
> <nathanc@nvidia.com>; Matt Ochs <mochs@nvidia.com>
> Subject: Re: [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during
> remove
>
> External email: Use caution opening links or attachments
>
>
> On Fri, May 29, 2026 at 10:10:52AM +0100, Shameer Kolothum wrote:
>
> > @@ -789,6 +794,13 @@ static void tegra241_cmdqv_remove(struct
> arm_smmu_device *smmu)
> > container_of(smmu, struct tegra241_cmdqv, smmu);
> > u16 idx;
> >
> > + /*
> > + * tegra241_cmdqv_remove() is added to devres at the very beginning.
> So,
> > + * at this point, devres has already freed the SMMU resources that this
> > + * path must not access to avoid a UAF.
> > + */
> > + cmdqv->removing = true;
>
> Please don't add random bools to avoid devm bugs. Just don't use devm,
> or maybe fix the devm ordering.. These ordering bugs are a well known
> problem with using devm :(
Ok. Let me take another look at this.
Thanks,
Shameer
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-29 12:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 9:10 [PATCH] iommu/tegra241-cmdqv: Skip CMD_SYNC flush during remove Shameer Kolothum
2026-05-29 11:53 ` Jason Gunthorpe
2026-05-29 12:50 ` Shameer Kolothum Thodi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox