* [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices
@ 2026-03-23 13:09 Tudor Ambarus
2026-03-23 13:54 ` Jason Gunthorpe
2026-03-24 11:40 ` Robin Murphy
0 siblings, 2 replies; 9+ messages in thread
From: Tudor Ambarus @ 2026-03-23 13:09 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi,
Jason Gunthorpe, Rob Herring (Arm)
Cc: Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin,
andre.draszik, willmcvicker, jyescas, kernel-team, stable,
Tudor Ambarus
Commit da33e87bd2bf ("iommu: Handle yet another race around
registration") introduced a readiness check in `iommu_fwspec_init()` to
prevent client drivers from configuring their IOMMUs before
`bus_iommu_probe()` has completed.
To optimize the replay path, the readiness check was conditionally
gated behind `!dev->iommu`:
if (!dev->iommu && !READ_ONCE(iommu->ready))
return -EPROBE_DEFER;
However, this assumption breaks down for devices that map to multiple
IOMMU instances. During the initialization loop over multiple IOMMUs in
`of_iommu_configure_device()`, the first IOMMU successfully allocates
`dev->iommu`. When `iommu_fwspec_init()` is called for the second
IOMMU, `!dev->iommu` evaluates to false, short-circuiting the logic and
entirely bypassing the `iommu->ready` check.
If the second IOMMU is still executing its `bus_iommu_probe()`
concurrently, this allows the client driver to proceed prematurely,
resulting in a late IOMMU probe warning:
dev: late IOMMU probe at driver bind, something fishy here!
WARNING: drivers/iommu/iommu.c:645 at __iommu_probe_device
Fix this by making the `iommu->ready` check unconditional, ensuring
that a device will defer its probe until *all* of its required IOMMUs
are fully registered and ready.
Cc: stable@vger.kernel.org
Fixes: da33e87bd2bf ("iommu: Handle yet another race around registration")
Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
The warning was observed using an Android 6.19 tree, using downstream
drivers (exynos-decon and samsung-sysmmu-v9).
---
drivers/iommu/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 78756c3f3c40..e61927b4d41f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3042,7 +3042,7 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode)
if (!iommu)
return driver_deferred_probe_check_state(dev);
- if (!dev->iommu && !READ_ONCE(iommu->ready))
+ if (!READ_ONCE(iommu->ready))
return -EPROBE_DEFER;
if (fwspec)
---
base-commit: ca3bbc9287400c1274d87ee57a16e3126ba2969a
change-id: 20260320-iommu-ready-check-4976863957c2
Best regards,
--
Tudor Ambarus <tudor.ambarus@linaro.org>
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-03-23 13:09 [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices Tudor Ambarus @ 2026-03-23 13:54 ` Jason Gunthorpe 2026-03-23 16:46 ` Tudor Ambarus 2026-03-24 11:40 ` Robin Murphy 1 sibling, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2026-03-23 13:54 UTC (permalink / raw) To: Tudor Ambarus Cc: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable On Mon, Mar 23, 2026 at 01:09:27PM +0000, Tudor Ambarus wrote: > Commit da33e87bd2bf ("iommu: Handle yet another race around > registration") introduced a readiness check in `iommu_fwspec_init()` to > prevent client drivers from configuring their IOMMUs before > `bus_iommu_probe()` has completed. > > To optimize the replay path, the readiness check was conditionally > gated behind `!dev->iommu`: > if (!dev->iommu && !READ_ONCE(iommu->ready)) > return -EPROBE_DEFER; > > However, this assumption breaks down for devices that map to multiple > IOMMU instances. ?? We don't directly support "multiple IOMMU instances". There is only one dev->iommu. AFAIK if some drivers need to support multiple different instances of the same IOMMU driver they must deal with this fully internally and present to the core a "single instance" view. So, your explanation doesn't make sense to me. If dev->iommu is set then the driver must be ready, including any multi-instances it has. If it is not ready then this is really an iommu driver bug, not a core bug? Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-03-23 13:54 ` Jason Gunthorpe @ 2026-03-23 16:46 ` Tudor Ambarus 2026-03-23 17:31 ` Jason Gunthorpe 0 siblings, 1 reply; 9+ messages in thread From: Tudor Ambarus @ 2026-03-23 16:46 UTC (permalink / raw) To: Jason Gunthorpe Cc: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable Hi, Jason, On 3/23/26 3:54 PM, Jason Gunthorpe wrote: > On Mon, Mar 23, 2026 at 01:09:27PM +0000, Tudor Ambarus wrote: >> Commit da33e87bd2bf ("iommu: Handle yet another race around >> registration") introduced a readiness check in `iommu_fwspec_init()` to >> prevent client drivers from configuring their IOMMUs before >> `bus_iommu_probe()` has completed. >> >> To optimize the replay path, the readiness check was conditionally >> gated behind `!dev->iommu`: >> if (!dev->iommu && !READ_ONCE(iommu->ready)) >> return -EPROBE_DEFER; >> >> However, this assumption breaks down for devices that map to multiple >> IOMMU instances. > > ?? We don't directly support "multiple IOMMU instances". There is only > one dev->iommu. > > AFAIK if some drivers need to support multiple different instances of > the same IOMMU driver they must deal with this fully internally and > present to the core a "single instance" view. Thanks for the quick answer. I may miss a few things, I should have marked this as an RFC. Would you please help me understand a little bit more on this topic? Downstream we have a display controller that's using: iommus = <&sysmmu_19840000>, <&sysmmu_19c40000>; These are 2 distinct platform devices, they probe independently, they each call iommu_device_register() independently. If I understood you correctly, the downstream driver shall model its architecture and call iommu_device_register() only once after both devices are configured. My downstream reality is different. Here's what I'm encountering: 1/ sysmmu_19840000: dev->iommu is NULL. iommu_fwspec_init() correctly evaluates !READ_ONCE(sysmmu_19840000->ready). Assuming it is ready, it allocates dev->iommu. 2/ dev->iommu is now NOT NULL. iommu_fwspec_init() is called for the second physical instance. 3/ Because of the !dev->iommu gate, the evaluation of !READ_ONCE(sysmmu_19c40000->ready) is short-circuited and skipped entirely. But sysmmu_19c40000 is not ready, its specific bus_iommu_probe() is executing asynchronously on another CPU. If the core's intent is to strictly enforce a single IOMMU instance, shouldn't iommu_fwspec_init() be checking fwspec->iommu_fwnode == iommu_fwnode instead of matching the ops? Because the core currently matches on ops, it permits aggregating multiple physical instances with the same ops into one fwspec. Thanks a ton! ta --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2940,7 +2940,7 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode) return -EPROBE_DEFER; if (fwspec) - return iommu->ops == iommu_fwspec_ops(fwspec) ? 0 : -EINVAL; + return fwspec->iommu_fwnode == iommu_fwnode ? 0 : -EINVAL; if (!dev_iommu_get(dev)) return -ENOMEM; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-03-23 16:46 ` Tudor Ambarus @ 2026-03-23 17:31 ` Jason Gunthorpe 2026-04-02 11:25 ` Tudor Ambarus 0 siblings, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2026-03-23 17:31 UTC (permalink / raw) To: Tudor Ambarus Cc: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable On Mon, Mar 23, 2026 at 06:46:39PM +0200, Tudor Ambarus wrote: > Downstream we have a display controller that's using: > iommus = <&sysmmu_19840000>, <&sysmmu_19c40000>; > > These are 2 distinct platform devices, they probe independently, they > each call iommu_device_register() independently. Sure, I guessed that is what you ment.. Do you have an example of this in an upstream DTS file? > If I understood you correctly, the downstream driver shall model its > architecture and call iommu_device_register() only once after both > devices are configured. No.. I'm not being so perscriptive, I'm just saying that once iommu->ops->probe_device() returns then the device is fully setup and dev->iommu will operate all of the iommus described in iommus=<..> probe_device() cannot return some half setup device with only some of the iommu instances working. We don't have any core idea of a half setup result from probe_device() today. > If the core's intent is to strictly enforce a single IOMMU instance, > shouldn't iommu_fwspec_init() be checking > fwspec->iommu_fwnode == iommu_fwnode > instead of matching the ops? Because the core currently matches on > ops, it permits aggregating multiple physical instances with the > same ops into one fwspec. The driver is responsible to handle this, not the core. It has to hide this mess under its covers, not rely on multiple calls to of_xlate or however it has been hacked up. Probably it means something like of_xlate/probe_device has to EPROBE_DEFER if all the instances listed in iommus don't exist. Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-03-23 17:31 ` Jason Gunthorpe @ 2026-04-02 11:25 ` Tudor Ambarus 2026-04-02 11:59 ` Jason Gunthorpe 0 siblings, 1 reply; 9+ messages in thread From: Tudor Ambarus @ 2026-04-02 11:25 UTC (permalink / raw) To: Jason Gunthorpe Cc: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable Hi, Jason, On 3/23/26 7:31 PM, Jason Gunthorpe wrote: > On Mon, Mar 23, 2026 at 06:46:39PM +0200, Tudor Ambarus wrote: > >> Downstream we have a display controller that's using: >> iommus = <&sysmmu_19840000>, <&sysmmu_19c40000>; >> >> These are 2 distinct platform devices, they probe independently, they >> each call iommu_device_register() independently. > > Sure, I guessed that is what you ment.. > > Do you have an example of this in an upstream DTS file? Yes, Exynos multimedia blocks use this upstream For example, in arch/arm64/boot/dts/exynos/exynos5433.dtsi, the `decon` and `decon_tv` nodes route through multiple sysmmus: iommus = <&sysmmu_decon0x>, <&sysmmu_decon1x>; Looking at the upstream exynos-iommu.c driver, it doesn't return -EPROBE_DEFER if all the instances listed in iommus doesn't exist. It seems it survives the race though, but only because of the core_initcall ordering. In downstream the IOMMU is forced to be a module which exposes this gap. > >> If I understood you correctly, the downstream driver shall model its >> architecture and call iommu_device_register() only once after both >> devices are configured. > > No.. I'm not being so perscriptive, I'm just saying that once > iommu->ops->probe_device() returns then the device is fully setup and > dev->iommu will operate all of the iommus described in iommus=<..> > > probe_device() cannot return some half setup device with only some of > the iommu instances working. > > We don't have any core idea of a half setup result from > probe_device() today. > >> If the core's intent is to strictly enforce a single IOMMU instance, >> shouldn't iommu_fwspec_init() be checking >> fwspec->iommu_fwnode == iommu_fwnode >> instead of matching the ops? Because the core currently matches on >> ops, it permits aggregating multiple physical instances with the >> same ops into one fwspec. > > The driver is responsible to handle this, not the core. It has to hide > this mess under its covers, not rely on multiple calls to of_xlate or > however it has been hacked up. > > Probably it means something like of_xlate/probe_device has to > EPROBE_DEFER if all the instances listed in iommus don't exist. > I can probably track whether all instances are ready, and defer if any is not ready, but then I'll force the iommu clients to use the sketchy replay path, which seems like a bad idea, according to Robin's feedback. I haven't seen functional problems with the races, just the "something fishy" dev_WARN. Maybe we shall downgrade that to dev_info. Thanks! ta ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-04-02 11:25 ` Tudor Ambarus @ 2026-04-02 11:59 ` Jason Gunthorpe 2026-04-02 14:20 ` Robin Murphy 0 siblings, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2026-04-02 11:59 UTC (permalink / raw) To: Tudor Ambarus Cc: Joerg Roedel, Will Deacon, Robin Murphy, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable On Thu, Apr 02, 2026 at 02:25:54PM +0300, Tudor Ambarus wrote: > I can probably track whether all instances are ready, and defer if any > is not ready, but then I'll force the iommu clients to use the sketchy > replay path, which seems like a bad idea, according to Robin's feedback. I didn't think that was sketchy, it is part of the boot ordering system to ensure that the iommu driver(s) is probed before the client devices. Half operating a device is definately going to get things into trouble with broken/incomplete domain attachments at least. Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-04-02 11:59 ` Jason Gunthorpe @ 2026-04-02 14:20 ` Robin Murphy 2026-04-14 13:04 ` Tudor Ambarus 0 siblings, 1 reply; 9+ messages in thread From: Robin Murphy @ 2026-04-02 14:20 UTC (permalink / raw) To: Jason Gunthorpe, Tudor Ambarus Cc: Joerg Roedel, Will Deacon, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable On 2026-04-02 12:59 pm, Jason Gunthorpe wrote: > On Thu, Apr 02, 2026 at 02:25:54PM +0300, Tudor Ambarus wrote: > >> I can probably track whether all instances are ready, and defer if any >> is not ready, but then I'll force the iommu clients to use the sketchy >> replay path, which seems like a bad idea, according to Robin's feedback. > > I didn't think that was sketchy, it is part of the boot ordering > system to ensure that the iommu driver(s) is probed before the client > devices. > > Half operating a device is definately going to get things into trouble > with broken/incomplete domain attachments at least. The Exynos driver itself is actually fine, and doing everything right. We'll never have a "half-configured" client device in IOMMU API terms currently - only once both instances are registered such that both of_xlate calls can succeed (one for each specifier in the client device's "iommus" property) will we proceed to calling probe_device, which will then work as normal. The issue here is purely in the race-avoidance scheme within of_iommu_configure() itself, which hasn't accounted for the fact that when it's looping over multiple specifiers, they don't necessarily all target the same IOMMU node. And it's only during a window where the instance targeted by the first specifier happens to be registered already, and the second is currently in the middle of registering. Thanks, Robin. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-04-02 14:20 ` Robin Murphy @ 2026-04-14 13:04 ` Tudor Ambarus 0 siblings, 0 replies; 9+ messages in thread From: Tudor Ambarus @ 2026-04-14 13:04 UTC (permalink / raw) To: Robin Murphy, Jason Gunthorpe Cc: Joerg Roedel, Will Deacon, Lorenzo Pieralisi, Rob Herring (Arm), Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable [-- Attachment #1: Type: text/plain, Size: 8309 bytes --] Hi, Robin, Thanks for the educative answers. On 4/2/26 5:20 PM, Robin Murphy wrote: > On 2026-04-02 12:59 pm, Jason Gunthorpe wrote: >> On Thu, Apr 02, 2026 at 02:25:54PM +0300, Tudor Ambarus wrote: >> >>> I can probably track whether all instances are ready, and defer if any >>> is not ready, but then I'll force the iommu clients to use the sketchy >>> replay path, which seems like a bad idea, according to Robin's feedback. >> >> I didn't think that was sketchy, it is part of the boot ordering >> system to ensure that the iommu driver(s) is probed before the client >> devices. >> >> Half operating a device is definately going to get things into trouble >> with broken/incomplete domain attachments at least. > > The Exynos driver itself is actually fine, and doing everything right. We'll never have a "half-configured" client device in IOMMU API terms currently - only once both instances are registered such that both of_xlate calls can succeed (one for each specifier in the client device's "iommus" property) will we proceed to calling probe_device, which will then work as normal. I assume this is true because of core_initcall(exynos_iommu_init); Or is it something else that guarantees that both the IOMMU instances are registered at exynos_iommu_of_xlate() time? The instance's drvdata is set before the instance registers, and the rest of the code in exynos_iommu_of_xlate constructs the client's list of iommu instances. > > The issue here is purely in the race-avoidance scheme within of_iommu_configure() itself, which hasn't accounted for the fact that when it's looping over multiple specifiers, they don't necessarily all target the same IOMMU node. And it's only during a window where the instance targeted by the first specifier happens to be registered already, and the second is currently in the middle of registering. Allow me to give an example. I added some extra prints to catch things. I attached a diff if you want to see exactly where. I stripped a bit the log so that it's easier to follow. I added comments. 19471000.drmdecon describe 19840000.sysmmu and 19c40000.sysmmu in its "iommus" dt property. [ 2.308076][ T1] samsung-sysmmu-v9 19840000.sysmmu: tudor: samsung_sysmmu_device_probe: after iommu_device_register ^ first instance registered [ 2.310095][ T86] platform 19471000.drmdecon: tudor: really_probe: enter for driver exynos-decon ^ client in really_probe() [ 2.311634][ T1] samsung-sysmmu-v9 19c40000.sysmmu: tudor: samsung_sysmmu_device_probe: before iommu_device_register ^ second instance right before calling iommu_device_register() [ 2.312358][ T86] exynos-decon 19471000.drmdecon: tudor: really_probe: after setting dev->driver exynos-decon ^ client set dev->driver! [ 2.348787][ T86] exynos-decon 19471000.drmdecon: tudor: of_iommu_configure: of_iommu_configure_device err = 0, dev_iommu_present = 0 ^ of_xlate succeeded for both instances We're in the client's really_probe() path. Shall of_xlate fail if the iommu instance is not registered yet? [ 2.349277][ T86] exynos-decon 19471000.drmdecon: tudor: of_iommu_configure: call iommu_probe_device new dev_iommu_present = 1, dev->iommu = 000000007deaca98 ^ initial dev->iommu was not set, we prepare to call iommu_probe_device() even if dev->iommu was set in the of_iommu_configure_device() call The thread is going to sleep now, it can't acquire iommu_probe_device_lock [ 2.358261][ T1] exynos-decon 19471000.drmdecon: tudor: iommu_init_device: bus->dma_configure not called dev->iommu = 000000007deaca98, dev->iommu_group = 0000000000000000, dev->iommu->fwspec = 00000000896797d4 dev->driver = 000000008821e7b2 ^ inside iommu_init_device(), dev->bus->dma_configure(dev) is NOT called. Then it calls ops->probe_device() and ops->device_group(), succeeding. [ 2.360561][ T1] ------------[ cut here ]------------ [ 2.360729][ T1] exynos-decon 19471000.drmdecon: late IOMMU probe at driver bind, something fishy here! [ 2.361153][ T1] WARNING: drivers/iommu/iommu.c:650 at __iommu_probe_device+0x3a0/0x608, CPU#7: init/1 [ 2.369032][ T1] CPU: 7 UID: 0 PID: 1 Comm: init Tainted: G S O 6.19.0-mainline-g2887d9327e73-dirty #1 PREEMPT 12ccb00dc0e1a023f9ae52bf43dc85a3c6f0cd0b [ 2.369592][ T1] Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE [ 2.369848][ T1] Hardware name: ZUMA PRO KOMODO EVT 1.0 Broadcom GNSS board based on ZUMA PRO (DT) [ 2.370205][ T1] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 2.370499][ T1] pc : __iommu_probe_device+0x3a0/0x608 [ 2.370681][ T1] lr : __iommu_probe_device+0x3a0/0x608 [ 2.370915][ T1] sp : ffffc0008005b4c0 [ 2.371043][ T1] x29: ffffc0008005b4c0 x28: ffff80001d791290 x27: 0000000000000000 [ 2.371372][ T1] x26: 000000000000001d x25: ffffc0008005b598 x24: ffffe42571f2dd08 [ 2.371674][ T1] x23: ffff80001e00e490 x22: ffff800005744e00 x21: ffffe4256fd687e0 [ 2.371976][ T1] x20: ffffc0008005b598 x19: ffff800005b79810 x18: ffffe425731e13c0 [ 2.372277][ T1] x17: 000000008c623181 x16: 000000008c623181 x15: 0000000000000000 [ 2.372553][ T1] x14: 0000000000000003 x13: ffff800a7eef8000 x12: 0000000000000003 [ 2.372881][ T1] x11: 0000000000000003 x10: 00000000ffff7fff x9 : 90cf001dbb0ee800 [ 2.373184][ T1] x8 : 90cf001dbb0ee800 x7 : ffffe42571d610b4 x6 : 0000000000000000 [ 2.373486][ T1] x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffffc0008005b210 [ 2.373788][ T1] x2 : 0000000000000dba x1 : 0000000000000000 x0 : 0000000000000056 [ 2.374089][ T1] Call trace: [ 2.374185][ T1] __iommu_probe_device+0x3a0/0x608 (P) [ 2.374419][ T1] probe_iommu_group+0x3c/0x68 [ 2.374570][ T1] bus_for_each_dev+0x104/0x160 [ 2.374752][ T1] iommu_device_register+0xe0/0x274 [ 2.374947][ T1] samsung_sysmmu_device_probe+0x720/0x938 [samsung_iommu_v9 756cd85ee7308d7c98df18fc7d1a25a4db1266e9] [ 2.375392][ T1] platform_probe+0x74/0xb8 [ 2.375533][ T1] really_probe+0x1a4/0x50c [ 2.375702][ T1] __driver_probe_device+0x98/0xd4 [ 2.375892][ T1] driver_probe_device+0x3c/0x220 [ 2.376080][ T1] __driver_attach+0x118/0x1f8 [ 2.376258][ T1] bus_for_each_dev+0x104/0x160 [ 2.376440][ T1] driver_attach+0x24/0x34 [ 2.376603][ T1] bus_add_driver+0x140/0x2ac [ 2.376779][ T1] driver_register+0x68/0x104 [ 2.376953][ T1] __platform_driver_register+0x20/0x30 [ 2.377186][ T1] init_module+0x20/0x3fd8 [samsung_iommu_v9 756cd85ee7308d7c98df18fc7d1a25a4db1266e9] [ 2.377524][ T1] do_one_initcall+0x100/0x3ac [ 2.377702][ T1] do_init_module+0x58/0x264 [ 2.377872][ T1] load_module+0x1244/0x1290 [ 2.378047][ T1] __arm64_sys_finit_module+0x248/0x334 [ 2.378280][ T1] invoke_syscall+0x58/0xe4 [ 2.378423][ T1] el0_svc_common+0x8c/0xd8 [ 2.378590][ T1] do_el0_svc+0x1c/0x28 [ 2.378745][ T1] el0_svc+0x54/0x1c4 [ 2.378892][ T1] el0t_64_sync_handler+0x84/0x12c [ 2.379084][ T1] el0t_64_sync+0x1c4/0x1c8 [ 2.379252][ T1] irq event stamp: 822284 [ 2.379413][ T1] hardirqs last enabled at (822283): [<ffffe42570bc7dc0>] __console_unlock+0x64/0xbc [ 2.379800][ T1] hardirqs last disabled at (822284): [<ffffe42571d52748>] el1_brk64+0x20/0x54 [ 2.380139][ T1] softirqs last enabled at (822092): [<ffffe42570b02efc>] handle_softirqs+0x484/0x504 [ 2.380505][ T1] softirqs last disabled at (822085): [<ffffe42570a10518>] __do_softirq+0x14/0x20 [ 2.380854][ T1] ---[ end trace 0000000000000000 ]--- [ 2.381058][ T1] exynos-decon 19471000.drmdecon: Adding to iommu group 1 [ 2.382126][ T86] exynos-decon 19471000.drmdecon: tudor: __iommu_probe_device: enter [ 2.382420][ T86] exynos-decon 19471000.drmdecon: tudor: of_iommu_configure: after iommu_probe_device err = 0 [ 2.382734][ T86] exynos-decon 19471000.drmdecon: tudor: iommu_device_use_default_domain: we hit !group->default_domain ^ returns -EPROBE_DEFER [ 2.383125][ T86] exynos-decon 19471000.drmdecon: tudor: really_probe: after dma_configure exynos-decon ret = -517 [ 2.383445][ T86] exynos-decon 19471000.drmdecon: tudor: really_probe: clear dev->driver exynos-decon [ [-- Attachment #2: diff --] [-- Type: text/plain, Size: 6417 bytes --] diff --git a/drivers/base/dd.c b/drivers/base/dd.c index bea8da5f8a3a..efa1634214e6 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -609,6 +609,9 @@ static int really_probe(struct device *dev, const struct device_driver *drv) bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) && !drv->suppress_bind_attrs; int ret, link_ret; + bool is_decon = (dev_name(dev) && strstr(dev_name(dev), "drmdecon")); + + dev_err(dev, "tudor: %s enter for driver %s\n", __func__, drv->name); if (defer_all_probes) { /* @@ -616,11 +619,14 @@ static int really_probe(struct device *dev, const struct device_driver *drv) * device_block_probing() which, in turn, will call * wait_for_device_probe() right after that to avoid any races. */ - dev_dbg(dev, "Driver %s force probe deferral\n", drv->name); + dev_err(dev, "Driver %s force probe deferral\n", drv->name); return -EPROBE_DEFER; } link_ret = device_links_check_suppliers(dev); + if (is_decon) + dev_err(dev, "tudor: %s device_links_check_suppliers() ret %d\n", + __func__, link_ret); if (link_ret == -EPROBE_DEFER) return link_ret; @@ -633,7 +639,13 @@ static int really_probe(struct device *dev, const struct device_driver *drv) } re_probe: + if (is_decon) + dev_err(dev, "tudor: %s before setting dev->driver %s\n", + __func__, drv->name); device_set_driver(dev, drv); + if (is_decon) + dev_err(dev, "tudor: %s after setting dev->driver %s\n", + __func__, drv->name); /* If using pinctrl, bind pins now before probing */ ret = pinctrl_bind_pins(dev); @@ -641,7 +653,13 @@ static int really_probe(struct device *dev, const struct device_driver *drv) goto pinctrl_bind_failed; if (dev->bus->dma_configure) { + if (is_decon) + dev_err(dev, "tudor: %s before dma_configure %s\n", + __func__, drv->name); ret = dev->bus->dma_configure(dev); + if (is_decon) + dev_err(dev, "tudor: %s after dma_configure %s ret = %d\n", + __func__, drv->name, ret); if (ret) goto pinctrl_bind_failed; } @@ -723,6 +741,10 @@ static int really_probe(struct device *dev, const struct device_driver *drv) if (dev->bus && dev->bus->dma_cleanup) dev->bus->dma_cleanup(dev); pinctrl_bind_failed: + + if (is_decon) + dev_err(dev, "tudor: %s clear dev->driver %s\n", + __func__, drv->name); device_links_no_driver(dev); device_unbind_cleanup(dev); done: diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 4926a43118e6..35042965b0f7 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -450,6 +450,11 @@ static int iommu_init_device(struct device *dev) struct iommu_device *iommu_dev; struct iommu_group *group; int ret; + bool is_decon = (dev_name(dev) && strstr(dev_name(dev), "drmdecon")); + + if (is_decon) + dev_err(dev, "tudor: %s, dev->iommu = %p, dev->iommu_group = %p\n", + __func__, dev->iommu, dev->iommu_group); if (!dev_iommu_get(dev)) return -ENOMEM; @@ -464,10 +469,19 @@ static int iommu_init_device(struct device *dev) mutex_unlock(&iommu_probe_device_lock); dev->bus->dma_configure(dev); mutex_lock(&iommu_probe_device_lock); + if (is_decon) + dev_err(dev, "tudor: %s, after bus->dma_configure(): dev->iommu = %p, dev->iommu_group = %p\n", + __func__, dev->iommu, dev->iommu_group); /* If another instance finished the job for us, skip it */ if (!dev->iommu || dev->iommu_group) return -ENODEV; + } else { + if (is_decon) + dev_err(dev, "tudor: %s, bus->dma_configure not called dev->iommu = %p, dev->iommu_group = %p, dev->iommu->fwspec = %p dev->driver = %p\n", + __func__, dev->iommu, dev->iommu_group, dev->iommu->fwspec, dev->driver); } + + /* * At this point, relevant devices either now have a fwspec which will * match ops registered with a non-NULL fwnode, or we can reasonably @@ -608,7 +622,10 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list struct iommu_group *group; struct group_device *gdev; int ret; + bool is_decon = (dev_name(dev) && strstr(dev_name(dev), "drmdecon")); + if (is_decon) + dev_err(dev, "tudor: %s enter\n", __func__); /* * Serialise to avoid races between IOMMU drivers registering in * parallel and/or the "replay" calls from ACPI/OF code via client @@ -3241,6 +3258,7 @@ int iommu_device_use_default_domain(struct device *dev) /* Caller is the driver core during the pre-probe path */ struct iommu_group *group = dev->iommu_group; int ret = 0; + bool is_decon = (dev_name(dev) && strstr(dev_name(dev), "drmdecon")); if (!group) return 0; @@ -3248,6 +3266,9 @@ int iommu_device_use_default_domain(struct device *dev) mutex_lock(&group->mutex); /* We may race against bus_iommu_probe() finalising groups here */ if (!group->default_domain) { + if (is_decon) + dev_err(dev, "tudor: %s, we hit !group->default_domain\n", + __func__); ret = -EPROBE_DEFER; goto unlock_out; } diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 6b989a62def2..af89f1c874dc 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -118,6 +118,7 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np, { bool dev_iommu_present; int err; + bool is_decon = (dev_name(dev) && strstr(dev_name(dev), "drmdecon")); if (!master_np) return -ENODEV; @@ -155,13 +156,25 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np, dev_iommu_free(dev); mutex_unlock(&iommu_probe_device_lock); + if (is_decon) + dev_err(dev, "tudor: %s: of_iommu_configure_device err = %d, dev_iommu_present = %d\n", + __func__, err, dev_iommu_present); /* * If we're not on the iommu_probe_device() path (as indicated by the * initial dev->iommu) then try to simulate it. This should no longer * happen unless of_dma_configure() is being misused outside bus code. */ - if (!err && dev->bus && !dev_iommu_present) + if (!err && dev->bus && !dev_iommu_present) { + if (is_decon) { + dev_iommu_present = dev->iommu; + dev_err(dev, "tudor: %s call iommu_probe_device new dev_iommu_present = %d, dev->iommu = %p\n", + __func__, dev_iommu_present, dev->iommu); + } err = iommu_probe_device(dev); + if (is_decon) + dev_err(dev, "tudor: %s after iommu_probe_device err = %d\n", + __func__, err); + } if (err && err != -EPROBE_DEFER) dev_dbg(dev, "Adding to IOMMU failed: %d\n", err); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices 2026-03-23 13:09 [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices Tudor Ambarus 2026-03-23 13:54 ` Jason Gunthorpe @ 2026-03-24 11:40 ` Robin Murphy 1 sibling, 0 replies; 9+ messages in thread From: Robin Murphy @ 2026-03-24 11:40 UTC (permalink / raw) To: Tudor Ambarus, Joerg Roedel, Will Deacon, Lorenzo Pieralisi, Jason Gunthorpe, Rob Herring (Arm) Cc: Joerg Roedel, Bjorn Helgaas, iommu, linux-kernel, peter.griffin, andre.draszik, willmcvicker, jyescas, kernel-team, stable On 2026-03-23 1:09 pm, Tudor Ambarus wrote: > Commit da33e87bd2bf ("iommu: Handle yet another race around > registration") introduced a readiness check in `iommu_fwspec_init()` to > prevent client drivers from configuring their IOMMUs before > `bus_iommu_probe()` has completed. > > To optimize the replay path, the readiness check was conditionally > gated behind `!dev->iommu`: > if (!dev->iommu && !READ_ONCE(iommu->ready)) > return -EPROBE_DEFER; > > However, this assumption breaks down for devices that map to multiple > IOMMU instances. During the initialization loop over multiple IOMMUs in > `of_iommu_configure_device()`, the first IOMMU successfully allocates > `dev->iommu`. When `iommu_fwspec_init()` is called for the second > IOMMU, `!dev->iommu` evaluates to false, short-circuiting the logic and > entirely bypassing the `iommu->ready` check. > > If the second IOMMU is still executing its `bus_iommu_probe()` > concurrently, this allows the client driver to proceed prematurely, > resulting in a late IOMMU probe warning: > dev: late IOMMU probe at driver bind, something fishy here! > WARNING: drivers/iommu/iommu.c:645 at __iommu_probe_device > > Fix this by making the `iommu->ready` check unconditional, ensuring > that a device will defer its probe until *all* of its required IOMMUs > are fully registered and ready. ...which is obviously wrong, since the whole point is that we *do* want of_iommu_xlate() to succeed in the bus_iommu_probe() path before the IOMMU driver has finished registering, because that's how we get probe to actually happen correctly in the expected order. This change would effectively undo the whole thing, except leaving ACPI systems without IOMMU functionality at all - it'll only be sort-of-working for you because DT still has the sketchy iommu_probe_device() replay in the driver bind path. Honestly we just need to get rid of that replay call, which is the root of almost all of the problems, but I don't know what to do about all the remaining of_dma_configure() abusers that are relying on it... :( Thanks, Robin. > Cc: stable@vger.kernel.org > Fixes: da33e87bd2bf ("iommu: Handle yet another race around registration") > Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path") > Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> > --- > The warning was observed using an Android 6.19 tree, using downstream > drivers (exynos-decon and samsung-sysmmu-v9). > --- > drivers/iommu/iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 78756c3f3c40..e61927b4d41f 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -3042,7 +3042,7 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode) > > if (!iommu) > return driver_deferred_probe_check_state(dev); > - if (!dev->iommu && !READ_ONCE(iommu->ready)) > + if (!READ_ONCE(iommu->ready)) > return -EPROBE_DEFER; > > if (fwspec) > > --- > base-commit: ca3bbc9287400c1274d87ee57a16e3126ba2969a > change-id: 20260320-iommu-ready-check-4976863957c2 > > Best regards, ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-14 13:04 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-23 13:09 [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices Tudor Ambarus 2026-03-23 13:54 ` Jason Gunthorpe 2026-03-23 16:46 ` Tudor Ambarus 2026-03-23 17:31 ` Jason Gunthorpe 2026-04-02 11:25 ` Tudor Ambarus 2026-04-02 11:59 ` Jason Gunthorpe 2026-04-02 14:20 ` Robin Murphy 2026-04-14 13:04 ` Tudor Ambarus 2026-03-24 11:40 ` Robin Murphy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox