From: Robin Murphy <robin.murphy@arm.com>
To: Johan Hovold <johan@kernel.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>,
Hanjun Guo <guohanjun@huawei.com>,
Sudeep Holla <sudeep.holla@arm.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>, Russell King <linux@armlinux.org.uk>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Danilo Krummrich <dakr@kernel.org>,
Stuart Yoder <stuyoder@gmail.com>,
Laurentiu Tudor <laurentiu.tudor@nxp.com>,
Nipun Gupta <nipun.gupta@amd.com>,
Nikhil Agarwal <nikhil.agarwal@amd.com>,
Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
Rob Herring <robh@kernel.org>,
Saravana Kannan <saravanak@google.com>,
Bjorn Helgaas <bhelgaas@google.com>,
linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
devicetree@vger.kernel.org, linux-pci@vger.kernel.org,
Charan Teja Kalla <quic_charante@quicinc.com>
Subject: Re: [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path
Date: Mon, 14 Apr 2025 16:37:59 +0100 [thread overview]
Message-ID: <50a06ba8-0a99-40d2-8601-778ebf451f6a@arm.com> (raw)
In-Reply-To: <Z_jMiC1uj_MJpKVj@hovoldconsulting.com>
On 2025-04-11 9:02 am, Johan Hovold wrote:
> Hi Robin,
>
> On Fri, Feb 28, 2025 at 03:46:33PM +0000, Robin Murphy wrote:
>> In hindsight, there were some crucial subtleties overlooked when moving
>> {of,acpi}_dma_configure() to driver probe time to allow waiting for
>> IOMMU drivers with -EPROBE_DEFER, and these have become an
>> ever-increasing source of problems. The IOMMU API has some fundamental
>> assumptions that iommu_probe_device() is called for every device added
>> to the system, in the order in which they are added. Calling it in a
>> random order or not at all dependent on driver binding leads to
>> malformed groups, a potential lack of isolation for devices with no
>> driver, and all manner of unexpected concurrency and race conditions.
>> We've attempted to mitigate the latter with point-fix bodges like
>> iommu_probe_device_lock, but it's a losing battle and the time has come
>> to bite the bullet and address the true source of the problem instead.
>
>> @@ -426,6 +438,12 @@ static int iommu_init_device(struct device *dev)
>> ret = -ENODEV;
>> goto err_free;
>> }
>> + /*
>> + * And if we do now see any replay calls, they would indicate someone
>> + * misusing the dma_configure path outside bus code.
>> + */
>> + if (dev->driver)
>> + dev_WARN(dev, "late IOMMU probe at driver bind, something fishy here!\n");
>>
>> if (!try_module_get(ops->owner)) {
>> ret = -EINVAL;
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index e10a68b5ffde..6b989a62def2 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -155,7 +155,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
>> dev_iommu_free(dev);
>> mutex_unlock(&iommu_probe_device_lock);
>>
>> - if (!err && dev->bus)
>> + /*
>> + * If we're not on the iommu_probe_device() path (as indicated by the
>> + * initial dev->iommu) then try to simulate it. This should no longer
>> + * happen unless of_dma_configure() is being misused outside bus code.
>> + */
>
> This assumption does not hold as there is nothing preventing iommu
> driver probe from racing with a client driver probe.
Not sure I follow - *this* assumption is that if we arrived here with
dev->iommu already allocated then __iommu_probe_device() is already in
progress for this device, either in the current callchain or on another
thread, and so we can (and should) skip calling into it again. There's
no ambiguity about that.
>> + if (!err && dev->bus && !dev_iommu_present)
>> err = iommu_probe_device(dev);
>>
>> if (err && err != -EPROBE_DEFER)
>
> I hit the (now moved) dev_WARN() on the ThinkPad T14s where the GPU SMMU
> is probed late due to a clock dependency and can end up probing in
> parallel with the GPU driver.
And what *should* happen is that the GPU driver probe waits for the
IOMMU driver probe to finish. Do you have fw_devlink enabled?
> [ 3.805282] arm-smmu 3da0000.iommu: probing hardware configuration...
> [ 3.806007] arm-smmu 3da0000.iommu: SMMUv2 with:
> [ 3.806843] arm-smmu 3da0000.iommu: stage 1 translation
> [ 3.807562] arm-smmu 3da0000.iommu: coherent table walk
> [ 3.808253] arm-smmu 3da0000.iommu: stream matching with 24 register groups
> [ 3.808957] arm-smmu 3da0000.iommu: 22 context banks (0 stage-2 only)
> [ 3.809651] arm-smmu 3da0000.iommu: Supported page sizes: 0x61311000
> [ 3.810339] arm-smmu 3da0000.iommu: Stage-1: 48-bit VA -> 40-bit IPA
> [ 3.811130] arm-smmu 3da0000.iommu: preserved 0 boot mappings
>
> [ 3.829042] platform 3d6a000.gmu: Adding to iommu group 8
>
> [ 3.992050] ------------[ cut here ]------------
> [ 3.993045] adreno 3d00000.gpu: late IOMMU probe at driver bind, something fishy here!
> [ 3.994058] WARNING: CPU: 9 PID: 343 at drivers/iommu/iommu.c:579 __iommu_probe_device+0x2b0/0x4ac
>
> [ 4.003272] CPU: 9 UID: 0 PID: 343 Comm: kworker/u50:2 Not tainted 6.15.0-rc1 #109 PREEMPT
> [ 4.003276] Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET83W (2.13 ) 10/04/2024
>
> [ 4.025943] Call trace:
> [ 4.025945] __iommu_probe_device+0x2b0/0x4ac (P)
> [ 4.030453] iommu_probe_device+0x38/0x7c
> [ 4.030455] of_iommu_configure+0x188/0x26c
> [ 4.030457] of_dma_configure_id+0xcc/0x300
> [ 4.030460] platform_dma_configure+0x74/0xac
> [ 4.030462] really_probe+0x74/0x38c
Indeed this is exactly what is *not* supposed to be happening - does
this patch help at all?
https://lore.kernel.org/linux-iommu/09d901ad11b3a410fbb6e27f7d04ad4609c3fe4a.1741706365.git.robin.murphy@arm.com/
If not then I guess I do need to do something to explicitly distinguish
the "iommu_device_register() is still running" state after all...
Thanks,
Robin.
> [ 4.030464] __driver_probe_device+0x7c/0x160
> [ 4.030465] driver_probe_device+0x40/0x110
> [ 4.030467] __device_attach_driver+0xbc/0x158
> [ 4.030468] bus_for_each_drv+0x84/0xe0
> [ 4.030470] __device_attach+0xa8/0x1d4
> [ 4.030472] device_initial_probe+0x14/0x20
> [ 4.030473] bus_probe_device+0xb0/0xb4
> [ 4.030476] deferred_probe_work_func+0xa0/0xf4
>
> [ 4.030501] ---[ end trace 0000000000000000 ]---
> [ 4.031269] adreno 3d00000.gpu: Adding to iommu group 9
>
> Johan
next prev parent reply other threads:[~2025-04-14 15:38 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-28 15:46 [PATCH v2 0/4] iommu: Fix the longstanding probe issues Robin Murphy
2025-02-28 15:46 ` [PATCH v2 1/4] iommu: Handle race with default domain setup Robin Murphy
2025-02-28 15:46 ` [PATCH v2 2/4] iommu: Resolve ops in iommu_init_device() Robin Murphy
2025-03-05 17:55 ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 3/4] iommu: Keep dev->iommu state consistent Robin Murphy
2025-03-05 18:14 ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path Robin Murphy
2025-03-05 18:28 ` Jason Gunthorpe
2025-03-07 14:24 ` Lorenzo Pieralisi
2025-03-07 20:20 ` Robin Murphy
2025-03-11 18:42 ` Joerg Roedel
2025-03-12 7:07 ` Baolu Lu
2025-03-12 10:10 ` Robin Murphy
2025-03-12 14:34 ` Baolu Lu
2025-03-12 15:21 ` Joerg Roedel
[not found] ` <CGME20250313095633eucas1p29cb55f2504b4bcf67c16b3bd3fa9b8cd@eucas1p2.samsung.com>
2025-03-13 9:56 ` Marek Szyprowski
2025-03-13 11:01 ` Robin Murphy
2025-03-13 12:23 ` Marek Szyprowski
2025-03-13 13:06 ` Robin Murphy
2025-03-13 14:12 ` Robin Murphy
2025-03-17 7:37 ` Marek Szyprowski
2025-03-17 18:22 ` Robin Murphy
2025-03-21 12:15 ` Marek Szyprowski
2025-03-21 16:48 ` Robin Murphy
2025-04-01 20:34 ` Marek Szyprowski
2025-03-13 16:30 ` Anders Roxell
2025-03-18 16:37 ` Geert Uytterhoeven
2025-03-18 17:24 ` Robin Murphy
2025-03-25 15:32 ` Geert Uytterhoeven
2025-03-27 9:47 ` Chen-Yu Tsai
2025-03-27 11:00 ` Louis-Alexis Eyraud
2025-04-11 8:02 ` Johan Hovold
2025-04-14 15:37 ` Robin Murphy [this message]
2025-04-15 15:08 ` Johan Hovold
2025-04-24 13:58 ` Robin Murphy
2025-04-21 21:19 ` William McVicker
2025-04-22 19:00 ` Jason Gunthorpe
2025-04-22 21:55 ` William McVicker
2025-04-22 23:41 ` Jason Gunthorpe
2025-04-23 17:31 ` William McVicker
2025-04-23 18:18 ` Jason Gunthorpe
2025-08-11 16:44 ` Eric Auger
2025-08-11 17:01 ` Bjorn Helgaas
2025-03-10 8:29 ` [PATCH v2 0/4] iommu: Fix the longstanding probe issues Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50a06ba8-0a99-40d2-8601-778ebf451f6a@arm.com \
--to=robin.murphy@arm.com \
--cc=bhelgaas@google.com \
--cc=dakr@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=guohanjun@huawei.com \
--cc=iommu@lists.linux.dev \
--cc=johan@kernel.org \
--cc=joro@8bytes.org \
--cc=laurentiu.tudor@nxp.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=lpieralisi@kernel.org \
--cc=nikhil.agarwal@amd.com \
--cc=nipun.gupta@amd.com \
--cc=quic_charante@quicinc.com \
--cc=rafael@kernel.org \
--cc=robh@kernel.org \
--cc=saravanak@google.com \
--cc=stuyoder@gmail.com \
--cc=sudeep.holla@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).