public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Tudor Ambarus <tudor.ambarus@linaro.org>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Hanjun Guo <guohanjun@huawei.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>, Russell King <linux@armlinux.org.uk>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Danilo Krummrich <dakr@kernel.org>,
	Stuart Yoder <stuyoder@gmail.com>,
	Laurentiu Tudor <laurentiu.tudor@nxp.com>,
	Nipun Gupta <nipun.gupta@amd.com>,
	Nikhil Agarwal <nikhil.agarwal@amd.com>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Rob Herring <robh@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	devicetree@vger.kernel.org, linux-pci@vger.kernel.org,
	"Charan Teja Kalla" <quic_charante@quicinc.com>,
	"Peter Griffin" <peter.griffin@linaro.org>,
	"André Draszik" <andre.draszik@linaro.org>,
	"Juan Yescas" <jyescas@google.com>,
	kernel-team@android.com
Subject: Re: [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path
Date: Mon, 23 Mar 2026 20:49:40 +0000	[thread overview]
Message-ID: <6c3b506e-8cc4-45ad-a801-326886c694c4@arm.com> (raw)
In-Reply-To: <67b32e90-1f60-4bf5-b534-b4a901d5a796@linaro.org>

On 23/03/2026 5:18 pm, Tudor Ambarus wrote:
> Hi, Robin,
> 
> On 2/28/25 5:46 PM, Robin Murphy wrote:
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index a3b45b84f42b..1cec7074367a 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -414,9 +414,21 @@ static int iommu_init_device(struct device *dev)
>>   	if (!dev_iommu_get(dev))
>>   		return -ENOMEM;
>>   	/*
>> -	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
>> -	 * instances with non-NULL fwnodes, and client devices should have been
>> -	 * identified with a fwspec by this point. Otherwise, we can currently
>> +	 * For FDT-based systems and ACPI IORT/VIOT, the common firmware parsing
>> +	 * is buried in the bus dma_configure path. Properly unpicking that is
>> +	 * still a big job, so for now just invoke the whole thing. The device
>> +	 * already having a driver bound means dma_configure has already run and
>> +	 * either found no IOMMU to wait for, or we're in its replay call right
>> +	 * now, so either way there's no point calling it again.
>> +	 */
>> +	if (!dev->driver && dev->bus->dma_configure) {
>> +		mutex_unlock(&iommu_probe_device_lock);
>> +		dev->bus->dma_configure(dev);
>> +		mutex_lock(&iommu_probe_device_lock);
>> +	}
> 
> I was chasing the "something fishy" dev_WARN on a 6.19+ downstream
> android kernel and while looking at the IOMMU code I couldn't help
> myself and ask whether we shall prevent concurrent execution of
> dma_configure().
> 
> It seems to me that while the IOMMU subsystem is executing
> dma_configure(), the deferred probe workqueue can concurrently pick up
> the same device, enter really_probe(), set dev->driver, and execute
> dma_configure(). Is it worth protecting against this?

Yes, it's certainly still possible to hit a false-positive if thread A 
in iommu_device_register()->bus_iommu_probe() races against thread B 
attempting to bind, simply because thread B can set dev->driver long 
before it gets to any point where ends up serialising on 
iommu_probe_device_lock again, so thread A can observe that even while 
it is doing the IOMMU probe in the "correct" context. Other than the 
warning though, it's still functionally OK even if the "wrong" thread 
does end up finishing the probe, at least after 0c8e9c148e29 ("iommu: 
Avoid introducing more races").

> I can try to prove it if needed, using a downstream iommu driver (sigh).
> 
> Thanks!
> ta
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index e61927b4d41f..5f0c1a8064b5 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -461,9 +461,19 @@ static int iommu_init_device(struct device *dev)
>           * already having a driver bound means dma_configure has already run and
>           * found no IOMMU to wait for, so there's no point calling it again.
>           */
> -       if (!dev->iommu->fwspec && !dev->driver && dev->bus->dma_configure) {
> +       if (!dev->iommu->fwspec && !READ_ONCE(dev->driver) &&
> +           dev->bus->dma_configure) {
>                  mutex_unlock(&iommu_probe_device_lock);
> -               dev->bus->dma_configure(dev);
> +
> +               /*
> +                * Serialize with really_probe(). Recheck dev->driver in case a
> +                * driver bound while we were waiting for the lock.
> +                */
> +               device_lock(dev);
> +               if (!dev->driver)
> +                       dev->bus->dma_configure(dev);
> +               device_unlock(dev);

Much as I can't wait to get rid of iommu_probe_device_lock, the main 
reason we still can't rely on device_lock() at the moment is not 
actually the remaining sketchy replay-dependers per the comment in 
__iommu_probe_device(), but more fundamentally that for most IOMMU 
drivers this will deadlock in that same 
iommu_device_register()->bus_iommu_probe() path, when the bus walk 
happens to stumble across the IOMMU device itself, which of course is 
already locked as it's still in the middle of its own driver bind. I 
couldn't see an easy, clean and reliable way to get around that, so that 
can got kicked down the road in order to get the "call of_xlate in the 
right order and make iommu_device_register() actually work" basics 
landed (and start shaking out all these other problems...)

Thanks,
Robin.

> +
>                  mutex_lock(&iommu_probe_device_lock);
>                  /* If another instance finished the job for us, skip it */
>                  if (!dev->iommu || dev->iommu_group)
> (END)



  reply	other threads:[~2026-03-23 20:57 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 15:46 [PATCH v2 0/4] iommu: Fix the longstanding probe issues Robin Murphy
2025-02-28 15:46 ` [PATCH v2 1/4] iommu: Handle race with default domain setup Robin Murphy
2025-02-28 15:46 ` [PATCH v2 2/4] iommu: Resolve ops in iommu_init_device() Robin Murphy
2025-03-05 17:55   ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 3/4] iommu: Keep dev->iommu state consistent Robin Murphy
2025-03-05 18:14   ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path Robin Murphy
2025-03-05 18:28   ` Jason Gunthorpe
2025-03-07 14:24   ` Lorenzo Pieralisi
2025-03-07 20:20     ` Robin Murphy
2025-03-11 18:42   ` Joerg Roedel
2025-03-12  7:07     ` Baolu Lu
2025-03-12 10:10     ` Robin Murphy
2025-03-12 14:34       ` Baolu Lu
2025-03-12 15:21       ` Joerg Roedel
2025-03-13  9:56   ` Marek Szyprowski
2025-03-13 11:01     ` Robin Murphy
2025-03-13 12:23       ` Marek Szyprowski
2025-03-13 13:06         ` Robin Murphy
2025-03-13 14:12           ` Robin Murphy
2025-03-17  7:37             ` Marek Szyprowski
2025-03-17 18:22               ` Robin Murphy
2025-03-21 12:15                 ` Marek Szyprowski
2025-03-21 16:48                   ` Robin Murphy
2025-04-01 20:34                     ` Marek Szyprowski
2025-03-13 16:30       ` Anders Roxell
2025-03-18 16:37   ` Geert Uytterhoeven
2025-03-18 17:24     ` Robin Murphy
2025-03-25 15:32       ` Geert Uytterhoeven
2025-03-27  9:47   ` Chen-Yu Tsai
2025-03-27 11:00     ` Louis-Alexis Eyraud
2025-04-11  8:02   ` Johan Hovold
2025-04-14 15:37     ` Robin Murphy
2025-04-15 15:08       ` Johan Hovold
2025-04-24 13:58         ` Robin Murphy
2025-04-21 21:19   ` William McVicker
2025-04-22 19:00     ` Jason Gunthorpe
2025-04-22 21:55       ` William McVicker
2025-04-22 23:41         ` Jason Gunthorpe
2025-04-23 17:31           ` William McVicker
2025-04-23 18:18             ` Jason Gunthorpe
2025-08-11 16:44   ` Eric Auger
2025-08-11 17:01     ` Bjorn Helgaas
2026-03-23 17:18   ` Tudor Ambarus
2026-03-23 20:49     ` Robin Murphy [this message]
2026-04-01 11:49       ` Tudor Ambarus
2025-03-10  8:29 ` [PATCH v2 0/4] iommu: Fix the longstanding probe issues Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c3b506e-8cc4-45ad-a801-326886c694c4@arm.com \
    --to=robin.murphy@arm.com \
    --cc=andre.draszik@linaro.org \
    --cc=bhelgaas@google.com \
    --cc=dakr@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=guohanjun@huawei.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=jyescas@google.com \
    --cc=kernel-team@android.com \
    --cc=laurentiu.tudor@nxp.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lpieralisi@kernel.org \
    --cc=nikhil.agarwal@amd.com \
    --cc=nipun.gupta@amd.com \
    --cc=peter.griffin@linaro.org \
    --cc=quic_charante@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=robh@kernel.org \
    --cc=stuyoder@gmail.com \
    --cc=sudeep.holla@arm.com \
    --cc=tudor.ambarus@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox