public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Johan Hovold <johan@kernel.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Hanjun Guo <guohanjun@huawei.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>, Russell King <linux@armlinux.org.uk>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Danilo Krummrich <dakr@kernel.org>,
	Stuart Yoder <stuyoder@gmail.com>,
	Laurentiu Tudor <laurentiu.tudor@nxp.com>,
	Nipun Gupta <nipun.gupta@amd.com>,
	Nikhil Agarwal <nikhil.agarwal@amd.com>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Rob Herring <robh@kernel.org>,
	Saravana Kannan <saravanak@google.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	devicetree@vger.kernel.org, linux-pci@vger.kernel.org,
	Charan Teja Kalla <quic_charante@quicinc.com>
Subject: Re: [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path
Date: Thu, 24 Apr 2025 14:58:32 +0100	[thread overview]
Message-ID: <f55869ea-0a96-4cef-b394-7c6bf0359617@arm.com> (raw)
In-Reply-To: <Z_52heGno2Y5M6uF@hovoldconsulting.com>

On 15/04/2025 4:08 pm, Johan Hovold wrote:
> On Mon, Apr 14, 2025 at 04:37:59PM +0100, Robin Murphy wrote:
>> On 2025-04-11 9:02 am, Johan Hovold wrote:
>>> On Fri, Feb 28, 2025 at 03:46:33PM +0000, Robin Murphy wrote:
> 
>>>> @@ -155,7 +155,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
>>>>    		dev_iommu_free(dev);
>>>>    	mutex_unlock(&iommu_probe_device_lock);
>>>>    
>>>> -	if (!err && dev->bus)
>>>> +	/*
>>>> +	 * If we're not on the iommu_probe_device() path (as indicated by the
>>>> +	 * initial dev->iommu) then try to simulate it. This should no longer
>>>> +	 * happen unless of_dma_configure() is being misused outside bus code.
>>>> +	 */
>>>
>>> This assumption does not hold as there is nothing preventing iommu
>>> driver probe from racing with a client driver probe.
>>
>> Not sure I follow - *this* assumption is that if we arrived here with
>> dev->iommu already allocated then __iommu_probe_device() is already in
>> progress for this device, either in the current callchain or on another
>> thread, and so we can (and should) skip calling into it again. There's
>> no ambiguity about that.
> 
> I was referring to the this "should no longer happen unless
> of_dma_configure() is being misused outside bus code" claim, which
> appears to be false given the splat below.

That's not an assumption so much as a statement of intent. And really 
it's the other way round anyway, as a reminder that this replay call is 
only still here (unlike in the ACPI path) because there *is* still 
plenty of sketchy usage of of_dma_configure() which I'm wary of breaking.
>>>> +	if (!err && dev->bus && !dev_iommu_present)
>>>>    		err = iommu_probe_device(dev);
>>>>    
>>>>    	if (err && err != -EPROBE_DEFER)
>>>
>>> I hit the (now moved) dev_WARN() on the ThinkPad T14s where the GPU SMMU
>>> is probed late due to a clock dependency and can end up probing in
>>> parallel with the GPU driver.
>>
>> And what *should* happen is that the GPU driver probe waits for the
>> IOMMU driver probe to finish. Do you have fw_devlink enabled?
> 
> Yes, but you shouldn't rely on devlinks for correctness.
> 
> That said it does seem like something is not working as you think it is
> here, and indeed the iommu supplier link is not created until SMMUv2
> probe_device() (see arm_smmu_probe_device()).
> 
> So client devices can start to be probed (bus dma_configure() is called)
> before their iommu is ready also with devlinks enabled (and I do see
> this happen on every boot).

I didn't mean the explicit power management links created by the SMMU 
driver itself, I meant the fwnode links created automatically by 
fw_devlink_link_device() at device_add() time, which infer a 
supplier-consumer relationship from the "iommus" DT property, wherein 
device_links_check_suppliers() would then defer the GPU driver probe 
until the SMMU driver probe has completed successfully probing and 
called device_links_driver_bound().

Except it turns out that "iommus" is one of the optional properties 
which are only linked that way under "fw_devlink=strict", so that 
explains that, fair enough.
>>> [    3.805282] arm-smmu 3da0000.iommu: probing hardware configuration...
> 
>>> [    3.829042] platform 3d6a000.gmu: Adding to iommu group 8
>>>
>>> [    3.992050] ------------[ cut here ]------------
>>> [    3.993045] adreno 3d00000.gpu: late IOMMU probe at driver bind, something fishy here!
>>> [    3.994058] WARNING: CPU: 9 PID: 343 at drivers/iommu/iommu.c:579 __iommu_probe_device+0x2b0/0x4ac
>>>
>>> [    4.003272] CPU: 9 UID: 0 PID: 343 Comm: kworker/u50:2 Not tainted 6.15.0-rc1 #109 PREEMPT
>>> [    4.003276] Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET83W (2.13 ) 10/04/2024
>>>
>>> [    4.025943] Call trace:
>>> [    4.025945]  __iommu_probe_device+0x2b0/0x4ac (P)
>>> [    4.030453]  iommu_probe_device+0x38/0x7c
>>> [    4.030455]  of_iommu_configure+0x188/0x26c
>>> [    4.030457]  of_dma_configure_id+0xcc/0x300
>>> [    4.030460]  platform_dma_configure+0x74/0xac
>>> [    4.030462]  really_probe+0x74/0x38c
>>
>> Indeed this is exactly what is *not* supposed to be happening - does
>> this patch help at all?
>>
>> https://lore.kernel.org/linux-iommu/09d901ad11b3a410fbb6e27f7d04ad4609c3fe4a.1741706365.git.robin.murphy@arm.com/
> 
> I've only seen that splat once so far so I don't have a reliable
> reproducer.
> 
> But AFAICS that patch won't help help here where we appear to have iommu
> probe racing with bus dma_configure() called from really_probe() for the
> client device.

Well, tightening up __iommu_probe_device() would stand to slightly 
reduce the window in general while bus_set_iommu() is running. However 
you're right that this is a different race from the ones implicated 
there. I have now managed to provoke it on my Juno board with 
"driver_async_probe=*" (which does also require that patch for other 
reasons), and I think I've got a reasonable fix which I shall finish 
writing up and send shortly. Thanks for helping me nail this one down!

Cheers,
Robin.

  reply	other threads:[~2025-04-24 13:58 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 15:46 [PATCH v2 0/4] iommu: Fix the longstanding probe issues Robin Murphy
2025-02-28 15:46 ` [PATCH v2 1/4] iommu: Handle race with default domain setup Robin Murphy
2025-02-28 15:46 ` [PATCH v2 2/4] iommu: Resolve ops in iommu_init_device() Robin Murphy
2025-03-05 17:55   ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 3/4] iommu: Keep dev->iommu state consistent Robin Murphy
2025-03-05 18:14   ` Jason Gunthorpe
2025-02-28 15:46 ` [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path Robin Murphy
2025-03-05 18:28   ` Jason Gunthorpe
2025-03-07 14:24   ` Lorenzo Pieralisi
2025-03-07 20:20     ` Robin Murphy
2025-03-11 18:42   ` Joerg Roedel
2025-03-12  7:07     ` Baolu Lu
2025-03-12 10:10     ` Robin Murphy
2025-03-12 14:34       ` Baolu Lu
2025-03-12 15:21       ` Joerg Roedel
2025-03-13  9:56   ` Marek Szyprowski
2025-03-13 11:01     ` Robin Murphy
2025-03-13 12:23       ` Marek Szyprowski
2025-03-13 13:06         ` Robin Murphy
2025-03-13 14:12           ` Robin Murphy
2025-03-17  7:37             ` Marek Szyprowski
2025-03-17 18:22               ` Robin Murphy
2025-03-21 12:15                 ` Marek Szyprowski
2025-03-21 16:48                   ` Robin Murphy
2025-04-01 20:34                     ` Marek Szyprowski
2025-03-13 16:30       ` Anders Roxell
2025-03-18 16:37   ` Geert Uytterhoeven
2025-03-18 17:24     ` Robin Murphy
2025-03-25 15:32       ` Geert Uytterhoeven
2025-03-27  9:47   ` Chen-Yu Tsai
2025-03-27 11:00     ` Louis-Alexis Eyraud
2025-04-11  8:02   ` Johan Hovold
2025-04-14 15:37     ` Robin Murphy
2025-04-15 15:08       ` Johan Hovold
2025-04-24 13:58         ` Robin Murphy [this message]
2025-04-21 21:19   ` William McVicker
2025-04-22 19:00     ` Jason Gunthorpe
2025-04-22 21:55       ` William McVicker
2025-04-22 23:41         ` Jason Gunthorpe
2025-04-23 17:31           ` William McVicker
2025-04-23 18:18             ` Jason Gunthorpe
2025-08-11 16:44   ` Eric Auger
2025-08-11 17:01     ` Bjorn Helgaas
2026-03-23 17:18   ` Tudor Ambarus
2026-03-23 20:49     ` Robin Murphy
2026-04-01 11:49       ` Tudor Ambarus
2025-03-10  8:29 ` [PATCH v2 0/4] iommu: Fix the longstanding probe issues Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f55869ea-0a96-4cef-b394-7c6bf0359617@arm.com \
    --to=robin.murphy@arm.com \
    --cc=bhelgaas@google.com \
    --cc=dakr@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=guohanjun@huawei.com \
    --cc=iommu@lists.linux.dev \
    --cc=johan@kernel.org \
    --cc=joro@8bytes.org \
    --cc=laurentiu.tudor@nxp.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lpieralisi@kernel.org \
    --cc=nikhil.agarwal@amd.com \
    --cc=nipun.gupta@amd.com \
    --cc=quic_charante@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=robh@kernel.org \
    --cc=saravanak@google.com \
    --cc=stuyoder@gmail.com \
    --cc=sudeep.holla@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox