From: Jason Gunthorpe <jgg@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Len Brown <lenb@kernel.org>,
linux-acpi@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>,
Lu Baolu <baolu.lu@linux.intel.com>,
Joerg Roedel <jroedel@suse.de>, Kevin Tian <kevin.tian@intel.com>,
Chen-Yu Tsai <wenst@chromium.org>
Subject: Re: [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths
Date: Thu, 17 Aug 2023 15:33:16 -0300 [thread overview]
Message-ID: <ZN5n7GnlrTS6s5Yg@nvidia.com> (raw)
In-Reply-To: <d090f196-a5c2-b188-31bf-e42553d8d251@samsung.com>
On Thu, Aug 17, 2023 at 10:31:31AM +0200, Marek Szyprowski wrote:
> Hi Jason,
>
> On 09.08.2023 16:43, Jason Gunthorpe wrote:
> > I missed two paths where __iommu_probe_device() can be called while
> > already holding the device_lock() for the device that is to be probed.
> >
> > This causes a deadlock because __iommu_probe_device() will attempt to
> > re-acquire the lock.
> >
> > Organize things so that these two paths can re-use the existing already
> > held device_lock by marking the call chains based on if the lock is held
> > or not.
> >
> > Also the order of iommu_init_device() was not correct for
> > generic_single_device_group()
>
> I've just noticed that there is still one more issue left to fix after
> applying this patchset. On Qualcomm's RB5 board I get the following
> warning (captured on vanilla next-20230817):
> Call trace:
> iommu_probe_device_locked+0xd4/0xe4
> of_iommu_configure+0x10c/0x200
> of_dma_configure_id+0x104/0x3b8
So it open codes a call to of_dma_configure_id for some reason
This is a bizzaro thing to do, it is taking the OF handle from a dt:
node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0);
And the sort of co-opts the platform device it is associated with:
struct platform_device *pdev = of_find_device_by_node(node);
gmu->dev = &pdev->dev;
of_dma_configure(gmu->dev, node, true);
And hackily sets up the iommu? It needs to do this because the iommu
doesn't get setup until something probes on the device. But this code
is just stealing the platform device, it doesn't actually probe a
driver.
So this is all kinds of wrong. Attach a driver if you intend to claim
the device and use it :(
Anyhow, you can "fix" it with this:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 5deb79924897af..8dbd75c200b91c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1556,7 +1556,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
gmu->dev = &pdev->dev;
+ device_lock(gmu->dev);
of_dma_configure(gmu->dev, node, true);
+ device_unlock(gmu->dev);
/* Fow now, don't do anything fancy until we get our feet under us */
gmu->idle_level = GMU_IDLE_STATE_ACTIVE;
But there is much more broken here than just that. Maybe leaving the
warning is a good thing so people can fix this properly. (eg remove
the call to of_dma_configure and use the driver core correctly)
Though looking around I see other places open coding of_dma_configure,
so I suppose this will turn into a persisting issue.
Some of these are not going to throw a warning because they are
adjusting the same device the driver is probing for. However those are
busted up since they try to attach the iommu driver to a device after
the driver core has passed iommu_device_use_default_domain(). It will
explode if the driver is removed.
Bascially.. Yikes!
drivers/bcma/main.c: of_dma_configure(&core->dev, node, false);
No idea..
drivers/dma/qcom/hidma_mgmt.c: of_dma_configure(&new_pdev->dev, child, true);
Registers a platformdevice then does of_dma_configure(), seems
wrong. of_dma_configure() should be done when a driver is probed.
drivers/gpu/drm/etnaviv/etnaviv_drv.c: of_dma_configure(&pdev->dev, first_node, true);
platform_dma_configure() chooses the wrong of_node so this overrides
the logic?
drivers/gpu/drm/msm/adreno/a6xx_gmu.c: of_dma_configure(gmu->dev, node, true);
drivers/gpu/drm/msm/adreno/a6xx_gmu.c: of_dma_configure(gmu->dev, node, true);
co-opts a platform device without attaching a driver to it
drivers/gpu/drm/sun4i/sun4i_backend.c: ret = of_dma_configure(drm->dev, dev->of_node, true);
drivers/gpu/drm/sun4i/sun8i_mixer.c: ret = of_dma_configure(drm->dev, dev->of_node, true);
No idea.. Same issue with using the wrong of_node?
drivers/gpu/drm/vc4/vc4_drv.c: ret = of_dma_configure(dev, node, true);
Similar to adreno
drivers/gpu/host1x/bus.c: of_dma_configure(&device->dev, host1x->dev->of_node, true);
drivers/gpu/host1x/context.c: err = of_dma_configure_id(&ctx->dev, node, true, &i);
Creating a device and then immediately doing of_dma_configure
(before registering even).
drivers/media/platform/qcom/venus/firmware.c: ret = of_dma_configure(&pdev->dev, np, true);
Looks like wrong of_node on the platform_device
drivers/media/platform/samsung/s5p-mfc/s5p_mfc.c: of_dma_configure(child, dev->of_node, true);
Says it all:
/*
* The memdevs are not proper OF platform devices, so in order for them
* to be treated as valid DMA masters we need a bit of a hack to force
* them to inherit the MFC node's DMA configuration.
*/
of_dma_configure(child, dev->of_node, true);
drivers/net/wireless/ath/ath10k/snoc.c: ret = of_dma_configure(&pdev->dev, node, true);
Looks like wrong of_node on the platform_device
drivers/net/wireless/ath/ath11k/ahb.c: ret = of_dma_configure(&pdev->dev, node, true);
Registering a platform device
sound/soc/bcm/bcm63xx-pcm-whistler.c: of_dma_configure(pcm->card->dev, pcm->card->dev->of_node, 1);
No idea
Jason
next prev parent reply other threads:[~2023-08-17 18:34 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20230809144403eucas1p1345aec6ec34440f1794594426e0402ab@eucas1p1.samsung.com>
2023-08-09 14:43 ` [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 1/4] iommu: Provide iommu_probe_device_locked() Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 2/4] iommu: Pass in the iommu_device to probe for in bus_iommu_probe() Jason Gunthorpe
2023-08-09 14:43 ` [PATCH v2 3/4] iommu: Do not attempt to re-lock the iommu device when probing Jason Gunthorpe
2023-08-10 2:37 ` Tian, Kevin
2023-08-09 14:43 ` [PATCH v2 4/4] iommu: dev->iommu->iommu_dev must be set before ops->device_group() Jason Gunthorpe
2023-08-10 2:37 ` Tian, Kevin
2023-08-09 15:49 ` [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths Joerg Roedel
2023-08-09 15:55 ` Jason Gunthorpe
2023-08-10 16:15 ` Jeffrey Hugo
2023-08-17 8:31 ` Marek Szyprowski
2023-08-17 18:33 ` Jason Gunthorpe [this message]
2023-08-18 15:56 ` Joerg Roedel
2023-08-18 16:06 ` Jason Gunthorpe
2023-08-18 18:00 ` Eric Farman
2023-08-18 18:15 ` Jason Gunthorpe
2023-08-18 18:32 ` Eric Farman
2023-08-18 18:24 ` Joerg Roedel
2023-08-18 18:50 ` Robin Murphy
2023-08-18 19:19 ` Jason Gunthorpe
2023-08-21 11:35 ` Robin Murphy
2023-08-18 19:18 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZN5n7GnlrTS6s5Yg@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=jroedel@suse.de \
--cc=kevin.tian@intel.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=wenst@chromium.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.