From: Jason Gunthorpe <jgg@nvidia.com>
To: Jon Hunter <jonathanh@nvidia.com>
Cc: David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
dri-devel@lists.freedesktop.org, linux-tegra@vger.kernel.org,
iommu@lists.linux.dev, Mikko Perttunen <mperttunen@nvidia.com>,
Thierry Reding <thierry.reding@gmail.com>,
Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>,
patches@lists.linux.dev
Subject: Re: [PATCH] drm/tegra: Remove of_dma_configure() from host1x_device_add()
Date: Wed, 7 Feb 2024 09:05:32 -0400 [thread overview]
Message-ID: <20240207130532.GA1833900@nvidia.com> (raw)
In-Reply-To: <20240202161540.GU1455070@nvidia.com>
On Fri, Feb 02, 2024 at 12:15:40PM -0400, Jason Gunthorpe wrote:
> > Yes looks like a race of some sort. Adding a bit of debug also makes the
> > issue go away so difficult to see what is happening.
>
> I'm wondering if it is racing with iommu driver probing? I looked but
> didn't notice anything obviously wrong there that would cause this
> though.
Oh there is at least one racy thing here..
The of_xlate hackjob is out of order and is racy if you have multiple instances:
struct tegra_smmu *tegra_smmu_probe(struct device *dev,
const struct tegra_smmu_soc *soc,
struct tegra_mc *mc)
{
/*
* This is a bit of a hack. Ideally we'd want to simply return this
* value. However iommu_device_register() will attempt to add
* all devices to the IOMMU before we get that far. In order
* not to rely on global variables to track the IOMMU instance, we
* set it here so that it can be looked up from the .probe_device()
* callback via the IOMMU device's .drvdata field.
*/
mc->smmu = smmu;
^^^^^^^^^^^^^
After this of_xlate and probe_device will succeed
[...]
err = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, dev_name(dev));
err = iommu_device_register(&smmu->iommu, &tegra_smmu_ops, dev);
^^^^^^^^^
But the iommu instance is not fully initialized yet.
So:
static int tegra_smmu_of_xlate(struct device *dev,
struct of_phandle_args *args)
{
struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
dev_iommu_priv_set(dev, mc->smmu);
^^^^^^^^^^^^
Gets the partially initialized iommu
instance
static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
{
smmu = dev_iommu_priv_get(dev);
if (!smmu)
return ERR_PTR(-ENODEV);
^^^^^^^^^^^^^
Allows the driver to bind to a partially setup instance
ie if you have multiple instances of tegra-smmu and you manage to do
concurrent probe where the iommu instance is probing concurrently with
the failing device_add flow then you can a situation like you have
described where the sysfs is not fully setup.
Is this making sense to you? Add a sleep after the mc->smmu store and
confirm with printing?
I think this is all an insane design. I fixed this race and removed
all this hackery in my fwspec removal series. There the iommu instance
only ever comes out of the locked list that iommu_device_register()
populates and drivers have a safe and simple flow.
Maybe just moving the store to mc->smmu later would improve it? I
didn't look closely..
Jason
prev parent reply other threads:[~2024-02-07 13:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-30 16:15 [PATCH] drm/tegra: Remove of_dma_configure() from host1x_device_add() Jason Gunthorpe
2024-01-30 16:15 ` Jason Gunthorpe
2024-01-30 21:55 ` Jon Hunter
2024-01-31 15:33 ` Jason Gunthorpe
2024-01-31 15:33 ` Jason Gunthorpe
2024-02-01 19:35 ` Jon Hunter
2024-02-01 20:02 ` Jason Gunthorpe
2024-02-02 10:40 ` Jon Hunter
2024-02-02 14:35 ` Jason Gunthorpe
2024-02-02 15:56 ` Jon Hunter
2024-02-02 16:15 ` Jason Gunthorpe
2024-02-07 13:05 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240207130532.GA1833900@nvidia.com \
--to=jgg@nvidia.com \
--cc=airlied@gmail.com \
--cc=daniel@ffwll.ch \
--cc=diogo.ivo@tecnico.ulisboa.pt \
--cc=dri-devel@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=jonathanh@nvidia.com \
--cc=linux-tegra@vger.kernel.org \
--cc=mperttunen@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=thierry.reding@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.