From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: acpica-devel@lists.linux.dev,
"Alyssa Rosenzweig" <alyssa@rosenzweig.io>,
"Albert Ou" <aou@eecs.berkeley.edu>,
asahi@lists.linux.dev,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Dexuan Cui" <decui@microsoft.com>,
devicetree@vger.kernel.org,
"David Woodhouse" <dwmw2@infradead.org>,
"Frank Rowand" <frowand.list@gmail.com>,
"Hanjun Guo" <guohanjun@huawei.com>,
"Haiyang Zhang" <haiyangz@microsoft.com>,
iommu@lists.linux.dev,
"Jean-Philippe Brucker" <jean-philippe@linaro.org>,
"Jonathan Hunter" <jonathanh@nvidia.com>,
"Joerg Roedel" <joro@8bytes.org>,
"K. Y. Srinivasan" <kys@microsoft.com>,
"Len Brown" <lenb@kernel.org>,
linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-hyperv@vger.kernel.org, linux-mips@vger.kernel.org,
linux-riscv@lists.infradead.org,
linux-snps-arc@lists.infradead.org, linux-tegra@vger.kernel.org,
"Russell King" <linux@armlinux.org.uk>,
"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
"Marek Szyprowski" <m.szyprowski@samsung.com>,
"Hector Martin" <marcan@marcan.st>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
patches@lists.linux.dev,
"Paul Walmsley" <paul.walmsley@sifive.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Robert Moore" <robert.moore@intel.com>,
"Rob Herring" <robh+dt@kernel.org>,
"Sudeep Holla" <sudeep.holla@arm.com>,
"Suravee Suthikulpanit" <suravee.suthikulpanit@amd.com>,
"Sven Peter" <sven@svenpeter.dev>,
"Thierry Reding" <thierry.reding@gmail.com>,
"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
"Krishna Reddy" <vdumpa@nvidia.com>,
"Vineet Gupta" <vgupta@kernel.org>,
virtualization@lists.linux.dev, "Wei Liu" <wei.liu@kernel.org>,
"Will Deacon" <will@kernel.org>,
"André Draszik" <andre.draszik@linaro.org>,
"Lu Baolu" <baolu.lu@linux.intel.com>,
"Christoph Hellwig" <hch@lst.de>,
"Jerry Snitselaar" <jsnitsel@redhat.com>,
"Moritz Fischer" <mdf@kernel.org>,
"Zhenhua Huang" <quic_zhenhuah@quicinc.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
"Rob Herring" <robh@kernel.org>
Subject: Re: [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec
Date: Tue, 21 Nov 2023 13:55:57 -0400 [thread overview]
Message-ID: <20231121175557.GH6083@nvidia.com> (raw)
In-Reply-To: <e926d2d8-8209-4c0f-a0cb-dcea4edf839e@arm.com>
On Tue, Nov 21, 2023 at 04:06:15PM +0000, Robin Murphy wrote:
> > Obviously. I rejected that right away because of how incredibly
> > wrongly layered and hacky it is to do something like that.
>
> What, and dressing up the fundamental layering violation by baking it even
> further into the API flow, while still not actually fixing it or any of its
> *other* symptoms, is somehow better?
It puts the locks and the controlled data in the right place, in the
right layer.
> Ultimately, this series is still basically doing the same thing my patch
> does - extending the scope of the existing iommu_probe_device_lock hack to
> cover fwspec creation. A hack is a hack, so frankly I'd rather it be simple
> and obvious and look like one, and being easy to remove again is an obvious
> bonus too.
Not entirely, as I've said this series removes the use of the dev->iommu
during fwspec creation. This is different than just extending the lock.
> > So, I would move toward having the driver's probe invoke a helper like:
> >
> > iommu_of_xlate(dev, fwspec, &per_fwnode_function, &args);
> >
> > Which generates the same list of (fwnode_handle, of_phandle_args) that
> > was passed to of_xlate today, but is ordered sensibly within the
> > sequence of probe for what many drivers seem to want to do.
>
> Grep for of_xlate. It is a standard and well-understood callback pattern for
> a subsystem to parse a common DT binding and pass a driver-specific
> specifier to a driver to interpret. Or maybe you just have a peculiar
> definition of what you think "micro-optimisation" means? :/
Don't know about other subsystems, here is making a mess. Look at what
the drivers are doing. The two SMMU drivers are sort of sane, but
everything else has coded half their pobe into of_xlate. It doesn't
make alot of sense.
> > So, it is not so much that that the idea of of_xlate goes away, but
> > the specific op->of_xlate does, it gets shifted into a helper that
> > invokes the same function in a more logical spot.
>
> I'm curious how you imagine an IOMMU driver's ->probe function could be
> called *before* parsing the firmware to work out what, if any, IOMMU, and
> thus driver, a device is associated with.
You've jumped ahead here, I'm talking about removing ops->of_xlate.
With everything kept the same as today this means we'd scan the FW
description twice. Once to find the ops and once inside the driver to
parse it.
When I say micro-optimization this is what I mean - structuring the
code to try to avoid multiple-scans of the FW table.
Once the drivers are self-parsing I see there are two paths to keep it
at one FW parse:
1) Have the core code parse and cache common cases in the iommu_fwspec.
Driver then pulls out the common cases from the iommu_fwspec and
reparsed in uncommon cases.
2) Accept we have only a single ops in all real systems and just
invoke the driver to parse it. That parse will cache enough
information to decide if EPROBE_DEFER is needed.
In either case the drivers would call the same APIs and have the same
logic. The choice is just infrastructure-side stuff.
It is a different view than trying to do everything up front, but I'm
not seeing that it is so differently efficient as to be a performance
problem.
On the plus side it looks to make the drivers alot simpler and more
logical.
> And then every driver has to have
> identical boilerplate to go off and parse the generic "iommus" binding...
> which is the whole damn reason for *not* going down that route and instead
> using an of_xlate mechanism in the first place.
Let's not guess. I've attached below a sketch conversion for
apple-dart.
Diffstat says it *saves* 1 line. But also we fix several bugs!
- iommu_of_xlate() will check for NULL fwspec and reject the bus
probe path
- iommu_of_xlate() can match the iommu's from the iommu list and
check that the OF actually points only to iommus with this driver's
ops (missed sanity check)
- The of parsing machinery already computes the iommu_driver but throws
it out. This forces all of the drivers to do their own thing. Pass
it in and thus apple_dart_of_xlate() doesn't need to mess around
with of_find_device_by_node() and we fix the bug where it leaks the
reference on the struct device (woops!)
- We don't leak the cfg allocation that apple_dart_of_xlate() did on
various error paths. All error paths logically free in probe. We
don't have to think about what happens if of_xlate is called twice
for the same args on error/reprobe paths.
Many drivers follow this pattern of apple-dart and would end up like this.
Drivers that just need the u32 array would be simpler, more like:
struct iommu_driver *instance;
unsigned int num_ids;
instance = iommu_of_get_iommu(..., &num_ids);
if (IS_ERR(instance))
return ERR_CAST(instance);
cfg = kzalloc(struct_size(cfg, sids, num_ids), GFP_KERNEL);
if (!cfg)
return -ENOMEM;
iommu_of_read_u32_ids(..., &cfg->sids);
[..]
return instance;
I haven't sketched *every* driver yet, but I've sketched enough to be
confident.
Robin, I know it is not what you have in your head, but you should
stop with the insults and be more open to other perspectives.
> > The per-device data can be allocated at the top of probe and passed
> > through args to fix the lifetime bugs.
> >
> > It is pretty simple to do.
>
> I believe the kids these days would say "Say you don't understand the code
> without saying you don't understand the code."
I think you are too fixated on what you have in your mind. It will
take me a bit but I will come with a series to move all the FW parsing
into probe along this model. Once done it is trivial to make bus probe
work like it should.
Regarding this series, I don't really see a problem. There is no
"concrete" or anything like that.
> > Not quite. This decouples two unrelated things into seperate
> > concerns. It is not so much about the concurrency but removing the
> > abuse of dev->iommu by code that has no need to touch it at all.
>
> Sorry, the "abuse" of storing IOMMU-API-specific data in the place we
> intentionally created to consolidate all the IOMMU-API-specific data
> into?
The global state should not be filled until the driver is probed. It
is an abuse to use a global instead of an on-stack variable when
building it. Publishing only fully initialized data to global
visibility is concurrency 101. :(
Jason
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index ee05f4824bfad1..476938722460b8 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -721,19 +721,29 @@ static struct iommu_domain apple_dart_blocked_domain = {
static struct iommu_device *apple_dart_probe_device(struct device *dev)
{
- struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ struct apple_dart_master_cfg *cfg;
struct apple_dart_stream_map *stream_map;
+ int ret;
int i;
+ cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
if (!cfg)
- return ERR_PTR(-ENODEV);
+ return ERR_PTR(-ENOMEM);
+ ret = iommu_of_xlate(dev_iommu_fwspec_get(dev), &apple_dart_iommu_ops,
+ 1, &apple_dart_of_xlate, cfg);
+ if (ret)
+ goto err_free;
for_each_stream_map(i, cfg, stream_map)
device_link_add(
dev, stream_map->dart->dev,
DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+ dev_iommu_priv_set(dev, cfg);
return &cfg->stream_maps[0].dart->iommu;
+err_free:
+ kfree(cfg);
+ return ret;
}
static void apple_dart_release_device(struct device *dev)
@@ -777,25 +787,15 @@ static void apple_dart_domain_free(struct iommu_domain *domain)
kfree(dart_domain);
}
-static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
+static int apple_dart_of_xlate(struct iommu_device *iommu,
+ struct of_phandle_args *args, void *priv)
{
- struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
- struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
- struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
- struct apple_dart *cfg_dart;
- int i, sid;
+ struct apple_dart *dart = container_of(iommu, struct apple_dart, iommu);
+ struct apple_dart_master_cfg *cfg = priv;
+ struct apple_dart *cfg_dart = cfg->stream_maps[0].dart;
+ int sid = args->args[0];
+ int i;
- if (args->args_count != 1)
- return -EINVAL;
- sid = args->args[0];
-
- if (!cfg)
- cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
- if (!cfg)
- return -ENOMEM;
- dev_iommu_priv_set(dev, cfg);
-
- cfg_dart = cfg->stream_maps[0].dart;
if (cfg_dart) {
if (cfg_dart->supports_bypass != dart->supports_bypass)
return -EINVAL;
@@ -980,7 +980,6 @@ static const struct iommu_ops apple_dart_iommu_ops = {
.probe_device = apple_dart_probe_device,
.release_device = apple_dart_release_device,
.device_group = apple_dart_device_group,
- .of_xlate = apple_dart_of_xlate,
.def_domain_type = apple_dart_def_domain_type,
.get_resv_regions = apple_dart_get_resv_regions,
.pgsize_bitmap = -1UL, /* Restricted during dart probe */
prev parent reply other threads:[~2023-11-21 17:56 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-15 14:05 [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 01/17] iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 02/17] iommmu/of: Do not return struct iommu_ops from of_iommu_configure() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 03/17] iommu/of: Use -ENODEV consistently in of_iommu_configure() Jason Gunthorpe
2023-11-15 14:42 ` Jerry Snitselaar
2023-11-15 14:05 ` [PATCH v2 04/17] acpi: Do not return struct iommu_ops from acpi_iommu_configure_id() Jason Gunthorpe
2023-11-15 14:45 ` Jerry Snitselaar
2023-11-15 14:05 ` [PATCH v2 05/17] iommu: Make iommu_fwspec->ids a distinct allocation Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 06/17] iommu: Add iommu_fwspec_alloc/dealloc() Jason Gunthorpe
2023-11-19 8:10 ` Hector Martin
2023-11-19 9:19 ` Hector Martin
2023-11-19 14:13 ` Jason Gunthorpe
2023-11-21 6:47 ` Hector Martin
2023-11-21 16:00 ` Jason Gunthorpe
2023-11-23 9:08 ` Hector Martin
2023-11-15 14:05 ` [PATCH v2 07/17] iommu: Add iommu_probe_device_fwspec() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 08/17] iommu/of: Do not use dev->iommu within of_iommu_configure() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 09/17] iommu: Add iommu_fwspec_append_ids() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 10/17] acpi: Do not use dev->iommu within acpi_iommu_configure() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 11/17] iommu: Hold iommu_probe_device_lock while calling ops->of_xlate Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 12/17] iommu: Make iommu_ops_from_fwnode() static Jason Gunthorpe
2023-11-15 15:09 ` Jerry Snitselaar
2023-11-16 14:36 ` Moritz Fischer
2023-11-15 14:06 ` [PATCH v2 13/17] iommu: Remove dev_iommu_fwspec_set() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 14/17] iommu: Remove pointless iommu_fwspec_free() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 15/17] iommu: Add ops->of_xlate_fwspec() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 16/17] iommu: Mark dev_iommu_get() with lockdep Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 17/17] iommu: Mark dev_iommu_priv_set() with a lockdep Jason Gunthorpe
2023-11-15 14:54 ` [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec Jerry Snitselaar
2023-11-15 15:22 ` Robin Murphy
2023-11-15 15:36 ` Jason Gunthorpe
2023-11-15 20:23 ` Robin Murphy
2023-11-16 4:17 ` Jason Gunthorpe
2023-11-21 16:06 ` Robin Murphy
2023-11-21 17:55 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121175557.GH6083@nvidia.com \
--to=jgg@nvidia.com \
--cc=acpica-devel@lists.linux.dev \
--cc=alyssa@rosenzweig.io \
--cc=andre.draszik@linaro.org \
--cc=aou@eecs.berkeley.edu \
--cc=asahi@lists.linux.dev \
--cc=baolu.lu@linux.intel.com \
--cc=catalin.marinas@arm.com \
--cc=decui@microsoft.com \
--cc=devicetree@vger.kernel.org \
--cc=dwmw2@infradead.org \
--cc=frowand.list@gmail.com \
--cc=guohanjun@huawei.com \
--cc=haiyangz@microsoft.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jonathanh@nvidia.com \
--cc=joro@8bytes.org \
--cc=jsnitsel@redhat.com \
--cc=kys@microsoft.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linux-tegra@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=lpieralisi@kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=marcan@marcan.st \
--cc=mdf@kernel.org \
--cc=palmer@dabbelt.com \
--cc=patches@lists.linux.dev \
--cc=paul.walmsley@sifive.com \
--cc=quic_zhenhuah@quicinc.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
--cc=robert.moore@intel.com \
--cc=robh+dt@kernel.org \
--cc=robh@kernel.org \
--cc=robin.murphy@arm.com \
--cc=sudeep.holla@arm.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=sven@svenpeter.dev \
--cc=thierry.reding@gmail.com \
--cc=tsbogend@alpha.franken.de \
--cc=vdumpa@nvidia.com \
--cc=vgupta@kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=wei.liu@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).