patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: acpica-devel@lists.linux.dev,
	"Alyssa Rosenzweig" <alyssa@rosenzweig.io>,
	"Albert Ou" <aou@eecs.berkeley.edu>,
	asahi@lists.linux.dev,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Dexuan Cui" <decui@microsoft.com>,
	devicetree@vger.kernel.org,
	"David Woodhouse" <dwmw2@infradead.org>,
	"Frank Rowand" <frowand.list@gmail.com>,
	"Hanjun Guo" <guohanjun@huawei.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	iommu@lists.linux.dev,
	"Jean-Philippe Brucker" <jean-philippe@linaro.org>,
	"Jonathan Hunter" <jonathanh@nvidia.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Len Brown" <lenb@kernel.org>,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-hyperv@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-riscv@lists.infradead.org,
	linux-snps-arc@lists.infradead.org, linux-tegra@vger.kernel.org,
	"Russell King" <linux@armlinux.org.uk>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Hector Martin" <marcan@marcan.st>,
	"Palmer Dabbelt" <palmer@dabbelt.com>,
	patches@lists.linux.dev,
	"Paul Walmsley" <paul.walmsley@sifive.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"Robert Moore" <robert.moore@intel.com>,
	"Rob Herring" <robh+dt@kernel.org>,
	"Sudeep Holla" <sudeep.holla@arm.com>,
	"Suravee Suthikulpanit" <suravee.suthikulpanit@amd.com>,
	"Sven Peter" <sven@svenpeter.dev>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
	"Krishna Reddy" <vdumpa@nvidia.com>,
	"Vineet Gupta" <vgupta@kernel.org>,
	virtualization@lists.linux.dev, "Wei Liu" <wei.liu@kernel.org>,
	"Will Deacon" <will@kernel.org>,
	"André Draszik" <andre.draszik@linaro.org>,
	"Lu Baolu" <baolu.lu@linux.intel.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Jerry Snitselaar" <jsnitsel@redhat.com>,
	"Moritz Fischer" <mdf@kernel.org>,
	"Zhenhua Huang" <quic_zhenhuah@quicinc.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Rob Herring" <robh@kernel.org>
Subject: Re: [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec
Date: Tue, 21 Nov 2023 13:55:57 -0400	[thread overview]
Message-ID: <20231121175557.GH6083@nvidia.com> (raw)
In-Reply-To: <e926d2d8-8209-4c0f-a0cb-dcea4edf839e@arm.com>

On Tue, Nov 21, 2023 at 04:06:15PM +0000, Robin Murphy wrote:

> > Obviously. I rejected that right away because of how incredibly
> > wrongly layered and hacky it is to do something like that.
> 
> What, and dressing up the fundamental layering violation by baking it even
> further into the API flow, while still not actually fixing it or any of its
> *other* symptoms, is somehow better?

It puts the locks and the controlled data in the right place, in the
right layer.

> Ultimately, this series is still basically doing the same thing my patch
> does - extending the scope of the existing iommu_probe_device_lock hack to
> cover fwspec creation. A hack is a hack, so frankly I'd rather it be simple
> and obvious and look like one, and being easy to remove again is an obvious
> bonus too.

Not entirely, as I've said this series removes the use of the dev->iommu
during fwspec creation. This is different than just extending the lock.

> > So, I would move toward having the driver's probe invoke a helper like:
> > 
> >     iommu_of_xlate(dev, fwspec, &per_fwnode_function, &args);
> > 
> > Which generates the same list of (fwnode_handle, of_phandle_args) that
> > was passed to of_xlate today, but is ordered sensibly within the
> > sequence of probe for what many drivers seem to want to do.
> 
> Grep for of_xlate. It is a standard and well-understood callback pattern for
> a subsystem to parse a common DT binding and pass a driver-specific
> specifier to a driver to interpret. Or maybe you just have a peculiar
> definition of what you think "micro-optimisation" means? :/

Don't know about other subsystems, here is making a mess. Look at what
the drivers are doing. The two SMMU drivers are sort of sane, but
everything else has coded half their pobe into of_xlate. It doesn't
make alot of sense.

> > So, it is not so much that that the idea of of_xlate goes away, but
> > the specific op->of_xlate does, it gets shifted into a helper that
> > invokes the same function in a more logical spot.
> 
> I'm curious how you imagine an IOMMU driver's ->probe function could be
> called *before* parsing the firmware to work out what, if any, IOMMU, and
> thus driver, a device is associated with.

You've jumped ahead here, I'm talking about removing ops->of_xlate.

With everything kept the same as today this means we'd scan the FW
description twice. Once to find the ops and once inside the driver to
parse it.

When I say micro-optimization this is what I mean - structuring the
code to try to avoid multiple-scans of the FW table.

Once the drivers are self-parsing I see there are two paths to keep it
at one FW parse:

 1) Have the core code parse and cache common cases in the iommu_fwspec.
    Driver then pulls out the common cases from the iommu_fwspec and
    reparsed in uncommon cases.

 2) Accept we have only a single ops in all real systems and just
    invoke the driver to parse it. That parse will cache enough
    information to decide if EPROBE_DEFER is needed.

In either case the drivers would call the same APIs and have the same
logic. The choice is just infrastructure-side stuff.

It is a different view than trying to do everything up front, but I'm
not seeing that it is so differently efficient as to be a performance
problem.

On the plus side it looks to make the drivers alot simpler and more
logical.

> And then every driver has to have
> identical boilerplate to go off and parse the generic "iommus" binding...
> which is the whole damn reason for *not* going down that route and instead
> using an of_xlate mechanism in the first place.

Let's not guess. I've attached below a sketch conversion for
apple-dart.

Diffstat says it *saves* 1 line. But also we fix several bugs!

 - iommu_of_xlate() will check for NULL fwspec and reject the bus
   probe path

 - iommu_of_xlate() can match the iommu's from the iommu list and
   check that the OF actually points only to iommus with this driver's
   ops (missed sanity check)

 - The of parsing machinery already computes the iommu_driver but throws
   it out. This forces all of the drivers to do their own thing. Pass
   it in and thus apple_dart_of_xlate() doesn't need to mess around
   with of_find_device_by_node() and we fix the bug where it leaks the
   reference on the struct device (woops!)

 - We don't leak the cfg allocation that apple_dart_of_xlate() did on
   various error paths. All error paths logically free in probe. We
   don't have to think about what happens if of_xlate is called twice
   for the same args on error/reprobe paths.

Many drivers follow this pattern of apple-dart and would end up like this.

Drivers that just need the u32 array would be simpler, more like:

   struct iommu_driver *instance;
   unsigned int num_ids;

   instance = iommu_of_get_iommu(..., &num_ids);
   if (IS_ERR(instance))
      return ERR_CAST(instance);
   cfg = kzalloc(struct_size(cfg, sids, num_ids), GFP_KERNEL);
   if (!cfg)
      return -ENOMEM;
   iommu_of_read_u32_ids(..., &cfg->sids);
   [..]
   return instance;

I haven't sketched *every* driver yet, but I've sketched enough to be
confident.

Robin, I know it is not what you have in your head, but you should
stop with the insults and be more open to other perspectives.

> > The per-device data can be allocated at the top of probe and passed
> > through args to fix the lifetime bugs.
> > 
> > It is pretty simple to do.
> 
> I believe the kids these days would say "Say you don't understand the code
> without saying you don't understand the code."

I think you are too fixated on what you have in your mind. It will
take me a bit but I will come with a series to move all the FW parsing
into probe along this model. Once done it is trivial to make bus probe
work like it should.

Regarding this series, I don't really see a problem. There is no
"concrete" or anything like that.

> > Not quite. This decouples two unrelated things into seperate
> > concerns. It is not so much about the concurrency but removing the
> > abuse of dev->iommu by code that has no need to touch it at all.
> 
> Sorry, the "abuse" of storing IOMMU-API-specific data in the place we
> intentionally created to consolidate all the IOMMU-API-specific data
> into?

The global state should not be filled until the driver is probed. It
is an abuse to use a global instead of an on-stack variable when
building it. Publishing only fully initialized data to global
visibility is concurrency 101. :(

Jason

diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index ee05f4824bfad1..476938722460b8 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -721,19 +721,29 @@ static struct iommu_domain apple_dart_blocked_domain = {
 
 static struct iommu_device *apple_dart_probe_device(struct device *dev)
 {
-	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	struct apple_dart_master_cfg *cfg;
 	struct apple_dart_stream_map *stream_map;
+	int ret;
 	int i;
 
+	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
 	if (!cfg)
-		return ERR_PTR(-ENODEV);
+		return ERR_PTR(-ENOMEM);
+	ret = iommu_of_xlate(dev_iommu_fwspec_get(dev), &apple_dart_iommu_ops,
+			     1, &apple_dart_of_xlate, cfg);
+	if (ret)
+		goto err_free;
 
 	for_each_stream_map(i, cfg, stream_map)
 		device_link_add(
 			dev, stream_map->dart->dev,
 			DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
 
+	dev_iommu_priv_set(dev, cfg);
 	return &cfg->stream_maps[0].dart->iommu;
+err_free:
+	kfree(cfg);
+	return ret;
 }
 
 static void apple_dart_release_device(struct device *dev)
@@ -777,25 +787,15 @@ static void apple_dart_domain_free(struct iommu_domain *domain)
 	kfree(dart_domain);
 }
 
-static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
+static int apple_dart_of_xlate(struct iommu_device *iommu,
+			       struct of_phandle_args *args, void *priv)
 {
-	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
-	struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
-	struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
-	struct apple_dart *cfg_dart;
-	int i, sid;
+	struct apple_dart *dart = container_of(iommu, struct apple_dart, iommu);
+	struct apple_dart_master_cfg *cfg = priv;
+	struct apple_dart *cfg_dart = cfg->stream_maps[0].dart;
+	int sid = args->args[0];
+	int i;
 
-	if (args->args_count != 1)
-		return -EINVAL;
-	sid = args->args[0];
-
-	if (!cfg)
-		cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
-	if (!cfg)
-		return -ENOMEM;
-	dev_iommu_priv_set(dev, cfg);
-
-	cfg_dart = cfg->stream_maps[0].dart;
 	if (cfg_dart) {
 		if (cfg_dart->supports_bypass != dart->supports_bypass)
 			return -EINVAL;
@@ -980,7 +980,6 @@ static const struct iommu_ops apple_dart_iommu_ops = {
 	.probe_device = apple_dart_probe_device,
 	.release_device = apple_dart_release_device,
 	.device_group = apple_dart_device_group,
-	.of_xlate = apple_dart_of_xlate,
 	.def_domain_type = apple_dart_def_domain_type,
 	.get_resv_regions = apple_dart_get_resv_regions,
 	.pgsize_bitmap = -1UL, /* Restricted during dart probe */

      reply	other threads:[~2023-11-21 17:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-15 14:05 [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 01/17] iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 02/17] iommmu/of: Do not return struct iommu_ops from of_iommu_configure() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 03/17] iommu/of: Use -ENODEV consistently in of_iommu_configure() Jason Gunthorpe
2023-11-15 14:42   ` Jerry Snitselaar
2023-11-15 14:05 ` [PATCH v2 04/17] acpi: Do not return struct iommu_ops from acpi_iommu_configure_id() Jason Gunthorpe
2023-11-15 14:45   ` Jerry Snitselaar
2023-11-15 14:05 ` [PATCH v2 05/17] iommu: Make iommu_fwspec->ids a distinct allocation Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 06/17] iommu: Add iommu_fwspec_alloc/dealloc() Jason Gunthorpe
2023-11-19  8:10   ` Hector Martin
2023-11-19  9:19     ` Hector Martin
2023-11-19 14:13       ` Jason Gunthorpe
2023-11-21  6:47         ` Hector Martin
2023-11-21 16:00           ` Jason Gunthorpe
2023-11-23  9:08             ` Hector Martin
2023-11-15 14:05 ` [PATCH v2 07/17] iommu: Add iommu_probe_device_fwspec() Jason Gunthorpe
2023-11-15 14:05 ` [PATCH v2 08/17] iommu/of: Do not use dev->iommu within of_iommu_configure() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 09/17] iommu: Add iommu_fwspec_append_ids() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 10/17] acpi: Do not use dev->iommu within acpi_iommu_configure() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 11/17] iommu: Hold iommu_probe_device_lock while calling ops->of_xlate Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 12/17] iommu: Make iommu_ops_from_fwnode() static Jason Gunthorpe
2023-11-15 15:09   ` Jerry Snitselaar
2023-11-16 14:36   ` Moritz Fischer
2023-11-15 14:06 ` [PATCH v2 13/17] iommu: Remove dev_iommu_fwspec_set() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 14/17] iommu: Remove pointless iommu_fwspec_free() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 15/17] iommu: Add ops->of_xlate_fwspec() Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 16/17] iommu: Mark dev_iommu_get() with lockdep Jason Gunthorpe
2023-11-15 14:06 ` [PATCH v2 17/17] iommu: Mark dev_iommu_priv_set() with a lockdep Jason Gunthorpe
2023-11-15 14:54 ` [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec Jerry Snitselaar
2023-11-15 15:22 ` Robin Murphy
2023-11-15 15:36   ` Jason Gunthorpe
2023-11-15 20:23     ` Robin Murphy
2023-11-16  4:17       ` Jason Gunthorpe
2023-11-21 16:06         ` Robin Murphy
2023-11-21 17:55           ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121175557.GH6083@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acpica-devel@lists.linux.dev \
    --cc=alyssa@rosenzweig.io \
    --cc=andre.draszik@linaro.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=asahi@lists.linux.dev \
    --cc=baolu.lu@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=decui@microsoft.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=frowand.list@gmail.com \
    --cc=guohanjun@huawei.com \
    --cc=haiyangz@microsoft.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jonathanh@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jsnitsel@redhat.com \
    --cc=kys@microsoft.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lpieralisi@kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=marcan@marcan.st \
    --cc=mdf@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=patches@lists.linux.dev \
    --cc=paul.walmsley@sifive.com \
    --cc=quic_zhenhuah@quicinc.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rafael@kernel.org \
    --cc=robert.moore@intel.com \
    --cc=robh+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=sven@svenpeter.dev \
    --cc=thierry.reding@gmail.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vdumpa@nvidia.com \
    --cc=vgupta@kernel.org \
    --cc=virtualization@lists.linux.dev \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).