From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "Lukas Wunner" <lukas@wunner.de>,
"Bjorn Helgaas" <helgaas@kernel.org>,
linuxarm@huawei.com, "Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
"Bjorn Helgaas" <bhelgaas@google.com>,
linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
"Davidlohr Bueso" <dave@stgolabs.net>,
"Dave Jiang" <dave.jiang@intel.com>,
"Alison Schofield" <alison.schofield@intel.com>,
"Vishal Verma" <vishal.l.verma@intel.com>,
"Ira Weiny" <ira.weiny@intel.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Will Deacon" <will@kernel.org>,
"Mark Rutland" <mark.rutland@arm.com>,
"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
terry.bowman@amd.com,
"Kuppuswamy Sathyanarayanan"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)
Date: Fri, 6 Sep 2024 18:18:32 +0100 [thread overview]
Message-ID: <20240906181832.00007dc7@Huawei.com> (raw)
In-Reply-To: <87jzfpdrc7.ffs@tglx>
On Fri, 06 Sep 2024 12:11:36 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, Sep 05 2024 at 12:23, Jonathan Cameron wrote:
> >> Look at how the PCI device manages interrupts with the per device MSI
> >> mechanism. Forget doing this with either one of the legacy mechanisms.
> >>
> >> 1) It creates a hierarchical interrupt domain and gets the required
> >> resources from the provided parent domain. The PCI side does not
> >> care whether this is x86 or arm64 and it neither cares whether the
> >> parent domain does remapping or not. The only way it cares is about
> >> the features supported by the different parent domains (think
> >> multi-MSI).
> >>
> >> 2) Driver side allocations go through the per device domain
> >>
> >> That AUXbus is not any different. When the CPMU driver binds it wants to
> >> allocate interrupts. So instead of having a convoluted construct
> >> reaching into the parent PCI device, you simply can do:
> >>
> >> 1) Let the cxl_pci driver create a MSI parent domain and set that in
> >> the subdevice::msi::irqdomain pointer.
> >>
> >> 2) Provide cxl_aux_bus_create_irqdomain() which allows the CPMU device
> >> driver to create a per device interrupt domain.
> >>
> >> 3) Let the CPMU driver create its per device interrupt domain with
> >> the provided parent domain
> >>
> >> 4) Let the CPMU driver allocate its MSI-X interrupts through the per
> >> device domain
> >>
> >> Now on the cxl_pci side the AUXbus parent interrupt domain allocation
> >> function does:
> >>
> >> if (!pci_msix_enabled(pdev))
> >> return pci_msix_enable_and_alloc_irq_at(pdev, ....);
> >> else
> >> return pci_msix_alloc_irq_at(pdev, ....);
>
> Ignore the above. Brainfart.
>
> > I'm struggling to follow this suggestion
> > Would you expect the cxl_pci MSI parent domain to set it's parent as
> > msi_domain = irq_domain_create_hierarchy(dev_get_msi_domain(&pdev->dev),
> > IRQ_DOMAIN_FLAG_MSI_PARENT,
> > ...
> > which seems to cause a problem with deadlocks in __irq_domain_alloc_irqs()
> > or create a separate domain structure and provide callbacks that reach into
> > the parent domain as necessary?
> >
> > Or do I have this entirely wrong? I'm struggling to relate what existing
> > code like PCI does to what I need to do here.
>
> dev_get_msi_domain(&pdev->dev) is a nightmare due to the 4 different
> models we have:
>
> - Legacy has no domain
>
> - Non-hierarchical domain
>
> - Hierarchical domain v1
>
> That's the global PCI/MSI domain
>
> - Hierarchical domain v2
>
> That's the underlying MSI_PARENT domain, which is on x86
> either the interrupt remap unit or the vector domain. On arm64
> it's the ITS domain.
>
> See e.g. pci_msi_domain_supports() which handles the mess.
>
> Now this proposal will only work on hierarchical domain v2 because that
> can do the post enable allocations on MSI-X. Let's put a potential
> solution for MSI aside for now to avoid complexity explosion.
>
> So there are two ways to solve this:
>
> 1) Implement the CXL MSI parent domain as disjunct domain
>
> 2) Teach the V2 per device MSI-X domain to be a parent domain
>
> #1 Looks pretty straight forward, but does not seemlessly fit the
> hierarchical domain model and creates horrible indirections.
>
> #2 Is the proper abstraction, but requires to teach the MSI core code
> about stacked MSI parent domains, which should not be horribly hard
> or maybe just works out of the box.
>
> The PCI device driver invokes the not yet existing function
> pci_enable_msi_parent_domain(pci_dev). This function does:
>
> A) validate that this PCI device has a V2 parent domain
>
> B) validate that the device has enabled MSI-X
>
> C) validate that the PCI/MSI-X domain supports dynamic MSI-X
> allocation
>
> D) if #A to #C are true, it enables the PCI device parent domain
>
> I made #B a prerequisite for now, as that's an orthogonal problem, which
> does not need to be solved upfront. Maybe it does not need to be solved
> at all because the underlying PCI driver always allocates a management
> interrupt before dealing with the underlying "bus", which is IMHO a
> reasonable expectation. At least it's a reasonable restriction for
> getting started.
>
> That function does:
>
> msix_dom = pci_msi_get_msix_domain(pdev);
> msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
> msix_dom->msi_parent_ops = &pci_msix_parent_ops;
>
> When creating the CXL devices the CXL code invokes
> pci_msi_init_subdevice(pdev, &cxldev->device), which just does:
>
> dev_set_msi_domain(dev, pci_msi_get_msix_subdevice_parent(pdev));
>
> That allows the CXL devices to create their own per device MSI
> domain via a new function pci_msi_create_subdevice_irq_domain().
>
> That function can just use a variant of pci_create_device_domain() with
> a different domain template and a different irqchip, where the irqchip
> just redirects to the underlying parent domain chip, aka PCI-MSI-X.
>
> I don't think the CXL device will require a special chip as all they
> should need to know is the linux interrupt number. If there are special
> requirements then we can bring the IMS code back or do something similar
> to the platform MSI code.
>
> Then you need pci_subdev_msix_alloc_at() and pci_subdev_msix_free()
> which are the only two functions which the CXL (or similar) need.
>
> The existing pci_msi*() API function might need a safety check so they
> can't be abused by subdevices, but that's a problem for a final
> solution.
>
> That's pretty much all what it takes. Hope that helps.
Absolutely! Thanks for providing the detailed suggestion
this definitely smells a lot less nasty than previous approach.
I have things sort of working now but it's ugly code with a few
cross layer hacks that need tidying up (around programming the
msi registers from wrong 'device'), so may be a little
while before I get it in a state to post.
Jonathan
>
> Thanks,
>
> tglx
>
>
>
next prev parent reply other threads:[~2024-09-06 17:18 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 16:40 [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 1/9] pci: pcie: Drop priv_data from struct pcie_device and use dev_get/set_drvdata() instead Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 2/9] pci: portdrv: Drop driver field for port type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 3/9] pci: pcie: portdrv: Use managed device handling to simplify error and remove flows Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 4/9] auxiliary_bus: expose auxiliary_bus_type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 5/9] pci: pcie: portdrv: Add a auxiliary_bus Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 6/9] cxl: Move CPMU register definitions to header Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 7/9] pci: pcie/cxl: Register an auxiliary device for each CPMU instance Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 8/9] perf: cxl: Make the cpmu driver also work with auxiliary_devices Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 9/9] pci: pcie: portdrv: aer: Switch to auxiliary_bus Jonathan Cameron
2024-06-05 18:04 ` [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Bjorn Helgaas
2024-06-05 19:44 ` Jonathan Cameron
2024-06-06 12:57 ` Jonathan Cameron
2024-06-06 13:21 ` Lukas Wunner
2024-08-23 11:05 ` Jonathan Cameron
2024-08-28 21:11 ` Thomas Gleixner
2024-08-29 12:17 ` Jonathan Cameron
2024-09-05 11:23 ` Jonathan Cameron
2024-09-06 10:11 ` Thomas Gleixner
2024-09-06 17:18 ` Jonathan Cameron [this message]
2024-09-10 16:47 ` Jonathan Cameron
2024-09-10 17:37 ` Jonathan Cameron
2024-09-10 20:04 ` Thomas Gleixner
2024-09-12 16:37 ` Jonathan Cameron
2024-09-12 17:34 ` Jonathan Cameron
2024-09-13 16:24 ` Thomas Gleixner
2024-06-17 7:03 ` Ilpo Järvinen
2024-07-04 16:14 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240906181832.00007dc7@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lpieralisi@kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=terry.bowman@amd.com \
--cc=tglx@linutronix.de \
--cc=vishal.l.verma@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox