public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "Lukas Wunner" <lukas@wunner.de>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	linuxarm@huawei.com, "Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Alison Schofield" <alison.schofield@intel.com>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Will Deacon" <will@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	terry.bowman@amd.com,
	"Kuppuswamy Sathyanarayanan"
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)
Date: Fri, 6 Sep 2024 18:18:32 +0100	[thread overview]
Message-ID: <20240906181832.00007dc7@Huawei.com> (raw)
In-Reply-To: <87jzfpdrc7.ffs@tglx>

On Fri, 06 Sep 2024 12:11:36 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Thu, Sep 05 2024 at 12:23, Jonathan Cameron wrote:
> >> Look at how the PCI device manages interrupts with the per device MSI
> >> mechanism. Forget doing this with either one of the legacy mechanisms.
> >> 
> >>   1) It creates a hierarchical interrupt domain and gets the required
> >>      resources from the provided parent domain. The PCI side does not
> >>      care whether this is x86 or arm64 and it neither cares whether the
> >>      parent domain does remapping or not. The only way it cares is about
> >>      the features supported by the different parent domains (think
> >>      multi-MSI).
> >>      
> >>   2) Driver side allocations go through the per device domain
> >> 
> >> That AUXbus is not any different. When the CPMU driver binds it wants to
> >> allocate interrupts. So instead of having a convoluted construct
> >> reaching into the parent PCI device, you simply can do:
> >> 
> >>   1) Let the cxl_pci driver create a MSI parent domain and set that in
> >>      the subdevice::msi::irqdomain pointer.
> >> 
> >>   2) Provide cxl_aux_bus_create_irqdomain() which allows the CPMU device
> >>      driver to create a per device interrupt domain.
> >> 
> >>   3) Let the CPMU driver create its per device interrupt domain with
> >>      the provided parent domain
> >> 
> >>   4) Let the CPMU driver allocate its MSI-X interrupts through the per
> >>      device domain
> >> 
> >> Now on the cxl_pci side the AUXbus parent interrupt domain allocation
> >> function does:
> >> 
> >>     if (!pci_msix_enabled(pdev))
> >>     	return pci_msix_enable_and_alloc_irq_at(pdev, ....);
> >>     else
> >>         return pci_msix_alloc_irq_at(pdev, ....);  
> 
> Ignore the above. Brainfart.
> 
> > I'm struggling to follow this suggestion
> > Would you expect the cxl_pci MSI parent domain to set it's parent as
> > msi_domain = irq_domain_create_hierarchy(dev_get_msi_domain(&pdev->dev),
> > 					 IRQ_DOMAIN_FLAG_MSI_PARENT,
> > 					 ...
> > which seems to cause a problem with deadlocks in __irq_domain_alloc_irqs()
> > or create a separate domain structure and provide callbacks that reach into
> > the parent domain as necessary?
> >
> > Or do I have this entirely wrong? I'm struggling to relate what existing
> > code like PCI does to what I need to do here.  
> 
> dev_get_msi_domain(&pdev->dev) is a nightmare due to the 4 different
> models we have:
> 
>    - Legacy has no domain
> 
>    - Non-hierarchical domain
> 
>    - Hierarchical domain v1
> 
>          That's the global PCI/MSI domain
> 
>    - Hierarchical domain v2
> 
>       That's the underlying MSI_PARENT domain, which is on x86
>       either the interrupt remap unit or the vector domain. On arm64
>       it's the ITS domain.
> 
> See e.g. pci_msi_domain_supports() which handles the mess.
> 
> Now this proposal will only work on hierarchical domain v2 because that
> can do the post enable allocations on MSI-X. Let's put a potential
> solution for MSI aside for now to avoid complexity explosion.
> 
> So there are two ways to solve this:
> 
>   1) Implement the CXL MSI parent domain as disjunct domain
> 
>   2) Teach the V2 per device MSI-X domain to be a parent domain
> 
> #1 Looks pretty straight forward, but does not seemlessly fit the
>    hierarchical domain model and creates horrible indirections.
> 
> #2 Is the proper abstraction, but requires to teach the MSI core code
>    about stacked MSI parent domains, which should not be horribly hard
>    or maybe just works out of the box.
> 
> The PCI device driver invokes the not yet existing function
> pci_enable_msi_parent_domain(pci_dev). This function does:
> 
>   A) validate that this PCI device has a V2 parent domain
> 
>   B) validate that the device has enabled MSI-X
> 
>   C) validate that the PCI/MSI-X domain supports dynamic MSI-X
>      allocation
> 
>   D) if #A to #C are true, it enables the PCI device parent domain
> 
> I made #B a prerequisite for now, as that's an orthogonal problem, which
> does not need to be solved upfront. Maybe it does not need to be solved
> at all because the underlying PCI driver always allocates a management
> interrupt before dealing with the underlying "bus", which is IMHO a
> reasonable expectation. At least it's a reasonable restriction for
> getting started.
> 
> That function does:
> 
>      msix_dom = pci_msi_get_msix_domain(pdev);
>      msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
>      msix_dom->msi_parent_ops = &pci_msix_parent_ops;
> 
> When creating the CXL devices the CXL code invokes
> pci_msi_init_subdevice(pdev, &cxldev->device), which just does:
> 
>   dev_set_msi_domain(dev, pci_msi_get_msix_subdevice_parent(pdev));
> 
> That allows the CXL devices to create their own per device MSI
> domain via a new function pci_msi_create_subdevice_irq_domain().
> 
> That function can just use a variant of pci_create_device_domain() with
> a different domain template and a different irqchip, where the irqchip
> just redirects to the underlying parent domain chip, aka PCI-MSI-X.
> 
> I don't think the CXL device will require a special chip as all they
> should need to know is the linux interrupt number. If there are special
> requirements then we can bring the IMS code back or do something similar
> to the platform MSI code. 
> 
> Then you need pci_subdev_msix_alloc_at() and pci_subdev_msix_free()
> which are the only two functions which the CXL (or similar) need.
> 
> The existing pci_msi*() API function might need a safety check so they
> can't be abused by subdevices, but that's a problem for a final
> solution.
> 
> That's pretty much all what it takes. Hope that helps.

Absolutely!  Thanks for providing the detailed suggestion
this definitely smells a lot less nasty than previous approach.

I have things sort of working now but it's ugly code with a few
cross layer hacks that need tidying up (around programming the
msi registers from wrong 'device'), so may be a little
while before I get it in a state to post.

Jonathan

> 
> Thanks,
> 
>         tglx
> 
> 
> 


  reply	other threads:[~2024-09-06 17:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 16:40 [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 1/9] pci: pcie: Drop priv_data from struct pcie_device and use dev_get/set_drvdata() instead Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 2/9] pci: portdrv: Drop driver field for port type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 3/9] pci: pcie: portdrv: Use managed device handling to simplify error and remove flows Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 4/9] auxiliary_bus: expose auxiliary_bus_type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 5/9] pci: pcie: portdrv: Add a auxiliary_bus Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 6/9] cxl: Move CPMU register definitions to header Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 7/9] pci: pcie/cxl: Register an auxiliary device for each CPMU instance Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 8/9] perf: cxl: Make the cpmu driver also work with auxiliary_devices Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 9/9] pci: pcie: portdrv: aer: Switch to auxiliary_bus Jonathan Cameron
2024-06-05 18:04 ` [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Bjorn Helgaas
2024-06-05 19:44   ` Jonathan Cameron
2024-06-06 12:57     ` Jonathan Cameron
2024-06-06 13:21       ` Lukas Wunner
2024-08-23 11:05         ` Jonathan Cameron
2024-08-28 21:11           ` Thomas Gleixner
2024-08-29 12:17             ` Jonathan Cameron
2024-09-05 11:23             ` Jonathan Cameron
2024-09-06 10:11               ` Thomas Gleixner
2024-09-06 17:18                 ` Jonathan Cameron [this message]
2024-09-10 16:47                   ` Jonathan Cameron
2024-09-10 17:37                     ` Jonathan Cameron
2024-09-10 20:04                     ` Thomas Gleixner
2024-09-12 16:37                       ` Jonathan Cameron
2024-09-12 17:34                         ` Jonathan Cameron
2024-09-13 16:24                           ` Thomas Gleixner
2024-06-17  7:03       ` Ilpo Järvinen
2024-07-04 16:14       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240906181832.00007dc7@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=helgaas@kernel.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=lpieralisi@kernel.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=terry.bowman@amd.com \
    --cc=tglx@linutronix.de \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox