From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "Lukas Wunner" <lukas@wunner.de>,
"Bjorn Helgaas" <helgaas@kernel.org>,
"Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
"Bjorn Helgaas" <bhelgaas@google.com>,
linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
"Davidlohr Bueso" <dave@stgolabs.net>,
"Dave Jiang" <dave.jiang@intel.com>,
"Alison Schofield" <alison.schofield@intel.com>,
"Vishal Verma" <vishal.l.verma@intel.com>,
"Ira Weiny" <ira.weiny@intel.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Will Deacon" <will@kernel.org>,
"Mark Rutland" <mark.rutland@arm.com>,
"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
terry.bowman@amd.com,
"Kuppuswamy Sathyanarayanan"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)
Date: Thu, 12 Sep 2024 17:37:20 +0100 [thread overview]
Message-ID: <20240912173720.000077ac@Huawei.com> (raw)
In-Reply-To: <87h6an9sxp.ffs@tglx>
On Tue, 10 Sep 2024 22:04:18 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, Sep 10 2024 at 17:47, Jonathan Cameron wrote:
> > There are quite a few places that make the assumption that the
> > preirq_data->parent->chip is the right chip to for example call
> > irq_set_affinity on.
> >
> > So the simple way to make it all work is to just have
> > a msi_domain_template->msi_domain_ops->prepare_desc
> > that sets the desc->dev = to the parent device (here the
> > pci_dev->dev)
>
> Creative :)
One bit that is challenging me is that the PCI access functions
use the pci_dev that is embedded via irq_data->common->msi_desc->dev
(and hence doesn't give us a obvious path to any hierarchical structures)
E.g. __pci_write_msi_msg() and which checks the pci_dev is
powered up.
I'd like to be able to do a call to the parent similar to irq_chip_unmask_parent
to handle write_msi_msg() for the new device domain.
So how should this be avoided or should msi_desc->dev be the
PCI device?
I can see some options but maybe I'm missing the big picture.
1) Stop them using that path to get to the device because it
might not be the right one. Instead use device via irq_data->chip_data
(which is currently unused for these cases).
Some slightly fiddly work will be needed to handle changing __pci_write_msi_msg()
to talk irq_data though.
2) Maybe setting the msi descriptor dev to the one we actually write
on is the right solution? (smells bad to me but I'm still getting
my head around this stuff).
Any immediate thoughts? Or am I still thinking about this the wrong
way? (which is likely)
>
> > At that point everything more or less just works and all the
> > rest of the callbacks can use generic forms.
> >
> > Alternative might be to make calls like the one in
> > arch/x86/kernel/apic/msi.c msi_set_affinity search
> > for the first ancestor device that has an irq_set_affinity().
> > That unfortunately may affect quite a few places.
>
> What's the problem? None of the CXL drivers should care about that at
> all. Delegating other callbacks to the parent domain is what hierarchical
> domains are about. If there is some helper missing, so we add it.
>
> > Anyhow, I'll probably send out the prepare_desc hack set with driver
> > usage etc after I've written up a proper description of problems
> > encountered etc so you can see what it all looks like and will be more
> > palatable in general but in the meantime this is the guts of it of the
> > variant where the subdev related desc has the dev set to the parent
> > device.
>
> Let me look at this pile then :)
>
> > Note for the avoidance of doubt, I don't much like this
> > solution but maybe it will grow on me ;)
>
> Maybe, but I doubt it will survive my abstraction taste check.
>
> > +static const struct msi_parent_ops pci_msix_parent_ops = {
> > + .supported_flags = MSI_FLAG_PCI_MSIX | MSI_FLAG_PCI_MSIX_ALLOC_DYN,
> > + .prefix = "PCI-MSIX-PROXY-",
> > + .init_dev_msi_info = msi_parent_init_dev_msi_info,
>
> The obligatory link to:
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#struct-declarations-and-initializers
>
Thanks. Will fix up before I get anywhere near a real posting!
> > +};
> > +
> > +int pci_msi_enable_parent_domain(struct pci_dev *pdev)
> > +{
> > + struct irq_domain *msix_dom;
> > + /* TGLX: Validate has v2 parent domain */
> > + /* TGLX: validate msix enabled */
> > + /* TGLX: Validate msix domain supports dynamics msi-x */
> > +
> > + /* Enable PCI device parent domain */
> > + msix_dom = dev_msi_get_msix_device_domain(&pdev->dev);
> > + msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
> > + msix_dom->msi_parent_ops = &pci_msix_parent_ops;
>
> That want's to be a call into the MSI core code, Something like:
>
> msi_enable_sub_parent_domain(&pdev->dev, DOMAIN_BUS_PCI_DEVICE_MSIX)
>
> or something like that. I really don't want to expose the internals of
> MSI. I spent a lot of effort to encapsulate that...
>
> > +void pci_msi_init_subdevice(struct pci_dev *pdev, struct device *dev)
> > +{
> > + dev_set_msi_domain(dev, dev_msi_get_msix_device_domain(&pdev->dev));
>
> msi_set_parent_domain(.....);
>
> > +static void pci_subdev_msix_prepare_desc(struct irq_domain *domain, msi_alloc_info_t *arg,
> > + struct msi_desc *desc)
> > +{
> > + struct device *parent = desc->dev->parent;
> > +
> > + if (!desc->pci.mask_base) {
> > + /* Not elegant - but needed for irq affinity to work? */
>
> Which makes me ask the question why desc->pci.mask_base is NULL in the
> first place. That needs some thought at the place which initializes the
> descriptor.
Maybe this is the other way around? The descriptor is 'never' initialized
for this case. The equivalent code in msix_prepare_msi_desc() has
comments
" This is separate from msix_setup_msi_descs() below to handle dynamic
allocations for MSI-X after initial enablement."
So I think this !desc->pci.mask_base should always succeed
(and so I should get rid of it and run the descriptor initialize unconditionally)
It might be appropriate to call the prepare_desc from the parent msi_domain though
via irq_domain->parent_domain->host_data
However this has the same need to not be based on the msi_desc->dev subject
to the question above.
Jonathan
>
> > --- a/kernel/irq/msi.c
> > +++ b/kernel/irq/msi.c
> > @@ -1720,3 +1720,8 @@ bool msi_device_has_isolated_msi(struct device *dev)
> > return arch_is_isolated_msi();
> > }
> > EXPORT_SYMBOL_GPL(msi_device_has_isolated_msi);
> > +struct irq_domain *dev_msi_get_msix_device_domain(struct device *dev)
>
> You clearly run out of new lines here :)
>
> > +{
> > + return dev->msi.data->__domains[0].domain;
> > +}
>
> But aside of that. No. See above.
>
> Thanks,
>
> tglx
>
next prev parent reply other threads:[~2024-09-12 16:37 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 16:40 [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 1/9] pci: pcie: Drop priv_data from struct pcie_device and use dev_get/set_drvdata() instead Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 2/9] pci: portdrv: Drop driver field for port type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 3/9] pci: pcie: portdrv: Use managed device handling to simplify error and remove flows Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 4/9] auxiliary_bus: expose auxiliary_bus_type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 5/9] pci: pcie: portdrv: Add a auxiliary_bus Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 6/9] cxl: Move CPMU register definitions to header Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 7/9] pci: pcie/cxl: Register an auxiliary device for each CPMU instance Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 8/9] perf: cxl: Make the cpmu driver also work with auxiliary_devices Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 9/9] pci: pcie: portdrv: aer: Switch to auxiliary_bus Jonathan Cameron
2024-06-05 18:04 ` [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Bjorn Helgaas
2024-06-05 19:44 ` Jonathan Cameron
2024-06-06 12:57 ` Jonathan Cameron
2024-06-06 13:21 ` Lukas Wunner
2024-08-23 11:05 ` Jonathan Cameron
2024-08-28 21:11 ` Thomas Gleixner
2024-08-29 12:17 ` Jonathan Cameron
2024-09-05 11:23 ` Jonathan Cameron
2024-09-06 10:11 ` Thomas Gleixner
2024-09-06 17:18 ` Jonathan Cameron
2024-09-10 16:47 ` Jonathan Cameron
2024-09-10 17:37 ` Jonathan Cameron
2024-09-10 20:04 ` Thomas Gleixner
2024-09-12 16:37 ` Jonathan Cameron [this message]
2024-09-12 17:34 ` Jonathan Cameron
2024-09-13 16:24 ` Thomas Gleixner
2024-06-17 7:03 ` Ilpo Järvinen
2024-07-04 16:14 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240912173720.000077ac@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=terry.bowman@amd.com \
--cc=tglx@linutronix.de \
--cc=vishal.l.verma@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox