public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "Lukas Wunner" <lukas@wunner.de>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	"Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Alison Schofield" <alison.schofield@intel.com>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Will Deacon" <will@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	terry.bowman@amd.com,
	"Kuppuswamy Sathyanarayanan"
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)
Date: Thu, 12 Sep 2024 18:34:29 +0100	[thread overview]
Message-ID: <20240912183429.00006fa7@Huawei.com> (raw)
In-Reply-To: <20240912173720.000077ac@Huawei.com>

On Thu, 12 Sep 2024 17:37:20 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Tue, 10 Sep 2024 22:04:18 +0200
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > On Tue, Sep 10 2024 at 17:47, Jonathan Cameron wrote:
> > > There are quite a few places that make the assumption that the
> > > preirq_data->parent->chip is the right chip to for example call
> > > irq_set_affinity on.
> > >
> > > So the simple way to make it all work is to just have
> > > a msi_domain_template->msi_domain_ops->prepare_desc
> > > that sets the desc->dev = to the parent device (here the
> > > pci_dev->dev)  
> > 
> > Creative :)
> 
> One bit that is challenging me is that the PCI access functions
> use the pci_dev that is embedded via irq_data->common->msi_desc->dev
> (and hence doesn't give us a obvious path to any hierarchical structures)
> E.g. __pci_write_msi_msg() and which checks the pci_dev is
> powered up.
> 
> I'd like to be able to do a call to the parent similar to irq_chip_unmask_parent
> to handle write_msi_msg() for the new device domain.
> 
> So how should this be avoided or should msi_desc->dev be the
> PCI device?

I thought about this whilst cycling home.  Perhaps my fundamental
question is more basic  Which device should
msi_dec->dev be?
* The ultimate requester of the msi  -  so here the one calling
our new pci_msix_subdev_alloc_at(),
* The one on which the msi_write_msg() should occur. - here
  that would be the struct pci_dev given the checks needed on
  whether the device is powered up.  If this is the case, can
  we instead use the irq_data->dev in __pci_write_msi_msg()?

Also, is it valid to use the irq_domain->dev for callbacks such
as msi_prepare_desc on basis that (I think) is the device
that 'hosts' the irq domain, so can we use it to replace the
use of msi_desc->dev in pci_msix_prepare_desc()
If we can that enables our subdev to use a prepare_desc callback
that does something like

	struct msi_domain *info;

	domain = domain->parent;
	info = domain->host_data;

	info->ops->prepare_desc(domain, arg, desc);

Thanks for any pointers!

Jonathan


> 
> I can see some options but maybe I'm missing the big picture.
> 1) Stop them using that path to get to the device because it
>   might not be the right one. Instead use device via irq_data->chip_data
>   (which is currently unused for these cases).
>   Some slightly fiddly work will be needed to handle changing __pci_write_msi_msg()
>   to talk irq_data though.
> 2) Maybe setting the msi descriptor dev to the one we actually write
>    on is the right solution? (smells bad to me but I'm still getting
>    my head around this stuff).
> 
> Any immediate thoughts?  Or am I still thinking about this the wrong
> way? (which is likely)
> 
> > 
> > > At that point everything more or less just works and all the
> > > rest of the callbacks can use generic forms.
> > >
> > > Alternative might be to make calls like the one in
> > > arch/x86/kernel/apic/msi.c msi_set_affinity search
> > > for the first ancestor device that has an irq_set_affinity().
> > > That unfortunately may affect quite a few places.  
> > 
> > What's the problem? None of the CXL drivers should care about that at
> > all. Delegating other callbacks to the parent domain is what hierarchical
> > domains are about. If there is some helper missing, so we add it.
> > 
> > > Anyhow, I'll probably send out the prepare_desc hack set with driver
> > > usage etc after I've written up a proper description of problems
> > > encountered etc so you can see what it all looks like and will be more
> > > palatable in general but in the meantime this is the guts of it of the
> > > variant where the subdev related desc has the dev set to the parent
> > > device.  
> > 
> > Let me look at this pile then :)
> > 
> > > Note for the avoidance of doubt, I don't much like this
> > > solution but maybe it will grow on me ;)  
> > 
> > Maybe, but I doubt it will survive my abstraction taste check.
> > 
> > > +static const struct msi_parent_ops pci_msix_parent_ops = {
> > > +	.supported_flags = MSI_FLAG_PCI_MSIX | MSI_FLAG_PCI_MSIX_ALLOC_DYN,
> > > +	.prefix = "PCI-MSIX-PROXY-",
> > > +	.init_dev_msi_info = msi_parent_init_dev_msi_info,  
> > 
> > The obligatory link to:
> > 
> > https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#struct-declarations-and-initializers
> > 
> 
> Thanks. Will fix up before I get anywhere near a real posting!
> 
> > > +};
> > > +
> > > +int pci_msi_enable_parent_domain(struct pci_dev *pdev)
> > > +{
> > > +	struct irq_domain *msix_dom;
> > > +	/* TGLX: Validate has v2 parent domain */
> > > +	/* TGLX: validate msix enabled */
> > > +	/* TGLX: Validate msix domain supports dynamics msi-x */
> > > +
> > > +	/* Enable PCI device parent domain */
> > > +	msix_dom = dev_msi_get_msix_device_domain(&pdev->dev);
> > > +	msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
> > > +	msix_dom->msi_parent_ops = &pci_msix_parent_ops;  
> > 
> > That want's to be a call into the MSI core code, Something like:
> > 
> >      msi_enable_sub_parent_domain(&pdev->dev, DOMAIN_BUS_PCI_DEVICE_MSIX)
> > 
> > or something like that. I really don't want to expose the internals of
> > MSI. I spent a lot of effort to encapsulate that...
> > 
> > > +void pci_msi_init_subdevice(struct pci_dev *pdev, struct device *dev)
> > > +{
> > > +	dev_set_msi_domain(dev, dev_msi_get_msix_device_domain(&pdev->dev));  
> > 
> >      msi_set_parent_domain(.....);
> > 
> > > +static void pci_subdev_msix_prepare_desc(struct irq_domain *domain, msi_alloc_info_t *arg,
> > > +					struct msi_desc *desc)
> > > +{
> > > +	struct device *parent = desc->dev->parent;
> > > +
> > > +	if (!desc->pci.mask_base) {
> > > +		/* Not elegant - but needed for irq affinity to work? */  
> > 
> > Which makes me ask the question why desc->pci.mask_base is NULL in the
> > first place. That needs some thought at the place which initializes the
> > descriptor.
> 
> Maybe this is the other way around?  The descriptor is 'never' initialized
> for this case.  The equivalent code in msix_prepare_msi_desc() has
> comments 
> " This is separate from msix_setup_msi_descs() below to handle dynamic
>  allocations for MSI-X after initial enablement."
> So I think this !desc->pci.mask_base should always succeed 
> (and so I should get rid of it and run the descriptor initialize unconditionally)
> 
> It might be appropriate to call the prepare_desc from the parent msi_domain though
> via irq_domain->parent_domain->host_data
> However this has the same need to not be based on the msi_desc->dev subject
> to the question above.
> 
> Jonathan
> 
> 
> 
> > 
> > > --- a/kernel/irq/msi.c
> > > +++ b/kernel/irq/msi.c
> > > @@ -1720,3 +1720,8 @@ bool msi_device_has_isolated_msi(struct device *dev)
> > >  	return arch_is_isolated_msi();
> > >  }
> > >  EXPORT_SYMBOL_GPL(msi_device_has_isolated_msi);
> > > +struct irq_domain *dev_msi_get_msix_device_domain(struct device *dev)  
> > 
> > You clearly run out of new lines here :)
> > 
> > > +{
> > > +	return dev->msi.data->__domains[0].domain;
> > > +}  
> > 
> > But aside of that. No. See above.
> > 
> > Thanks,
> > 
> >         tglx
> > 
> 


  reply	other threads:[~2024-09-12 17:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 16:40 [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 1/9] pci: pcie: Drop priv_data from struct pcie_device and use dev_get/set_drvdata() instead Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 2/9] pci: portdrv: Drop driver field for port type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 3/9] pci: pcie: portdrv: Use managed device handling to simplify error and remove flows Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 4/9] auxiliary_bus: expose auxiliary_bus_type Jonathan Cameron
2024-05-29 16:40 ` [RFC PATCH 5/9] pci: pcie: portdrv: Add a auxiliary_bus Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 6/9] cxl: Move CPMU register definitions to header Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 7/9] pci: pcie/cxl: Register an auxiliary device for each CPMU instance Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 8/9] perf: cxl: Make the cpmu driver also work with auxiliary_devices Jonathan Cameron
2024-05-29 16:41 ` [RFC PATCH 9/9] pci: pcie: portdrv: aer: Switch to auxiliary_bus Jonathan Cameron
2024-06-05 18:04 ` [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer) Bjorn Helgaas
2024-06-05 19:44   ` Jonathan Cameron
2024-06-06 12:57     ` Jonathan Cameron
2024-06-06 13:21       ` Lukas Wunner
2024-08-23 11:05         ` Jonathan Cameron
2024-08-28 21:11           ` Thomas Gleixner
2024-08-29 12:17             ` Jonathan Cameron
2024-09-05 11:23             ` Jonathan Cameron
2024-09-06 10:11               ` Thomas Gleixner
2024-09-06 17:18                 ` Jonathan Cameron
2024-09-10 16:47                   ` Jonathan Cameron
2024-09-10 17:37                     ` Jonathan Cameron
2024-09-10 20:04                     ` Thomas Gleixner
2024-09-12 16:37                       ` Jonathan Cameron
2024-09-12 17:34                         ` Jonathan Cameron [this message]
2024-09-13 16:24                           ` Thomas Gleixner
2024-06-17  7:03       ` Ilpo Järvinen
2024-07-04 16:14       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240912183429.00006fa7@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=helgaas@kernel.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=terry.bowman@amd.com \
    --cc=tglx@linutronix.de \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox