linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nam Cao <namcao@linutronix.de>
To: Michael Kelley <mhklinux@outlook.com>
Cc: "Marc Zyngier" <maz@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Rob Herring" <robh@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Karthikeyan Mitran" <m.karthikeyan@mobiveil.co.in>,
	"Hou Zhiqiang" <Zhiqiang.Hou@nxp.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Pali Rohár" <pali@kernel.org>,
	"K . Y . Srinivasan" <kys@microsoft.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Wei Liu" <wei.liu@kernel.org>,
	"Dexuan Cui" <decui@microsoft.com>,
	"Joyce Ooi" <joyce.ooi@intel.com>,
	"Jim Quinlan" <jim2101024@gmail.com>,
	"Nicolas Saenz Julienne" <nsaenz@kernel.org>,
	"Florian Fainelli" <florian.fainelli@broadcom.com>,
	"Broadcom internal kernel review list"
	<bcm-kernel-feedback-list@broadcom.com>,
	"Ray Jui" <rjui@broadcom.com>,
	"Scott Branden" <sbranden@broadcom.com>,
	"Ryder Lee" <ryder.lee@mediatek.com>,
	"Jianjun Wang" <jianjun.wang@mediatek.com>,
	"Marek Vasut" <marek.vasut+renesas@gmail.com>,
	"Yoshihiro Shimoda" <yoshihiro.shimoda.uh@renesas.com>,
	"Michal Simek" <michal.simek@amd.com>,
	"Daire McNamara" <daire.mcnamara@microchip.com>,
	"Nirmal Patel" <nirmal.patel@linux.intel.com>,
	"Jonathan Derrick" <jonathan.derrick@linux.dev>,
	"Matthias Brugger" <matthias.bgg@gmail.com>,
	"AngeloGioacchino Del Regno"
	<angelogioacchino.delregno@collabora.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-rpi-kernel@lists.infradead.org"
	<linux-rpi-kernel@lists.infradead.org>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>,
	"linux-renesas-soc@vger.kernel.org"
	<linux-renesas-soc@vger.kernel.org>
Subject: Re: [PATCH 14/16] PCI: hv: Switch to msi_create_parent_irq_domain()
Date: Sat, 5 Jul 2025 11:46:55 +0200	[thread overview]
Message-ID: <20250705094655.sEu3KWbJ@linutronix.de> (raw)
In-Reply-To: <SN6PR02MB41571145B5ECA505CDA6BD90D44DA@SN6PR02MB4157.namprd02.prod.outlook.com>

On Sat, Jul 05, 2025 at 03:51:48AM +0000, Michael Kelley wrote:
> From: Nam Cao <namcao@linutronix.de> Sent: Thursday, June 26, 2025 7:48 AM
> > 
> > Move away from the legacy MSI domain setup, switch to use
> > msi_create_parent_irq_domain().
> 
> With the additional tweak to this patch that you supplied separately,
> everything in my testing on both x86 and arm64 seems to work OK. So
> that's all good.
> 
> On arm64, I did notice the following IRQ domain information from
> /sys/kernel/debug/irq/domains:
> 
> # cat HV-PCI-MSIX-1e03\:00\:00.0-12
> name:   HV-PCI-MSIX-1e03:00:00.0-12
>  size:   0
>  mapped: 7
>  flags:  0x00000213
>             IRQ_DOMAIN_FLAG_HIERARCHY
>             IRQ_DOMAIN_NAME_ALLOCATED
>             IRQ_DOMAIN_FLAG_MSI
>             IRQ_DOMAIN_FLAG_MSI_DEVICE
>  parent: 5D202AA8-1E03-4F0F-A786-390A0D2749E9-3
>     name:   5D202AA8-1E03-4F0F-A786-390A0D2749E9-3
>      size:   0
>      mapped: 7
>      flags:  0x00000103
>                 IRQ_DOMAIN_FLAG_HIERARCHY
>                 IRQ_DOMAIN_NAME_ALLOCATED
>                 IRQ_DOMAIN_FLAG_MSI_PARENT
>      parent: hv_vpci_arm64
>         name:   hv_vpci_arm64
>          size:   956
>          mapped: 31
>          flags:  0x00000003
>                     IRQ_DOMAIN_FLAG_HIERARCHY
>                     IRQ_DOMAIN_NAME_ALLOCATED
>          parent: irqchip@0x00000000ffff0000-1
>             name:   irqchip@0x00000000ffff0000-1
>              size:   0
>              mapped: 47
>              flags:  0x00000003
>                         IRQ_DOMAIN_FLAG_HIERARCHY
>                         IRQ_DOMAIN_NAME_ALLOCATED
> 
> The 5D202AA8-1E03-4F0F-A786-390A0D2749E9-3 domain has
> IRQ_DOMAIN_FLAG_MSI_PARENT set. But the hv_vpci_arm64
> and irqchip@... domains do not.  Is that a problem?  On x86,
> the output is this, with IRQ_DOMAIN_FLAG_MSI_PARENT set
> in the next level up VECTOR domain:

That looks normal. IRQ_DOMAIN_FLAG_MSI_PARENT is set for domains which
provide MSI parent domain capability, which happens to be the case for x86
vector.

> # cat HV-PCI-MSIX-6b71\:00\:02.0-12
> name:   HV-PCI-MSIX-6b71:00:02.0-12
>  size:   0
>  mapped: 17
>  flags:  0x00000213
>             IRQ_DOMAIN_FLAG_HIERARCHY
>             IRQ_DOMAIN_NAME_ALLOCATED
>             IRQ_DOMAIN_FLAG_MSI
>             IRQ_DOMAIN_FLAG_MSI_DEVICE
>  parent: 8564CB14-6B71-477C-B189-F175118E6FF0-3
>     name:   8564CB14-6B71-477C-B189-F175118E6FF0-3
>      size:   0
>      mapped: 17
>      flags:  0x00000103
>                 IRQ_DOMAIN_FLAG_HIERARCHY
>                 IRQ_DOMAIN_NAME_ALLOCATED
>                 IRQ_DOMAIN_FLAG_MSI_PARENT
>      parent: VECTOR
>         name:   VECTOR
>          size:   0
>          mapped: 67
>          flags:  0x00000103
>                     IRQ_DOMAIN_FLAG_HIERARCHY
>                     IRQ_DOMAIN_NAME_ALLOCATED
>                     IRQ_DOMAIN_FLAG_MSI_PARENT
> 
> Finally, I've noted a couple of code review comments below. These
> comments may reflect my lack of fully understanding the MSI
> IRQ handling, in which case, please set me straight. Thanks,
> 
> Michael
> 
> > 
> > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > ---
> > Cc: K. Y. Srinivasan <kys@microsoft.com>
> > Cc: Haiyang Zhang <haiyangz@microsoft.com>
> > Cc: Wei Liu <wei.liu@kernel.org>
> > Cc: Dexuan Cui <decui@microsoft.com>
> > Cc: linux-hyperv@vger.kernel.org
> > ---
> >  drivers/pci/Kconfig                 |  1 +
> >  drivers/pci/controller/pci-hyperv.c | 98 +++++++++++++++++++++++------
> >  2 files changed, 80 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> > index 9c0e4aaf4e8cb..9a249c65aedcd 100644
> > --- a/drivers/pci/Kconfig
> > +++ b/drivers/pci/Kconfig
> > @@ -223,6 +223,7 @@ config PCI_HYPERV
> >  	tristate "Hyper-V PCI Frontend"
> >  	depends on ((X86 && X86_64) || ARM64) && HYPERV && PCI_MSI && SYSFS
> >  	select PCI_HYPERV_INTERFACE
> > +	select IRQ_MSI_LIB
> >  	help
> >  	  The PCI device frontend driver allows the kernel to import arbitrary
> >  	  PCI devices from a PCI backend to support PCI driver domains.
> > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> > index ef5d655a0052c..3a24fadddb83b 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -44,6 +44,7 @@
> >  #include <linux/delay.h>
> >  #include <linux/semaphore.h>
> >  #include <linux/irq.h>
> > +#include <linux/irqchip/irq-msi-lib.h>
> >  #include <linux/msi.h>
> >  #include <linux/hyperv.h>
> >  #include <linux/refcount.h>
> > @@ -508,7 +509,6 @@ struct hv_pcibus_device {
> >  	struct list_head children;
> >  	struct list_head dr_list;
> > 
> > -	struct msi_domain_info msi_info;
> >  	struct irq_domain *irq_domain;
> > 
> >  	struct workqueue_struct *wq;
> > @@ -1687,7 +1687,7 @@ static void hv_msi_free(struct irq_domain *domain, struct msi_domain_info *info,
> >  	struct msi_desc *msi = irq_data_get_msi_desc(irq_data);
> > 
> >  	pdev = msi_desc_to_pci_dev(msi);
> > -	hbus = info->data;
> > +	hbus = domain->host_data;
> >  	int_desc = irq_data_get_irq_chip_data(irq_data);
> >  	if (!int_desc)
> >  		return;
> > @@ -1705,7 +1705,6 @@ static void hv_msi_free(struct irq_domain *domain, struct msi_domain_info *info,
> > 
> >  static void hv_irq_mask(struct irq_data *data)
> >  {
> > -	pci_msi_mask_irq(data);
> >  	if (data->parent_data->chip->irq_mask)
> >  		irq_chip_mask_parent(data);
> >  }
> > @@ -1716,7 +1715,6 @@ static void hv_irq_unmask(struct irq_data *data)
> > 
> >  	if (data->parent_data->chip->irq_unmask)
> >  		irq_chip_unmask_parent(data);
> > -	pci_msi_unmask_irq(data);
> >  }
> > 
> >  struct compose_comp_ctxt {
> > @@ -2101,6 +2099,44 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
> >  	msg->data = 0;
> >  }
> > 
> > +static bool hv_pcie_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
> > +				      struct irq_domain *real_parent, struct msi_domain_info *info)
> > +{
> > +	struct irq_chip *chip = info->chip;
> > +
> > +	if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
> > +		return false;
> > +
> > +	info->ops->msi_prepare = hv_msi_prepare;
> > +
> > +	chip->irq_set_affinity = irq_chip_set_affinity_parent;
> > +
> > +	if (IS_ENABLED(CONFIG_X86))
> > +		chip->flags |= IRQCHIP_MOVE_DEFERRED;
> > +
> > +	return true;
> > +}
> > +
> > +#define HV_PCIE_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS	| \
> > +				    MSI_FLAG_USE_DEF_CHIP_OPS		| \
> > +				    MSI_FLAG_PCI_MSI_MASK_PARENT)
> > +#define HV_PCIE_MSI_FLAGS_SUPPORTED (MSI_FLAG_MULTI_PCI_MSI	| \
> > +				     MSI_FLAG_PCI_MSIX			| \
> > +				     MSI_GENERIC_FLAGS_MASK)
> > +
> > +static const struct msi_parent_ops hv_pcie_msi_parent_ops = {
> > +	.required_flags		= HV_PCIE_MSI_FLAGS_REQUIRED,
> > +	.supported_flags	= HV_PCIE_MSI_FLAGS_SUPPORTED,
> > +	.bus_select_token	= DOMAIN_BUS_PCI_MSI,
> > +#ifdef CONFIG_X86
> > +	.chip_flags		= MSI_CHIP_FLAG_SET_ACK,
> > +#elif defined(CONFIG_ARM64)
> > +	.chip_flags		= MSI_CHIP_FLAG_SET_EOI,
> > +#endif
> > +	.prefix			= "HV-",
> > +	.init_dev_msi_info	= hv_pcie_init_dev_msi_info,
> > +};
> > +
> >  /* HW Interrupt Chip Descriptor */
> >  static struct irq_chip hv_msi_irq_chip = {
> >  	.name			= "Hyper-V PCIe MSI",
> > @@ -2108,7 +2144,6 @@ static struct irq_chip hv_msi_irq_chip = {
> >  	.irq_set_affinity	= irq_chip_set_affinity_parent,
> >  #ifdef CONFIG_X86
> >  	.irq_ack		= irq_chip_ack_parent,
> > -	.flags			= IRQCHIP_MOVE_DEFERRED,
> >  #elif defined(CONFIG_ARM64)
> >  	.irq_eoi		= irq_chip_eoi_parent,
> >  #endif
> 
> Would it work to drop the #ifdef's and always set both .irq_ack and
> .irq_eoi on x86 and on ARM64?  Is which one gets called controlled by the
> child HV-PCI-MSIX- ... domain, based on the .chip_flags?
>
> I'm trying to reduce the #ifdef clutter. I
> tested without the #ifdefs on both x86 and arm64, and
> everything works, but I know that doesn't prove that it's
> OK.

Nothing is wrong with that, as far as I can tell.

> If the #ifdefs can go away, then I'd like to see a tweak to the way
> .chip_flags is set. Rather than do an #ifdef inline for struct
> msi_parent_ops hv_pcie_msi_parent_ops, add a #define
> HV_MSI_CHIP_FLAGS in the existing #ifdef X86 and #ifdef ARM64
> sections respectively near the top of this source file, and then
> use HV_MSI_CHIP_FLAGS in struct msi_parent_ops
> hv_pcie_msi_parent_ops.  As much as is reasonable, I'd like to
> not clutter the code with #ifdef X86 #elseif ARM64, but instead
> group all the differences under the existing #ifdefs near the top.
> There are some places where this isn't practical, but this seems
> like a place that is practical.

Yes, that would be better. I will do it in v2.

> > @@ -2116,9 +2151,37 @@ static struct irq_chip hv_msi_irq_chip = {
> >  	.irq_unmask		= hv_irq_unmask,
> >  };
> > 
> > -static struct msi_domain_ops hv_msi_ops = {
> > -	.msi_prepare	= hv_msi_prepare,
> > -	.msi_free	= hv_msi_free,
> > +static int hv_pcie_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
> > +			       void *arg)
> > +{
> > +	/* TODO: move the content of hv_compose_msi_msg() in here */
> 
> Could you elaborate on this TODO? Is the idea to loop through all the IRQs and
> generate the MSI message for each one? What is the advantage to doing it here?
> I noticed in Patch 3 of the series, the Aardvark controller has
> advk_msi_irq_compose_msi_msg(), but you had not moved it into the domain
> allocation path.

Sorry for being unclear. hv_compose_msi_msg() should not be moved here
entirely. Let me elaborate this in v2.

What I meant is that, hv_compose_msi_msg() is doing more than what this
callback is supposed to do (composing message). It works, but it is not
correct. Interrupt allocation is the responsibility of
irq_domain_ops::alloc(). Allocating and populating int_desc should be in
hv_pcie_domain_alloc() instead.

irq_domain_ops's .alloc() and .free() should be asymmetric.

> 
> Also, is there some point in the time in the future where the "TODO" is likely to
> become a "MUST DO"?

There's nothing planned that would make this non-functional, as far as I
know.

Thanks so much for examining the patch,
Nam

  reply	other threads:[~2025-07-05  9:47 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-26 14:47 [PATCH 00/16] PCI: MSI parent domain conversion Nam Cao
2025-06-26 14:47 ` [PATCH 01/16] PCI: dwc: Switch to msi_create_parent_irq_domain() Nam Cao
2025-07-03 13:19   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 02/16] PCI: mobiveil: " Nam Cao
2025-07-03 13:20   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 03/16] PCI: aardvark: " Nam Cao
2025-07-03 13:21   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 04/16] PCI: altera-msi: " Nam Cao
2025-07-03 13:22   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 05/16] PCI: brcmstb: " Nam Cao
2025-06-30 19:18   ` Florian Fainelli
2025-07-03 13:23   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 06/16] PCI: iproc: " Nam Cao
2025-06-30 19:17   ` Florian Fainelli
2025-07-03 13:23   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 07/16] PCI: mediatek-gen3: " Nam Cao
2025-07-03 13:24   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 08/16] PCI: mediatek: " Nam Cao
2025-07-03 13:25   ` Thomas Gleixner
2025-06-26 14:47 ` [PATCH 09/16] PCI: rcar-host: " Nam Cao
2025-07-03 13:26   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 10/16] PCI: xilinx-xdma: " Nam Cao
2025-07-03 13:27   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 11/16] PCI: xilinx-nwl: " Nam Cao
2025-07-03 13:28   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 12/16] PCI: xilinx: " Nam Cao
2025-07-03 13:29   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 13/16] PCI: plda: " Nam Cao
2025-07-03 13:30   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 14/16] PCI: hv: " Nam Cao
2025-07-03 13:33   ` Thomas Gleixner
2025-07-03 17:41   ` Michael Kelley
2025-07-03 19:59     ` Thomas Gleixner
2025-07-03 20:15       ` Michael Kelley
2025-07-03 21:00         ` Nam Cao
2025-07-03 21:52           ` Thomas Gleixner
2025-07-03 21:21         ` Thomas Gleixner
2025-07-04  2:27           ` Michael Kelley
2025-07-04  4:32             ` Nam Cao
2025-07-04  4:58               ` Michael Kelley
2025-07-05  2:52   ` kernel test robot
2025-07-05  3:51   ` Michael Kelley
2025-07-05  9:46     ` Nam Cao [this message]
2025-07-05 10:02       ` Nam Cao
2025-07-07 19:04         ` Michael Kelley
2025-06-26 14:48 ` [PATCH 15/16] PCI: vmd: Convert to lock guards Nam Cao
2025-07-03 13:34   ` Thomas Gleixner
2025-06-26 14:48 ` [PATCH 16/16] PCI: vmd: Switch to msi_create_parent_irq_domain() Nam Cao
2025-07-03 13:37   ` Thomas Gleixner
2025-07-16 18:10   ` Nirmal Patel
2025-07-16 19:41     ` Bjorn Helgaas
2025-07-16 19:52   ` Antonio Quartulli
2025-07-16 20:12     ` Nam Cao
2025-07-16 20:31       ` Bjorn Helgaas
2025-07-03 17:28 ` [PATCH 00/16] PCI: MSI parent domain conversion Bjorn Helgaas
2025-07-04  4:48   ` Nam Cao
2025-07-07  6:20     ` Manivannan Sadhasivam
2025-07-07  7:43 ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250705094655.sEu3KWbJ@linutronix.de \
    --to=namcao@linutronix.de \
    --cc=Zhiqiang.Hou@nxp.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=bhelgaas@google.com \
    --cc=daire.mcnamara@microchip.com \
    --cc=decui@microsoft.com \
    --cc=florian.fainelli@broadcom.com \
    --cc=haiyangz@microsoft.com \
    --cc=jianjun.wang@mediatek.com \
    --cc=jim2101024@gmail.com \
    --cc=jonathan.derrick@linux.dev \
    --cc=joyce.ooi@intel.com \
    --cc=kwilczynski@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=linux-rpi-kernel@lists.infradead.org \
    --cc=lpieralisi@kernel.org \
    --cc=m.karthikeyan@mobiveil.co.in \
    --cc=mani@kernel.org \
    --cc=marek.vasut+renesas@gmail.com \
    --cc=matthias.bgg@gmail.com \
    --cc=maz@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=michal.simek@amd.com \
    --cc=nirmal.patel@linux.intel.com \
    --cc=nsaenz@kernel.org \
    --cc=pali@kernel.org \
    --cc=rjui@broadcom.com \
    --cc=robh@kernel.org \
    --cc=ryder.lee@mediatek.com \
    --cc=sbranden@broadcom.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=wei.liu@kernel.org \
    --cc=yoshihiro.shimoda.uh@renesas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).