public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] xen: fix usage of devices behind a VMD bridge
@ 2025-01-10 14:01 Roger Pau Monne
  2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Roger Pau Monne @ 2025-01-10 14:01 UTC (permalink / raw)
  To: linux-kernel, xen-devel; +Cc: Roger Pau Monne

Hello,

The following series should fix the usage of devices behind a VMD bridge
when running Linux as a Xen PV hardware domain (dom0).  I've only been
able to test PV. I think PVH should also work but I don't have hardware
capable of testing it right now.

I don't expect the first two patches to be problematic, the last patch
is likely to be more controversial.  I've tested it internally and
didn't see any issues, but my testing of PV mode is mostly limited to
dom0.

Thanks, Roger.

Roger Pau Monne (3):
  xen/pci: do not register devices outside of PCI segment scope
  vmd: disable MSI remapping bypass under Xen
  pci/msi: remove pci_msi_ignore_mask

 arch/x86/pci/xen.c           |  8 ++------
 drivers/pci/controller/vmd.c |  9 +++++++++
 drivers/pci/msi/msi.c        | 36 ++++++++++++++++++++----------------
 drivers/xen/pci.c            | 19 +++++++++++++++++++
 include/linux/msi.h          |  3 ++-
 kernel/irq/msi.c             |  2 +-
 6 files changed, 53 insertions(+), 24 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-10 14:01 [PATCH 0/3] xen: fix usage of devices behind a VMD bridge Roger Pau Monne
@ 2025-01-10 14:01 ` Roger Pau Monne
  2025-01-10 22:21   ` Bjorn Helgaas
  2025-01-13  7:17   ` Jan Beulich
  2025-01-10 14:01 ` [PATCH 2/3] vmd: disable MSI remapping bypass under Xen Roger Pau Monne
  2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
  2 siblings, 2 replies; 24+ messages in thread
From: Roger Pau Monne @ 2025-01-10 14:01 UTC (permalink / raw)
  To: linux-kernel, xen-devel
  Cc: Roger Pau Monne, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko

The PCI segment value is limited to 16 bits, however there are buses like VMD
that fake being part of the PCI topology by adding segment with a number
outside the scope of the PCI firmware specification range (>= 0x10000). The
MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
width.

Attempting to register or manage those devices with Xen would result in errors
at best, or overlaps with existing devices living on the truncated equivalent
segment values.

Skip notifying Xen about those devices.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/xen/pci.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
index 416f231809cb..08e82fd1263e 100644
--- a/drivers/xen/pci.c
+++ b/drivers/xen/pci.c
@@ -43,6 +43,13 @@ static int xen_add_device(struct device *dev)
 		pci_mcfg_reserved = true;
 	}
 #endif
+
+	if (pci_domain_nr(pci_dev->bus) >> 16) {
+		dev_info(dev,
+			 "not registering with Xen: invalid PCI segment\n");
+		return 0;
+	}
+
 	if (pci_seg_supported) {
 		DEFINE_RAW_FLEX(struct physdev_pci_device_add, add, optarr, 1);
 
@@ -149,6 +156,12 @@ static int xen_remove_device(struct device *dev)
 	int r;
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 
+	if (pci_domain_nr(pci_dev->bus) >> 16) {
+		dev_info(dev,
+			 "not unregistering with Xen: invalid PCI segment\n");
+		return 0;
+	}
+
 	if (pci_seg_supported) {
 		struct physdev_pci_device device = {
 			.seg = pci_domain_nr(pci_dev->bus),
@@ -182,6 +195,12 @@ int xen_reset_device(const struct pci_dev *dev)
 		.flags = PCI_DEVICE_RESET_FLR,
 	};
 
+	if (pci_domain_nr(dev->bus) >> 16) {
+		dev_info(&dev->dev,
+			 "unable to notify Xen of device reset: invalid PCI segment\n");
+		return 0;
+	}
+
 	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_reset, &device);
 }
 EXPORT_SYMBOL_GPL(xen_reset_device);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-10 14:01 [PATCH 0/3] xen: fix usage of devices behind a VMD bridge Roger Pau Monne
  2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
@ 2025-01-10 14:01 ` Roger Pau Monne
  2025-01-10 22:25   ` Bjorn Helgaas
  2025-01-12  2:57   ` kernel test robot
  2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
  2 siblings, 2 replies; 24+ messages in thread
From: Roger Pau Monne @ 2025-01-10 14:01 UTC (permalink / raw)
  To: linux-kernel, xen-devel, linux-pci
  Cc: Roger Pau Monne, Nirmal Patel, Jonathan Derrick,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

MSI remapping bypass (directly configuring MSI entries for devices on the VMD
bus) won't work under Xen, as Xen is not aware of devices in such bus, and
hence cannot configure the entries using the pIRQ interface in the PV case, and
in the PVH case traps won't be setup for MSI entries for such devices.

Until Xen is aware of devices in the VMD bus prevent the
VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any
kind of Xen guest.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/pci/controller/vmd.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 264a180403a0..d9b7510ace29 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -965,6 +965,15 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	struct vmd_dev *vmd;
 	int err;
 
+	if (xen_domain())
+		/*
+		 * Xen doesn't have knowledge about devices in the VMD bus.
+		 * Bypass of MSI remapping won't work in that case as direct
+		 * write to the MSI entries won't result in functional
+		 * interrupts.
+		 */
+		features &= ~VMD_FEAT_CAN_BYPASS_MSI_REMAP;
+
 	if (resource_size(&dev->resource[VMD_CFGBAR]) < (1 << 20))
 		return -ENOMEM;
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-10 14:01 [PATCH 0/3] xen: fix usage of devices behind a VMD bridge Roger Pau Monne
  2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
  2025-01-10 14:01 ` [PATCH 2/3] vmd: disable MSI remapping bypass under Xen Roger Pau Monne
@ 2025-01-10 14:01 ` Roger Pau Monne
  2025-01-10 22:30   ` Bjorn Helgaas
                     ` (2 more replies)
  2 siblings, 3 replies; 24+ messages in thread
From: Roger Pau Monne @ 2025-01-10 14:01 UTC (permalink / raw)
  To: linux-kernel, xen-devel, linux-pci
  Cc: Roger Pau Monne, Juergen Gross, Bjorn Helgaas, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin

Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both MSI
and MSI-X entries globally, regardless of the IRQ chip they are using.  Only
Xen sets the pci_msi_ignore_mask when routing physical interrupts over event
channels, to prevent PCI code from attempting to toggle the maskbit, as it's
Xen that controls the bit.

However, the pci_msi_ignore_mask being global will affect devices that use MSI
interrupts but are not routing those interrupts over event channels (not using
the Xen pIRQ chip).  One example is devices behind a VMD PCI bridge.  In that
scenario the VMD bridge configures MSI(-X) using the normal IRQ chip (the pIRQ
one in the Xen case), and devices behind the bridge configure the MSI entries
using indexes into the VMD bridge MSI table.  The VMD bridge then demultiplexes
such interrupts and delivers to the destination device(s).  Having
pci_msi_ignore_mask set in that scenario prevents (un)masking of MSI entries
for devices behind the VMD bridge.

Move the signaling of no entry masking into the MSI domain flags, as that
allows setting it on a per-domain basis.  Set it for the Xen MSI domain that
uses the pIRQ chip, while leaving it unset for the rest of the cases.

Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
with Xen dropping usage the variable is unneeded.

This fixes using devices behind a VMD bridge on Xen PV hardware domains.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 arch/x86/pci/xen.c    |  8 ++------
 drivers/pci/msi/msi.c | 36 ++++++++++++++++++++----------------
 include/linux/msi.h   |  3 ++-
 kernel/irq/msi.c      |  2 +-
 4 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 0f2fe524f60d..b8755cde2419 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -436,7 +436,8 @@ static struct msi_domain_ops xen_pci_msi_domain_ops = {
 };
 
 static struct msi_domain_info xen_pci_msi_domain_info = {
-	.flags			= MSI_FLAG_PCI_MSIX | MSI_FLAG_FREE_MSI_DESCS | MSI_FLAG_DEV_SYSFS,
+	.flags			= MSI_FLAG_PCI_MSIX | MSI_FLAG_FREE_MSI_DESCS |
+				  MSI_FLAG_DEV_SYSFS | MSI_FLAG_NO_MASK,
 	.ops			= &xen_pci_msi_domain_ops,
 };
 
@@ -484,11 +485,6 @@ static __init void xen_setup_pci_msi(void)
 	 * in allocating the native domain and never use it.
 	 */
 	x86_init.irqs.create_pci_msi_domain = xen_create_pci_msi_domain;
-	/*
-	 * With XEN PIRQ/Eventchannels in use PCI/MSI[-X] masking is solely
-	 * controlled by the hypervisor.
-	 */
-	pci_msi_ignore_mask = 1;
 }
 
 #else /* CONFIG_PCI_MSI */
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 3a45879d85db..cb42298f6a97 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -15,7 +15,6 @@
 #include "msi.h"
 
 int pci_msi_enable = 1;
-int pci_msi_ignore_mask;
 
 /**
  * pci_msi_supported - check whether MSI may be enabled on a device
@@ -285,6 +284,8 @@ static void pci_msi_set_enable(struct pci_dev *dev, int enable)
 static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
 			      struct irq_affinity_desc *masks)
 {
+	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
+	const struct msi_domain_info *info = d->host_data;
 	struct msi_desc desc;
 	u16 control;
 
@@ -295,8 +296,7 @@ static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
 	/* Lies, damned lies, and MSIs */
 	if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
 		control |= PCI_MSI_FLAGS_MASKBIT;
-	/* Respect XEN's mask disabling */
-	if (pci_msi_ignore_mask)
+	if (info->flags & MSI_FLAG_NO_MASK)
 		control &= ~PCI_MSI_FLAGS_MASKBIT;
 
 	desc.nvec_used			= nvec;
@@ -600,12 +600,15 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
  */
 void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc)
 {
+	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
+	const struct msi_domain_info *info = d->host_data;
+
 	desc->nvec_used				= 1;
 	desc->pci.msi_attrib.is_msix		= 1;
 	desc->pci.msi_attrib.is_64		= 1;
 	desc->pci.msi_attrib.default_irq	= dev->irq;
 	desc->pci.mask_base			= dev->msix_base;
-	desc->pci.msi_attrib.can_mask		= !pci_msi_ignore_mask &&
+	desc->pci.msi_attrib.can_mask		= !(info->flags & MSI_FLAG_NO_MASK) &&
 						  !desc->pci.msi_attrib.is_virtual;
 
 	if (desc->pci.msi_attrib.can_mask) {
@@ -655,9 +658,6 @@ static void msix_mask_all(void __iomem *base, int tsize)
 	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
 	int i;
 
-	if (pci_msi_ignore_mask)
-		return;
-
 	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
 		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
 }
@@ -710,6 +710,8 @@ static int msix_setup_interrupts(struct pci_dev *dev, struct msix_entry *entries
 static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 				int nvec, struct irq_affinity *affd)
 {
+	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
+	const struct msi_domain_info *info = d->host_data;
 	int ret, tsize;
 	u16 control;
 
@@ -740,15 +742,17 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	/* Disable INTX */
 	pci_intx_for_msi(dev, 0);
 
-	/*
-	 * Ensure that all table entries are masked to prevent
-	 * stale entries from firing in a crash kernel.
-	 *
-	 * Done late to deal with a broken Marvell NVME device
-	 * which takes the MSI-X mask bits into account even
-	 * when MSI-X is disabled, which prevents MSI delivery.
-	 */
-	msix_mask_all(dev->msix_base, tsize);
+	if (!(info->flags & MSI_FLAG_NO_MASK)) {
+		/*
+		 * Ensure that all table entries are masked to prevent
+		 * stale entries from firing in a crash kernel.
+		 *
+		 * Done late to deal with a broken Marvell NVME device
+		 * which takes the MSI-X mask bits into account even
+		 * when MSI-X is disabled, which prevents MSI delivery.
+		 */
+		msix_mask_all(dev->msix_base, tsize);
+	}
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 
 	pcibios_free_irq(dev);
diff --git a/include/linux/msi.h b/include/linux/msi.h
index b10093c4d00e..59a421fc42bf 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -73,7 +73,6 @@ struct msi_msg {
 	};
 };
 
-extern int pci_msi_ignore_mask;
 /* Helper functions */
 struct msi_desc;
 struct pci_dev;
@@ -556,6 +555,8 @@ enum {
 	MSI_FLAG_PCI_MSIX_ALLOC_DYN	= (1 << 20),
 	/* PCI MSIs cannot be steered separately to CPU cores */
 	MSI_FLAG_NO_AFFINITY		= (1 << 21),
+	/* Inhibit usage of entry masking */
+	MSI_FLAG_NO_MASK		= (1 << 22),
 };
 
 /**
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 396a067a8a56..7682c36cbccc 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -1143,7 +1143,7 @@ static bool msi_check_reservation_mode(struct irq_domain *domain,
 	if (!(info->flags & MSI_FLAG_MUST_REACTIVATE))
 		return false;
 
-	if (IS_ENABLED(CONFIG_PCI_MSI) && pci_msi_ignore_mask)
+	if (info->flags & MSI_FLAG_NO_MASK)
 		return false;
 
 	/*
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
@ 2025-01-10 22:21   ` Bjorn Helgaas
  2025-01-13  7:20     ` Jan Beulich
  2025-01-13 10:18     ` Roger Pau Monné
  2025-01-13  7:17   ` Jan Beulich
  1 sibling, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2025-01-10 22:21 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: linux-kernel, xen-devel, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko

On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote:
> The PCI segment value is limited to 16 bits, however there are buses like VMD
> that fake being part of the PCI topology by adding segment with a number
> outside the scope of the PCI firmware specification range (>= 0x10000). The
> MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
> width.
>
> Attempting to register or manage those devices with Xen would result in errors
> at best, or overlaps with existing devices living on the truncated equivalent
> segment values.

The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding
value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly
16-bit values.

But otherwise, the segment value is pretty much an arbitrary software
value, and the kernel works fine with the larger domain values from
vmd_find_free_domain(), so this isn't quite enough to explain what the
issue with Xen is.

Does Xen truncate the domain to 16 bits or use it to look up something
in ACPI?

> Skip notifying Xen about those devices.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  drivers/xen/pci.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
> index 416f231809cb..08e82fd1263e 100644
> --- a/drivers/xen/pci.c
> +++ b/drivers/xen/pci.c
> @@ -43,6 +43,13 @@ static int xen_add_device(struct device *dev)
>  		pci_mcfg_reserved = true;
>  	}
>  #endif
> +
> +	if (pci_domain_nr(pci_dev->bus) >> 16) {
> +		dev_info(dev,
> +			 "not registering with Xen: invalid PCI segment\n");
> +		return 0;
> +	}
> +
>  	if (pci_seg_supported) {
>  		DEFINE_RAW_FLEX(struct physdev_pci_device_add, add, optarr, 1);
>  
> @@ -149,6 +156,12 @@ static int xen_remove_device(struct device *dev)
>  	int r;
>  	struct pci_dev *pci_dev = to_pci_dev(dev);
>  
> +	if (pci_domain_nr(pci_dev->bus) >> 16) {
> +		dev_info(dev,
> +			 "not unregistering with Xen: invalid PCI segment\n");
> +		return 0;
> +	}
> +
>  	if (pci_seg_supported) {
>  		struct physdev_pci_device device = {
>  			.seg = pci_domain_nr(pci_dev->bus),
> @@ -182,6 +195,12 @@ int xen_reset_device(const struct pci_dev *dev)
>  		.flags = PCI_DEVICE_RESET_FLR,
>  	};
>  
> +	if (pci_domain_nr(dev->bus) >> 16) {
> +		dev_info(&dev->dev,
> +			 "unable to notify Xen of device reset: invalid PCI segment\n");
> +		return 0;
> +	}
> +
>  	return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_reset, &device);
>  }
>  EXPORT_SYMBOL_GPL(xen_reset_device);
> -- 
> 2.46.0
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-10 14:01 ` [PATCH 2/3] vmd: disable MSI remapping bypass under Xen Roger Pau Monne
@ 2025-01-10 22:25   ` Bjorn Helgaas
  2025-01-11  5:02     ` Jonathan Derrick
  2025-01-13 10:03     ` Roger Pau Monné
  2025-01-12  2:57   ` kernel test robot
  1 sibling, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2025-01-10 22:25 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

Match historical subject line style for prefix and capitalization:

  PCI: vmd: Set devices to D0 before enabling PM L1 Substates
  PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs
  PCI: vmd: Fix indentation issue in vmd_shutdown()

On Fri, Jan 10, 2025 at 03:01:49PM +0100, Roger Pau Monne wrote:
> MSI remapping bypass (directly configuring MSI entries for devices on the VMD
> bus) won't work under Xen, as Xen is not aware of devices in such bus, and
> hence cannot configure the entries using the pIRQ interface in the PV case, and
> in the PVH case traps won't be setup for MSI entries for such devices.
> 
> Until Xen is aware of devices in the VMD bus prevent the
> VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any
> kind of Xen guest.

Wrap to fit in 75 columns.

Can you include a hint about *why* Xen is not aware of devices below
VMD?  That will help to know whether it's a permanent unfixable
situation or something that could be done eventually.

> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  drivers/pci/controller/vmd.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 264a180403a0..d9b7510ace29 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -965,6 +965,15 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  	struct vmd_dev *vmd;
>  	int err;
>  
> +	if (xen_domain())
> +		/*
> +		 * Xen doesn't have knowledge about devices in the VMD bus.

Also here.

> +		 * Bypass of MSI remapping won't work in that case as direct
> +		 * write to the MSI entries won't result in functional
> +		 * interrupts.
> +		 */
> +		features &= ~VMD_FEAT_CAN_BYPASS_MSI_REMAP;
> +
>  	if (resource_size(&dev->resource[VMD_CFGBAR]) < (1 << 20))
>  		return -ENOMEM;
>  
> -- 
> 2.46.0
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
@ 2025-01-10 22:30   ` Bjorn Helgaas
  2025-01-13 10:25     ` Roger Pau Monné
  2025-01-11 11:08   ` kernel test robot
  2025-01-11 12:24   ` kernel test robot
  2 siblings, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2025-01-10 22:30 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: linux-kernel, xen-devel, linux-pci, Juergen Gross, Bjorn Helgaas,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin

Match subject line style again.

On Fri, Jan 10, 2025 at 03:01:50PM +0100, Roger Pau Monne wrote:
> Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both MSI
> and MSI-X entries globally, regardless of the IRQ chip they are using.  Only
> Xen sets the pci_msi_ignore_mask when routing physical interrupts over event
> channels, to prevent PCI code from attempting to toggle the maskbit, as it's
> Xen that controls the bit.
> 
> However, the pci_msi_ignore_mask being global will affect devices that use MSI
> interrupts but are not routing those interrupts over event channels (not using
> the Xen pIRQ chip).  One example is devices behind a VMD PCI bridge.  In that
> scenario the VMD bridge configures MSI(-X) using the normal IRQ chip (the pIRQ
> one in the Xen case), and devices behind the bridge configure the MSI entries
> using indexes into the VMD bridge MSI table.  The VMD bridge then demultiplexes
> such interrupts and delivers to the destination device(s).  Having
> pci_msi_ignore_mask set in that scenario prevents (un)masking of MSI entries
> for devices behind the VMD bridge.
> 
> Move the signaling of no entry masking into the MSI domain flags, as that
> allows setting it on a per-domain basis.  Set it for the Xen MSI domain that
> uses the pIRQ chip, while leaving it unset for the rest of the cases.
> 
> Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
> with Xen dropping usage the variable is unneeded.
> 
> This fixes using devices behind a VMD bridge on Xen PV hardware domains.

Wrap to fit in 75 columns.

The first two patches talk about devices behind VMD not being usable
for Xen, but this one says they now work.  But this doesn't undo the
code changes or comments added by the first two, so the result is a
bit confusing (probably because I know nothing about Xen).

Bjorn

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-10 22:25   ` Bjorn Helgaas
@ 2025-01-11  5:02     ` Jonathan Derrick
  2025-01-13 10:07       ` Roger Pau Monné
  2025-01-13 10:03     ` Roger Pau Monné
  1 sibling, 1 reply; 24+ messages in thread
From: Jonathan Derrick @ 2025-01-11  5:02 UTC (permalink / raw)
  To: Bjorn Helgaas, Roger Pau Monne
  Cc: linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

Hi Bjorn,

On 1/10/25 3:25 PM, Bjorn Helgaas wrote:
> Match historical subject line style for prefix and capitalization:
> 
>    PCI: vmd: Set devices to D0 before enabling PM L1 Substates
>    PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs
>    PCI: vmd: Fix indentation issue in vmd_shutdown()
> 
> On Fri, Jan 10, 2025 at 03:01:49PM +0100, Roger Pau Monne wrote:
>> MSI remapping bypass (directly configuring MSI entries for devices on the VMD
>> bus) won't work under Xen, as Xen is not aware of devices in such bus, and
>> hence cannot configure the entries using the pIRQ interface in the PV case, and
>> in the PVH case traps won't be setup for MSI entries for such devices.
>>
>> Until Xen is aware of devices in the VMD bus prevent the
>> VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any
>> kind of Xen guest.
> 
> Wrap to fit in 75 columns.
> 
> Can you include a hint about *why* Xen is not aware of devices below
> VMD?  That will help to know whether it's a permanent unfixable
> situation or something that could be done eventually.
> 
I wasn't aware of the Xen issue with VMD but if I had to guess it's 
probably due to the special handling of the downstream device into the 
dmar table.

[1] 4fda230ecddc2
[2] 9b4a824b889e1


>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>>   drivers/pci/controller/vmd.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
>> index 264a180403a0..d9b7510ace29 100644
>> --- a/drivers/pci/controller/vmd.c
>> +++ b/drivers/pci/controller/vmd.c
>> @@ -965,6 +965,15 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
>>   	struct vmd_dev *vmd;
>>   	int err;
>>   
>> +	if (xen_domain())
>> +		/*
>> +		 * Xen doesn't have knowledge about devices in the VMD bus.
> 
> Also here.
> 
>> +		 * Bypass of MSI remapping won't work in that case as direct
>> +		 * write to the MSI entries won't result in functional
>> +		 * interrupts.
>> +		 */
>> +		features &= ~VMD_FEAT_CAN_BYPASS_MSI_REMAP;
>> +
>>   	if (resource_size(&dev->resource[VMD_CFGBAR]) < (1 << 20))
>>   		return -ENOMEM;
>>   
>> -- 
>> 2.46.0
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
  2025-01-10 22:30   ` Bjorn Helgaas
@ 2025-01-11 11:08   ` kernel test robot
  2025-01-11 12:24   ` kernel test robot
  2 siblings, 0 replies; 24+ messages in thread
From: kernel test robot @ 2025-01-11 11:08 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel, xen-devel, linux-pci
  Cc: llvm, oe-kbuild-all, Roger Pau Monne, Juergen Gross,
	Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin

Hi Roger,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus xen-tip/linux-next tip/irq/core linus/master v6.13-rc6 next-20250110]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Roger-Pau-Monne/xen-pci-do-not-register-devices-outside-of-PCI-segment-scope/20250110-220331
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20250110140152.27624-4-roger.pau%40citrix.com
patch subject: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
config: arm64-randconfig-003-20250111 (https://download.01.org/0day-ci/archive/20250111/202501111839.HXJGe5FL-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250111/202501111839.HXJGe5FL-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501111839.HXJGe5FL-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/pci/msi/msi.c:288:40: error: incomplete definition of type 'struct irq_domain'
     288 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   drivers/pci/msi/msi.c:604:40: error: incomplete definition of type 'struct irq_domain'
     604 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   drivers/pci/msi/msi.c:714:40: error: incomplete definition of type 'struct irq_domain'
     714 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   3 errors generated.


vim +288 drivers/pci/msi/msi.c

   283	
   284	static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
   285				      struct irq_affinity_desc *masks)
   286	{
   287		const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
 > 288		const struct msi_domain_info *info = d->host_data;
   289		struct msi_desc desc;
   290		u16 control;
   291	
   292		/* MSI Entry Initialization */
   293		memset(&desc, 0, sizeof(desc));
   294	
   295		pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
   296		/* Lies, damned lies, and MSIs */
   297		if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
   298			control |= PCI_MSI_FLAGS_MASKBIT;
   299		if (info->flags & MSI_FLAG_NO_MASK)
   300			control &= ~PCI_MSI_FLAGS_MASKBIT;
   301	
   302		desc.nvec_used			= nvec;
   303		desc.pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
   304		desc.pci.msi_attrib.can_mask	= !!(control & PCI_MSI_FLAGS_MASKBIT);
   305		desc.pci.msi_attrib.default_irq	= dev->irq;
   306		desc.pci.msi_attrib.multi_cap	= FIELD_GET(PCI_MSI_FLAGS_QMASK, control);
   307		desc.pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
   308		desc.affinity			= masks;
   309	
   310		if (control & PCI_MSI_FLAGS_64BIT)
   311			desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
   312		else
   313			desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
   314	
   315		/* Save the initial mask status */
   316		if (desc.pci.msi_attrib.can_mask)
   317			pci_read_config_dword(dev, desc.pci.mask_pos, &desc.pci.msi_mask);
   318	
   319		return msi_insert_msi_desc(&dev->dev, &desc);
   320	}
   321	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
  2025-01-10 22:30   ` Bjorn Helgaas
  2025-01-11 11:08   ` kernel test robot
@ 2025-01-11 12:24   ` kernel test robot
  2 siblings, 0 replies; 24+ messages in thread
From: kernel test robot @ 2025-01-11 12:24 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel, xen-devel, linux-pci
  Cc: llvm, oe-kbuild-all, Roger Pau Monne, Juergen Gross,
	Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin

Hi Roger,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus xen-tip/linux-next tip/irq/core linus/master v6.13-rc6 next-20250110]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Roger-Pau-Monne/xen-pci-do-not-register-devices-outside-of-PCI-segment-scope/20250110-220331
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20250110140152.27624-4-roger.pau%40citrix.com
patch subject: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
config: riscv-defconfig (https://download.01.org/0day-ci/archive/20250111/202501112048.6yCFh2ma-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250111/202501112048.6yCFh2ma-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501112048.6yCFh2ma-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/pci/msi/msi.c:12:
   In file included from include/linux/irq.h:23:
   In file included from arch/riscv/include/asm/irq.h:10:
   In file included from include/linux/interrupt.h:22:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2223:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
>> drivers/pci/msi/msi.c:288:40: error: incomplete definition of type 'const struct irq_domain'
     288 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   drivers/pci/msi/msi.c:604:40: error: incomplete definition of type 'const struct irq_domain'
     604 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   drivers/pci/msi/msi.c:714:40: error: incomplete definition of type 'const struct irq_domain'
     714 |         const struct msi_domain_info *info = d->host_data;
         |                                              ~^
   include/linux/irq.h:130:8: note: forward declaration of 'struct irq_domain'
     130 | struct irq_domain;
         |        ^
   1 warning and 3 errors generated.


vim +288 drivers/pci/msi/msi.c

   283	
   284	static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
   285				      struct irq_affinity_desc *masks)
   286	{
   287		const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
 > 288		const struct msi_domain_info *info = d->host_data;
   289		struct msi_desc desc;
   290		u16 control;
   291	
   292		/* MSI Entry Initialization */
   293		memset(&desc, 0, sizeof(desc));
   294	
   295		pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
   296		/* Lies, damned lies, and MSIs */
   297		if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
   298			control |= PCI_MSI_FLAGS_MASKBIT;
   299		if (info->flags & MSI_FLAG_NO_MASK)
   300			control &= ~PCI_MSI_FLAGS_MASKBIT;
   301	
   302		desc.nvec_used			= nvec;
   303		desc.pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
   304		desc.pci.msi_attrib.can_mask	= !!(control & PCI_MSI_FLAGS_MASKBIT);
   305		desc.pci.msi_attrib.default_irq	= dev->irq;
   306		desc.pci.msi_attrib.multi_cap	= FIELD_GET(PCI_MSI_FLAGS_QMASK, control);
   307		desc.pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
   308		desc.affinity			= masks;
   309	
   310		if (control & PCI_MSI_FLAGS_64BIT)
   311			desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
   312		else
   313			desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
   314	
   315		/* Save the initial mask status */
   316		if (desc.pci.msi_attrib.can_mask)
   317			pci_read_config_dword(dev, desc.pci.mask_pos, &desc.pci.msi_mask);
   318	
   319		return msi_insert_msi_desc(&dev->dev, &desc);
   320	}
   321	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-10 14:01 ` [PATCH 2/3] vmd: disable MSI remapping bypass under Xen Roger Pau Monne
  2025-01-10 22:25   ` Bjorn Helgaas
@ 2025-01-12  2:57   ` kernel test robot
  1 sibling, 0 replies; 24+ messages in thread
From: kernel test robot @ 2025-01-12  2:57 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel, xen-devel, linux-pci
  Cc: oe-kbuild-all, Roger Pau Monne, Nirmal Patel, Jonathan Derrick,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

Hi Roger,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus tip/irq/core linus/master v6.13-rc6 next-20250110]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Roger-Pau-Monne/xen-pci-do-not-register-devices-outside-of-PCI-segment-scope/20250110-220331
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20250110140152.27624-3-roger.pau%40citrix.com
patch subject: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
config: x86_64-buildonly-randconfig-001-20250112 (https://download.01.org/0day-ci/archive/20250112/202501121029.dJk0TBPr-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250112/202501121029.dJk0TBPr-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501121029.dJk0TBPr-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/pci/controller/vmd.c: In function 'vmd_probe':
>> drivers/pci/controller/vmd.c:973:13: error: implicit declaration of function 'xen_domain' [-Werror=implicit-function-declaration]
     973 |         if (xen_domain())
         |             ^~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/xen_domain +973 drivers/pci/controller/vmd.c

   966	
   967	static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
   968	{
   969		unsigned long features = (unsigned long) id->driver_data;
   970		struct vmd_dev *vmd;
   971		int err;
   972	
 > 973		if (xen_domain())
   974			/*
   975			 * Xen doesn't have knowledge about devices in the VMD bus.
   976			 * Bypass of MSI remapping won't work in that case as direct
   977			 * write to the MSI entries won't result in functional
   978			 * interrupts.
   979			 */
   980			features &= ~VMD_FEAT_CAN_BYPASS_MSI_REMAP;
   981	
   982		if (resource_size(&dev->resource[VMD_CFGBAR]) < (1 << 20))
   983			return -ENOMEM;
   984	
   985		vmd = devm_kzalloc(&dev->dev, sizeof(*vmd), GFP_KERNEL);
   986		if (!vmd)
   987			return -ENOMEM;
   988	
   989		vmd->dev = dev;
   990		vmd->instance = ida_alloc(&vmd_instance_ida, GFP_KERNEL);
   991		if (vmd->instance < 0)
   992			return vmd->instance;
   993	
   994		vmd->name = devm_kasprintf(&dev->dev, GFP_KERNEL, "vmd%d",
   995					   vmd->instance);
   996		if (!vmd->name) {
   997			err = -ENOMEM;
   998			goto out_release_instance;
   999		}
  1000	
  1001		err = pcim_enable_device(dev);
  1002		if (err < 0)
  1003			goto out_release_instance;
  1004	
  1005		vmd->cfgbar = pcim_iomap(dev, VMD_CFGBAR, 0);
  1006		if (!vmd->cfgbar) {
  1007			err = -ENOMEM;
  1008			goto out_release_instance;
  1009		}
  1010	
  1011		pci_set_master(dev);
  1012		if (dma_set_mask_and_coherent(&dev->dev, DMA_BIT_MASK(64)) &&
  1013		    dma_set_mask_and_coherent(&dev->dev, DMA_BIT_MASK(32))) {
  1014			err = -ENODEV;
  1015			goto out_release_instance;
  1016		}
  1017	
  1018		if (features & VMD_FEAT_OFFSET_FIRST_VECTOR)
  1019			vmd->first_vec = 1;
  1020	
  1021		spin_lock_init(&vmd->cfg_lock);
  1022		pci_set_drvdata(dev, vmd);
  1023		err = vmd_enable_domain(vmd, features);
  1024		if (err)
  1025			goto out_release_instance;
  1026	
  1027		dev_info(&vmd->dev->dev, "Bound to PCI domain %04x\n",
  1028			 vmd->sysdata.domain);
  1029		return 0;
  1030	
  1031	 out_release_instance:
  1032		ida_free(&vmd_instance_ida, vmd->instance);
  1033		return err;
  1034	}
  1035	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
  2025-01-10 22:21   ` Bjorn Helgaas
@ 2025-01-13  7:17   ` Jan Beulich
  2025-01-13 10:13     ` Roger Pau Monné
  1 sibling, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2025-01-13  7:17 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	linux-kernel, xen-devel

On 10.01.2025 15:01, Roger Pau Monne wrote:
> The PCI segment value is limited to 16 bits, however there are buses like VMD
> that fake being part of the PCI topology by adding segment with a number
> outside the scope of the PCI firmware specification range (>= 0x10000). The
> MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
> width.
> 
> Attempting to register or manage those devices with Xen would result in errors
> at best, or overlaps with existing devices living on the truncated equivalent
> segment values.
> 
> Skip notifying Xen about those devices.

Hmm, is simply omitting the notification really all it takes? How would Xen
manage MSI / MSI-X, for example, without knowing of the device? As per the
BoF on the summit in Prague(?), I continue to think we need partial driver
logic in Xen for VMD ...

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-10 22:21   ` Bjorn Helgaas
@ 2025-01-13  7:20     ` Jan Beulich
  2025-01-13 10:18     ` Roger Pau Monné
  1 sibling, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2025-01-13  7:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, xen-devel, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Roger Pau Monne

On 10.01.2025 23:21, Bjorn Helgaas wrote:
> On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote:
>> The PCI segment value is limited to 16 bits, however there are buses like VMD
>> that fake being part of the PCI topology by adding segment with a number
>> outside the scope of the PCI firmware specification range (>= 0x10000). The
>> MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
>> width.
>>
>> Attempting to register or manage those devices with Xen would result in errors
>> at best, or overlaps with existing devices living on the truncated equivalent
>> segment values.
> 
> The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding
> value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly
> 16-bit values.
> 
> But otherwise, the segment value is pretty much an arbitrary software
> value, and the kernel works fine with the larger domain values from
> vmd_find_free_domain(), so this isn't quite enough to explain what the
> issue with Xen is.
> 
> Does Xen truncate the domain to 16 bits or use it to look up something
> in ACPI?

One of the involved public interface structs starts like this:

struct physdev_pci_device_add {
    /* IN */
    uint16_t seg;
    uint8_t bus;
    uint8_t devfn;
    ...

So yes, wider segment values would be truncated. Plus, even if they weren't,
there would need to be coordination between Dom0 and Xen on which devices
gets which segment number, since - as you say - the assignment in Linux is
pretty much arbitrary.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-10 22:25   ` Bjorn Helgaas
  2025-01-11  5:02     ` Jonathan Derrick
@ 2025-01-13 10:03     ` Roger Pau Monné
  2025-01-13 15:11       ` Keith Busch
  1 sibling, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 10:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Fri, Jan 10, 2025 at 04:25:25PM -0600, Bjorn Helgaas wrote:
> Match historical subject line style for prefix and capitalization:
> 
>   PCI: vmd: Set devices to D0 before enabling PM L1 Substates
>   PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs
>   PCI: vmd: Fix indentation issue in vmd_shutdown()
> 
> On Fri, Jan 10, 2025 at 03:01:49PM +0100, Roger Pau Monne wrote:
> > MSI remapping bypass (directly configuring MSI entries for devices on the VMD
> > bus) won't work under Xen, as Xen is not aware of devices in such bus, and
> > hence cannot configure the entries using the pIRQ interface in the PV case, and
> > in the PVH case traps won't be setup for MSI entries for such devices.
> > 
> > Until Xen is aware of devices in the VMD bus prevent the
> > VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any
> > kind of Xen guest.
> 
> Wrap to fit in 75 columns.

Hm, OK, but isn't the limit 80 columns according to the kernel coding
style (Documentation/process/coding-style.rst)?

I don't mind adjusting, but if you are going to ask every submitter to
limit to 75 columns then the coding style document should be updated
to reflect that.

> Can you include a hint about *why* Xen is not aware of devices below
> VMD?  That will help to know whether it's a permanent unfixable
> situation or something that could be done eventually.

Xen would need to be made aware of the devices exposed behind the VMD
bridge, so it can manage them.  For example Xen is the entity that
controls the local APICs, and hence interrupts must be configured by
Xen.  Xen needs knowledge about the devices behind the VMD bridge,
and how to access those devices PCI config space to at least configure
MSI or MSI-X capabilities.  It could possibly be exposed similarly to
how Xen currently deals with ECAM areas.

None of this is present at the moment, could always be added later and
Linux be made aware that the limitation no longer applies.  That would
require changes in both Xen and Linux to propagate the VMD information
into Xen.

> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> >  drivers/pci/controller/vmd.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index 264a180403a0..d9b7510ace29 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -965,6 +965,15 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
> >  	struct vmd_dev *vmd;
> >  	int err;
> >  
> > +	if (xen_domain())
> > +		/*
> > +		 * Xen doesn't have knowledge about devices in the VMD bus.
> 
> Also here.

Would you be OK with something like:

"Xen doesn't have knowledge about devices in the VMD bus because the
config space of devices behind the VMD bridge is not known to Xen, and
hence Xen cannot discover or configure them in any way.

Bypass of MSI remapping won't work in that case as direct write by
Linux to the MSI entries won't result in functional interrupts, as
it's Xen the entity that manages the local APIC and must configure
interrupts."

Thanks, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-11  5:02     ` Jonathan Derrick
@ 2025-01-13 10:07       ` Roger Pau Monné
  0 siblings, 0 replies; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 10:07 UTC (permalink / raw)
  To: Jonathan Derrick
  Cc: Bjorn Helgaas, linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Fri, Jan 10, 2025 at 10:02:00PM -0700, Jonathan Derrick wrote:
> Hi Bjorn,
> 
> On 1/10/25 3:25 PM, Bjorn Helgaas wrote:
> > Match historical subject line style for prefix and capitalization:
> > 
> >    PCI: vmd: Set devices to D0 before enabling PM L1 Substates
> >    PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs
> >    PCI: vmd: Fix indentation issue in vmd_shutdown()
> > 
> > On Fri, Jan 10, 2025 at 03:01:49PM +0100, Roger Pau Monne wrote:
> > > MSI remapping bypass (directly configuring MSI entries for devices on the VMD
> > > bus) won't work under Xen, as Xen is not aware of devices in such bus, and
> > > hence cannot configure the entries using the pIRQ interface in the PV case, and
> > > in the PVH case traps won't be setup for MSI entries for such devices.
> > > 
> > > Until Xen is aware of devices in the VMD bus prevent the
> > > VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any
> > > kind of Xen guest.
> > 
> > Wrap to fit in 75 columns.
> > 
> > Can you include a hint about *why* Xen is not aware of devices below
> > VMD?  That will help to know whether it's a permanent unfixable
> > situation or something that could be done eventually.
> > 
> I wasn't aware of the Xen issue with VMD but if I had to guess it's probably
> due to the special handling of the downstream device into the dmar table.

Nothing to do with DMAR or IOMMUs, it's just that on a Xen system it
must be Xen the one that configures the MSI entries, and that requires
Xen being aware of the VMD devices and it's MSI or MSI-X
capabilities.

None of this is currently done, as Xen has no visibility at all of
devices behind a VMD bridge because is doesn't even know about VMD
bridges, neither about the exposed ECAM-like region on those
devices.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-13  7:17   ` Jan Beulich
@ 2025-01-13 10:13     ` Roger Pau Monné
  0 siblings, 0 replies; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 10:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	linux-kernel, xen-devel

On Mon, Jan 13, 2025 at 08:17:23AM +0100, Jan Beulich wrote:
> On 10.01.2025 15:01, Roger Pau Monne wrote:
> > The PCI segment value is limited to 16 bits, however there are buses like VMD
> > that fake being part of the PCI topology by adding segment with a number
> > outside the scope of the PCI firmware specification range (>= 0x10000). The
> > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
> > width.
> > 
> > Attempting to register or manage those devices with Xen would result in errors
> > at best, or overlaps with existing devices living on the truncated equivalent
> > segment values.
> > 
> > Skip notifying Xen about those devices.
> 
> Hmm, is simply omitting the notification really all it takes? How would Xen
> manage MSI / MSI-X, for example, without knowing of the device? As per the
> BoF on the summit in Prague(?), I continue to think we need partial driver
> logic in Xen for VMD ...

The basic mode of operation of devices behind a VMD bridge is to
reference the interrupts of the bridge device in its MSI(-X) entries,
so the VMD bridge acts as a de-multiplexer and forwards the interrupts
to the device behind the VMD bridge.  See vmd_alloc_irqs() (and
calling context) in the VMD driver for a reference about how this is
setup and operates.  Note also that devices behind a VMD bridge
operate using a different MSI domain, that uses a custom
irq_compose_msi_msg hook.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-10 22:21   ` Bjorn Helgaas
  2025-01-13  7:20     ` Jan Beulich
@ 2025-01-13 10:18     ` Roger Pau Monné
  2025-01-13 23:29       ` Bjorn Helgaas
  1 sibling, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 10:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, xen-devel, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko

On Fri, Jan 10, 2025 at 04:21:29PM -0600, Bjorn Helgaas wrote:
> On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote:
> > The PCI segment value is limited to 16 bits, however there are buses like VMD
> > that fake being part of the PCI topology by adding segment with a number
> > outside the scope of the PCI firmware specification range (>= 0x10000). The
> > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
> > width.
> >
> > Attempting to register or manage those devices with Xen would result in errors
> > at best, or overlaps with existing devices living on the truncated equivalent
> > segment values.
> 
> The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding
> value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly
> 16-bit values.
> 
> But otherwise, the segment value is pretty much an arbitrary software
> value, and the kernel works fine with the larger domain values from
> vmd_find_free_domain(), so this isn't quite enough to explain what the
> issue with Xen is.
> 
> Does Xen truncate the domain to 16 bits or use it to look up something
> in ACPI?

In the interface between Xen and Linux the segment field is 16 bit
width, so with the current interface is not possible to reference
devices that are past the 0xffff segment.

I also wonder whether Xen and Linux (or guest OSes in general) would
agree on how to reference such devices.  AFAICT VMD segment numbers
are purely a software construct, but not something enforced by the
specification.  Could for example FreeBSD assign a different segment
to VMD devices?

If so we would need some kind of specification about how VMD segment
values are assigned so that OSes have a coherent way of referencing
VMD devices without ambiguity.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-10 22:30   ` Bjorn Helgaas
@ 2025-01-13 10:25     ` Roger Pau Monné
  2025-01-13 23:32       ` Bjorn Helgaas
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 10:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, xen-devel, linux-pci, Juergen Gross, Bjorn Helgaas,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin

On Fri, Jan 10, 2025 at 04:30:57PM -0600, Bjorn Helgaas wrote:
> Match subject line style again.
> 
> On Fri, Jan 10, 2025 at 03:01:50PM +0100, Roger Pau Monne wrote:
> > Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both MSI
> > and MSI-X entries globally, regardless of the IRQ chip they are using.  Only
> > Xen sets the pci_msi_ignore_mask when routing physical interrupts over event
> > channels, to prevent PCI code from attempting to toggle the maskbit, as it's
> > Xen that controls the bit.
> > 
> > However, the pci_msi_ignore_mask being global will affect devices that use MSI
> > interrupts but are not routing those interrupts over event channels (not using
> > the Xen pIRQ chip).  One example is devices behind a VMD PCI bridge.  In that
> > scenario the VMD bridge configures MSI(-X) using the normal IRQ chip (the pIRQ
> > one in the Xen case), and devices behind the bridge configure the MSI entries
> > using indexes into the VMD bridge MSI table.  The VMD bridge then demultiplexes
> > such interrupts and delivers to the destination device(s).  Having
> > pci_msi_ignore_mask set in that scenario prevents (un)masking of MSI entries
> > for devices behind the VMD bridge.
> > 
> > Move the signaling of no entry masking into the MSI domain flags, as that
> > allows setting it on a per-domain basis.  Set it for the Xen MSI domain that
> > uses the pIRQ chip, while leaving it unset for the rest of the cases.
> > 
> > Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
> > with Xen dropping usage the variable is unneeded.
> > 
> > This fixes using devices behind a VMD bridge on Xen PV hardware domains.
> 
> Wrap to fit in 75 columns.
> 
> The first two patches talk about devices behind VMD not being usable
> for Xen, but this one says they now work.

Sorry, let me try to clarify:

Devices behind a VMD bridge are not known to Xen, however that doesn't
mean Linux cannot use them.  That's what this series achieves.  By
inhibiting the usage of VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal
of the pci_msi_ignore_mask bodge devices behind a VMD bridge do work
fine when use from a Linux Xen hardware domain.  That's the whole
point of the series.

> But this doesn't undo the
> code changes or comments added by the first two, so the result is a
> bit confusing (probably because I know nothing about Xen).

All patches are needed.  There's probably some confusion about devices
behind a VMD bridge not being known by Xen vs not being usable by
Linux running under Xen as a hardware domain.

With all three patches applied devices behind a VMD bridge work under
Linux with Xen.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-13 10:03     ` Roger Pau Monné
@ 2025-01-13 15:11       ` Keith Busch
  2025-01-13 16:45         ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Keith Busch @ 2025-01-13 15:11 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Bjorn Helgaas, linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczy´nski,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Mon, Jan 13, 2025 at 11:03:58AM +0100, Roger Pau Monné wrote:
> 
> Hm, OK, but isn't the limit 80 columns according to the kernel coding
> style (Documentation/process/coding-style.rst)?

That's the coding style. The commit message style is described in a
different doc:

  https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-format

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-13 15:11       ` Keith Busch
@ 2025-01-13 16:45         ` Roger Pau Monné
  2025-01-13 16:53           ` Keith Busch
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-13 16:45 UTC (permalink / raw)
  To: Keith Busch
  Cc: Bjorn Helgaas, linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczy´nski,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Mon, Jan 13, 2025 at 08:11:19AM -0700, Keith Busch wrote:
> On Mon, Jan 13, 2025 at 11:03:58AM +0100, Roger Pau Monné wrote:
> > 
> > Hm, OK, but isn't the limit 80 columns according to the kernel coding
> > style (Documentation/process/coding-style.rst)?
> 
> That's the coding style. The commit message style is described in a
> different doc:
> 
>   https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-format

It's quite confusing to have different line length for code vs commit
messages, but my fault for not reading those. Will adjust to 75
columns them.

Regards, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-13 16:45         ` Roger Pau Monné
@ 2025-01-13 16:53           ` Keith Busch
  2025-01-14 11:03             ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Keith Busch @ 2025-01-13 16:53 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Bjorn Helgaas, linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczy´nski,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Mon, Jan 13, 2025 at 05:45:20PM +0100, Roger Pau Monné wrote:
> On Mon, Jan 13, 2025 at 08:11:19AM -0700, Keith Busch wrote:
> > On Mon, Jan 13, 2025 at 11:03:58AM +0100, Roger Pau Monné wrote:
> > > 
> > > Hm, OK, but isn't the limit 80 columns according to the kernel coding
> > > style (Documentation/process/coding-style.rst)?
> > 
> > That's the coding style. The commit message style is described in a
> > different doc:
> > 
> >   https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-format
> 
> It's quite confusing to have different line length for code vs commit
> messages, but my fault for not reading those. Will adjust to 75
> columns them.

The output of 'git log' prepends spaces to the subject and body of the
listed commits. The lower limit for commit messages vs. code makes the
change log look readable in an 80-char terminal.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
  2025-01-13 10:18     ` Roger Pau Monné
@ 2025-01-13 23:29       ` Bjorn Helgaas
  0 siblings, 0 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2025-01-13 23:29 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, xen-devel, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko

On Mon, Jan 13, 2025 at 11:18:57AM +0100, Roger Pau Monné wrote:
> On Fri, Jan 10, 2025 at 04:21:29PM -0600, Bjorn Helgaas wrote:
> > On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote:
> > > The PCI segment value is limited to 16 bits, however there are buses like VMD
> > > that fake being part of the PCI topology by adding segment with a number
> > > outside the scope of the PCI firmware specification range (>= 0x10000). The
> > > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit
> > > width.
> > >
> > > Attempting to register or manage those devices with Xen would result in errors
> > > at best, or overlaps with existing devices living on the truncated equivalent
> > > segment values.
> > 
> > The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding
> > value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly
> > 16-bit values.
> > 
> > But otherwise, the segment value is pretty much an arbitrary software
> > value, and the kernel works fine with the larger domain values from
> > vmd_find_free_domain(), so this isn't quite enough to explain what the
> > issue with Xen is.
> > 
> > Does Xen truncate the domain to 16 bits or use it to look up something
> > in ACPI?
> 
> In the interface between Xen and Linux the segment field is 16 bit
> width, so with the current interface is not possible to reference
> devices that are past the 0xffff segment.

I think this specific reason (and maybe even struct
physdev_pci_device_add) would be more useful than the ACPI _SEG and
MCFG things, which are not as directly connected here.

Bjorn

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask
  2025-01-13 10:25     ` Roger Pau Monné
@ 2025-01-13 23:32       ` Bjorn Helgaas
  0 siblings, 0 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2025-01-13 23:32 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, xen-devel, linux-pci, Juergen Gross, Bjorn Helgaas,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin

On Mon, Jan 13, 2025 at 11:25:58AM +0100, Roger Pau Monné wrote:
> On Fri, Jan 10, 2025 at 04:30:57PM -0600, Bjorn Helgaas wrote:
> > On Fri, Jan 10, 2025 at 03:01:50PM +0100, Roger Pau Monne wrote:
> > > Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both MSI
> > > and MSI-X entries globally, regardless of the IRQ chip they are using.  Only
> > > Xen sets the pci_msi_ignore_mask when routing physical interrupts over event
> > > channels, to prevent PCI code from attempting to toggle the maskbit, as it's
> > > Xen that controls the bit.
> > > 
> > > However, the pci_msi_ignore_mask being global will affect devices that use MSI
> > > interrupts but are not routing those interrupts over event channels (not using
> > > the Xen pIRQ chip).  One example is devices behind a VMD PCI bridge.  In that
> > > scenario the VMD bridge configures MSI(-X) using the normal IRQ chip (the pIRQ
> > > one in the Xen case), and devices behind the bridge configure the MSI entries
> > > using indexes into the VMD bridge MSI table.  The VMD bridge then demultiplexes
> > > such interrupts and delivers to the destination device(s).  Having
> > > pci_msi_ignore_mask set in that scenario prevents (un)masking of MSI entries
> > > for devices behind the VMD bridge.
> > > 
> > > Move the signaling of no entry masking into the MSI domain flags, as that
> > > allows setting it on a per-domain basis.  Set it for the Xen MSI domain that
> > > uses the pIRQ chip, while leaving it unset for the rest of the cases.
> > > 
> > > Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
> > > with Xen dropping usage the variable is unneeded.
> > > 
> > > This fixes using devices behind a VMD bridge on Xen PV hardware domains.
> > 
> > Wrap to fit in 75 columns.
> > 
> > The first two patches talk about devices behind VMD not being usable
> > for Xen, but this one says they now work.
> 
> Sorry, let me try to clarify:
> 
> Devices behind a VMD bridge are not known to Xen, however that doesn't
> mean Linux cannot use them.  That's what this series achieves.  By
> inhibiting the usage of VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal
> of the pci_msi_ignore_mask bodge devices behind a VMD bridge do work
> fine when use from a Linux Xen hardware domain.  That's the whole
> point of the series.
> 
> > But this doesn't undo the
> > code changes or comments added by the first two, so the result is a
> > bit confusing (probably because I know nothing about Xen).
> 
> All patches are needed.  There's probably some confusion about devices
> behind a VMD bridge not being known by Xen vs not being usable by
> Linux running under Xen as a hardware domain.
> 
> With all three patches applied devices behind a VMD bridge work under
> Linux with Xen.

There's certainly confusion in *my* mind because I don't know enough
about Xen to understand the subtlety about devices behind VMD not
being known by Xen but still being usable by Linux running under Xen.

Bjorn

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] vmd: disable MSI remapping bypass under Xen
  2025-01-13 16:53           ` Keith Busch
@ 2025-01-14 11:03             ` Roger Pau Monné
  0 siblings, 0 replies; 24+ messages in thread
From: Roger Pau Monné @ 2025-01-14 11:03 UTC (permalink / raw)
  To: Keith Busch
  Cc: Bjorn Helgaas, linux-kernel, xen-devel, linux-pci, Nirmal Patel,
	Jonathan Derrick, Lorenzo Pieralisi, Krzysztof Wilczy´nski,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas

On Mon, Jan 13, 2025 at 09:53:21AM -0700, Keith Busch wrote:
> On Mon, Jan 13, 2025 at 05:45:20PM +0100, Roger Pau Monné wrote:
> > On Mon, Jan 13, 2025 at 08:11:19AM -0700, Keith Busch wrote:
> > > On Mon, Jan 13, 2025 at 11:03:58AM +0100, Roger Pau Monné wrote:
> > > > 
> > > > Hm, OK, but isn't the limit 80 columns according to the kernel coding
> > > > style (Documentation/process/coding-style.rst)?
> > > 
> > > That's the coding style. The commit message style is described in a
> > > different doc:
> > > 
> > >   https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-format
> > 
> > It's quite confusing to have different line length for code vs commit
> > messages, but my fault for not reading those. Will adjust to 75
> > columns them.
> 
> The output of 'git log' prepends spaces to the subject and body of the
> listed commits. The lower limit for commit messages vs. code makes the
> change log look readable in an 80-char terminal.

Oh, I see, thanks for explaining.

The 80 column limit for code mean the resulting diff (and `git log`
output) could have a maximum width of 81 characters (because of the
prepended +,-, ), but since Linux uses hard tabs for indentation this
is not really an issue as it would be if using spaces.

Regards, Roger.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-01-14 11:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-10 14:01 [PATCH 0/3] xen: fix usage of devices behind a VMD bridge Roger Pau Monne
2025-01-10 14:01 ` [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope Roger Pau Monne
2025-01-10 22:21   ` Bjorn Helgaas
2025-01-13  7:20     ` Jan Beulich
2025-01-13 10:18     ` Roger Pau Monné
2025-01-13 23:29       ` Bjorn Helgaas
2025-01-13  7:17   ` Jan Beulich
2025-01-13 10:13     ` Roger Pau Monné
2025-01-10 14:01 ` [PATCH 2/3] vmd: disable MSI remapping bypass under Xen Roger Pau Monne
2025-01-10 22:25   ` Bjorn Helgaas
2025-01-11  5:02     ` Jonathan Derrick
2025-01-13 10:07       ` Roger Pau Monné
2025-01-13 10:03     ` Roger Pau Monné
2025-01-13 15:11       ` Keith Busch
2025-01-13 16:45         ` Roger Pau Monné
2025-01-13 16:53           ` Keith Busch
2025-01-14 11:03             ` Roger Pau Monné
2025-01-12  2:57   ` kernel test robot
2025-01-10 14:01 ` [PATCH 3/3] pci/msi: remove pci_msi_ignore_mask Roger Pau Monne
2025-01-10 22:30   ` Bjorn Helgaas
2025-01-13 10:25     ` Roger Pau Monné
2025-01-13 23:32       ` Bjorn Helgaas
2025-01-11 11:08   ` kernel test robot
2025-01-11 12:24   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox