* [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver
@ 2025-06-28 17:29 Marc Zyngier
2025-06-28 17:29 ` [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt Marc Zyngier
` (11 more replies)
0 siblings, 12 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Having recently dipped into the xgene-msi driver to bring it to use
the MSI-parent concept, I have realised that some of it was slightly
sub-par (read: downright broken).
The driver is playing horrible tricks behind the core code, missing
proper affinity management, is terribly over-designed for no good
reason, and despite what MAINTAINERS says, completely unmaintained.
This series is an attempt to fix most of the issues, and effectively
results more or less in a full rewrite of the driver, removing a lot
of cruft and fixing the interactions with the PCI host driver in the
process (there really isn't any reason to rely on initcall ordering
anymore).
I've stopped short of repainting the MAINTAINERS file, but given how
reactive Toan Le has been, maybe that's on the cards. Patches on top
of -rc3, tested on a Mustang board.
Marc Zyngier (12):
genirq: Teach handle_simple_irq() to resend an in-progress interrupt
PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet
PCI: xgene: Drop useless conditional compilation
PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN
PCI: xgene-msi: Make per-CPU interrupt setup robust
PCI: xgene-msi: Drop superfluous fields from xgene_msi structure
PCI: xgene-msi: Use device-managed memory allocations
PCI: xgene-msi: Get rid of intermediate tracking structure
PCI: xgene-msi: Sanitise MSI allocation and affinity setting
PCI: xgene-msi: Resend an MSI racing with itself on a different CPU
PCI: xgene-msi: Probe as a standard platform driver
PCI: xgene-msi: Restructure handler setup/teardown
drivers/pci/controller/pci-xgene-msi.c | 418 +++++++++----------------
drivers/pci/controller/pci-xgene.c | 33 +-
kernel/irq/chip.c | 8 +-
3 files changed, 176 insertions(+), 283 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-07-03 16:29 ` Thomas Gleixner
2025-06-28 17:29 ` [PATCH 02/12] PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet Marc Zyngier
` (10 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
It appears that the defect outlined in 9c15eeb5362c4 ("genirq: Allow
fasteoi handler to resend interrupts on concurrent handling") also
affects some other less stellar MSI controllers, this time using
the handle_simple_irq() flow.
Teach this flow about irqd_needs_resend_when_in_progress(). Given
the invasive nature of this workaround, only this flow is updated.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
kernel/irq/chip.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 2b274007e8bab..6e789035919f7 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -559,7 +559,13 @@ void handle_simple_irq(struct irq_desc *desc)
{
guard(raw_spinlock)(&desc->lock);
- if (!irq_can_handle(desc))
+ if (!irq_can_handle_pm(desc)) {
+ if (irqd_needs_resend_when_in_progress(&desc->irq_data))
+ desc->istate |= IRQS_PENDING;
+ return;
+ }
+
+ if (!irq_can_handle_actions(desc))
return;
kstat_incr_irqs_this_cpu(desc);
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 02/12] PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
2025-06-28 17:29 ` [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-06-28 17:29 ` [PATCH 03/12] PCI: xgene: Drop useless conditional compilation Marc Zyngier
` (9 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
As a preparatory work to make the XGene MSI driver probe less of
a sorry hack, make the PCI driver check for the availability of
the MSI parent domain, and defer the probing otherwise.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
index 1e2ebbfa36d19..f26cb58f814ec 100644
--- a/drivers/pci/controller/pci-xgene.c
+++ b/drivers/pci/controller/pci-xgene.c
@@ -12,6 +12,7 @@
#include <linux/jiffies.h>
#include <linux/memblock.h>
#include <linux/init.h>
+#include <linux/irqdomain.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_pci.h>
@@ -594,6 +595,24 @@ static struct pci_ops xgene_pcie_ops = {
.write = pci_generic_config_write32,
};
+static bool xgene_check_pcie_msi_ready(void)
+{
+ struct device_node *np;
+ struct irq_domain *d;
+
+ if (!IS_ENABLED(CONFIG_PCI_XGENE_MSI))
+ return true;
+
+ np = of_find_compatible_node(NULL, NULL, "apm,xgene1-msi");
+ if (!np)
+ return true;
+
+ d = irq_find_matching_host(np, DOMAIN_BUS_PCI_MSI);
+ of_node_put(np);
+
+ return d && irq_domain_is_msi_parent(d);
+}
+
static int xgene_pcie_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -602,6 +621,10 @@ static int xgene_pcie_probe(struct platform_device *pdev)
struct pci_host_bridge *bridge;
int ret;
+ if (!xgene_check_pcie_msi_ready())
+ return dev_err_probe(&pdev->dev, -EPROBE_DEFER,
+ "MSI driver not ready\n");
+
bridge = devm_pci_alloc_host_bridge(dev, sizeof(*port));
if (!bridge)
return -ENOMEM;
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 03/12] PCI: xgene: Drop useless conditional compilation
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
2025-06-28 17:29 ` [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt Marc Zyngier
2025-06-28 17:29 ` [PATCH 02/12] PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-06-28 17:29 ` [PATCH 04/12] PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN Marc Zyngier
` (8 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
pci-xgene.c only gets compiled if CONFIG_PCI_XGENE is selected.
It is therefore pointless to check for CONFIG_PCI_XGENE inside
the driver.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
index f26cb58f814ec..a848f98203ae4 100644
--- a/drivers/pci/controller/pci-xgene.c
+++ b/drivers/pci/controller/pci-xgene.c
@@ -58,7 +58,6 @@
#define XGENE_PCIE_IP_VER_1 1
#define XGENE_PCIE_IP_VER_2 2
-#if defined(CONFIG_PCI_XGENE) || (defined(CONFIG_ACPI) && defined(CONFIG_PCI_QUIRKS))
struct xgene_pcie {
struct device_node *node;
struct device *dev;
@@ -189,7 +188,6 @@ static int xgene_pcie_config_read32(struct pci_bus *bus, unsigned int devfn,
return PCIBIOS_SUCCESSFUL;
}
-#endif
#if defined(CONFIG_ACPI) && defined(CONFIG_PCI_QUIRKS)
static int xgene_get_csr_resource(struct acpi_device *adev,
@@ -280,7 +278,6 @@ const struct pci_ecam_ops xgene_v2_pcie_ecam_ops = {
};
#endif
-#if defined(CONFIG_PCI_XGENE)
static u64 xgene_pcie_set_ib_mask(struct xgene_pcie *port, u32 addr,
u32 flags, u64 size)
{
@@ -670,4 +667,3 @@ static struct platform_driver xgene_pcie_driver = {
.probe = xgene_pcie_probe,
};
builtin_platform_driver(xgene_pcie_driver);
-#endif
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 04/12] PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (2 preceding siblings ...)
2025-06-28 17:29 ` [PATCH 03/12] PCI: xgene: Drop useless conditional compilation Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-06-28 17:29 ` [PATCH 05/12] PCI: xgene-msi: Make per-CPU interrupt setup robust Marc Zyngier
` (7 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
XGENE_PCIE_IP_VER_UNKN is only refered to when probing for the
original XGene PCIe implementation, and get immediately overridden
if the device has the "apm,xgene-pcie" compatible string.
Given that the only way to get there is by finding this very string in
the DT, it is obvious that we will always ovwrite the version with
XGENE_PCIE_IP_VER_1.
Drop the whole thing.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
index a848f98203ae4..b95afa35201d0 100644
--- a/drivers/pci/controller/pci-xgene.c
+++ b/drivers/pci/controller/pci-xgene.c
@@ -54,7 +54,6 @@
#define XGENE_V1_PCI_EXP_CAP 0x40
/* PCIe IP version */
-#define XGENE_PCIE_IP_VER_UNKN 0
#define XGENE_PCIE_IP_VER_1 1
#define XGENE_PCIE_IP_VER_2 2
@@ -630,10 +629,7 @@ static int xgene_pcie_probe(struct platform_device *pdev)
port->node = of_node_get(dn);
port->dev = dev;
-
- port->version = XGENE_PCIE_IP_VER_UNKN;
- if (of_device_is_compatible(port->node, "apm,xgene-pcie"))
- port->version = XGENE_PCIE_IP_VER_1;
+ port->version = XGENE_PCIE_IP_VER_1;
ret = xgene_pcie_map_reg(port, pdev);
if (ret)
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 05/12] PCI: xgene-msi: Make per-CPU interrupt setup robust
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (3 preceding siblings ...)
2025-06-28 17:29 ` [PATCH 04/12] PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-06-28 17:29 ` [PATCH 06/12] PCI: xgene-msi: Drop superfluous fields from xgene_msi structure Marc Zyngier
` (6 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
The way the per-CPU interrupts are dealt with in the XGene MSI
driver isn't great:
- the affinity is set after the interrupt is enabled
- nothing prevents userspace from moving the interrupt around
- the affinity setting code pointlessly allocates memory
- the driver checks for conditions that cannot possibly happen
Address all of this in one go, resulting in slightly simpler setup
code.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index b05ec8b0bb93f..25cb4119bab07 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -355,40 +355,26 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
{
struct xgene_msi *msi = &xgene_msi_ctrl;
struct xgene_msi_group *msi_group;
- cpumask_var_t mask;
int i;
int err;
for (i = cpu; i < NR_HW_IRQS; i += msi->num_cpus) {
msi_group = &msi->msi_groups[i];
- if (!msi_group->gic_irq)
- continue;
-
- irq_set_chained_handler_and_data(msi_group->gic_irq,
- xgene_msi_isr, msi_group);
/*
* Statically allocate MSI GIC IRQs to each CPU core.
* With 8-core X-Gene v1, 2 MSI GIC IRQs are allocated
* to each core.
*/
- if (alloc_cpumask_var(&mask, GFP_KERNEL)) {
- cpumask_clear(mask);
- cpumask_set_cpu(cpu, mask);
- err = irq_set_affinity(msi_group->gic_irq, mask);
- if (err)
- pr_err("failed to set affinity for GIC IRQ");
- free_cpumask_var(mask);
- } else {
- pr_err("failed to alloc CPU mask for affinity\n");
- err = -EINVAL;
- }
-
+ irq_set_status_flags(msi_group->gic_irq, IRQ_NO_BALANCING);
+ err = irq_set_affinity(msi_group->gic_irq, cpumask_of(cpu));
if (err) {
- irq_set_chained_handler_and_data(msi_group->gic_irq,
- NULL, NULL);
+ pr_err("failed to set affinity for GIC IRQ");
return err;
}
+
+ irq_set_chained_handler_and_data(msi_group->gic_irq,
+ xgene_msi_isr, msi_group);
}
return 0;
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 06/12] PCI: xgene-msi: Drop superfluous fields from xgene_msi structure
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (4 preceding siblings ...)
2025-06-28 17:29 ` [PATCH 05/12] PCI: xgene-msi: Make per-CPU interrupt setup robust Marc Zyngier
@ 2025-06-28 17:29 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 07/12] PCI: xgene-msi: Use device-managed memory allocations Marc Zyngier
` (5 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:29 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
The xgene_msi structure remembers both the of_node of the device
and the number of CPUs. All of which are perfectly useless.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index 25cb4119bab07..3ca9a13cf38d3 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -31,14 +31,12 @@ struct xgene_msi_group {
};
struct xgene_msi {
- struct device_node *node;
struct irq_domain *inner_domain;
u64 msi_addr;
void __iomem *msi_regs;
unsigned long *bitmap;
struct mutex bitmap_lock;
struct xgene_msi_group *msi_groups;
- int num_cpus;
};
/* Global data */
@@ -147,7 +145,7 @@ static void xgene_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
*/
static int hwirq_to_cpu(unsigned long hwirq)
{
- return (hwirq % xgene_msi_ctrl.num_cpus);
+ return (hwirq % num_possible_cpus());
}
static unsigned long hwirq_to_canonical_hwirq(unsigned long hwirq)
@@ -186,9 +184,9 @@ static int xgene_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
mutex_lock(&msi->bitmap_lock);
msi_irq = bitmap_find_next_zero_area(msi->bitmap, NR_MSI_VEC, 0,
- msi->num_cpus, 0);
+ num_possible_cpus(), 0);
if (msi_irq < NR_MSI_VEC)
- bitmap_set(msi->bitmap, msi_irq, msi->num_cpus);
+ bitmap_set(msi->bitmap, msi_irq, num_possible_cpus());
else
msi_irq = -ENOSPC;
@@ -214,7 +212,7 @@ static void xgene_irq_domain_free(struct irq_domain *domain,
mutex_lock(&msi->bitmap_lock);
hwirq = hwirq_to_canonical_hwirq(d->hwirq);
- bitmap_clear(msi->bitmap, hwirq, msi->num_cpus);
+ bitmap_clear(msi->bitmap, hwirq, num_possible_cpus());
mutex_unlock(&msi->bitmap_lock);
@@ -235,10 +233,11 @@ static const struct msi_parent_ops xgene_msi_parent_ops = {
.init_dev_msi_info = msi_lib_init_dev_msi_info,
};
-static int xgene_allocate_domains(struct xgene_msi *msi)
+static int xgene_allocate_domains(struct device_node *node,
+ struct xgene_msi *msi)
{
struct irq_domain_info info = {
- .fwnode = of_fwnode_handle(msi->node),
+ .fwnode = of_fwnode_handle(node),
.ops = &xgene_msi_domain_ops,
.size = NR_MSI_VEC,
.host_data = msi,
@@ -358,7 +357,7 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
int i;
int err;
- for (i = cpu; i < NR_HW_IRQS; i += msi->num_cpus) {
+ for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
msi_group = &msi->msi_groups[i];
/*
@@ -386,11 +385,10 @@ static int xgene_msi_hwirq_free(unsigned int cpu)
struct xgene_msi_group *msi_group;
int i;
- for (i = cpu; i < NR_HW_IRQS; i += msi->num_cpus) {
+ for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
msi_group = &msi->msi_groups[i];
if (!msi_group->gic_irq)
continue;
-
irq_set_chained_handler_and_data(msi_group->gic_irq, NULL,
NULL);
}
@@ -420,8 +418,6 @@ static int xgene_msi_probe(struct platform_device *pdev)
goto error;
}
xgene_msi->msi_addr = res->start;
- xgene_msi->node = pdev->dev.of_node;
- xgene_msi->num_cpus = num_possible_cpus();
rc = xgene_msi_init_allocator(xgene_msi);
if (rc) {
@@ -429,7 +425,7 @@ static int xgene_msi_probe(struct platform_device *pdev)
goto error;
}
- rc = xgene_allocate_domains(xgene_msi);
+ rc = xgene_allocate_domains(dev_of_node(&pdev->dev), xgene_msi);
if (rc) {
dev_err(&pdev->dev, "Failed to allocate MSI domain\n");
goto error;
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 07/12] PCI: xgene-msi: Use device-managed memory allocations
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (5 preceding siblings ...)
2025-06-28 17:29 ` [PATCH 06/12] PCI: xgene-msi: Drop superfluous fields from xgene_msi structure Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure Marc Zyngier
` (4 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Since the MSI driver is probed as a platform device, there is no
reason to not use device-managed allocations. That's including
the top-level bookkeeping structure, which is better dynamically
alocated than being static.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 37 +++++++++++++-------------
1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index 3ca9a13cf38d3..b3ac0125b3b40 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -40,7 +40,7 @@ struct xgene_msi {
};
/* Global data */
-static struct xgene_msi xgene_msi_ctrl;
+static struct xgene_msi *xgene_msi_ctrl;
/*
* X-Gene v1 has 16 groups of MSI termination registers MSInIRx, where
@@ -253,18 +253,18 @@ static void xgene_free_domains(struct xgene_msi *msi)
irq_domain_remove(msi->inner_domain);
}
-static int xgene_msi_init_allocator(struct xgene_msi *xgene_msi)
+static int xgene_msi_init_allocator(struct device *dev)
{
- xgene_msi->bitmap = bitmap_zalloc(NR_MSI_VEC, GFP_KERNEL);
- if (!xgene_msi->bitmap)
+ xgene_msi_ctrl->bitmap = devm_bitmap_zalloc(dev, NR_MSI_VEC, GFP_KERNEL);
+ if (!xgene_msi_ctrl->bitmap)
return -ENOMEM;
- mutex_init(&xgene_msi->bitmap_lock);
+ mutex_init(&xgene_msi_ctrl->bitmap_lock);
- xgene_msi->msi_groups = kcalloc(NR_HW_IRQS,
- sizeof(struct xgene_msi_group),
- GFP_KERNEL);
- if (!xgene_msi->msi_groups)
+ xgene_msi_ctrl->msi_groups = devm_kcalloc(dev, NR_HW_IRQS,
+ sizeof(struct xgene_msi_group),
+ GFP_KERNEL);
+ if (!xgene_msi_ctrl->msi_groups)
return -ENOMEM;
return 0;
@@ -273,15 +273,14 @@ static int xgene_msi_init_allocator(struct xgene_msi *xgene_msi)
static void xgene_msi_isr(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct xgene_msi *xgene_msi = xgene_msi_ctrl;
struct xgene_msi_group *msi_groups;
- struct xgene_msi *xgene_msi;
int msir_index, msir_val, hw_irq, ret;
u32 intr_index, grp_select, msi_grp;
chained_irq_enter(chip, desc);
msi_groups = irq_desc_get_handler_data(desc);
- xgene_msi = msi_groups->msi;
msi_grp = msi_groups->msi_grp;
/*
@@ -344,15 +343,12 @@ static void xgene_msi_remove(struct platform_device *pdev)
kfree(msi->msi_groups);
- bitmap_free(msi->bitmap);
- msi->bitmap = NULL;
-
xgene_free_domains(msi);
}
static int xgene_msi_hwirq_alloc(unsigned int cpu)
{
- struct xgene_msi *msi = &xgene_msi_ctrl;
+ struct xgene_msi *msi = xgene_msi_ctrl;
struct xgene_msi_group *msi_group;
int i;
int err;
@@ -381,7 +377,7 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
static int xgene_msi_hwirq_free(unsigned int cpu)
{
- struct xgene_msi *msi = &xgene_msi_ctrl;
+ struct xgene_msi *msi = xgene_msi_ctrl;
struct xgene_msi_group *msi_group;
int i;
@@ -408,7 +404,12 @@ static int xgene_msi_probe(struct platform_device *pdev)
int virt_msir;
u32 msi_val, msi_idx;
- xgene_msi = &xgene_msi_ctrl;
+ xgene_msi_ctrl = devm_kzalloc(&pdev->dev, sizeof(*xgene_msi_ctrl),
+ GFP_KERNEL);
+ if (!xgene_msi_ctrl)
+ return -ENOMEM;
+
+ xgene_msi = xgene_msi_ctrl;
platform_set_drvdata(pdev, xgene_msi);
@@ -419,7 +420,7 @@ static int xgene_msi_probe(struct platform_device *pdev)
}
xgene_msi->msi_addr = res->start;
- rc = xgene_msi_init_allocator(xgene_msi);
+ rc = xgene_msi_init_allocator(&pdev->dev);
if (rc) {
dev_err(&pdev->dev, "Error allocating MSI bitmap\n");
goto error;
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (6 preceding siblings ...)
2025-06-28 17:30 ` [PATCH 07/12] PCI: xgene-msi: Use device-managed memory allocations Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-07-07 15:22 ` Lorenzo Pieralisi
2025-06-28 17:30 ` [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting Marc Zyngier
` (3 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
The xgene-msi driver uses an odd construct in the form of an
intermediate tracking structure, evidently designed to deal with
multiple instances of the MSI widget. However, the existing HW
only has one set, and it is obvious that there won't be new HW
coming down that particular line.
Simplify the driver by using a bit of pointer arithmetic instead,
directly tracking the interrupt and avoiding extra memory allocation.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 58 ++++++++------------------
1 file changed, 17 insertions(+), 41 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index b3ac0125b3b40..4be79b9ff80df 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -24,19 +24,13 @@
#define NR_HW_IRQS 16
#define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
-struct xgene_msi_group {
- struct xgene_msi *msi;
- int gic_irq;
- u32 msi_grp;
-};
-
struct xgene_msi {
struct irq_domain *inner_domain;
u64 msi_addr;
void __iomem *msi_regs;
unsigned long *bitmap;
struct mutex bitmap_lock;
- struct xgene_msi_group *msi_groups;
+ unsigned int gic_irq[NR_HW_IRQS];
};
/* Global data */
@@ -261,27 +255,20 @@ static int xgene_msi_init_allocator(struct device *dev)
mutex_init(&xgene_msi_ctrl->bitmap_lock);
- xgene_msi_ctrl->msi_groups = devm_kcalloc(dev, NR_HW_IRQS,
- sizeof(struct xgene_msi_group),
- GFP_KERNEL);
- if (!xgene_msi_ctrl->msi_groups)
- return -ENOMEM;
-
return 0;
}
static void xgene_msi_isr(struct irq_desc *desc)
{
+ unsigned int *irqp = irq_desc_get_handler_data(desc);
struct irq_chip *chip = irq_desc_get_chip(desc);
struct xgene_msi *xgene_msi = xgene_msi_ctrl;
- struct xgene_msi_group *msi_groups;
int msir_index, msir_val, hw_irq, ret;
u32 intr_index, grp_select, msi_grp;
chained_irq_enter(chip, desc);
- msi_groups = irq_desc_get_handler_data(desc);
- msi_grp = msi_groups->msi_grp;
+ msi_grp = irqp - xgene_msi->gic_irq;
/*
* MSIINTn (n is 0..F) indicates if there is a pending MSI interrupt
@@ -341,35 +328,31 @@ static void xgene_msi_remove(struct platform_device *pdev)
cpuhp_remove_state(pci_xgene_online);
cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
- kfree(msi->msi_groups);
-
xgene_free_domains(msi);
}
static int xgene_msi_hwirq_alloc(unsigned int cpu)
{
- struct xgene_msi *msi = xgene_msi_ctrl;
- struct xgene_msi_group *msi_group;
int i;
int err;
for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
- msi_group = &msi->msi_groups[i];
+ unsigned int irq = xgene_msi_ctrl->gic_irq[i];
/*
* Statically allocate MSI GIC IRQs to each CPU core.
* With 8-core X-Gene v1, 2 MSI GIC IRQs are allocated
* to each core.
*/
- irq_set_status_flags(msi_group->gic_irq, IRQ_NO_BALANCING);
- err = irq_set_affinity(msi_group->gic_irq, cpumask_of(cpu));
+ irq_set_status_flags(irq, IRQ_NO_BALANCING);
+ err = irq_set_affinity(irq, cpumask_of(cpu));
if (err) {
pr_err("failed to set affinity for GIC IRQ");
return err;
}
- irq_set_chained_handler_and_data(msi_group->gic_irq,
- xgene_msi_isr, msi_group);
+ irq_set_chained_handler_and_data(irq, xgene_msi_isr,
+ &xgene_msi_ctrl->gic_irq[i]);
}
return 0;
@@ -378,15 +361,12 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
static int xgene_msi_hwirq_free(unsigned int cpu)
{
struct xgene_msi *msi = xgene_msi_ctrl;
- struct xgene_msi_group *msi_group;
int i;
for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
- msi_group = &msi->msi_groups[i];
- if (!msi_group->gic_irq)
+ if (!msi->gic_irq[i])
continue;
- irq_set_chained_handler_and_data(msi_group->gic_irq, NULL,
- NULL);
+ irq_set_chained_handler_and_data(msi->gic_irq[i], NULL, NULL);
}
return 0;
}
@@ -399,10 +379,9 @@ static const struct of_device_id xgene_msi_match_table[] = {
static int xgene_msi_probe(struct platform_device *pdev)
{
struct resource *res;
- int rc, irq_index;
struct xgene_msi *xgene_msi;
- int virt_msir;
u32 msi_val, msi_idx;
+ int rc;
xgene_msi_ctrl = devm_kzalloc(&pdev->dev, sizeof(*xgene_msi_ctrl),
GFP_KERNEL);
@@ -432,15 +411,12 @@ static int xgene_msi_probe(struct platform_device *pdev)
goto error;
}
- for (irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
- virt_msir = platform_get_irq(pdev, irq_index);
- if (virt_msir < 0) {
- rc = virt_msir;
+ for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
+ rc = platform_get_irq(pdev, irq_index);
+ if (rc < 0)
goto error;
- }
- xgene_msi->msi_groups[irq_index].gic_irq = virt_msir;
- xgene_msi->msi_groups[irq_index].msi_grp = irq_index;
- xgene_msi->msi_groups[irq_index].msi = xgene_msi;
+
+ xgene_msi->gic_irq[irq_index] = rc;
}
/*
@@ -448,7 +424,7 @@ static int xgene_msi_probe(struct platform_device *pdev)
* interrupt handlers, read all of them to clear spurious
* interrupts that may occur before the driver is probed.
*/
- for (irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
+ for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
for (msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
xgene_msi_ir_read(xgene_msi, irq_index, msi_idx);
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (7 preceding siblings ...)
2025-06-28 17:30 ` [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-07-07 14:57 ` Lorenzo Pieralisi
2025-06-28 17:30 ` [PATCH 10/12] PCI: xgene-msi: Resend an MSI racing with itself on a different CPU Marc Zyngier
` (2 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Plugging a device that doesn't use managed affinity on an XGene-1
machine results in messages such as:
genirq: irq_chip PCI-MSIX-0000:01:00.0 did not update eff. affinity mask of irq 39
As it turns out, the driver was never updated to populate the effective
affinity on irq_set_affinity() call, and the core code is prickly about
that.
But upon further investigation, it appears that the driver keeps repainting
the hwirq field of the irq_data structure as a way to track the affinity
of the MSI, something that is very much frowned upon as it breaks the
fundamentals of an IRQ domain (an array indexed by hwirq).
Fixing this results more or less in a rewrite of the driver:
- Define how a hwirq and a cpu affinity map onto the MSI termination
registers
- Allocate a single entry in the bitmap per MSI instead of *8*
- Correctly track CPU affinity
- Fix the documentation so that it actually means something (to me)
- Use standard bitmap iterators
- and plenty of other cleanups
With this, the driver behaves correctly on my vintage Mustang board.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 212 ++++++++++---------------
1 file changed, 82 insertions(+), 130 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index 4be79b9ff80df..fbfdc80942596 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -6,6 +6,7 @@
* Author: Tanmay Inamdar <tinamdar@apm.com>
* Duc Dang <dhdang@apm.com>
*/
+#include <linux/bitfield.h>
#include <linux/cpu.h>
#include <linux/interrupt.h>
#include <linux/irqdomain.h>
@@ -22,7 +23,12 @@
#define IDX_PER_GROUP 8
#define IRQS_PER_IDX 16
#define NR_HW_IRQS 16
-#define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
+#define NR_MSI_BITS (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
+#define NR_MSI_VEC (NR_MSI_BITS / num_possible_cpus())
+
+#define MSI_GROUP_MASK GENMASK(22, 19)
+#define MSI_INDEX_MASK GENMASK(18, 16)
+#define MSI_INTR_MASK GENMASK(19, 16)
struct xgene_msi {
struct irq_domain *inner_domain;
@@ -37,8 +43,21 @@ struct xgene_msi {
static struct xgene_msi *xgene_msi_ctrl;
/*
- * X-Gene v1 has 16 groups of MSI termination registers MSInIRx, where
- * n is group number (0..F), x is index of registers in each group (0..7)
+ * X-Gene v1 has 16 frames of MSI termination registers MSInIRx, where n is
+ * frame number (0..15), x is index of registers in each frame (0..7). Each
+ * 32b register is at the beginning of a 64kB region, each frame occupying
+ * 512kB (and the whole thing 8MB of PA space).
+ *
+ * Each register supports 16 MSI vectors (0..15) to generate interrupts.
+ *
+ * Additionally, each MSI termination frame has 1 MSIINTn register (n is
+ * 0..15) to indicate the MSI pending status caused by 1 of its 8 index
+ * registers. Each 32b register is at the beginning of a 64kB region (and
+ * overall occupying an extra 1MB).
+ *
+ * There is one GIC IRQ assigned for each MSI termination frame, 16 in
+ * total.
+ *
* The register layout is as follows:
* MSI0IR0 base_addr
* MSI0IR1 base_addr + 0x10000
@@ -59,13 +78,6 @@ static struct xgene_msi *xgene_msi_ctrl;
* MSIINT1 base_addr + 0x810000
* ... ...
* MSIINTF base_addr + 0x8F0000
- *
- * Each index register supports 16 MSI vectors (0..15) to generate interrupt.
- * There are total 16 GIC IRQs assigned for these 16 groups of MSI termination
- * registers.
- *
- * Each MSI termination group has 1 MSIINTn register (n is 0..15) to indicate
- * the MSI pending status caused by 1 of its 8 index registers.
*/
/* MSInIRx read helper */
@@ -73,93 +85,60 @@ static u32 xgene_msi_ir_read(struct xgene_msi *msi,
u32 msi_grp, u32 msir_idx)
{
return readl_relaxed(msi->msi_regs + MSI_IR0 +
- (msi_grp << 19) + (msir_idx << 16));
+ (FIELD_PREP(MSI_GROUP_MASK, msi_grp) |
+ FIELD_PREP(MSI_INDEX_MASK, msir_idx)));
}
/* MSIINTn read helper */
static u32 xgene_msi_int_read(struct xgene_msi *msi, u32 msi_grp)
{
- return readl_relaxed(msi->msi_regs + MSI_INT0 + (msi_grp << 16));
+ return readl_relaxed(msi->msi_regs + MSI_INT0 +
+ FIELD_PREP(MSI_INTR_MASK, msi_grp));
}
/*
- * With 2048 MSI vectors supported, the MSI message can be constructed using
- * following scheme:
- * - Divide into 8 256-vector groups
- * Group 0: 0-255
- * Group 1: 256-511
- * Group 2: 512-767
- * ...
- * Group 7: 1792-2047
- * - Each 256-vector group is divided into 16 16-vector groups
- * As an example: 16 16-vector groups for 256-vector group 0-255 is
- * Group 0: 0-15
- * Group 1: 16-32
- * ...
- * Group 15: 240-255
- * - The termination address of MSI vector in 256-vector group n and 16-vector
- * group x is the address of MSIxIRn
- * - The data for MSI vector in 16-vector group x is x
+ * In order to allow an MSI to be moved from one CPU to another without
+ * having to repaint both the address and the data (which cannot be done
+ * atomically), we statically partitions the MSI frames between CPUs. Given
+ * that XGene-1 has 8 CPUs, each CPU gets two frames assigned to it
+ *
+ * We adopt the convention that when an MSI is moved, it is configured to
+ * target the same register number in the new frame. This reserves a given
+ * MSI across all CPUs, and reduces the MSI capacity from 2048 to 256.
+ *
+ * Effectively, this amounts to:
+ * - hwirq[7]::cpu[2:0] is the target frame number
+ * - hwirq[6:4] is the register index in any given frame
+ * - hwirq[3:0] is the MSI data
*/
-static u32 hwirq_to_reg_set(unsigned long hwirq)
-{
- return (hwirq / (NR_HW_IRQS * IRQS_PER_IDX));
-}
-
-static u32 hwirq_to_group(unsigned long hwirq)
-{
- return (hwirq % NR_HW_IRQS);
-}
-
-static u32 hwirq_to_msi_data(unsigned long hwirq)
-{
- return ((hwirq / NR_HW_IRQS) % IRQS_PER_IDX);
-}
-
static void xgene_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
{
struct xgene_msi *msi = irq_data_get_irq_chip_data(data);
- u32 reg_set = hwirq_to_reg_set(data->hwirq);
- u32 group = hwirq_to_group(data->hwirq);
- u64 target_addr = msi->msi_addr + (((8 * group) + reg_set) << 16);
+ u64 target_addr;
+ u32 frame, msir;
+ int cpu;
- msg->address_hi = upper_32_bits(target_addr);
- msg->address_lo = lower_32_bits(target_addr);
- msg->data = hwirq_to_msi_data(data->hwirq);
-}
+ cpu = cpumask_first(irq_data_get_effective_affinity_mask(data));
+ msir = FIELD_GET(GENMASK(6, 4), data->hwirq);
+ frame = FIELD_GET(BIT(7), data->hwirq) << 3 | cpu;
-/*
- * X-Gene v1 only has 16 MSI GIC IRQs for 2048 MSI vectors. To maintain
- * the expected behaviour of .set_affinity for each MSI interrupt, the 16
- * MSI GIC IRQs are statically allocated to 8 X-Gene v1 cores (2 GIC IRQs
- * for each core). The MSI vector is moved from 1 MSI GIC IRQ to another
- * MSI GIC IRQ to steer its MSI interrupt to correct X-Gene v1 core. As a
- * consequence, the total MSI vectors that X-Gene v1 supports will be
- * reduced to 256 (2048/8) vectors.
- */
-static int hwirq_to_cpu(unsigned long hwirq)
-{
- return (hwirq % num_possible_cpus());
-}
+ target_addr = msi->msi_addr;
+ target_addr += (FIELD_PREP(MSI_GROUP_MASK, frame) |
+ FIELD_PREP(MSI_INTR_MASK, msir));
-static unsigned long hwirq_to_canonical_hwirq(unsigned long hwirq)
-{
- return (hwirq - hwirq_to_cpu(hwirq));
+ msg->address_hi = upper_32_bits(target_addr);
+ msg->address_lo = lower_32_bits(target_addr);
+ msg->data = data->hwirq & GENMASK(3, 0);
}
static int xgene_msi_set_affinity(struct irq_data *irqdata,
const struct cpumask *mask, bool force)
{
int target_cpu = cpumask_first(mask);
- int curr_cpu;
- curr_cpu = hwirq_to_cpu(irqdata->hwirq);
- if (curr_cpu == target_cpu)
- return IRQ_SET_MASK_OK_DONE;
-
- /* Update MSI number to target the new CPU */
- irqdata->hwirq = hwirq_to_canonical_hwirq(irqdata->hwirq) + target_cpu;
+ irq_data_update_effective_affinity(irqdata, cpumask_of(target_cpu));
+ /* Force the core code to regenerate the message */
return IRQ_SET_MASK_OK;
}
@@ -173,23 +152,20 @@ static int xgene_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
unsigned int nr_irqs, void *args)
{
struct xgene_msi *msi = domain->host_data;
- int msi_irq;
+ irq_hw_number_t hwirq;
mutex_lock(&msi->bitmap_lock);
- msi_irq = bitmap_find_next_zero_area(msi->bitmap, NR_MSI_VEC, 0,
- num_possible_cpus(), 0);
- if (msi_irq < NR_MSI_VEC)
- bitmap_set(msi->bitmap, msi_irq, num_possible_cpus());
- else
- msi_irq = -ENOSPC;
+ hwirq = find_first_zero_bit(msi->bitmap, NR_MSI_VEC);
+ if (hwirq < NR_MSI_VEC)
+ set_bit(hwirq, msi->bitmap);
mutex_unlock(&msi->bitmap_lock);
- if (msi_irq < 0)
- return msi_irq;
+ if (hwirq >= NR_MSI_VEC)
+ return -ENOSPC;
- irq_domain_set_info(domain, virq, msi_irq,
+ irq_domain_set_info(domain, virq, hwirq,
&xgene_msi_bottom_irq_chip, domain->host_data,
handle_simple_irq, NULL, NULL);
@@ -201,12 +177,10 @@ static void xgene_irq_domain_free(struct irq_domain *domain,
{
struct irq_data *d = irq_domain_get_irq_data(domain, virq);
struct xgene_msi *msi = irq_data_get_irq_chip_data(d);
- u32 hwirq;
mutex_lock(&msi->bitmap_lock);
- hwirq = hwirq_to_canonical_hwirq(d->hwirq);
- bitmap_clear(msi->bitmap, hwirq, num_possible_cpus());
+ clear_bit(d->hwirq, msi->bitmap);
mutex_unlock(&msi->bitmap_lock);
@@ -263,55 +237,33 @@ static void xgene_msi_isr(struct irq_desc *desc)
unsigned int *irqp = irq_desc_get_handler_data(desc);
struct irq_chip *chip = irq_desc_get_chip(desc);
struct xgene_msi *xgene_msi = xgene_msi_ctrl;
- int msir_index, msir_val, hw_irq, ret;
- u32 intr_index, grp_select, msi_grp;
+ unsigned long grp_pending;
+ int msir_idx;
+ u32 msi_grp;
chained_irq_enter(chip, desc);
msi_grp = irqp - xgene_msi->gic_irq;
- /*
- * MSIINTn (n is 0..F) indicates if there is a pending MSI interrupt
- * If bit x of this register is set (x is 0..7), one or more interrupts
- * corresponding to MSInIRx is set.
- */
- grp_select = xgene_msi_int_read(xgene_msi, msi_grp);
- while (grp_select) {
- msir_index = ffs(grp_select) - 1;
- /*
- * Calculate MSInIRx address to read to check for interrupts
- * (refer to termination address and data assignment
- * described in xgene_compose_msi_msg() )
- */
- msir_val = xgene_msi_ir_read(xgene_msi, msi_grp, msir_index);
- while (msir_val) {
- intr_index = ffs(msir_val) - 1;
- /*
- * Calculate MSI vector number (refer to the termination
- * address and data assignment described in
- * xgene_compose_msi_msg function)
- */
- hw_irq = (((msir_index * IRQS_PER_IDX) + intr_index) *
- NR_HW_IRQS) + msi_grp;
- /*
- * As we have multiple hw_irq that maps to single MSI,
- * always look up the virq using the hw_irq as seen from
- * CPU0
- */
- hw_irq = hwirq_to_canonical_hwirq(hw_irq);
- ret = generic_handle_domain_irq(xgene_msi->inner_domain, hw_irq);
+ grp_pending = xgene_msi_int_read(xgene_msi, msi_grp);
+
+ for_each_set_bit(msir_idx, &grp_pending, IDX_PER_GROUP) {
+ unsigned long msir;
+ int intr_idx;
+
+ msir = xgene_msi_ir_read(xgene_msi, msi_grp, msir_idx);
+
+ for_each_set_bit(intr_idx, &msir, IRQS_PER_IDX) {
+ irq_hw_number_t hwirq;
+ int ret;
+
+ hwirq = ((FIELD_GET(BIT(3), msi_grp) << 7) |
+ (msir_idx << 4) |
+ intr_idx);
+
+ ret = generic_handle_domain_irq(xgene_msi->inner_domain,
+ hwirq);
WARN_ON_ONCE(ret);
- msir_val &= ~(1 << intr_index);
- }
- grp_select &= ~(1 << msir_index);
-
- if (!grp_select) {
- /*
- * We handled all interrupts happened in this group,
- * resample this group MSI_INTx register in case
- * something else has been made pending in the meantime
- */
- grp_select = xgene_msi_int_read(xgene_msi, msi_grp);
}
}
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 10/12] PCI: xgene-msi: Resend an MSI racing with itself on a different CPU
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (8 preceding siblings ...)
2025-06-28 17:30 ` [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 11/12] PCI: xgene-msi: Probe as a standard platform driver Marc Zyngier
2025-06-28 17:30 ` [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown Marc Zyngier
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Since changing the affinity of an MSI really is about changing
the target address and that it isn't possible to mask an individual
MSI, it is completely possible for an interrupt to race with itself,
usually resulting in a lost interrupt.
Paper over the design blunder by informing the core code of this
sad state of affair.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index fbfdc80942596..f797ba0524783 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -168,6 +168,7 @@ static int xgene_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
irq_domain_set_info(domain, virq, hwirq,
&xgene_msi_bottom_irq_chip, domain->host_data,
handle_simple_irq, NULL, NULL);
+ irqd_set_resend_when_in_progress(irq_get_irq_data(virq));
return 0;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 11/12] PCI: xgene-msi: Probe as a standard platform driver
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (9 preceding siblings ...)
2025-06-28 17:30 ` [PATCH 10/12] PCI: xgene-msi: Resend an MSI racing with itself on a different CPU Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown Marc Zyngier
11 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Now that we have made the dependedncy between the PCI driver and
the MSI driver explicit, there is no need to use subsys_initcall()
as a probing hook, and we can rely on builtin_platform_driver()
instead.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index f797ba0524783..a22a6df7808c7 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -419,9 +419,4 @@ static struct platform_driver xgene_msi_driver = {
.probe = xgene_msi_probe,
.remove = xgene_msi_remove,
};
-
-static int __init xgene_pcie_msi_init(void)
-{
- return platform_driver_register(&xgene_msi_driver);
-}
-subsys_initcall(xgene_pcie_msi_init);
+builtin_platform_driver(xgene_msi_driver);
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
` (10 preceding siblings ...)
2025-06-28 17:30 ` [PATCH 11/12] PCI: xgene-msi: Probe as a standard platform driver Marc Zyngier
@ 2025-06-28 17:30 ` Marc Zyngier
2025-07-07 11:14 ` Lorenzo Pieralisi
11 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2025-06-28 17:30 UTC (permalink / raw)
To: linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas,
Thomas Gleixner
Another utterly pointless aspect of the xgene-msi driver is that
it is built around CPU hotplug. Which is quite amusing since this
is one of the few arm64 platforms that, by construction, cannot
do CPU hotplug in a supported way (no EL3, no PSCI, no luck).
Drop the CPU hotplug nonsense and just setup the IRQs and handlers
in a less overdesigned way, grouping things more logically in the
process.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/pci/controller/pci-xgene-msi.c | 109 +++++++++----------------
1 file changed, 37 insertions(+), 72 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
index a22a6df7808c7..9f05c2a12da94 100644
--- a/drivers/pci/controller/pci-xgene-msi.c
+++ b/drivers/pci/controller/pci-xgene-msi.c
@@ -216,12 +216,6 @@ static int xgene_allocate_domains(struct device_node *node,
return msi->inner_domain ? 0 : -ENOMEM;
}
-static void xgene_free_domains(struct xgene_msi *msi)
-{
- if (msi->inner_domain)
- irq_domain_remove(msi->inner_domain);
-}
-
static int xgene_msi_init_allocator(struct device *dev)
{
xgene_msi_ctrl->bitmap = devm_bitmap_zalloc(dev, NR_MSI_VEC, GFP_KERNEL);
@@ -271,26 +265,48 @@ static void xgene_msi_isr(struct irq_desc *desc)
chained_irq_exit(chip, desc);
}
-static enum cpuhp_state pci_xgene_online;
-
static void xgene_msi_remove(struct platform_device *pdev)
{
- struct xgene_msi *msi = platform_get_drvdata(pdev);
-
- if (pci_xgene_online)
- cpuhp_remove_state(pci_xgene_online);
- cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
+ for (int i = 0; i < NR_HW_IRQS; i++) {
+ unsigned int irq = xgene_msi_ctrl->gic_irq[i];
+ if (!irq)
+ continue;
+ irq_set_chained_handler_and_data(irq, NULL, NULL);
+ }
- xgene_free_domains(msi);
+ if (xgene_msi_ctrl->inner_domain)
+ irq_domain_remove(xgene_msi_ctrl->inner_domain);
}
-static int xgene_msi_hwirq_alloc(unsigned int cpu)
+static int xgene_msi_handler_setup(struct platform_device *pdev)
{
+ struct xgene_msi *xgene_msi = xgene_msi_ctrl;
int i;
- int err;
- for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
- unsigned int irq = xgene_msi_ctrl->gic_irq[i];
+ for (i = 0; i < NR_HW_IRQS; i++) {
+ u32 msi_val;
+ int irq, err;
+
+ /*
+ * MSInIRx registers are read-to-clear; before registering
+ * interrupt handlers, read all of them to clear spurious
+ * interrupts that may occur before the driver is probed.
+ */
+ for (int msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
+ xgene_msi_ir_read(xgene_msi, i, msi_idx);
+
+ /* Read MSIINTn to confirm */
+ msi_val = xgene_msi_int_read(xgene_msi, i);
+ if (msi_val) {
+ dev_err(&pdev->dev, "Failed to clear spurious IRQ\n");
+ return EINVAL;
+ }
+
+ irq = platform_get_irq(pdev, i);
+ if (irq < 0)
+ return irq;
+
+ xgene_msi->gic_irq[i] = irq;
/*
* Statically allocate MSI GIC IRQs to each CPU core.
@@ -298,7 +314,7 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
* to each core.
*/
irq_set_status_flags(irq, IRQ_NO_BALANCING);
- err = irq_set_affinity(irq, cpumask_of(cpu));
+ err = irq_set_affinity(irq, cpumask_of(i % num_possible_cpus()));
if (err) {
pr_err("failed to set affinity for GIC IRQ");
return err;
@@ -311,19 +327,6 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
return 0;
}
-static int xgene_msi_hwirq_free(unsigned int cpu)
-{
- struct xgene_msi *msi = xgene_msi_ctrl;
- int i;
-
- for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
- if (!msi->gic_irq[i])
- continue;
- irq_set_chained_handler_and_data(msi->gic_irq[i], NULL, NULL);
- }
- return 0;
-}
-
static const struct of_device_id xgene_msi_match_table[] = {
{.compatible = "apm,xgene1-msi"},
{},
@@ -333,7 +336,6 @@ static int xgene_msi_probe(struct platform_device *pdev)
{
struct resource *res;
struct xgene_msi *xgene_msi;
- u32 msi_val, msi_idx;
int rc;
xgene_msi_ctrl = devm_kzalloc(&pdev->dev, sizeof(*xgene_msi_ctrl),
@@ -343,8 +345,6 @@ static int xgene_msi_probe(struct platform_device *pdev)
xgene_msi = xgene_msi_ctrl;
- platform_set_drvdata(pdev, xgene_msi);
-
xgene_msi->msi_regs = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(xgene_msi->msi_regs)) {
rc = PTR_ERR(xgene_msi->msi_regs);
@@ -364,48 +364,13 @@ static int xgene_msi_probe(struct platform_device *pdev)
goto error;
}
- for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
- rc = platform_get_irq(pdev, irq_index);
- if (rc < 0)
- goto error;
-
- xgene_msi->gic_irq[irq_index] = rc;
- }
-
- /*
- * MSInIRx registers are read-to-clear; before registering
- * interrupt handlers, read all of them to clear spurious
- * interrupts that may occur before the driver is probed.
- */
- for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
- for (msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
- xgene_msi_ir_read(xgene_msi, irq_index, msi_idx);
-
- /* Read MSIINTn to confirm */
- msi_val = xgene_msi_int_read(xgene_msi, irq_index);
- if (msi_val) {
- dev_err(&pdev->dev, "Failed to clear spurious IRQ\n");
- rc = -EINVAL;
- goto error;
- }
- }
-
- rc = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "pci/xgene:online",
- xgene_msi_hwirq_alloc, NULL);
- if (rc < 0)
- goto err_cpuhp;
- pci_xgene_online = rc;
- rc = cpuhp_setup_state(CPUHP_PCI_XGENE_DEAD, "pci/xgene:dead", NULL,
- xgene_msi_hwirq_free);
+ rc = xgene_msi_handler_setup(pdev);
if (rc)
- goto err_cpuhp;
+ goto error;
dev_info(&pdev->dev, "APM X-Gene PCIe MSI driver loaded\n");
return 0;
-
-err_cpuhp:
- dev_err(&pdev->dev, "failed to add CPU MSI notifier\n");
error:
xgene_msi_remove(pdev);
return rc;
--
2.39.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt
2025-06-28 17:29 ` [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt Marc Zyngier
@ 2025-07-03 16:29 ` Thomas Gleixner
0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2025-07-03 16:29 UTC (permalink / raw)
To: Marc Zyngier, linux-pci, linux-arm-kernel, linux-kernel
Cc: Toan Le, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas
On Sat, Jun 28 2025 at 18:29, Marc Zyngier wrote:
> It appears that the defect outlined in 9c15eeb5362c4 ("genirq: Allow
> fasteoi handler to resend interrupts on concurrent handling") also
> affects some other less stellar MSI controllers, this time using
> the handle_simple_irq() flow.
>
> Teach this flow about irqd_needs_resend_when_in_progress(). Given
> the invasive nature of this workaround, only this flow is updated.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
As I have no conflicting changes pending in this area, this can go
through the PCI tree along with the rest of this lot. Therefore:
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Thanks,
tglx
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown
2025-06-28 17:30 ` [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown Marc Zyngier
@ 2025-07-07 11:14 ` Lorenzo Pieralisi
2025-07-07 15:58 ` Marc Zyngier
0 siblings, 1 reply; 20+ messages in thread
From: Lorenzo Pieralisi @ 2025-07-07 11:14 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Sat, Jun 28, 2025 at 06:30:05PM +0100, Marc Zyngier wrote:
> Another utterly pointless aspect of the xgene-msi driver is that
> it is built around CPU hotplug. Which is quite amusing since this
> is one of the few arm64 platforms that, by construction, cannot
> do CPU hotplug in a supported way (no EL3, no PSCI, no luck).
>
> Drop the CPU hotplug nonsense and just setup the IRQs and handlers
> in a less overdesigned way, grouping things more logically in the
> process.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> drivers/pci/controller/pci-xgene-msi.c | 109 +++++++++----------------
> 1 file changed, 37 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
> index a22a6df7808c7..9f05c2a12da94 100644
> --- a/drivers/pci/controller/pci-xgene-msi.c
> +++ b/drivers/pci/controller/pci-xgene-msi.c
> @@ -216,12 +216,6 @@ static int xgene_allocate_domains(struct device_node *node,
> return msi->inner_domain ? 0 : -ENOMEM;
> }
>
> -static void xgene_free_domains(struct xgene_msi *msi)
> -{
> - if (msi->inner_domain)
> - irq_domain_remove(msi->inner_domain);
> -}
> -
> static int xgene_msi_init_allocator(struct device *dev)
> {
> xgene_msi_ctrl->bitmap = devm_bitmap_zalloc(dev, NR_MSI_VEC, GFP_KERNEL);
> @@ -271,26 +265,48 @@ static void xgene_msi_isr(struct irq_desc *desc)
> chained_irq_exit(chip, desc);
> }
>
> -static enum cpuhp_state pci_xgene_online;
> -
> static void xgene_msi_remove(struct platform_device *pdev)
> {
> - struct xgene_msi *msi = platform_get_drvdata(pdev);
> -
> - if (pci_xgene_online)
> - cpuhp_remove_state(pci_xgene_online);
> - cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
No question on the patch - just noticed we could remove
CPUHP_PCI_XGENE_DEAD from cpuhp_state since it would become
unused AFAICS.
Thanks,
Lorenzo
> + for (int i = 0; i < NR_HW_IRQS; i++) {
> + unsigned int irq = xgene_msi_ctrl->gic_irq[i];
> + if (!irq)
> + continue;
> + irq_set_chained_handler_and_data(irq, NULL, NULL);
> + }
>
> - xgene_free_domains(msi);
> + if (xgene_msi_ctrl->inner_domain)
> + irq_domain_remove(xgene_msi_ctrl->inner_domain);
> }
>
> -static int xgene_msi_hwirq_alloc(unsigned int cpu)
> +static int xgene_msi_handler_setup(struct platform_device *pdev)
> {
> + struct xgene_msi *xgene_msi = xgene_msi_ctrl;
> int i;
> - int err;
>
> - for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> - unsigned int irq = xgene_msi_ctrl->gic_irq[i];
> + for (i = 0; i < NR_HW_IRQS; i++) {
> + u32 msi_val;
> + int irq, err;
> +
> + /*
> + * MSInIRx registers are read-to-clear; before registering
> + * interrupt handlers, read all of them to clear spurious
> + * interrupts that may occur before the driver is probed.
> + */
> + for (int msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
> + xgene_msi_ir_read(xgene_msi, i, msi_idx);
> +
> + /* Read MSIINTn to confirm */
> + msi_val = xgene_msi_int_read(xgene_msi, i);
> + if (msi_val) {
> + dev_err(&pdev->dev, "Failed to clear spurious IRQ\n");
> + return EINVAL;
> + }
> +
> + irq = platform_get_irq(pdev, i);
> + if (irq < 0)
> + return irq;
> +
> + xgene_msi->gic_irq[i] = irq;
>
> /*
> * Statically allocate MSI GIC IRQs to each CPU core.
> @@ -298,7 +314,7 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
> * to each core.
> */
> irq_set_status_flags(irq, IRQ_NO_BALANCING);
> - err = irq_set_affinity(irq, cpumask_of(cpu));
> + err = irq_set_affinity(irq, cpumask_of(i % num_possible_cpus()));
> if (err) {
> pr_err("failed to set affinity for GIC IRQ");
> return err;
> @@ -311,19 +327,6 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
> return 0;
> }
>
> -static int xgene_msi_hwirq_free(unsigned int cpu)
> -{
> - struct xgene_msi *msi = xgene_msi_ctrl;
> - int i;
> -
> - for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> - if (!msi->gic_irq[i])
> - continue;
> - irq_set_chained_handler_and_data(msi->gic_irq[i], NULL, NULL);
> - }
> - return 0;
> -}
> -
> static const struct of_device_id xgene_msi_match_table[] = {
> {.compatible = "apm,xgene1-msi"},
> {},
> @@ -333,7 +336,6 @@ static int xgene_msi_probe(struct platform_device *pdev)
> {
> struct resource *res;
> struct xgene_msi *xgene_msi;
> - u32 msi_val, msi_idx;
> int rc;
>
> xgene_msi_ctrl = devm_kzalloc(&pdev->dev, sizeof(*xgene_msi_ctrl),
> @@ -343,8 +345,6 @@ static int xgene_msi_probe(struct platform_device *pdev)
>
> xgene_msi = xgene_msi_ctrl;
>
> - platform_set_drvdata(pdev, xgene_msi);
> -
> xgene_msi->msi_regs = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
> if (IS_ERR(xgene_msi->msi_regs)) {
> rc = PTR_ERR(xgene_msi->msi_regs);
> @@ -364,48 +364,13 @@ static int xgene_msi_probe(struct platform_device *pdev)
> goto error;
> }
>
> - for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> - rc = platform_get_irq(pdev, irq_index);
> - if (rc < 0)
> - goto error;
> -
> - xgene_msi->gic_irq[irq_index] = rc;
> - }
> -
> - /*
> - * MSInIRx registers are read-to-clear; before registering
> - * interrupt handlers, read all of them to clear spurious
> - * interrupts that may occur before the driver is probed.
> - */
> - for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> - for (msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
> - xgene_msi_ir_read(xgene_msi, irq_index, msi_idx);
> -
> - /* Read MSIINTn to confirm */
> - msi_val = xgene_msi_int_read(xgene_msi, irq_index);
> - if (msi_val) {
> - dev_err(&pdev->dev, "Failed to clear spurious IRQ\n");
> - rc = -EINVAL;
> - goto error;
> - }
> - }
> -
> - rc = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "pci/xgene:online",
> - xgene_msi_hwirq_alloc, NULL);
> - if (rc < 0)
> - goto err_cpuhp;
> - pci_xgene_online = rc;
> - rc = cpuhp_setup_state(CPUHP_PCI_XGENE_DEAD, "pci/xgene:dead", NULL,
> - xgene_msi_hwirq_free);
> + rc = xgene_msi_handler_setup(pdev);
> if (rc)
> - goto err_cpuhp;
> + goto error;
>
> dev_info(&pdev->dev, "APM X-Gene PCIe MSI driver loaded\n");
>
> return 0;
> -
> -err_cpuhp:
> - dev_err(&pdev->dev, "failed to add CPU MSI notifier\n");
> error:
> xgene_msi_remove(pdev);
> return rc;
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting
2025-06-28 17:30 ` [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting Marc Zyngier
@ 2025-07-07 14:57 ` Lorenzo Pieralisi
2025-07-08 14:41 ` Marc Zyngier
0 siblings, 1 reply; 20+ messages in thread
From: Lorenzo Pieralisi @ 2025-07-07 14:57 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Sat, Jun 28, 2025 at 06:30:02PM +0100, Marc Zyngier wrote:
> Plugging a device that doesn't use managed affinity on an XGene-1
> machine results in messages such as:
>
> genirq: irq_chip PCI-MSIX-0000:01:00.0 did not update eff. affinity mask of irq 39
>
> As it turns out, the driver was never updated to populate the effective
> affinity on irq_set_affinity() call, and the core code is prickly about
> that.
>
> But upon further investigation, it appears that the driver keeps repainting
> the hwirq field of the irq_data structure as a way to track the affinity
> of the MSI, something that is very much frowned upon as it breaks the
> fundamentals of an IRQ domain (an array indexed by hwirq).
>
> Fixing this results more or less in a rewrite of the driver:
>
> - Define how a hwirq and a cpu affinity map onto the MSI termination
> registers
>
> - Allocate a single entry in the bitmap per MSI instead of *8*
>
> - Correctly track CPU affinity
>
> - Fix the documentation so that it actually means something (to me)
>
> - Use standard bitmap iterators
>
> - and plenty of other cleanups
>
> With this, the driver behaves correctly on my vintage Mustang board.
This makes sense to me, a tiny request below.
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> drivers/pci/controller/pci-xgene-msi.c | 212 ++++++++++---------------
> 1 file changed, 82 insertions(+), 130 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
> index 4be79b9ff80df..fbfdc80942596 100644
> --- a/drivers/pci/controller/pci-xgene-msi.c
> +++ b/drivers/pci/controller/pci-xgene-msi.c
> @@ -6,6 +6,7 @@
> * Author: Tanmay Inamdar <tinamdar@apm.com>
> * Duc Dang <dhdang@apm.com>
> */
> +#include <linux/bitfield.h>
> #include <linux/cpu.h>
> #include <linux/interrupt.h>
> #include <linux/irqdomain.h>
> @@ -22,7 +23,12 @@
> #define IDX_PER_GROUP 8
> #define IRQS_PER_IDX 16
> #define NR_HW_IRQS 16
> -#define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
> +#define NR_MSI_BITS (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
> +#define NR_MSI_VEC (NR_MSI_BITS / num_possible_cpus())
> +
> +#define MSI_GROUP_MASK GENMASK(22, 19)
> +#define MSI_INDEX_MASK GENMASK(18, 16)
> +#define MSI_INTR_MASK GENMASK(19, 16)
>
> struct xgene_msi {
> struct irq_domain *inner_domain;
> @@ -37,8 +43,21 @@ struct xgene_msi {
> static struct xgene_msi *xgene_msi_ctrl;
>
> /*
> - * X-Gene v1 has 16 groups of MSI termination registers MSInIRx, where
> - * n is group number (0..F), x is index of registers in each group (0..7)
> + * X-Gene v1 has 16 frames of MSI termination registers MSInIRx, where n is
> + * frame number (0..15), x is index of registers in each frame (0..7). Each
> + * 32b register is at the beginning of a 64kB region, each frame occupying
> + * 512kB (and the whole thing 8MB of PA space).
> + *
> + * Each register supports 16 MSI vectors (0..15) to generate interrupts.
> + *
> + * Additionally, each MSI termination frame has 1 MSIINTn register (n is
> + * 0..15) to indicate the MSI pending status caused by 1 of its 8 index
> + * registers. Each 32b register is at the beginning of a 64kB region (and
> + * overall occupying an extra 1MB).
> + *
> + * There is one GIC IRQ assigned for each MSI termination frame, 16 in
> + * total.
> + *
> * The register layout is as follows:
> * MSI0IR0 base_addr
> * MSI0IR1 base_addr + 0x10000
> @@ -59,13 +78,6 @@ static struct xgene_msi *xgene_msi_ctrl;
> * MSIINT1 base_addr + 0x810000
> * ... ...
> * MSIINTF base_addr + 0x8F0000
> - *
> - * Each index register supports 16 MSI vectors (0..15) to generate interrupt.
> - * There are total 16 GIC IRQs assigned for these 16 groups of MSI termination
> - * registers.
> - *
> - * Each MSI termination group has 1 MSIINTn register (n is 0..15) to indicate
> - * the MSI pending status caused by 1 of its 8 index registers.
> */
>
> /* MSInIRx read helper */
> @@ -73,93 +85,60 @@ static u32 xgene_msi_ir_read(struct xgene_msi *msi,
> u32 msi_grp, u32 msir_idx)
> {
> return readl_relaxed(msi->msi_regs + MSI_IR0 +
> - (msi_grp << 19) + (msir_idx << 16));
> + (FIELD_PREP(MSI_GROUP_MASK, msi_grp) |
> + FIELD_PREP(MSI_INDEX_MASK, msir_idx)));
> }
>
> /* MSIINTn read helper */
> static u32 xgene_msi_int_read(struct xgene_msi *msi, u32 msi_grp)
> {
> - return readl_relaxed(msi->msi_regs + MSI_INT0 + (msi_grp << 16));
> + return readl_relaxed(msi->msi_regs + MSI_INT0 +
> + FIELD_PREP(MSI_INTR_MASK, msi_grp));
> }
>
> /*
> - * With 2048 MSI vectors supported, the MSI message can be constructed using
> - * following scheme:
> - * - Divide into 8 256-vector groups
> - * Group 0: 0-255
> - * Group 1: 256-511
> - * Group 2: 512-767
> - * ...
> - * Group 7: 1792-2047
> - * - Each 256-vector group is divided into 16 16-vector groups
> - * As an example: 16 16-vector groups for 256-vector group 0-255 is
> - * Group 0: 0-15
> - * Group 1: 16-32
> - * ...
> - * Group 15: 240-255
> - * - The termination address of MSI vector in 256-vector group n and 16-vector
> - * group x is the address of MSIxIRn
> - * - The data for MSI vector in 16-vector group x is x
> + * In order to allow an MSI to be moved from one CPU to another without
> + * having to repaint both the address and the data (which cannot be done
> + * atomically), we statically partitions the MSI frames between CPUs. Given
> + * that XGene-1 has 8 CPUs, each CPU gets two frames assigned to it
> + *
> + * We adopt the convention that when an MSI is moved, it is configured to
> + * target the same register number in the new frame. This reserves a given
> + * MSI across all CPUs, and reduces the MSI capacity from 2048 to 256.
> + *
> + * Effectively, this amounts to:
> + * - hwirq[7]::cpu[2:0] is the target frame number
> + * - hwirq[6:4] is the register index in any given frame
> + * - hwirq[3:0] is the MSI data
I think that adding macros to define these subfields shifts would simplify
reading and reviewing the code - while reviewing xgene_msi_isr() I
realized it is hard to understand where the shifts to pack/unpack hwirq come
from. I'd understand you don't want to use FIELD_PREP/GET on hwirq, it
is not a HW field but rather a SW encoding you created but at least defining
the shifts and using them throughout would help.
Other than that the code looks fine to me, thanks a lot for cleaning
this up (and fixing it in the process, by the way).
Lorenzo
> */
> -static u32 hwirq_to_reg_set(unsigned long hwirq)
> -{
> - return (hwirq / (NR_HW_IRQS * IRQS_PER_IDX));
> -}
> -
> -static u32 hwirq_to_group(unsigned long hwirq)
> -{
> - return (hwirq % NR_HW_IRQS);
> -}
> -
> -static u32 hwirq_to_msi_data(unsigned long hwirq)
> -{
> - return ((hwirq / NR_HW_IRQS) % IRQS_PER_IDX);
> -}
> -
> static void xgene_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
> {
> struct xgene_msi *msi = irq_data_get_irq_chip_data(data);
> - u32 reg_set = hwirq_to_reg_set(data->hwirq);
> - u32 group = hwirq_to_group(data->hwirq);
> - u64 target_addr = msi->msi_addr + (((8 * group) + reg_set) << 16);
> + u64 target_addr;
> + u32 frame, msir;
> + int cpu;
>
> - msg->address_hi = upper_32_bits(target_addr);
> - msg->address_lo = lower_32_bits(target_addr);
> - msg->data = hwirq_to_msi_data(data->hwirq);
> -}
> + cpu = cpumask_first(irq_data_get_effective_affinity_mask(data));
> + msir = FIELD_GET(GENMASK(6, 4), data->hwirq);
> + frame = FIELD_GET(BIT(7), data->hwirq) << 3 | cpu;
>
> -/*
> - * X-Gene v1 only has 16 MSI GIC IRQs for 2048 MSI vectors. To maintain
> - * the expected behaviour of .set_affinity for each MSI interrupt, the 16
> - * MSI GIC IRQs are statically allocated to 8 X-Gene v1 cores (2 GIC IRQs
> - * for each core). The MSI vector is moved from 1 MSI GIC IRQ to another
> - * MSI GIC IRQ to steer its MSI interrupt to correct X-Gene v1 core. As a
> - * consequence, the total MSI vectors that X-Gene v1 supports will be
> - * reduced to 256 (2048/8) vectors.
> - */
> -static int hwirq_to_cpu(unsigned long hwirq)
> -{
> - return (hwirq % num_possible_cpus());
> -}
> + target_addr = msi->msi_addr;
> + target_addr += (FIELD_PREP(MSI_GROUP_MASK, frame) |
> + FIELD_PREP(MSI_INTR_MASK, msir));
>
> -static unsigned long hwirq_to_canonical_hwirq(unsigned long hwirq)
> -{
> - return (hwirq - hwirq_to_cpu(hwirq));
> + msg->address_hi = upper_32_bits(target_addr);
> + msg->address_lo = lower_32_bits(target_addr);
> + msg->data = data->hwirq & GENMASK(3, 0);
> }
>
> static int xgene_msi_set_affinity(struct irq_data *irqdata,
> const struct cpumask *mask, bool force)
> {
> int target_cpu = cpumask_first(mask);
> - int curr_cpu;
>
> - curr_cpu = hwirq_to_cpu(irqdata->hwirq);
> - if (curr_cpu == target_cpu)
> - return IRQ_SET_MASK_OK_DONE;
> -
> - /* Update MSI number to target the new CPU */
> - irqdata->hwirq = hwirq_to_canonical_hwirq(irqdata->hwirq) + target_cpu;
> + irq_data_update_effective_affinity(irqdata, cpumask_of(target_cpu));
>
> + /* Force the core code to regenerate the message */
> return IRQ_SET_MASK_OK;
> }
>
> @@ -173,23 +152,20 @@ static int xgene_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> unsigned int nr_irqs, void *args)
> {
> struct xgene_msi *msi = domain->host_data;
> - int msi_irq;
> + irq_hw_number_t hwirq;
>
> mutex_lock(&msi->bitmap_lock);
>
> - msi_irq = bitmap_find_next_zero_area(msi->bitmap, NR_MSI_VEC, 0,
> - num_possible_cpus(), 0);
> - if (msi_irq < NR_MSI_VEC)
> - bitmap_set(msi->bitmap, msi_irq, num_possible_cpus());
> - else
> - msi_irq = -ENOSPC;
> + hwirq = find_first_zero_bit(msi->bitmap, NR_MSI_VEC);
> + if (hwirq < NR_MSI_VEC)
> + set_bit(hwirq, msi->bitmap);
>
> mutex_unlock(&msi->bitmap_lock);
>
> - if (msi_irq < 0)
> - return msi_irq;
> + if (hwirq >= NR_MSI_VEC)
> + return -ENOSPC;
>
> - irq_domain_set_info(domain, virq, msi_irq,
> + irq_domain_set_info(domain, virq, hwirq,
> &xgene_msi_bottom_irq_chip, domain->host_data,
> handle_simple_irq, NULL, NULL);
>
> @@ -201,12 +177,10 @@ static void xgene_irq_domain_free(struct irq_domain *domain,
> {
> struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> struct xgene_msi *msi = irq_data_get_irq_chip_data(d);
> - u32 hwirq;
>
> mutex_lock(&msi->bitmap_lock);
>
> - hwirq = hwirq_to_canonical_hwirq(d->hwirq);
> - bitmap_clear(msi->bitmap, hwirq, num_possible_cpus());
> + clear_bit(d->hwirq, msi->bitmap);
>
> mutex_unlock(&msi->bitmap_lock);
>
> @@ -263,55 +237,33 @@ static void xgene_msi_isr(struct irq_desc *desc)
> unsigned int *irqp = irq_desc_get_handler_data(desc);
> struct irq_chip *chip = irq_desc_get_chip(desc);
> struct xgene_msi *xgene_msi = xgene_msi_ctrl;
> - int msir_index, msir_val, hw_irq, ret;
> - u32 intr_index, grp_select, msi_grp;
> + unsigned long grp_pending;
> + int msir_idx;
> + u32 msi_grp;
>
> chained_irq_enter(chip, desc);
>
> msi_grp = irqp - xgene_msi->gic_irq;
>
> - /*
> - * MSIINTn (n is 0..F) indicates if there is a pending MSI interrupt
> - * If bit x of this register is set (x is 0..7), one or more interrupts
> - * corresponding to MSInIRx is set.
> - */
> - grp_select = xgene_msi_int_read(xgene_msi, msi_grp);
> - while (grp_select) {
> - msir_index = ffs(grp_select) - 1;
> - /*
> - * Calculate MSInIRx address to read to check for interrupts
> - * (refer to termination address and data assignment
> - * described in xgene_compose_msi_msg() )
> - */
> - msir_val = xgene_msi_ir_read(xgene_msi, msi_grp, msir_index);
> - while (msir_val) {
> - intr_index = ffs(msir_val) - 1;
> - /*
> - * Calculate MSI vector number (refer to the termination
> - * address and data assignment described in
> - * xgene_compose_msi_msg function)
> - */
> - hw_irq = (((msir_index * IRQS_PER_IDX) + intr_index) *
> - NR_HW_IRQS) + msi_grp;
> - /*
> - * As we have multiple hw_irq that maps to single MSI,
> - * always look up the virq using the hw_irq as seen from
> - * CPU0
> - */
> - hw_irq = hwirq_to_canonical_hwirq(hw_irq);
> - ret = generic_handle_domain_irq(xgene_msi->inner_domain, hw_irq);
> + grp_pending = xgene_msi_int_read(xgene_msi, msi_grp);
> +
> + for_each_set_bit(msir_idx, &grp_pending, IDX_PER_GROUP) {
> + unsigned long msir;
> + int intr_idx;
> +
> + msir = xgene_msi_ir_read(xgene_msi, msi_grp, msir_idx);
> +
> + for_each_set_bit(intr_idx, &msir, IRQS_PER_IDX) {
> + irq_hw_number_t hwirq;
> + int ret;
> +
> + hwirq = ((FIELD_GET(BIT(3), msi_grp) << 7) |
> + (msir_idx << 4) |
> + intr_idx);
> +
> + ret = generic_handle_domain_irq(xgene_msi->inner_domain,
> + hwirq);
> WARN_ON_ONCE(ret);
> - msir_val &= ~(1 << intr_index);
> - }
> - grp_select &= ~(1 << msir_index);
> -
> - if (!grp_select) {
> - /*
> - * We handled all interrupts happened in this group,
> - * resample this group MSI_INTx register in case
> - * something else has been made pending in the meantime
> - */
> - grp_select = xgene_msi_int_read(xgene_msi, msi_grp);
> }
> }
>
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure
2025-06-28 17:30 ` [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure Marc Zyngier
@ 2025-07-07 15:22 ` Lorenzo Pieralisi
2025-07-08 14:46 ` Marc Zyngier
0 siblings, 1 reply; 20+ messages in thread
From: Lorenzo Pieralisi @ 2025-07-07 15:22 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Sat, Jun 28, 2025 at 06:30:01PM +0100, Marc Zyngier wrote:
> The xgene-msi driver uses an odd construct in the form of an
> intermediate tracking structure, evidently designed to deal with
> multiple instances of the MSI widget. However, the existing HW
> only has one set, and it is obvious that there won't be new HW
> coming down that particular line.
>
> Simplify the driver by using a bit of pointer arithmetic instead,
> directly tracking the interrupt and avoiding extra memory allocation.
A couple of nits, nothing else.
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> drivers/pci/controller/pci-xgene-msi.c | 58 ++++++++------------------
> 1 file changed, 17 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
> index b3ac0125b3b40..4be79b9ff80df 100644
> --- a/drivers/pci/controller/pci-xgene-msi.c
> +++ b/drivers/pci/controller/pci-xgene-msi.c
> @@ -24,19 +24,13 @@
> #define NR_HW_IRQS 16
> #define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
>
> -struct xgene_msi_group {
> - struct xgene_msi *msi;
> - int gic_irq;
> - u32 msi_grp;
> -};
> -
> struct xgene_msi {
> struct irq_domain *inner_domain;
> u64 msi_addr;
> void __iomem *msi_regs;
> unsigned long *bitmap;
> struct mutex bitmap_lock;
> - struct xgene_msi_group *msi_groups;
> + unsigned int gic_irq[NR_HW_IRQS];
> };
>
> /* Global data */
> @@ -261,27 +255,20 @@ static int xgene_msi_init_allocator(struct device *dev)
>
> mutex_init(&xgene_msi_ctrl->bitmap_lock);
>
> - xgene_msi_ctrl->msi_groups = devm_kcalloc(dev, NR_HW_IRQS,
> - sizeof(struct xgene_msi_group),
> - GFP_KERNEL);
> - if (!xgene_msi_ctrl->msi_groups)
> - return -ENOMEM;
> -
> return 0;
> }
>
> static void xgene_msi_isr(struct irq_desc *desc)
> {
> + unsigned int *irqp = irq_desc_get_handler_data(desc);
> struct irq_chip *chip = irq_desc_get_chip(desc);
> struct xgene_msi *xgene_msi = xgene_msi_ctrl;
> - struct xgene_msi_group *msi_groups;
> int msir_index, msir_val, hw_irq, ret;
> u32 intr_index, grp_select, msi_grp;
>
> chained_irq_enter(chip, desc);
>
> - msi_groups = irq_desc_get_handler_data(desc);
> - msi_grp = msi_groups->msi_grp;
> + msi_grp = irqp - xgene_msi->gic_irq;
>
> /*
> * MSIINTn (n is 0..F) indicates if there is a pending MSI interrupt
> @@ -341,35 +328,31 @@ static void xgene_msi_remove(struct platform_device *pdev)
> cpuhp_remove_state(pci_xgene_online);
> cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
>
> - kfree(msi->msi_groups);
> -
> xgene_free_domains(msi);
> }
>
> static int xgene_msi_hwirq_alloc(unsigned int cpu)
> {
> - struct xgene_msi *msi = xgene_msi_ctrl;
> - struct xgene_msi_group *msi_group;
> int i;
> int err;
>
> for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> - msi_group = &msi->msi_groups[i];
> + unsigned int irq = xgene_msi_ctrl->gic_irq[i];
>
> /*
> * Statically allocate MSI GIC IRQs to each CPU core.
> * With 8-core X-Gene v1, 2 MSI GIC IRQs are allocated
> * to each core.
> */
> - irq_set_status_flags(msi_group->gic_irq, IRQ_NO_BALANCING);
> - err = irq_set_affinity(msi_group->gic_irq, cpumask_of(cpu));
> + irq_set_status_flags(irq, IRQ_NO_BALANCING);
> + err = irq_set_affinity(irq, cpumask_of(cpu));
> if (err) {
> pr_err("failed to set affinity for GIC IRQ");
> return err;
> }
>
> - irq_set_chained_handler_and_data(msi_group->gic_irq,
> - xgene_msi_isr, msi_group);
> + irq_set_chained_handler_and_data(irq, xgene_msi_isr,
> + &xgene_msi_ctrl->gic_irq[i]);
> }
>
> return 0;
> @@ -378,15 +361,12 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
> static int xgene_msi_hwirq_free(unsigned int cpu)
> {
> struct xgene_msi *msi = xgene_msi_ctrl;
> - struct xgene_msi_group *msi_group;
> int i;
>
> for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> - msi_group = &msi->msi_groups[i];
> - if (!msi_group->gic_irq)
> + if (!msi->gic_irq[i])
In patch 5 we removed this check in xgene_msi_hwirq_alloc(), if it
superfluous there it should be here too.
> continue;
> - irq_set_chained_handler_and_data(msi_group->gic_irq, NULL,
> - NULL);
> + irq_set_chained_handler_and_data(msi->gic_irq[i], NULL, NULL);
> }
> return 0;
> }
> @@ -399,10 +379,9 @@ static const struct of_device_id xgene_msi_match_table[] = {
> static int xgene_msi_probe(struct platform_device *pdev)
> {
> struct resource *res;
> - int rc, irq_index;
Just noticed, insignificant nit: don't see why moving irq_index to a
local loop variable is required in this patch - fine to leave the
code in the patch as-is - reporting it to make sure I have not
missed anything.
Lorenzo
> struct xgene_msi *xgene_msi;
> - int virt_msir;
> u32 msi_val, msi_idx;
> + int rc;
>
> xgene_msi_ctrl = devm_kzalloc(&pdev->dev, sizeof(*xgene_msi_ctrl),
> GFP_KERNEL);
> @@ -432,15 +411,12 @@ static int xgene_msi_probe(struct platform_device *pdev)
> goto error;
> }
>
> - for (irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> - virt_msir = platform_get_irq(pdev, irq_index);
> - if (virt_msir < 0) {
> - rc = virt_msir;
> + for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> + rc = platform_get_irq(pdev, irq_index);
> + if (rc < 0)
> goto error;
> - }
> - xgene_msi->msi_groups[irq_index].gic_irq = virt_msir;
> - xgene_msi->msi_groups[irq_index].msi_grp = irq_index;
> - xgene_msi->msi_groups[irq_index].msi = xgene_msi;
> +
> + xgene_msi->gic_irq[irq_index] = rc;
> }
>
> /*
> @@ -448,7 +424,7 @@ static int xgene_msi_probe(struct platform_device *pdev)
> * interrupt handlers, read all of them to clear spurious
> * interrupts that may occur before the driver is probed.
> */
> - for (irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> + for (int irq_index = 0; irq_index < NR_HW_IRQS; irq_index++) {
> for (msi_idx = 0; msi_idx < IDX_PER_GROUP; msi_idx++)
> xgene_msi_ir_read(xgene_msi, irq_index, msi_idx);
>
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown
2025-07-07 11:14 ` Lorenzo Pieralisi
@ 2025-07-07 15:58 ` Marc Zyngier
0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-07-07 15:58 UTC (permalink / raw)
To: Lorenzo Pieralisi
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Mon, 07 Jul 2025 12:14:09 +0100,
Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
>
> On Sat, Jun 28, 2025 at 06:30:05PM +0100, Marc Zyngier wrote:
> > Another utterly pointless aspect of the xgene-msi driver is that
> > it is built around CPU hotplug. Which is quite amusing since this
> > is one of the few arm64 platforms that, by construction, cannot
> > do CPU hotplug in a supported way (no EL3, no PSCI, no luck).
> >
> > Drop the CPU hotplug nonsense and just setup the IRQs and handlers
> > in a less overdesigned way, grouping things more logically in the
> > process.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > drivers/pci/controller/pci-xgene-msi.c | 109 +++++++++----------------
> > 1 file changed, 37 insertions(+), 72 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
> > index a22a6df7808c7..9f05c2a12da94 100644
> > --- a/drivers/pci/controller/pci-xgene-msi.c
> > +++ b/drivers/pci/controller/pci-xgene-msi.c
> > @@ -216,12 +216,6 @@ static int xgene_allocate_domains(struct device_node *node,
> > return msi->inner_domain ? 0 : -ENOMEM;
> > }
> >
> > -static void xgene_free_domains(struct xgene_msi *msi)
> > -{
> > - if (msi->inner_domain)
> > - irq_domain_remove(msi->inner_domain);
> > -}
> > -
> > static int xgene_msi_init_allocator(struct device *dev)
> > {
> > xgene_msi_ctrl->bitmap = devm_bitmap_zalloc(dev, NR_MSI_VEC, GFP_KERNEL);
> > @@ -271,26 +265,48 @@ static void xgene_msi_isr(struct irq_desc *desc)
> > chained_irq_exit(chip, desc);
> > }
> >
> > -static enum cpuhp_state pci_xgene_online;
> > -
> > static void xgene_msi_remove(struct platform_device *pdev)
> > {
> > - struct xgene_msi *msi = platform_get_drvdata(pdev);
> > -
> > - if (pci_xgene_online)
> > - cpuhp_remove_state(pci_xgene_online);
> > - cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
>
> No question on the patch - just noticed we could remove
> CPUHP_PCI_XGENE_DEAD from cpuhp_state since it would become
> unused AFAICS.
Good point. I'll add that to the queue.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting
2025-07-07 14:57 ` Lorenzo Pieralisi
@ 2025-07-08 14:41 ` Marc Zyngier
0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-07-08 14:41 UTC (permalink / raw)
To: Lorenzo Pieralisi
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Mon, 07 Jul 2025 15:57:47 +0100,
Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
>
> On Sat, Jun 28, 2025 at 06:30:02PM +0100, Marc Zyngier wrote:
[...]
> > + * Effectively, this amounts to:
> > + * - hwirq[7]::cpu[2:0] is the target frame number
> > + * - hwirq[6:4] is the register index in any given frame
> > + * - hwirq[3:0] is the MSI data
>
> I think that adding macros to define these subfields shifts would simplify
> reading and reviewing the code - while reviewing xgene_msi_isr() I
> realized it is hard to understand where the shifts to pack/unpack hwirq come
> from. I'd understand you don't want to use FIELD_PREP/GET on hwirq, it
> is not a HW field but rather a SW encoding you created but at least defining
> the shifts and using them throughout would help.
I have no problem using FIELD_*() for that. I've now reworked this to
make it clearer as well as added a bit more documentation on the
behaviour of the MSInRx registers (they are quite funky).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure
2025-07-07 15:22 ` Lorenzo Pieralisi
@ 2025-07-08 14:46 ` Marc Zyngier
0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2025-07-08 14:46 UTC (permalink / raw)
To: Lorenzo Pieralisi
Cc: linux-pci, linux-arm-kernel, linux-kernel, Toan Le,
Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
Bjorn Helgaas, Thomas Gleixner
On Mon, 07 Jul 2025 16:22:43 +0100,
Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
>
> On Sat, Jun 28, 2025 at 06:30:01PM +0100, Marc Zyngier wrote:
> > The xgene-msi driver uses an odd construct in the form of an
> > intermediate tracking structure, evidently designed to deal with
> > multiple instances of the MSI widget. However, the existing HW
> > only has one set, and it is obvious that there won't be new HW
> > coming down that particular line.
> >
> > Simplify the driver by using a bit of pointer arithmetic instead,
> > directly tracking the interrupt and avoiding extra memory allocation.
>
> A couple of nits, nothing else.
>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > drivers/pci/controller/pci-xgene-msi.c | 58 ++++++++------------------
> > 1 file changed, 17 insertions(+), 41 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c
> > index b3ac0125b3b40..4be79b9ff80df 100644
> > --- a/drivers/pci/controller/pci-xgene-msi.c
> > +++ b/drivers/pci/controller/pci-xgene-msi.c
> > @@ -24,19 +24,13 @@
> > #define NR_HW_IRQS 16
> > #define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
> >
> > -struct xgene_msi_group {
> > - struct xgene_msi *msi;
> > - int gic_irq;
> > - u32 msi_grp;
> > -};
> > -
> > struct xgene_msi {
> > struct irq_domain *inner_domain;
> > u64 msi_addr;
> > void __iomem *msi_regs;
> > unsigned long *bitmap;
> > struct mutex bitmap_lock;
> > - struct xgene_msi_group *msi_groups;
> > + unsigned int gic_irq[NR_HW_IRQS];
> > };
> >
> > /* Global data */
> > @@ -261,27 +255,20 @@ static int xgene_msi_init_allocator(struct device *dev)
> >
> > mutex_init(&xgene_msi_ctrl->bitmap_lock);
> >
> > - xgene_msi_ctrl->msi_groups = devm_kcalloc(dev, NR_HW_IRQS,
> > - sizeof(struct xgene_msi_group),
> > - GFP_KERNEL);
> > - if (!xgene_msi_ctrl->msi_groups)
> > - return -ENOMEM;
> > -
> > return 0;
> > }
> >
> > static void xgene_msi_isr(struct irq_desc *desc)
> > {
> > + unsigned int *irqp = irq_desc_get_handler_data(desc);
> > struct irq_chip *chip = irq_desc_get_chip(desc);
> > struct xgene_msi *xgene_msi = xgene_msi_ctrl;
> > - struct xgene_msi_group *msi_groups;
> > int msir_index, msir_val, hw_irq, ret;
> > u32 intr_index, grp_select, msi_grp;
> >
> > chained_irq_enter(chip, desc);
> >
> > - msi_groups = irq_desc_get_handler_data(desc);
> > - msi_grp = msi_groups->msi_grp;
> > + msi_grp = irqp - xgene_msi->gic_irq;
> >
> > /*
> > * MSIINTn (n is 0..F) indicates if there is a pending MSI interrupt
> > @@ -341,35 +328,31 @@ static void xgene_msi_remove(struct platform_device *pdev)
> > cpuhp_remove_state(pci_xgene_online);
> > cpuhp_remove_state(CPUHP_PCI_XGENE_DEAD);
> >
> > - kfree(msi->msi_groups);
> > -
> > xgene_free_domains(msi);
> > }
> >
> > static int xgene_msi_hwirq_alloc(unsigned int cpu)
> > {
> > - struct xgene_msi *msi = xgene_msi_ctrl;
> > - struct xgene_msi_group *msi_group;
> > int i;
> > int err;
> >
> > for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> > - msi_group = &msi->msi_groups[i];
> > + unsigned int irq = xgene_msi_ctrl->gic_irq[i];
> >
> > /*
> > * Statically allocate MSI GIC IRQs to each CPU core.
> > * With 8-core X-Gene v1, 2 MSI GIC IRQs are allocated
> > * to each core.
> > */
> > - irq_set_status_flags(msi_group->gic_irq, IRQ_NO_BALANCING);
> > - err = irq_set_affinity(msi_group->gic_irq, cpumask_of(cpu));
> > + irq_set_status_flags(irq, IRQ_NO_BALANCING);
> > + err = irq_set_affinity(irq, cpumask_of(cpu));
> > if (err) {
> > pr_err("failed to set affinity for GIC IRQ");
> > return err;
> > }
> >
> > - irq_set_chained_handler_and_data(msi_group->gic_irq,
> > - xgene_msi_isr, msi_group);
> > + irq_set_chained_handler_and_data(irq, xgene_msi_isr,
> > + &xgene_msi_ctrl->gic_irq[i]);
> > }
> >
> > return 0;
> > @@ -378,15 +361,12 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu)
> > static int xgene_msi_hwirq_free(unsigned int cpu)
> > {
> > struct xgene_msi *msi = xgene_msi_ctrl;
> > - struct xgene_msi_group *msi_group;
> > int i;
> >
> > for (i = cpu; i < NR_HW_IRQS; i += num_possible_cpus()) {
> > - msi_group = &msi->msi_groups[i];
> > - if (!msi_group->gic_irq)
> > + if (!msi->gic_irq[i])
>
> In patch 5 we removed this check in xgene_msi_hwirq_alloc(), if it
> superfluous there it should be here too.
Hmmm, good point. I'll get rid of that one too.
>
> > continue;
> > - irq_set_chained_handler_and_data(msi_group->gic_irq, NULL,
> > - NULL);
> > + irq_set_chained_handler_and_data(msi->gic_irq[i], NULL, NULL);
> > }
> > return 0;
> > }
> > @@ -399,10 +379,9 @@ static const struct of_device_id xgene_msi_match_table[] = {
> > static int xgene_msi_probe(struct platform_device *pdev)
> > {
> > struct resource *res;
> > - int rc, irq_index;
>
> Just noticed, insignificant nit: don't see why moving irq_index to a
> local loop variable is required in this patch - fine to leave the
> code in the patch as-is - reporting it to make sure I have not
> missed anything.
Not required, just my own obsession with scope reduction of local
variables. I thought that given the magnitude of the changes, I might
as well give myself some artistic license! ;-)
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-07-08 14:46 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-28 17:29 [PATCH 00/12] PCI: xgene: Fix and simplify the MSI driver Marc Zyngier
2025-06-28 17:29 ` [PATCH 01/12] genirq: Teach handle_simple_irq() to resend an in-progress interrupt Marc Zyngier
2025-07-03 16:29 ` Thomas Gleixner
2025-06-28 17:29 ` [PATCH 02/12] PCI: xgene: Defer probing if the MSI widget driver hasn't probed yet Marc Zyngier
2025-06-28 17:29 ` [PATCH 03/12] PCI: xgene: Drop useless conditional compilation Marc Zyngier
2025-06-28 17:29 ` [PATCH 04/12] PCI: xgene: Drop XGENE_PCIE_IP_VER_UNKN Marc Zyngier
2025-06-28 17:29 ` [PATCH 05/12] PCI: xgene-msi: Make per-CPU interrupt setup robust Marc Zyngier
2025-06-28 17:29 ` [PATCH 06/12] PCI: xgene-msi: Drop superfluous fields from xgene_msi structure Marc Zyngier
2025-06-28 17:30 ` [PATCH 07/12] PCI: xgene-msi: Use device-managed memory allocations Marc Zyngier
2025-06-28 17:30 ` [PATCH 08/12] PCI: xgene-msi: Get rid of intermediate tracking structure Marc Zyngier
2025-07-07 15:22 ` Lorenzo Pieralisi
2025-07-08 14:46 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 09/12] PCI: xgene-msi: Sanitise MSI allocation and affinity setting Marc Zyngier
2025-07-07 14:57 ` Lorenzo Pieralisi
2025-07-08 14:41 ` Marc Zyngier
2025-06-28 17:30 ` [PATCH 10/12] PCI: xgene-msi: Resend an MSI racing with itself on a different CPU Marc Zyngier
2025-06-28 17:30 ` [PATCH 11/12] PCI: xgene-msi: Probe as a standard platform driver Marc Zyngier
2025-06-28 17:30 ` [PATCH 12/12] PCI: xgene-msi: Restructure handler setup/teardown Marc Zyngier
2025-07-07 11:14 ` Lorenzo Pieralisi
2025-07-07 15:58 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).