linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Enable MSI affinity support for dwc PCI
@ 2025-10-03 16:04 Radu Rendec
  2025-10-03 16:04 ` [PATCH 1/3] genirq: Add interrupt redirection infrastructure Radu Rendec
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Radu Rendec @ 2025-10-03 16:04 UTC (permalink / raw)
  To: Thomas Gleixner, Manivannan Sadhasivam
  Cc: Bjorn Helgaas, Rob Herring, Krzysztof Wilczyński,
	Lorenzo Pieralisi, Jingoo Han, Brian Masney, Eric Chanudet,
	Alessandro Carminati, Jared Kangas, linux-pci, linux-kernel

Various attempts have been made so far to support CPU affinity control
for (de)multiplexed interrupts. Some examples are [1] and [2]. That work
was centered around the idea to control the parent interrupt's CPU
affinity, since the child interrupt handler runs in the context of the
parent interrupt handler, on whatever CPU it was triggered.

This is a new attempt based on a different approach. Instead of touching
the parent interrupt's CPU affinity, the child interrupt is allowed to
freely change its affinity setting, independently of the parent. If the
interrupt handler happens to be triggered on an "incompatible" CPU (a
CPU that's not part of the child interrupt's affinity mask), the handler
is redirected and runs in IRQ work context on a "compatible" CPU. This
is a direct follow up to the (unsubmitted) patches that Thomas Gleixner
proposed in [3].

The first patch adds support for interrupt redirection to the IRQ core,
without making any functional change to irqchip drivers. The other two
patches modify the dwc PCI core driver to enable interrupt redirection
using the new infrastructure added in the first patch.

Thomas, however, I made a small design change to your original patches.
Instead of keeping track of the parent interrupt's affinity setting (or
rather the first CPU in its affinity mask) and attempting to pick the
same CPU for the child (as the target CPU) if possible, I just check if
the child handler fires on a CPU that's part of its affinity mask (which
is already stored anyway). As an optimization for the case when the
current CPU is *not* part of the mask and the handler needs to be
redirected, I pre-calculate and store the first CPU in the mask, at the
time when the child affinity is set. In my opinion, this is simpler and
cleaner, at the expense of a cpumask_test_cpu() call on the fast path,
because:
- It no longer needs to keep track of the parent interrupt's affinity
  setting.
- If the parent interrupt can run on more than one CPU, the child can
  also run on any of those CPUs without being redirected (in case the
  child's affinity mask is the same as the parent's or a superset).

Last but not least, since most of the code in these patches is your
code, I took the liberty to add your From and Signed-off-by tags to
properly attribute authorship. I hope that's all right, and if for any
reason you don't want that, then please accept my apologies and I will
remove them in a future version. Of course, you can always remove them
yourself if you want (assuming the patches are merged at some point),
since you are the maintainer :)

[1] https://lore.kernel.org/all/20220502102137.764606ee@thinkpad/
[2] https://lore.kernel.org/all/20230530214550.864894-1-rrendec@redhat.com/
[3] https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/

Radu Rendec (3):
  genirq: Add interrupt redirection infrastructure
  PCI: dwc: Code cleanup
  PCI: dwc: Enable MSI affinity support

 .../pci/controller/dwc/pcie-designware-host.c | 123 ++++++++----------
 drivers/pci/controller/dwc/pcie-designware.h  |   7 +-
 include/linux/irq.h                           |   6 +
 include/linux/irqdesc.h                       |  11 +-
 kernel/irq/chip.c                             |  20 +++
 kernel/irq/irqdesc.c                          |  51 +++++++-
 kernel/irq/manage.c                           |  16 ++-
 7 files changed, 154 insertions(+), 80 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] genirq: Add interrupt redirection infrastructure
  2025-10-03 16:04 [PATCH 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
@ 2025-10-03 16:04 ` Radu Rendec
  2025-10-03 16:04 ` [PATCH 2/3] PCI: dwc: Code cleanup Radu Rendec
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Radu Rendec @ 2025-10-03 16:04 UTC (permalink / raw)
  To: Thomas Gleixner, Manivannan Sadhasivam
  Cc: Bjorn Helgaas, Rob Herring, Krzysztof Wilczyński,
	Lorenzo Pieralisi, Jingoo Han, Brian Masney, Eric Chanudet,
	Alessandro Carminati, Jared Kangas, linux-pci, linux-kernel

From: Thomas Gleixner <tglx@linutronix.de>

Add infrastructure to redirect interrupt handler execution to a
different CPU when the current CPU is not part of the interrupt's CPU
affinity mask.

This is primarily aimed at (de)multiplexed interrupts, where the child
interrupt handler runs in the context of the parent interrupt handler,
and therefore CPU affinity control for the child interrupt is typically
not available.

With the new infrastructure, the child interrupt is allowed to freely
change its affinity setting, independently of the parent. If the
interrupt handler happens to be triggered on an "incompatible" CPU (a
CPU that's not part of the child interrupt's affinity mask), the handler
is redirected and runs in IRQ work context on a "compatible" CPU.

No functional change is being made to any existing irqchip driver, and
irqchip drivers must be explicitly modified to use the newly added
infrastructure to support interrupt redirection.

This is a direct follow up to the patches that Thomas Gleixner proposed
in https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
Co-developed-by: Radu Rendec <rrendec@redhat.com>
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
 include/linux/irq.h     |  6 +++++
 include/linux/irqdesc.h | 11 ++++++++-
 kernel/irq/chip.c       | 20 ++++++++++++++++
 kernel/irq/irqdesc.c    | 51 +++++++++++++++++++++++++++++++++++++++--
 kernel/irq/manage.c     | 16 +++++++++++--
 5 files changed, 99 insertions(+), 5 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index c67e76fbcc077..484d4aed08084 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -459,6 +459,8 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  *			checks against the supplied affinity mask are not
  *			required. This is used for CPU hotplug where the
  *			target CPU is not yet set in the cpu_online_mask.
+ * @irq_pre_redirect:	Optional function to be invoked before redirecting
+ *			an interrupt via irq_work.
  * @irq_retrigger:	resend an IRQ to the CPU
  * @irq_set_type:	set the flow type (IRQ_TYPE_LEVEL/etc.) of an IRQ
  * @irq_set_wake:	enable/disable power-management wake-on of an IRQ
@@ -503,6 +505,7 @@ struct irq_chip {
 	void		(*irq_eoi)(struct irq_data *data);
 
 	int		(*irq_set_affinity)(struct irq_data *data, const struct cpumask *dest, bool force);
+	void		(*irq_pre_redirect)(struct irq_data *data);
 	int		(*irq_retrigger)(struct irq_data *data);
 	int		(*irq_set_type)(struct irq_data *data, unsigned int flow_type);
 	int		(*irq_set_wake)(struct irq_data *data, unsigned int on);
@@ -690,6 +693,9 @@ extern int irq_chip_request_resources_parent(struct irq_data *data);
 extern void irq_chip_release_resources_parent(struct irq_data *data);
 #endif
 
+int irq_chip_redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force);
+void irq_chip_pre_redirect_parent(struct irq_data *data);
+
 /* Disable or mask interrupts during a kernel kexec */
 extern void machine_kexec_mask_interrupts(void);
 
diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index fd091c35d5721..aeead91884668 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -2,9 +2,10 @@
 #ifndef _LINUX_IRQDESC_H
 #define _LINUX_IRQDESC_H
 
-#include <linux/rcupdate.h>
+#include <linux/irq_work.h>
 #include <linux/kobject.h>
 #include <linux/mutex.h>
+#include <linux/rcupdate.h>
 
 /*
  * Core internal functions to deal with irq descriptors
@@ -29,6 +30,11 @@ struct irqstat {
 #endif
 };
 
+struct irq_redirect {
+	struct irq_work	work;
+	unsigned int	fallback_cpu;
+};
+
 /**
  * struct irq_desc - interrupt descriptor
  * @irq_common_data:	per irq and chip data passed down to chip functions
@@ -46,6 +52,7 @@ struct irqstat {
  * @threads_handled:	stats field for deferred spurious detection of threaded handlers
  * @threads_handled_last: comparator field for deferred spurious detection of threaded handlers
  * @lock:		locking for SMP
+ * @redirect:		Facility for redirecting interrupts via irq_work
  * @affinity_hint:	hint to user space for preferred irq affinity
  * @affinity_notify:	context for notification of affinity changes
  * @pending_mask:	pending rebalanced interrupts
@@ -84,6 +91,7 @@ struct irq_desc {
 	struct cpumask		*percpu_enabled;
 	const struct cpumask	*percpu_affinity;
 #ifdef CONFIG_SMP
+	struct irq_redirect	redirect;
 	const struct cpumask	*affinity_hint;
 	struct irq_affinity_notify *affinity_notify;
 #ifdef CONFIG_GENERIC_PENDING_IRQ
@@ -186,6 +194,7 @@ int generic_handle_irq_safe(unsigned int irq);
 int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq);
 int generic_handle_domain_irq_safe(struct irq_domain *domain, unsigned int hwirq);
 int generic_handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq);
+bool generic_handle_demux_domain_irq(struct irq_domain *domain, unsigned int hwirq);
 #endif
 
 /* Test to see if a driver has successfully requested an irq */
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 3ffa0d80ddd19..8e74c6fc63f86 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1215,6 +1215,26 @@ EXPORT_SYMBOL_GPL(handle_fasteoi_mask_irq);
 
 #endif /* CONFIG_IRQ_FASTEOI_HIERARCHY_HANDLERS */
 
+#ifdef CONFIG_SMP
+
+int irq_chip_redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force)
+{
+	struct irq_redirect *redir = &irq_data_to_desc(data)->redirect;
+
+	WRITE_ONCE(redir->fallback_cpu, cpumask_first(dest));
+	return IRQ_SET_MASK_OK;
+}
+EXPORT_SYMBOL_GPL(irq_chip_redirect_set_affinity);
+
+void irq_chip_pre_redirect_parent(struct irq_data *data)
+{
+	data = data->parent_data;
+	data->chip->irq_pre_redirect(data);
+}
+EXPORT_SYMBOL_GPL(irq_chip_pre_redirect_parent);
+
+#endif
+
 /**
  * irq_chip_set_parent_state - set the state of a parent interrupt.
  *
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index db714d3014b5f..d704025751315 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -78,8 +78,12 @@ static int alloc_masks(struct irq_desc *desc, int node)
 	return 0;
 }
 
-static void desc_smp_init(struct irq_desc *desc, int node,
-			  const struct cpumask *affinity)
+static void irq_redirect_work(struct irq_work *work)
+{
+	handle_irq_desc(container_of(work, struct irq_desc, redirect.work));
+}
+
+static void desc_smp_init(struct irq_desc *desc, int node, const struct cpumask *affinity)
 {
 	if (!affinity)
 		affinity = irq_default_affinity;
@@ -91,6 +95,7 @@ static void desc_smp_init(struct irq_desc *desc, int node,
 #ifdef CONFIG_NUMA
 	desc->irq_common_data.node = node;
 #endif
+	desc->redirect.work = IRQ_WORK_INIT_HARD(irq_redirect_work);
 }
 
 static void free_masks(struct irq_desc *desc)
@@ -766,6 +771,48 @@ int generic_handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq)
 	WARN_ON_ONCE(!in_nmi());
 	return handle_irq_desc(irq_resolve_mapping(domain, hwirq));
 }
+
+static bool demux_redirect_remote(struct irq_desc *desc)
+{
+#ifdef CONFIG_SMP
+	const struct cpumask *m = irq_data_get_effective_affinity_mask(&desc->irq_data);
+	unsigned int target_cpu = READ_ONCE(desc->redirect.fallback_cpu);
+
+	if (!cpumask_test_cpu(smp_processor_id(), m)) {
+		/* Protect against shutdown */
+		if (desc->action)
+			irq_work_queue_on(&desc->redirect.work, target_cpu);
+		return true;
+	}
+#endif
+	return false;
+}
+
+/**
+ * generic_handle_demux_domain_irq - Invoke the handler for a hardware interrupt
+ *				     of a demultiplexing domain.
+ * @domain:	The domain where to perform the lookup
+ * @hwirq:	The hardware interrupt number to convert to a logical one
+ *
+ * Returns:	True on success, or false if lookup has failed
+ */
+bool generic_handle_demux_domain_irq(struct irq_domain *domain, unsigned int hwirq)
+{
+	struct irq_desc *desc = irq_resolve_mapping(domain, hwirq);
+
+	if (unlikely(!desc))
+		return false;
+
+	scoped_guard(raw_spinlock, &desc->lock) {
+		if (desc->irq_data.chip->irq_pre_redirect)
+			desc->irq_data.chip->irq_pre_redirect(&desc->irq_data);
+		if (demux_redirect_remote(desc))
+			return true;
+	}
+	return !handle_irq_desc(desc);
+}
+EXPORT_SYMBOL_GPL(generic_handle_demux_domain_irq);
+
 #endif
 
 /* Dynamic interrupt handling */
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index c94837382037e..ed8f8b2667b0b 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -35,6 +35,16 @@ static int __init setup_forced_irqthreads(char *arg)
 early_param("threadirqs", setup_forced_irqthreads);
 #endif
 
+#ifdef CONFIG_SMP
+static inline void synchronize_irqwork(struct irq_desc *desc)
+{
+	/* Synchronize pending or on the fly redirect work */
+	irq_work_sync(&desc->redirect.work);
+}
+#else
+static inline void synchronize_irqwork(struct irq_desc *desc) { }
+#endif
+
 static int __irq_get_irqchip_state(struct irq_data *d, enum irqchip_irq_state which, bool *state);
 
 static void __synchronize_hardirq(struct irq_desc *desc, bool sync_chip)
@@ -43,6 +53,8 @@ static void __synchronize_hardirq(struct irq_desc *desc, bool sync_chip)
 	bool inprogress;
 
 	do {
+		synchronize_irqwork(desc);
+
 		/*
 		 * Wait until we're out of the critical section.  This might
 		 * give the wrong answer due to the lack of memory barriers.
@@ -108,6 +120,7 @@ EXPORT_SYMBOL(synchronize_hardirq);
 static void __synchronize_irq(struct irq_desc *desc)
 {
 	__synchronize_hardirq(desc, true);
+
 	/*
 	 * We made sure that no hardirq handler is running. Now verify that no
 	 * threaded handlers are active.
@@ -217,8 +230,7 @@ static inline void irq_validate_effective_affinity(struct irq_data *data) { }
 
 static DEFINE_PER_CPU(struct cpumask, __tmp_mask);
 
-int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
-			bool force)
+int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force)
 {
 	struct cpumask *tmp_mask = this_cpu_ptr(&__tmp_mask);
 	struct irq_desc *desc = irq_data_to_desc(data);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] PCI: dwc: Code cleanup
  2025-10-03 16:04 [PATCH 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
  2025-10-03 16:04 ` [PATCH 1/3] genirq: Add interrupt redirection infrastructure Radu Rendec
@ 2025-10-03 16:04 ` Radu Rendec
  2025-10-03 16:04 ` [PATCH 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
  2025-10-03 16:32 ` [PATCH 0/3] Enable MSI affinity support for dwc PCI Manivannan Sadhasivam
  3 siblings, 0 replies; 7+ messages in thread
From: Radu Rendec @ 2025-10-03 16:04 UTC (permalink / raw)
  To: Thomas Gleixner, Manivannan Sadhasivam
  Cc: Bjorn Helgaas, Rob Herring, Krzysztof Wilczyński,
	Lorenzo Pieralisi, Jingoo Han, Brian Masney, Eric Chanudet,
	Alessandro Carminati, Jared Kangas, linux-pci, linux-kernel

From: Thomas Gleixner <tglx@linutronix.de>

Code cleanup with no functional changes. These changes were originally
made by Thomas Gleixner (see Link tag below) in a patch that was never
submitted as is. Other parts of that patch were eventually submitted as
commit 8e717112caf3 ("PCI: dwc: Switch to msi_create_parent_irq_domain()")
and the remaining parts are the code cleanup changes in this patch.

Summary of changes:
- Use guard()/scoped_guard() instead of open-coded lock/unlock.
- Return void in a few functions whose return value is never used.
- Simplify dw_handle_msi_irq() by using for_each_set_bit().

One notable deviation from the original patch is that I reverted back to
a simple 1 by 1 iteration over the controllers inside dw_handle_msi_irq.
The reason is that with the original changes, the IRQ offset was
calculated incorrectly.

This patch also prepares the ground for the next patch in the series,
which enables MSI affinity support, and was originally part of that same
series that Thomas Gleixner prepared.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
Co-developed-by: Radu Rendec <rrendec@redhat.com>
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
 .../pci/controller/dwc/pcie-designware-host.c | 98 ++++++-------------
 drivers/pci/controller/dwc/pcie-designware.h  |  7 +-
 2 files changed, 34 insertions(+), 71 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 952f8594b5012..3ee6a464726ec 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -42,35 +42,25 @@ static const struct msi_parent_ops dw_pcie_msi_parent_ops = {
 };
 
 /* MSI int handler */
-irqreturn_t dw_handle_msi_irq(struct dw_pcie_rp *pp)
+void dw_handle_msi_irq(struct dw_pcie_rp *pp)
 {
-	int i, pos;
-	unsigned long val;
-	u32 status, num_ctrls;
-	irqreturn_t ret = IRQ_NONE;
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+	unsigned int i, num_ctrls;
 
 	num_ctrls = pp->num_vectors / MAX_MSI_IRQS_PER_CTRL;
 
 	for (i = 0; i < num_ctrls; i++) {
-		status = dw_pcie_readl_dbi(pci, PCIE_MSI_INTR0_STATUS +
-					   (i * MSI_REG_CTRL_BLOCK_SIZE));
+		unsigned int reg_off = i * MSI_REG_CTRL_BLOCK_SIZE;
+		unsigned int irq_off = i * MAX_MSI_IRQS_PER_CTRL;
+		unsigned long status, pos;
+
+		status = dw_pcie_readl_dbi(pci, PCIE_MSI_INTR0_STATUS + reg_off);
 		if (!status)
 			continue;
 
-		ret = IRQ_HANDLED;
-		val = status;
-		pos = 0;
-		while ((pos = find_next_bit(&val, MAX_MSI_IRQS_PER_CTRL,
-					    pos)) != MAX_MSI_IRQS_PER_CTRL) {
-			generic_handle_domain_irq(pp->irq_domain,
-						  (i * MAX_MSI_IRQS_PER_CTRL) +
-						  pos);
-			pos++;
-		}
+		for_each_set_bit(pos, &status, MAX_MSI_IRQS_PER_CTRL)
+			generic_handle_domain_irq(pp->irq_domain, irq_off + pos);
 	}
-
-	return ret;
 }
 
 /* Chained MSI interrupt service routine */
@@ -91,13 +81,10 @@ static void dw_pci_setup_msi_msg(struct irq_data *d, struct msi_msg *msg)
 {
 	struct dw_pcie_rp *pp = irq_data_get_irq_chip_data(d);
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
-	u64 msi_target;
-
-	msi_target = (u64)pp->msi_data;
+	u64 msi_target = (u64)pp->msi_data;
 
 	msg->address_lo = lower_32_bits(msi_target);
 	msg->address_hi = upper_32_bits(msi_target);
-
 	msg->data = d->hwirq;
 
 	dev_dbg(pci->dev, "msi#%d address_hi %#x address_lo %#x\n",
@@ -109,18 +96,14 @@ static void dw_pci_bottom_mask(struct irq_data *d)
 	struct dw_pcie_rp *pp = irq_data_get_irq_chip_data(d);
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 	unsigned int res, bit, ctrl;
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&pp->lock, flags);
 
+	guard(raw_spinlock)(&pp->lock);
 	ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
 	res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
 	bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
 
 	pp->irq_mask[ctrl] |= BIT(bit);
 	dw_pcie_writel_dbi(pci, PCIE_MSI_INTR0_MASK + res, pp->irq_mask[ctrl]);
-
-	raw_spin_unlock_irqrestore(&pp->lock, flags);
 }
 
 static void dw_pci_bottom_unmask(struct irq_data *d)
@@ -128,18 +111,14 @@ static void dw_pci_bottom_unmask(struct irq_data *d)
 	struct dw_pcie_rp *pp = irq_data_get_irq_chip_data(d);
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 	unsigned int res, bit, ctrl;
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&pp->lock, flags);
 
+	guard(raw_spinlock)(&pp->lock);
 	ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
 	res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
 	bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
 
 	pp->irq_mask[ctrl] &= ~BIT(bit);
 	dw_pcie_writel_dbi(pci, PCIE_MSI_INTR0_MASK + res, pp->irq_mask[ctrl]);
-
-	raw_spin_unlock_irqrestore(&pp->lock, flags);
 }
 
 static void dw_pci_bottom_ack(struct irq_data *d)
@@ -156,54 +135,42 @@ static void dw_pci_bottom_ack(struct irq_data *d)
 }
 
 static struct irq_chip dw_pci_msi_bottom_irq_chip = {
-	.name = "DWPCI-MSI",
-	.irq_ack = dw_pci_bottom_ack,
-	.irq_compose_msi_msg = dw_pci_setup_msi_msg,
-	.irq_mask = dw_pci_bottom_mask,
-	.irq_unmask = dw_pci_bottom_unmask,
+	.name			= "DWPCI-MSI",
+	.irq_ack		= dw_pci_bottom_ack,
+	.irq_compose_msi_msg	= dw_pci_setup_msi_msg,
+	.irq_mask		= dw_pci_bottom_mask,
+	.irq_unmask		= dw_pci_bottom_unmask,
 };
 
-static int dw_pcie_irq_domain_alloc(struct irq_domain *domain,
-				    unsigned int virq, unsigned int nr_irqs,
-				    void *args)
+static int dw_pcie_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+				    unsigned int nr_irqs, void *args)
 {
 	struct dw_pcie_rp *pp = domain->host_data;
-	unsigned long flags;
-	u32 i;
 	int bit;
 
-	raw_spin_lock_irqsave(&pp->lock, flags);
-
-	bit = bitmap_find_free_region(pp->msi_irq_in_use, pp->num_vectors,
-				      order_base_2(nr_irqs));
-
-	raw_spin_unlock_irqrestore(&pp->lock, flags);
+	scoped_guard (raw_spinlock_irq, &pp->lock) {
+		bit = bitmap_find_free_region(pp->msi_irq_in_use, pp->num_vectors,
+					      order_base_2(nr_irqs));
+	}
 
 	if (bit < 0)
 		return -ENOSPC;
 
-	for (i = 0; i < nr_irqs; i++)
-		irq_domain_set_info(domain, virq + i, bit + i,
-				    pp->msi_irq_chip,
-				    pp, handle_edge_irq,
-				    NULL, NULL);
-
+	for (unsigned int i = 0; i < nr_irqs; i++) {
+		irq_domain_set_info(domain, virq + i, bit + i, pp->msi_irq_chip,
+				    pp, handle_edge_irq, NULL, NULL);
+	}
 	return 0;
 }
 
-static void dw_pcie_irq_domain_free(struct irq_domain *domain,
-				    unsigned int virq, unsigned int nr_irqs)
+static void dw_pcie_irq_domain_free(struct irq_domain *domain, unsigned int virq,
+				    unsigned int nr_irqs)
 {
 	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
 	struct dw_pcie_rp *pp = domain->host_data;
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&pp->lock, flags);
 
-	bitmap_release_region(pp->msi_irq_in_use, d->hwirq,
-			      order_base_2(nr_irqs));
-
-	raw_spin_unlock_irqrestore(&pp->lock, flags);
+	guard(raw_spinlock_irq)(&pp->lock);
+	bitmap_release_region(pp->msi_irq_in_use, d->hwirq, order_base_2(nr_irqs));
 }
 
 static const struct irq_domain_ops dw_pcie_msi_domain_ops = {
@@ -236,8 +203,7 @@ void dw_pcie_free_msi(struct dw_pcie_rp *pp)
 
 	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++) {
 		if (pp->msi_irq[ctrl] > 0)
-			irq_set_chained_handler_and_data(pp->msi_irq[ctrl],
-							 NULL, NULL);
+			irq_set_chained_handler_and_data(pp->msi_irq[ctrl], NULL, NULL);
 	}
 
 	irq_domain_remove(pp->irq_domain);
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 00f52d472dcdd..226aac41836bc 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -753,7 +753,7 @@ static inline enum dw_pcie_ltssm dw_pcie_get_ltssm(struct dw_pcie *pci)
 #ifdef CONFIG_PCIE_DW_HOST
 int dw_pcie_suspend_noirq(struct dw_pcie *pci);
 int dw_pcie_resume_noirq(struct dw_pcie *pci);
-irqreturn_t dw_handle_msi_irq(struct dw_pcie_rp *pp);
+void dw_handle_msi_irq(struct dw_pcie_rp *pp);
 void dw_pcie_msi_init(struct dw_pcie_rp *pp);
 int dw_pcie_msi_host_init(struct dw_pcie_rp *pp);
 void dw_pcie_free_msi(struct dw_pcie_rp *pp);
@@ -774,10 +774,7 @@ static inline int dw_pcie_resume_noirq(struct dw_pcie *pci)
 	return 0;
 }
 
-static inline irqreturn_t dw_handle_msi_irq(struct dw_pcie_rp *pp)
-{
-	return IRQ_NONE;
-}
+static inline void dw_handle_msi_irq(struct dw_pcie_rp *pp) { }
 
 static inline void dw_pcie_msi_init(struct dw_pcie_rp *pp)
 { }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] PCI: dwc: Enable MSI affinity support
  2025-10-03 16:04 [PATCH 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
  2025-10-03 16:04 ` [PATCH 1/3] genirq: Add interrupt redirection infrastructure Radu Rendec
  2025-10-03 16:04 ` [PATCH 2/3] PCI: dwc: Code cleanup Radu Rendec
@ 2025-10-03 16:04 ` Radu Rendec
  2025-10-04  4:42   ` kernel test robot
  2025-10-04  7:27   ` kernel test robot
  2025-10-03 16:32 ` [PATCH 0/3] Enable MSI affinity support for dwc PCI Manivannan Sadhasivam
  3 siblings, 2 replies; 7+ messages in thread
From: Radu Rendec @ 2025-10-03 16:04 UTC (permalink / raw)
  To: Thomas Gleixner, Manivannan Sadhasivam
  Cc: Bjorn Helgaas, Rob Herring, Krzysztof Wilczyński,
	Lorenzo Pieralisi, Jingoo Han, Brian Masney, Eric Chanudet,
	Alessandro Carminati, Jared Kangas, linux-pci, linux-kernel

From: Thomas Gleixner <tglx@linutronix.de>

Leverage the interrupt redirection infrastructure to enable CPU affinity
support for MSI interrupts. Since the parent interrupt affinity cannot
be changed, affinity control for the child interrupt (MSI) is achieved
by redirecting the handler to run in IRQ work context on the target CPU.

This patch was originally prepared by Thomas Gleixner (see Link tag
below) in a patch series that was never submitted as is, and only
parts of that series have made it upstream so far.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
Co-developed-by: Radu Rendec <rrendec@redhat.com>
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
 .../pci/controller/dwc/pcie-designware-host.c | 27 +++++++++++++++----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 3ee6a464726ec..a3d4b423a2ab9 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -24,9 +24,21 @@
 static struct pci_ops dw_pcie_ops;
 static struct pci_ops dw_child_pcie_ops;
 
+static void dw_pcie_msi_ack(struct irq_data *d) { }
+
+static bool dw_pcie_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
+				      struct irq_domain *real_parent, struct msi_domain_info *info)
+{
+	if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
+		return false;
+
+	info->chip->irq_ack = dw_pcie_msi_ack;
+	info->chip->irq_pre_redirect = irq_chip_pre_redirect_parent;
+	return true;
+}
+
 #define DW_PCIE_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS		| \
 				    MSI_FLAG_USE_DEF_CHIP_OPS		| \
-				    MSI_FLAG_NO_AFFINITY		| \
 				    MSI_FLAG_PCI_MSI_MASK_PARENT)
 #define DW_PCIE_MSI_FLAGS_SUPPORTED (MSI_FLAG_MULTI_PCI_MSI		| \
 				     MSI_FLAG_PCI_MSIX			| \
@@ -36,9 +48,8 @@ static const struct msi_parent_ops dw_pcie_msi_parent_ops = {
 	.required_flags		= DW_PCIE_MSI_FLAGS_REQUIRED,
 	.supported_flags	= DW_PCIE_MSI_FLAGS_SUPPORTED,
 	.bus_select_token	= DOMAIN_BUS_PCI_MSI,
-	.chip_flags		= MSI_CHIP_FLAG_SET_ACK,
 	.prefix			= "DW-",
-	.init_dev_msi_info	= msi_lib_init_dev_msi_info,
+	.init_dev_msi_info	= dw_pcie_init_dev_msi_info,
 };
 
 /* MSI int handler */
@@ -59,7 +70,7 @@ void dw_handle_msi_irq(struct dw_pcie_rp *pp)
 			continue;
 
 		for_each_set_bit(pos, &status, MAX_MSI_IRQS_PER_CTRL)
-			generic_handle_domain_irq(pp->irq_domain, irq_off + pos);
+			generic_handle_demux_domain_irq(pp->irq_domain, irq_off + pos);
 	}
 }
 
@@ -121,7 +132,9 @@ static void dw_pci_bottom_unmask(struct irq_data *d)
 	dw_pcie_writel_dbi(pci, PCIE_MSI_INTR0_MASK + res, pp->irq_mask[ctrl]);
 }
 
-static void dw_pci_bottom_ack(struct irq_data *d)
+static void dw_pci_bottom_ack(struct irq_data *d) { }
+
+static void dw_pci_pre_redirect(struct irq_data *d)
 {
 	struct dw_pcie_rp *pp  = irq_data_get_irq_chip_data(d);
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
@@ -140,6 +153,10 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
 	.irq_compose_msi_msg	= dw_pci_setup_msi_msg,
 	.irq_mask		= dw_pci_bottom_mask,
 	.irq_unmask		= dw_pci_bottom_unmask,
+#ifdef CONFIG_SMP
+	.irq_pre_redirect	= dw_pci_pre_redirect,
+	.irq_set_affinity	= irq_chip_redirect_set_affinity,
+#endif
 };
 
 static int dw_pcie_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Enable MSI affinity support for dwc PCI
  2025-10-03 16:04 [PATCH 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
                   ` (2 preceding siblings ...)
  2025-10-03 16:04 ` [PATCH 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
@ 2025-10-03 16:32 ` Manivannan Sadhasivam
  3 siblings, 0 replies; 7+ messages in thread
From: Manivannan Sadhasivam @ 2025-10-03 16:32 UTC (permalink / raw)
  To: Radu Rendec
  Cc: Thomas Gleixner, Bjorn Helgaas, Rob Herring,
	Krzysztof Wilczyński, Lorenzo Pieralisi, Jingoo Han,
	Brian Masney, Eric Chanudet, Alessandro Carminati, Jared Kangas,
	linux-pci, linux-kernel, Daniel Tsai, Marek Behún,
	Krishna Chaitanya Chundru

+ folks who were part of previous attempts

On Fri, Oct 03, 2025 at 12:04:18PM -0400, Radu Rendec wrote:
> Various attempts have been made so far to support CPU affinity control
> for (de)multiplexed interrupts. Some examples are [1] and [2]. That work
> was centered around the idea to control the parent interrupt's CPU
> affinity, since the child interrupt handler runs in the context of the
> parent interrupt handler, on whatever CPU it was triggered.
> 
> This is a new attempt based on a different approach. Instead of touching
> the parent interrupt's CPU affinity, the child interrupt is allowed to
> freely change its affinity setting, independently of the parent. If the
> interrupt handler happens to be triggered on an "incompatible" CPU (a
> CPU that's not part of the child interrupt's affinity mask), the handler
> is redirected and runs in IRQ work context on a "compatible" CPU. This
> is a direct follow up to the (unsubmitted) patches that Thomas Gleixner
> proposed in [3].
> 
> The first patch adds support for interrupt redirection to the IRQ core,
> without making any functional change to irqchip drivers. The other two
> patches modify the dwc PCI core driver to enable interrupt redirection
> using the new infrastructure added in the first patch.
> 
> Thomas, however, I made a small design change to your original patches.
> Instead of keeping track of the parent interrupt's affinity setting (or
> rather the first CPU in its affinity mask) and attempting to pick the
> same CPU for the child (as the target CPU) if possible, I just check if
> the child handler fires on a CPU that's part of its affinity mask (which
> is already stored anyway). As an optimization for the case when the
> current CPU is *not* part of the mask and the handler needs to be
> redirected, I pre-calculate and store the first CPU in the mask, at the
> time when the child affinity is set. In my opinion, this is simpler and
> cleaner, at the expense of a cpumask_test_cpu() call on the fast path,
> because:
> - It no longer needs to keep track of the parent interrupt's affinity
>   setting.
> - If the parent interrupt can run on more than one CPU, the child can
>   also run on any of those CPUs without being redirected (in case the
>   child's affinity mask is the same as the parent's or a superset).
> 
> Last but not least, since most of the code in these patches is your
> code, I took the liberty to add your From and Signed-off-by tags to
> properly attribute authorship. I hope that's all right, and if for any
> reason you don't want that, then please accept my apologies and I will
> remove them in a future version. Of course, you can always remove them
> yourself if you want (assuming the patches are merged at some point),
> since you are the maintainer :)
> 
> [1] https://lore.kernel.org/all/20220502102137.764606ee@thinkpad/
> [2] https://lore.kernel.org/all/20230530214550.864894-1-rrendec@redhat.com/
> [3] https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
> 
> Radu Rendec (3):
>   genirq: Add interrupt redirection infrastructure
>   PCI: dwc: Code cleanup
>   PCI: dwc: Enable MSI affinity support
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 123 ++++++++----------
>  drivers/pci/controller/dwc/pcie-designware.h  |   7 +-
>  include/linux/irq.h                           |   6 +
>  include/linux/irqdesc.h                       |  11 +-
>  kernel/irq/chip.c                             |  20 +++
>  kernel/irq/irqdesc.c                          |  51 +++++++-
>  kernel/irq/manage.c                           |  16 ++-
>  7 files changed, 154 insertions(+), 80 deletions(-)
> 
> -- 
> 2.51.0
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] PCI: dwc: Enable MSI affinity support
  2025-10-03 16:04 ` [PATCH 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
@ 2025-10-04  4:42   ` kernel test robot
  2025-10-04  7:27   ` kernel test robot
  1 sibling, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-04  4:42 UTC (permalink / raw)
  To: Radu Rendec, Thomas Gleixner, Manivannan Sadhasivam
  Cc: oe-kbuild-all, Bjorn Helgaas, Rob Herring,
	Krzysztof Wilczyński, Lorenzo Pieralisi, Jingoo Han,
	Brian Masney, Eric Chanudet, Alessandro Carminati, Jared Kangas,
	linux-pci, linux-kernel

Hi Radu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/irq/core]
[also build test WARNING on pci/next pci/for-linus mani-mhi/mhi-next linus/master v6.17 next-20251003]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Radu-Rendec/genirq-Add-interrupt-redirection-infrastructure/20251004-000948
base:   tip/irq/core
patch link:    https://lore.kernel.org/r/20251003160421.951448-4-rrendec%40redhat.com
patch subject: [PATCH 3/3] PCI: dwc: Enable MSI affinity support
config: i386-randconfig-002-20251004 (https://download.01.org/0day-ci/archive/20251004/202510041241.KgbWC5KM-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251004/202510041241.KgbWC5KM-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510041241.KgbWC5KM-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/pci/controller/dwc/pcie-designware-host.c:137:13: warning: 'dw_pci_pre_redirect' defined but not used [-Wunused-function]
     137 | static void dw_pci_pre_redirect(struct irq_data *d)
         |             ^~~~~~~~~~~~~~~~~~~


vim +/dw_pci_pre_redirect +137 drivers/pci/controller/dwc/pcie-designware-host.c

   136	
 > 137	static void dw_pci_pre_redirect(struct irq_data *d)
   138	{
   139		struct dw_pcie_rp *pp  = irq_data_get_irq_chip_data(d);
   140		struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
   141		unsigned int res, bit, ctrl;
   142	
   143		ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
   144		res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
   145		bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
   146	
   147		dw_pcie_writel_dbi(pci, PCIE_MSI_INTR0_STATUS + res, BIT(bit));
   148	}
   149	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] PCI: dwc: Enable MSI affinity support
  2025-10-03 16:04 ` [PATCH 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
  2025-10-04  4:42   ` kernel test robot
@ 2025-10-04  7:27   ` kernel test robot
  1 sibling, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-04  7:27 UTC (permalink / raw)
  To: Radu Rendec, Thomas Gleixner, Manivannan Sadhasivam
  Cc: oe-kbuild-all, Bjorn Helgaas, Rob Herring,
	Krzysztof Wilczyński, Lorenzo Pieralisi, Jingoo Han,
	Brian Masney, Eric Chanudet, Alessandro Carminati, Jared Kangas,
	linux-pci, linux-kernel

Hi Radu,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/irq/core]
[also build test ERROR on pci/next pci/for-linus mani-mhi/mhi-next linus/master v6.17 next-20251003]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Radu-Rendec/genirq-Add-interrupt-redirection-infrastructure/20251004-000948
base:   tip/irq/core
patch link:    https://lore.kernel.org/r/20251003160421.951448-4-rrendec%40redhat.com
patch subject: [PATCH 3/3] PCI: dwc: Enable MSI affinity support
config: x86_64-randconfig-002-20251004 (https://download.01.org/0day-ci/archive/20251004/202510041550.xoRbz92p-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251004/202510041550.xoRbz92p-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510041550.xoRbz92p-lkp@intel.com/

All errors (new ones prefixed by >>):

   ld: vmlinux.o: in function `dw_pcie_init_dev_msi_info':
>> drivers/pci/controller/dwc/pcie-designware-host.c:36:(.text+0x21dd630): undefined reference to `irq_chip_pre_redirect_parent'


vim +36 drivers/pci/controller/dwc/pcie-designware-host.c

    28	
    29	static bool dw_pcie_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
    30					      struct irq_domain *real_parent, struct msi_domain_info *info)
    31	{
    32		if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
    33			return false;
    34	
    35		info->chip->irq_ack = dw_pcie_msi_ack;
  > 36		info->chip->irq_pre_redirect = irq_chip_pre_redirect_parent;
    37		return true;
    38	}
    39	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-04  7:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-03 16:04 [PATCH 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
2025-10-03 16:04 ` [PATCH 1/3] genirq: Add interrupt redirection infrastructure Radu Rendec
2025-10-03 16:04 ` [PATCH 2/3] PCI: dwc: Code cleanup Radu Rendec
2025-10-03 16:04 ` [PATCH 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
2025-10-04  4:42   ` kernel test robot
2025-10-04  7:27   ` kernel test robot
2025-10-03 16:32 ` [PATCH 0/3] Enable MSI affinity support for dwc PCI Manivannan Sadhasivam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).