linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic
@ 2024-12-10 10:34 Thomas Gleixner
  2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

In a recent discussion about the potential race condition with unmaskable
MSI interrupts on RISC-V:

  https;//lore.kernel.org/all/87r06gq2di.ffs@tglx

it turned out that RISC-V needs the GENERIC_PENDING_IRQ infrastructure to
close the gap.

It turns out that the logic behind GENERIC_PENDING_IRQ is slighly
convoluted and backwards for the general case:

   1) The default is to defer, which is not what the majority of interrupt
      controllers need.

   2) Deferrement is handled as per interrupt flag. That's a pointless
      exercise as the requirement is actually per interrupt controller.

To ease the conversion of RISC-V, rework the logic to make the deferrement
based on an interrupt chip flag and convert x86 over, which then allows to
remove the current double book keeping of the non-deferrement flag.

The conversion is done in two steps with an intermediate config switch as
RISC-V needs a trivial way to backport the changes.

Thanks,

	tglx
---
 arch/arc/Kconfig                    |    1 -
 arch/arc/kernel/mcip.c              |    2 --
 arch/hexagon/Kconfig                |    1 -
 arch/x86/hyperv/irqdomain.c         |    2 +-
 arch/x86/kernel/apic/io_apic.c      |    2 +-
 arch/x86/kernel/apic/msi.c          |    3 ++-
 arch/x86/kernel/hpet.c              |    8 --------
 arch/x86/platform/uv/uv_irq.c       |    2 --
 drivers/iommu/amd/init.c            |    2 +-
 drivers/iommu/amd/iommu.c           |    1 -
 drivers/iommu/intel/irq_remapping.c |    1 -
 drivers/pci/controller/pci-hyperv.c |    1 +
 drivers/xen/events/events_base.c    |    6 ------
 include/linux/irq.h                 |   14 +++-----------
 kernel/irq/Kconfig                  |    4 ++++
 kernel/irq/chip.c                   |    4 +---
 kernel/irq/debugfs.c                |    2 +-
 kernel/irq/internals.h              |    2 +-
 kernel/irq/settings.h               |    6 ------
 19 files changed, 16 insertions(+), 48 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ
  2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
@ 2024-12-10 10:34 ` Thomas Gleixner
  2024-12-10 17:22   ` Vineet Gupta
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2024-12-10 10:34 ` [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover Thomas Gleixner
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

Nothing uses the actual functionality and the MCIP controller sets the
flags which disables the deferred affinity change. The other interrupt
controller does not support affinity setting at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@kernel.org>
---
 arch/arc/Kconfig       |    1 -
 arch/arc/kernel/mcip.c |    2 --
 kernel/irq/debugfs.c   |    1 +
 3 files changed, 1 insertion(+), 3 deletions(-)

--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -24,7 +24,6 @@ config ARC
 	# for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PCI_IOMAP
-	select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_SCHED_CLOCK
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_IOREMAP
--- a/arch/arc/kernel/mcip.c
+++ b/arch/arc/kernel/mcip.c
@@ -357,8 +357,6 @@ static void idu_cascade_isr(struct irq_d
 static int idu_irq_map(struct irq_domain *d, unsigned int virq, irq_hw_number_t hwirq)
 {
 	irq_set_chip_and_handler(virq, &idu_irq_chip, handle_level_irq);
-	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
-
 	return 0;
 }
 
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -53,6 +53,7 @@ static const struct irq_bit_descr irqchi
 	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI),
 	BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND),
 	BIT_MASK_DESCR(IRQCHIP_IMMUTABLE),
+	BIT_MASK_DESCR(IRQCHIP_MOVE_DEFERRED),
 };
 
 static void


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover
  2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
  2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
@ 2024-12-10 10:34 ` Thomas Gleixner
  2024-12-10 15:06   ` Brian Cain
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2024-12-10 10:34 ` [patch 3/5] genirq: Provide IRQCHIP_MOVE_DEFERRED Thomas Gleixner
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Brian Cain, Vineet Gupta, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

Commented out since 2011....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Brian Cain <bcain@quicinc.com>
---
 arch/hexagon/Kconfig |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -20,7 +20,6 @@ config HEXAGON
 	# select ARCH_HAS_CPU_IDLE_WAIT
 	# select GPIOLIB
 	# select HAVE_CLK
-	# select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_ATOMIC64
 	select HAVE_PERF_EVENTS
 	# GENERIC_ALLOCATOR is used by dma_alloc_coherent()


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 3/5] genirq: Provide IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
  2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
  2024-12-10 10:34 ` [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover Thomas Gleixner
@ 2024-12-10 10:34 ` Thomas Gleixner
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
  2024-12-10 10:34 ` [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code Thomas Gleixner
  4 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

The logic of GENERIC_PENDING_IRQ is backwards for historical reasons. Most
interrupt controllers allow to move the interrupt from arbitrary
contexts. If GENERIC_PENDING_IRQ is enabled by an architecture to support a
chip, which requires the affinity change to happen in interrupt context,
all other chips have to be marked with IRQF_MOVE_PCNTXT.

That's tedious and there is no real good reason for the extra flags in the
irq descriptor and the irq data status fields. In fact the decision whether
interrupts can be moved in arbitrary context or not is a property of the
interrupt chip.

To simplify adoption for RISC-V provide a new mechanism which is enabled
via a config switch and allows to add a flag to irq_chip::flags to request
that interrupt affinity changes are deferred. Setting the top level chip of
an interrupt evaluates the flag and maps it into the existing logic.

The config switch and the various PCNTXT flags are temporary until x86 is
converted over to this scheme. This intermediate step also allows trivial
backporting of the mechanism to plug the affinity change race of various
RISC-V interrupt controllers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irq.h |    2 ++
 kernel/irq/Kconfig  |    4 ++++
 kernel/irq/chip.c   |   18 +++++++++++++++---
 3 files changed, 21 insertions(+), 3 deletions(-)

--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -567,6 +567,7 @@ struct irq_chip {
  *                                    in the suspend path if they are in disabled state
  * IRQCHIP_AFFINITY_PRE_STARTUP:      Default affinity update before startup
  * IRQCHIP_IMMUTABLE:		      Don't ever change anything in this chip
+ * IRQCHIP_MOVE_DEFERRED:	      Move the interrupt in actual interrupt context
  */
 enum {
 	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
@@ -581,6 +582,7 @@ enum {
 	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
 	IRQCHIP_AFFINITY_PRE_STARTUP		= (1 << 10),
 	IRQCHIP_IMMUTABLE			= (1 << 11),
+	IRQCHIP_MOVE_DEFERRED			= (1 << 12),
 };
 
 #include <linux/irqdesc.h>
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -31,6 +31,10 @@ config GENERIC_IRQ_EFFECTIVE_AFF_MASK
 config GENERIC_PENDING_IRQ
 	bool
 
+# Deduce delayed migration from top-level interrupt chip flags
+config GENERIC_PENDING_IRQ_CHIPFLAGS
+	bool
+
 # Support for generic irq migrating off cpu before the cpu is offline.
 config GENERIC_IRQ_MIGRATION
 	bool
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,6 +47,13 @@ int irq_set_chip(unsigned int irq, const
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
+
+	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
+		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
+			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		else
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1114,16 +1121,21 @@ void irq_modify_status(unsigned int irq,
 	trigger = irqd_get_trigger_type(&desc->irq_data);
 
 	irqd_clear(&desc->irq_data, IRQD_NO_BALANCING | IRQD_PER_CPU |
-		   IRQD_TRIGGER_MASK | IRQD_LEVEL | IRQD_MOVE_PCNTXT);
+		   IRQD_TRIGGER_MASK | IRQD_LEVEL);
 	if (irq_settings_has_no_balance_set(desc))
 		irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
 	if (irq_settings_is_per_cpu(desc))
 		irqd_set(&desc->irq_data, IRQD_PER_CPU);
-	if (irq_settings_can_move_pcntxt(desc))
-		irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
+	/* Keep this around until x86 is converted over */
+	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
+		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		if (irq_settings_can_move_pcntxt(desc))
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
+
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
                   ` (2 preceding siblings ...)
  2024-12-10 10:34 ` [patch 3/5] genirq: Provide IRQCHIP_MOVE_DEFERRED Thomas Gleixner
@ 2024-12-10 10:34 ` Thomas Gleixner
  2024-12-11 16:36   ` Steve Wahl
                     ` (3 more replies)
  2024-12-10 10:34 ` [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code Thomas Gleixner
  4 siblings, 4 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Wei Liu, Steve Wahl, Joerg Roedel, Lu Baolu,
	Juergen Gross, Vineet Gupta, Brian Cain

Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
---
 arch/x86/Kconfig                    |    1 +
 arch/x86/hyperv/irqdomain.c         |    2 +-
 arch/x86/kernel/apic/io_apic.c      |    2 +-
 arch/x86/kernel/apic/msi.c          |    3 ++-
 arch/x86/kernel/hpet.c              |    8 --------
 arch/x86/platform/uv/uv_irq.c       |    2 --
 drivers/iommu/amd/init.c            |    2 +-
 drivers/iommu/amd/iommu.c           |    1 -
 drivers/iommu/intel/irq_remapping.c |    1 -
 drivers/pci/controller/pci-hyperv.c |    1 +
 drivers/xen/events/events_base.c    |    6 ------
 11 files changed, 7 insertions(+), 22 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,6 +173,7 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
+	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
--- a/arch/x86/hyperv/irqdomain.c
+++ b/arch/x86/hyperv/irqdomain.c
@@ -304,7 +304,7 @@ static struct irq_chip hv_pci_msi_contro
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= hv_irq_compose_msi_msg,
 	.irq_set_affinity	= msi_domain_set_affinity,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED,
 };
 
 static struct msi_domain_ops pci_msi_domain_ops = {
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1861,7 +1861,7 @@ static struct irq_chip ioapic_chip __rea
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -214,6 +214,7 @@ static bool x86_init_dev_msi_info(struct
 		if (WARN_ON_ONCE(domain != real_parent))
 			return false;
 		info->chip->irq_set_affinity = msi_set_affinity;
+		info->chip->flags |= IRQCHIP_MOVE_DEFERRED;
 		break;
 	case DOMAIN_BUS_DMAR:
 	case DOMAIN_BUS_AMDVI:
@@ -315,7 +316,7 @@ static struct irq_chip dmar_msi_controll
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= dmar_msi_compose_msg,
 	.irq_write_msi_msg	= dmar_msi_write_msg,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -516,22 +516,14 @@ static int hpet_msi_init(struct irq_doma
 			 struct msi_domain_info *info, unsigned int virq,
 			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
 {
-	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 	irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL,
 			    handle_edge_irq, arg->data, "edge");
 
 	return 0;
 }
 
-static void hpet_msi_free(struct irq_domain *domain,
-			  struct msi_domain_info *info, unsigned int virq)
-{
-	irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
-}
-
 static struct msi_domain_ops hpet_msi_domain_ops = {
 	.msi_init	= hpet_msi_init,
-	.msi_free	= hpet_msi_free,
 };
 
 static struct msi_domain_info hpet_msi_domain_info = {
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -92,8 +92,6 @@ static int uv_domain_alloc(struct irq_do
 	if (ret >= 0) {
 		if (info->uv.limit == UV_AFFINITY_CPU)
 			irq_set_status_flags(virq, IRQ_NO_BALANCING);
-		else
-			irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 
 		chip_data->pnode = uv_blade_to_pnode(info->uv.blade);
 		chip_data->offset = info->uv.offset;
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2332,7 +2332,7 @@ static struct irq_chip intcapxt_controll
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_set_affinity       = intcapxt_set_affinity,
 	.irq_set_wake		= intcapxt_set_wake,
-	.flags			= IRQCHIP_MASK_ON_SUSPEND,
+	.flags			= IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_MOVE_DEFERRED,
 };
 
 static const struct irq_domain_ops intcapxt_domain_ops = {
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3532,7 +3532,6 @@ static int irq_remapping_alloc(struct ir
 		irq_data->chip_data = data;
 		irq_data->chip = &amd_ir_chip;
 		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 
 	return 0;
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1463,7 +1463,6 @@ static int intel_irq_remapping_alloc(str
 		else
 			irq_data->chip = &intel_ir_chip;
 		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 	return 0;
 
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2053,6 +2053,7 @@ static struct irq_chip hv_msi_irq_chip =
 	.irq_set_affinity	= irq_chip_set_affinity_parent,
 #ifdef CONFIG_X86
 	.irq_ack		= irq_chip_ack_parent,
+	.flags			= IRQCHIP_MOVE_DEFERRED,
 #elif defined(CONFIG_ARM64)
 	.irq_eoi		= irq_chip_eoi_parent,
 #endif
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -722,12 +722,6 @@ static struct irq_info *xen_irq_init(uns
 		INIT_RCU_WORK(&info->rwork, delayed_free_irq);
 
 		set_info_for_irq(irq, info);
-		/*
-		 * Interrupt affinity setting can be immediate. No point
-		 * in delaying it until an interrupt is handled.
-		 */
-		irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-
 		INIT_LIST_HEAD(&info->eoi_list);
 		list_add_tail(&info->list, &xen_irq_list_head);
 	}


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code
  2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
                   ` (3 preceding siblings ...)
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
@ 2024-12-10 10:34 ` Thomas Gleixner
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  4 siblings, 2 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 10:34 UTC (permalink / raw)
  To: LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

Now that x86 is converted over to use the IRQCHIP_MOVE_DEFERRED flags,
remove IRQ*_MOVE_PCNTXT and related code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig       |    1 -
 include/linux/irq.h    |   12 +-----------
 kernel/irq/chip.c      |   14 --------------
 kernel/irq/debugfs.c   |    1 -
 kernel/irq/internals.h |    2 +-
 kernel/irq/settings.h  |    6 ------
 6 files changed, 2 insertions(+), 34 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,7 +173,6 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
-	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -64,7 +64,6 @@ enum irqchip_irq_state;
  * IRQ_NOAUTOEN			- Interrupt is not automatically enabled in
  *				  request/setup_irq()
  * IRQ_NO_BALANCING		- Interrupt cannot be balanced (affinity set)
- * IRQ_MOVE_PCNTXT		- Interrupt can be migrated from process context
  * IRQ_NESTED_THREAD		- Interrupt nests into another thread
  * IRQ_PER_CPU_DEVID		- Dev_id is a per-cpu variable
  * IRQ_IS_POLLED		- Always polled by another interrupt. Exclude
@@ -93,7 +92,6 @@ enum {
 	IRQ_NOREQUEST		= (1 << 11),
 	IRQ_NOAUTOEN		= (1 << 12),
 	IRQ_NO_BALANCING	= (1 << 13),
-	IRQ_MOVE_PCNTXT		= (1 << 14),
 	IRQ_NESTED_THREAD	= (1 << 15),
 	IRQ_NOTHREAD		= (1 << 16),
 	IRQ_PER_CPU_DEVID	= (1 << 17),
@@ -105,7 +103,7 @@ enum {
 
 #define IRQF_MODIFY_MASK	\
 	(IRQ_TYPE_SENSE_MASK | IRQ_NOPROBE | IRQ_NOREQUEST | \
-	 IRQ_NOAUTOEN | IRQ_MOVE_PCNTXT | IRQ_LEVEL | IRQ_NO_BALANCING | \
+	 IRQ_NOAUTOEN | IRQ_LEVEL | IRQ_NO_BALANCING | \
 	 IRQ_PER_CPU | IRQ_NESTED_THREAD | IRQ_NOTHREAD | IRQ_PER_CPU_DEVID | \
 	 IRQ_IS_POLLED | IRQ_DISABLE_UNLAZY | IRQ_HIDDEN)
 
@@ -201,8 +199,6 @@ struct irq_data {
  * IRQD_LEVEL			- Interrupt is level triggered
  * IRQD_WAKEUP_STATE		- Interrupt is configured for wakeup
  *				  from suspend
- * IRQD_MOVE_PCNTXT		- Interrupt can be moved in process
- *				  context
  * IRQD_IRQ_DISABLED		- Disabled state of the interrupt
  * IRQD_IRQ_MASKED		- Masked state of the interrupt
  * IRQD_IRQ_INPROGRESS		- In progress state of the interrupt
@@ -233,7 +229,6 @@ enum {
 	IRQD_AFFINITY_SET		= BIT(12),
 	IRQD_LEVEL			= BIT(13),
 	IRQD_WAKEUP_STATE		= BIT(14),
-	IRQD_MOVE_PCNTXT		= BIT(15),
 	IRQD_IRQ_DISABLED		= BIT(16),
 	IRQD_IRQ_MASKED			= BIT(17),
 	IRQD_IRQ_INPROGRESS		= BIT(18),
@@ -338,11 +333,6 @@ static inline bool irqd_is_wakeup_set(st
 	return __irqd_to_state(d) & IRQD_WAKEUP_STATE;
 }
 
-static inline bool irqd_can_move_in_process_context(struct irq_data *d)
-{
-	return __irqd_to_state(d) & IRQD_MOVE_PCNTXT;
-}
-
 static inline bool irqd_irq_disabled(struct irq_data *d)
 {
 	return __irqd_to_state(d) & IRQD_IRQ_DISABLED;
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,13 +47,6 @@ int irq_set_chip(unsigned int irq, const
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
-
-	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
-		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
-			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		else
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1129,13 +1122,6 @@ void irq_modify_status(unsigned int irq,
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
-	/* Keep this around until x86 is converted over */
-	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
-		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		if (irq_settings_can_move_pcntxt(desc))
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
-
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -109,7 +109,6 @@ static const struct irq_bit_descr irqdat
 	BIT_MASK_DESCR(IRQD_NO_BALANCING),
 
 	BIT_MASK_DESCR(IRQD_SINGLE_TARGET),
-	BIT_MASK_DESCR(IRQD_MOVE_PCNTXT),
 	BIT_MASK_DESCR(IRQD_AFFINITY_SET),
 	BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
 	BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -421,7 +421,7 @@ irq_init_generic_chip(struct irq_chip_ge
 #ifdef CONFIG_GENERIC_PENDING_IRQ
 static inline bool irq_can_move_pcntxt(struct irq_data *data)
 {
-	return irqd_can_move_in_process_context(data);
+	return !(data->chip->flags & IRQCHIP_MOVE_DEFERRED);
 }
 static inline bool irq_move_pending(struct irq_data *data)
 {
--- a/kernel/irq/settings.h
+++ b/kernel/irq/settings.h
@@ -11,7 +11,6 @@ enum {
 	_IRQ_NOREQUEST		= IRQ_NOREQUEST,
 	_IRQ_NOTHREAD		= IRQ_NOTHREAD,
 	_IRQ_NOAUTOEN		= IRQ_NOAUTOEN,
-	_IRQ_MOVE_PCNTXT	= IRQ_MOVE_PCNTXT,
 	_IRQ_NO_BALANCING	= IRQ_NO_BALANCING,
 	_IRQ_NESTED_THREAD	= IRQ_NESTED_THREAD,
 	_IRQ_PER_CPU_DEVID	= IRQ_PER_CPU_DEVID,
@@ -142,11 +141,6 @@ static inline void irq_settings_set_nopr
 	desc->status_use_accessors |= _IRQ_NOPROBE;
 }
 
-static inline bool irq_settings_can_move_pcntxt(struct irq_desc *desc)
-{
-	return desc->status_use_accessors & _IRQ_MOVE_PCNTXT;
-}
-
 static inline bool irq_settings_can_autoenable(struct irq_desc *desc)
 {
 	return !(desc->status_use_accessors & _IRQ_NOAUTOEN);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover
  2024-12-10 10:34 ` [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover Thomas Gleixner
@ 2024-12-10 15:06   ` Brian Cain
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: Brian Cain @ 2024-12-10 15:06 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86@kernel.org, Anup Patel, Vineet Gupta, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross



> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Tuesday, December 10, 2024 4:34 AM
> To: LKML <linux-kernel@vger.kernel.org>
> Cc: x86@kernel.org; Anup Patel <apatel@ventanamicro.com>; Brian Cain
> <bcain@quicinc.com>; Vineet Gupta <vgupta@kernel.org>; Wei Liu
> <wei.liu@kernel.org>; Steve Wahl <steve.wahl@hpe.com>; Joerg Roedel
> <joro@8bytes.org>; Lu Baolu <baolu.lu@linux.intel.com>; Juergen Gross
> <jgross@suse.com>
> Subject: [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
> 
> Commented out since 2011....
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Brian Cain <bcain@quicinc.com>
> ---

Acked-by: Brian Cain <bcain@quicinc.com>

>  arch/hexagon/Kconfig |    1 -
>  1 file changed, 1 deletion(-)
> 
> --- a/arch/hexagon/Kconfig
> +++ b/arch/hexagon/Kconfig
> @@ -20,7 +20,6 @@ config HEXAGON
>         # select ARCH_HAS_CPU_IDLE_WAIT
>         # select GPIOLIB
>         # select HAVE_CLK
> -       # select GENERIC_PENDING_IRQ if SMP
>         select GENERIC_ATOMIC64
>         select HAVE_PERF_EVENTS
>         # GENERIC_ALLOCATOR is used by dma_alloc_coherent()


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ
  2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
@ 2024-12-10 17:22   ` Vineet Gupta
  2024-12-10 22:52     ` Thomas Gleixner
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 18+ messages in thread
From: Vineet Gupta @ 2024-12-10 17:22 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

On 12/10/24 02:34, Thomas Gleixner wrote:
> Nothing uses the actual functionality and the MCIP controller sets the
> flags which disables the deferred affinity change. The other interrupt
> controller does not support affinity setting at all.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vineet Gupta <vgupta@kernel.org>

Acked-by: Vineet Gupta <vgupta@kernel.org>   # arch/arc/

> ---
>  arch/arc/Kconfig       |    1 -
>  arch/arc/kernel/mcip.c |    2 --
>  kernel/irq/debugfs.c   |    1 +
>  3 files changed, 1 insertion(+), 3 deletions(-)
>
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -24,7 +24,6 @@ config ARC
>  	# for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP
>  	select GENERIC_IRQ_SHOW
>  	select GENERIC_PCI_IOMAP
> -	select GENERIC_PENDING_IRQ if SMP
>  	select GENERIC_SCHED_CLOCK
>  	select GENERIC_SMP_IDLE_THREAD
>  	select GENERIC_IOREMAP
> --- a/arch/arc/kernel/mcip.c
> +++ b/arch/arc/kernel/mcip.c
> @@ -357,8 +357,6 @@ static void idu_cascade_isr(struct irq_d
>  static int idu_irq_map(struct irq_domain *d, unsigned int virq, irq_hw_number_t hwirq)
>  {
>  	irq_set_chip_and_handler(virq, &idu_irq_chip, handle_level_irq);
> -	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
> -
>  	return 0;
>  }
>  
> --- a/kernel/irq/debugfs.c
> +++ b/kernel/irq/debugfs.c
> @@ -53,6 +53,7 @@ static const struct irq_bit_descr irqchi
>  	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI),
>  	BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND),
>  	BIT_MASK_DESCR(IRQCHIP_IMMUTABLE),
> +	BIT_MASK_DESCR(IRQCHIP_MOVE_DEFERRED),

I think this leaked in here, needs to be in patch 3/5

Cheers,
-Vineet

>  };
>  
>  static void
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ
  2024-12-10 17:22   ` Vineet Gupta
@ 2024-12-10 22:52     ` Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2024-12-10 22:52 UTC (permalink / raw)
  To: Vineet Gupta, LKML
  Cc: x86, Anup Patel, Vineet Gupta, Brian Cain, Wei Liu, Steve Wahl,
	Joerg Roedel, Lu Baolu, Juergen Gross

On Tue, Dec 10 2024 at 09:22, Vineet Gupta wrote:
> On 12/10/24 02:34, Thomas Gleixner wrote:
>> Nothing uses the actual functionality and the MCIP controller sets the
>>  
>> --- a/kernel/irq/debugfs.c
>> +++ b/kernel/irq/debugfs.c
>> @@ -53,6 +53,7 @@ static const struct irq_bit_descr irqchi
>>  	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI),
>>  	BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND),
>>  	BIT_MASK_DESCR(IRQCHIP_IMMUTABLE),
>> +	BIT_MASK_DESCR(IRQCHIP_MOVE_DEFERRED),
>
> I think this leaked in here, needs to be in patch 3/5

Ooops. Yes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
@ 2024-12-11 16:36   ` Steve Wahl
  2024-12-17 19:33   ` Wei Liu
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Steve Wahl @ 2024-12-11 16:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Anup Patel, Wei Liu, Steve Wahl, Joerg Roedel,
	Lu Baolu, Juergen Gross, Vineet Gupta, Brian Cain

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

Thanks,

--> Steve

On Tue, Dec 10, 2024 at 11:34:15AM +0100, Thomas Gleixner wrote:
> Instead of marking individual interrupts as safe to be migrated in
> arbitrary contexts, mark the interrupt chips, which require the interrupt
> to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
> flag. This makes more sense because this is a per interrupt chip property
> and not restricted to individual interrupts.
> 
> That flips the logic from the historical opt-out to a opt-in model. This is
> simpler to handle for other architectures, which default to unrestricted
> affinity setting. It also allows to cleanup the redundant core logic
> significantly.
> 
> All interrupt chips, which belong to a top-level domain sitting directly on
> top of the x86 vector domain are marked accordingly, unless the related
> setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.
> 
> No functional change intended.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Steve Wahl <steve.wahl@hpe.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/Kconfig                    |    1 +
>  arch/x86/hyperv/irqdomain.c         |    2 +-
>  arch/x86/kernel/apic/io_apic.c      |    2 +-
>  arch/x86/kernel/apic/msi.c          |    3 ++-
>  arch/x86/kernel/hpet.c              |    8 --------
>  arch/x86/platform/uv/uv_irq.c       |    2 --
>  drivers/iommu/amd/init.c            |    2 +-
>  drivers/iommu/amd/iommu.c           |    1 -
>  drivers/iommu/intel/irq_remapping.c |    1 -
>  drivers/pci/controller/pci-hyperv.c |    1 +
>  drivers/xen/events/events_base.c    |    6 ------
>  11 files changed, 7 insertions(+), 22 deletions(-)
> 
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -173,6 +173,7 @@ config X86
>  	select GENERIC_IRQ_RESERVATION_MODE
>  	select GENERIC_IRQ_SHOW
>  	select GENERIC_PENDING_IRQ		if SMP
> +	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
>  	select GENERIC_PTDUMP
>  	select GENERIC_SMP_IDLE_THREAD
>  	select GENERIC_TIME_VSYSCALL
> --- a/arch/x86/hyperv/irqdomain.c
> +++ b/arch/x86/hyperv/irqdomain.c
> @@ -304,7 +304,7 @@ static struct irq_chip hv_pci_msi_contro
>  	.irq_retrigger		= irq_chip_retrigger_hierarchy,
>  	.irq_compose_msi_msg	= hv_irq_compose_msi_msg,
>  	.irq_set_affinity	= msi_domain_set_affinity,
> -	.flags			= IRQCHIP_SKIP_SET_WAKE,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED,
>  };
>  
>  static struct msi_domain_ops pci_msi_domain_ops = {
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -1861,7 +1861,7 @@ static struct irq_chip ioapic_chip __rea
>  	.irq_set_affinity	= ioapic_set_affinity,
>  	.irq_retrigger		= irq_chip_retrigger_hierarchy,
>  	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
> -	.flags			= IRQCHIP_SKIP_SET_WAKE |
> +	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
>  				  IRQCHIP_AFFINITY_PRE_STARTUP,
>  };
>  
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -214,6 +214,7 @@ static bool x86_init_dev_msi_info(struct
>  		if (WARN_ON_ONCE(domain != real_parent))
>  			return false;
>  		info->chip->irq_set_affinity = msi_set_affinity;
> +		info->chip->flags |= IRQCHIP_MOVE_DEFERRED;
>  		break;
>  	case DOMAIN_BUS_DMAR:
>  	case DOMAIN_BUS_AMDVI:
> @@ -315,7 +316,7 @@ static struct irq_chip dmar_msi_controll
>  	.irq_retrigger		= irq_chip_retrigger_hierarchy,
>  	.irq_compose_msi_msg	= dmar_msi_compose_msg,
>  	.irq_write_msi_msg	= dmar_msi_write_msg,
> -	.flags			= IRQCHIP_SKIP_SET_WAKE |
> +	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
>  				  IRQCHIP_AFFINITY_PRE_STARTUP,
>  };
>  
> --- a/arch/x86/kernel/hpet.c
> +++ b/arch/x86/kernel/hpet.c
> @@ -516,22 +516,14 @@ static int hpet_msi_init(struct irq_doma
>  			 struct msi_domain_info *info, unsigned int virq,
>  			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
>  {
> -	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
>  	irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL,
>  			    handle_edge_irq, arg->data, "edge");
>  
>  	return 0;
>  }
>  
> -static void hpet_msi_free(struct irq_domain *domain,
> -			  struct msi_domain_info *info, unsigned int virq)
> -{
> -	irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
> -}
> -
>  static struct msi_domain_ops hpet_msi_domain_ops = {
>  	.msi_init	= hpet_msi_init,
> -	.msi_free	= hpet_msi_free,
>  };
>  
>  static struct msi_domain_info hpet_msi_domain_info = {
> --- a/arch/x86/platform/uv/uv_irq.c
> +++ b/arch/x86/platform/uv/uv_irq.c
> @@ -92,8 +92,6 @@ static int uv_domain_alloc(struct irq_do
>  	if (ret >= 0) {
>  		if (info->uv.limit == UV_AFFINITY_CPU)
>  			irq_set_status_flags(virq, IRQ_NO_BALANCING);
> -		else
> -			irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
>  
>  		chip_data->pnode = uv_blade_to_pnode(info->uv.blade);
>  		chip_data->offset = info->uv.offset;
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -2332,7 +2332,7 @@ static struct irq_chip intcapxt_controll
>  	.irq_retrigger		= irq_chip_retrigger_hierarchy,
>  	.irq_set_affinity       = intcapxt_set_affinity,
>  	.irq_set_wake		= intcapxt_set_wake,
> -	.flags			= IRQCHIP_MASK_ON_SUSPEND,
> +	.flags			= IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_MOVE_DEFERRED,
>  };
>  
>  static const struct irq_domain_ops intcapxt_domain_ops = {
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3532,7 +3532,6 @@ static int irq_remapping_alloc(struct ir
>  		irq_data->chip_data = data;
>  		irq_data->chip = &amd_ir_chip;
>  		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
> -		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
>  	}
>  
>  	return 0;
> --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -1463,7 +1463,6 @@ static int intel_irq_remapping_alloc(str
>  		else
>  			irq_data->chip = &intel_ir_chip;
>  		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
> -		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
>  	}
>  	return 0;
>  
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2053,6 +2053,7 @@ static struct irq_chip hv_msi_irq_chip =
>  	.irq_set_affinity	= irq_chip_set_affinity_parent,
>  #ifdef CONFIG_X86
>  	.irq_ack		= irq_chip_ack_parent,
> +	.flags			= IRQCHIP_MOVE_DEFERRED,
>  #elif defined(CONFIG_ARM64)
>  	.irq_eoi		= irq_chip_eoi_parent,
>  #endif
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -722,12 +722,6 @@ static struct irq_info *xen_irq_init(uns
>  		INIT_RCU_WORK(&info->rwork, delayed_free_irq);
>  
>  		set_info_for_irq(irq, info);
> -		/*
> -		 * Interrupt affinity setting can be immediate. No point
> -		 * in delaying it until an interrupt is handled.
> -		 */
> -		irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
> -
>  		INIT_LIST_HEAD(&info->eoi_list);
>  		list_add_tail(&info->list, &xen_irq_list_head);
>  	}
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
  2024-12-11 16:36   ` Steve Wahl
@ 2024-12-17 19:33   ` Wei Liu
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 18+ messages in thread
From: Wei Liu @ 2024-12-17 19:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Anup Patel, Wei Liu, Steve Wahl, Joerg Roedel,
	Lu Baolu, Juergen Gross, Vineet Gupta, Brian Cain

On Tue, Dec 10, 2024 at 11:34:15AM +0100, Thomas Gleixner wrote:
> Instead of marking individual interrupts as safe to be migrated in
> arbitrary contexts, mark the interrupt chips, which require the interrupt
> to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
> flag. This makes more sense because this is a per interrupt chip property
> and not restricted to individual interrupts.
> 
> That flips the logic from the historical opt-out to a opt-in model. This is
> simpler to handle for other architectures, which default to unrestricted
> affinity setting. It also allows to cleanup the redundant core logic
> significantly.
> 
> All interrupt chips, which belong to a top-level domain sitting directly on
> top of the x86 vector domain are marked accordingly, unless the related
> setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.
> 
> No functional change intended.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[...]
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2053,6 +2053,7 @@ static struct irq_chip hv_msi_irq_chip =
>  	.irq_set_affinity	= irq_chip_set_affinity_parent,
>  #ifdef CONFIG_X86
>  	.irq_ack		= irq_chip_ack_parent,
> +	.flags			= IRQCHIP_MOVE_DEFERRED,
>  #elif defined(CONFIG_ARM64)
>  	.irq_eoi		= irq_chip_eoi_parent,
>  #endif

Acked-by: Wei Liu <wei.liu@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip: irq/core] genirq: Remove IRQ_MOVE_PCNTXT and related code
  2024-12-10 10:34 ` [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code Thomas Gleixner
@ 2025-01-15 10:04   ` tip-bot2 for Thomas Gleixner
  2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 10:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     0fccabd9e3215368983c7161cb69b3fc748893e1
Gitweb:        https://git.kernel.org/tip/0fccabd9e3215368983c7161cb69b3fc748893e1
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:17 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:56:22 +01:00

genirq: Remove IRQ_MOVE_PCNTXT and related code

Now that x86 is converted over to use the IRQCHIP_MOVE_DEFERRED flags,
remove IRQ*_MOVE_PCNTXT and related code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241210103335.626707225@linutronix.de

---
 arch/x86/Kconfig       |  1 -
 include/linux/irq.h    | 12 +-----------
 kernel/irq/chip.c      | 14 --------------
 kernel/irq/debugfs.c   |  1 -
 kernel/irq/internals.h |  2 +-
 kernel/irq/settings.h  |  6 ------
 6 files changed, 2 insertions(+), 34 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index df0fd72..9d7bd0a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,7 +173,6 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
-	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 6e02154..8daa17f 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -64,7 +64,6 @@ enum irqchip_irq_state;
  * IRQ_NOAUTOEN			- Interrupt is not automatically enabled in
  *				  request/setup_irq()
  * IRQ_NO_BALANCING		- Interrupt cannot be balanced (affinity set)
- * IRQ_MOVE_PCNTXT		- Interrupt can be migrated from process context
  * IRQ_NESTED_THREAD		- Interrupt nests into another thread
  * IRQ_PER_CPU_DEVID		- Dev_id is a per-cpu variable
  * IRQ_IS_POLLED		- Always polled by another interrupt. Exclude
@@ -93,7 +92,6 @@ enum {
 	IRQ_NOREQUEST		= (1 << 11),
 	IRQ_NOAUTOEN		= (1 << 12),
 	IRQ_NO_BALANCING	= (1 << 13),
-	IRQ_MOVE_PCNTXT		= (1 << 14),
 	IRQ_NESTED_THREAD	= (1 << 15),
 	IRQ_NOTHREAD		= (1 << 16),
 	IRQ_PER_CPU_DEVID	= (1 << 17),
@@ -105,7 +103,7 @@ enum {
 
 #define IRQF_MODIFY_MASK	\
 	(IRQ_TYPE_SENSE_MASK | IRQ_NOPROBE | IRQ_NOREQUEST | \
-	 IRQ_NOAUTOEN | IRQ_MOVE_PCNTXT | IRQ_LEVEL | IRQ_NO_BALANCING | \
+	 IRQ_NOAUTOEN | IRQ_LEVEL | IRQ_NO_BALANCING | \
 	 IRQ_PER_CPU | IRQ_NESTED_THREAD | IRQ_NOTHREAD | IRQ_PER_CPU_DEVID | \
 	 IRQ_IS_POLLED | IRQ_DISABLE_UNLAZY | IRQ_HIDDEN)
 
@@ -201,8 +199,6 @@ struct irq_data {
  * IRQD_LEVEL			- Interrupt is level triggered
  * IRQD_WAKEUP_STATE		- Interrupt is configured for wakeup
  *				  from suspend
- * IRQD_MOVE_PCNTXT		- Interrupt can be moved in process
- *				  context
  * IRQD_IRQ_DISABLED		- Disabled state of the interrupt
  * IRQD_IRQ_MASKED		- Masked state of the interrupt
  * IRQD_IRQ_INPROGRESS		- In progress state of the interrupt
@@ -233,7 +229,6 @@ enum {
 	IRQD_AFFINITY_SET		= BIT(12),
 	IRQD_LEVEL			= BIT(13),
 	IRQD_WAKEUP_STATE		= BIT(14),
-	IRQD_MOVE_PCNTXT		= BIT(15),
 	IRQD_IRQ_DISABLED		= BIT(16),
 	IRQD_IRQ_MASKED			= BIT(17),
 	IRQD_IRQ_INPROGRESS		= BIT(18),
@@ -338,11 +333,6 @@ static inline bool irqd_is_wakeup_set(struct irq_data *d)
 	return __irqd_to_state(d) & IRQD_WAKEUP_STATE;
 }
 
-static inline bool irqd_can_move_in_process_context(struct irq_data *d)
-{
-	return __irqd_to_state(d) & IRQD_MOVE_PCNTXT;
-}
-
 static inline bool irqd_irq_disabled(struct irq_data *d)
 {
 	return __irqd_to_state(d) & IRQD_IRQ_DISABLED;
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 7989da2..c901436 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,13 +47,6 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
-
-	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
-		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
-			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		else
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1129,13 +1122,6 @@ void irq_modify_status(unsigned int irq, unsigned long clr, unsigned long set)
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
-	/* Keep this around until x86 is converted over */
-	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
-		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		if (irq_settings_can_move_pcntxt(desc))
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
-
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index 975eb8d..ca142b9 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -109,7 +109,6 @@ static const struct irq_bit_descr irqdata_states[] = {
 	BIT_MASK_DESCR(IRQD_NO_BALANCING),
 
 	BIT_MASK_DESCR(IRQD_SINGLE_TARGET),
-	BIT_MASK_DESCR(IRQD_MOVE_PCNTXT),
 	BIT_MASK_DESCR(IRQD_AFFINITY_SET),
 	BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
 	BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index b61fc64..a979523 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -421,7 +421,7 @@ irq_init_generic_chip(struct irq_chip_generic *gc, const char *name,
 #ifdef CONFIG_GENERIC_PENDING_IRQ
 static inline bool irq_can_move_pcntxt(struct irq_data *data)
 {
-	return irqd_can_move_in_process_context(data);
+	return !(data->chip->flags & IRQCHIP_MOVE_DEFERRED);
 }
 static inline bool irq_move_pending(struct irq_data *data)
 {
diff --git a/kernel/irq/settings.h b/kernel/irq/settings.h
index 7b7efb1..00b3bd1 100644
--- a/kernel/irq/settings.h
+++ b/kernel/irq/settings.h
@@ -11,7 +11,6 @@ enum {
 	_IRQ_NOREQUEST		= IRQ_NOREQUEST,
 	_IRQ_NOTHREAD		= IRQ_NOTHREAD,
 	_IRQ_NOAUTOEN		= IRQ_NOAUTOEN,
-	_IRQ_MOVE_PCNTXT	= IRQ_MOVE_PCNTXT,
 	_IRQ_NO_BALANCING	= IRQ_NO_BALANCING,
 	_IRQ_NESTED_THREAD	= IRQ_NESTED_THREAD,
 	_IRQ_PER_CPU_DEVID	= IRQ_PER_CPU_DEVID,
@@ -142,11 +141,6 @@ static inline void irq_settings_set_noprobe(struct irq_desc *desc)
 	desc->status_use_accessors |= _IRQ_NOPROBE;
 }
 
-static inline bool irq_settings_can_move_pcntxt(struct irq_desc *desc)
-{
-	return desc->status_use_accessors & _IRQ_MOVE_PCNTXT;
-}
-
 static inline bool irq_settings_can_autoenable(struct irq_desc *desc)
 {
 	return !(desc->status_use_accessors & _IRQ_NOAUTOEN);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
  2024-12-11 16:36   ` Steve Wahl
  2024-12-17 19:33   ` Wei Liu
@ 2025-01-15 10:04   ` tip-bot2 for Thomas Gleixner
  2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 10:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Steve Wahl, Wei Liu, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     12cbdcb9f05559ff72eb8a04df829852804c0276
Gitweb:        https://git.kernel.org/tip/12cbdcb9f05559ff72eb8a04df829852804c0276
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:15 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:56:22 +01:00

x86/apic: Convert to IRQCHIP_MOVE_DEFERRED

Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de

---
 arch/x86/Kconfig                    | 1 +
 arch/x86/hyperv/irqdomain.c         | 2 +-
 arch/x86/kernel/apic/io_apic.c      | 2 +-
 arch/x86/kernel/apic/msi.c          | 3 ++-
 arch/x86/kernel/hpet.c              | 8 --------
 arch/x86/platform/uv/uv_irq.c       | 2 --
 drivers/iommu/amd/init.c            | 2 +-
 drivers/iommu/amd/iommu.c           | 1 -
 drivers/iommu/intel/irq_remapping.c | 1 -
 drivers/pci/controller/pci-hyperv.c | 1 +
 drivers/xen/events/events_base.c    | 6 ------
 11 files changed, 7 insertions(+), 22 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9d7bd0a..df0fd72 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,6 +173,7 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
+	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
index 3215a4a..64b9213 100644
--- a/arch/x86/hyperv/irqdomain.c
+++ b/arch/x86/hyperv/irqdomain.c
@@ -304,7 +304,7 @@ static struct irq_chip hv_pci_msi_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= hv_irq_compose_msi_msg,
 	.irq_set_affinity	= msi_domain_set_affinity,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED,
 };
 
 static struct msi_domain_ops pci_msi_domain_ops = {
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 1029ea4..5d033d9 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1861,7 +1861,7 @@ static struct irq_chip ioapic_chip __read_mostly = {
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 3407692..66bc5d3 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -214,6 +214,7 @@ static bool x86_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
 		if (WARN_ON_ONCE(domain != real_parent))
 			return false;
 		info->chip->irq_set_affinity = msi_set_affinity;
+		info->chip->flags |= IRQCHIP_MOVE_DEFERRED;
 		break;
 	case DOMAIN_BUS_DMAR:
 	case DOMAIN_BUS_AMDVI:
@@ -315,7 +316,7 @@ static struct irq_chip dmar_msi_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= dmar_msi_compose_msg,
 	.irq_write_msi_msg	= dmar_msi_write_msg,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index c96ae8f..87b5a50 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -516,22 +516,14 @@ static int hpet_msi_init(struct irq_domain *domain,
 			 struct msi_domain_info *info, unsigned int virq,
 			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
 {
-	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 	irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL,
 			    handle_edge_irq, arg->data, "edge");
 
 	return 0;
 }
 
-static void hpet_msi_free(struct irq_domain *domain,
-			  struct msi_domain_info *info, unsigned int virq)
-{
-	irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
-}
-
 static struct msi_domain_ops hpet_msi_domain_ops = {
 	.msi_init	= hpet_msi_init,
-	.msi_free	= hpet_msi_free,
 };
 
 static struct msi_domain_info hpet_msi_domain_info = {
diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
index a379501..f971ef2 100644
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -92,8 +92,6 @@ static int uv_domain_alloc(struct irq_domain *domain, unsigned int virq,
 	if (ret >= 0) {
 		if (info->uv.limit == UV_AFFINITY_CPU)
 			irq_set_status_flags(virq, IRQ_NO_BALANCING);
-		else
-			irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 
 		chip_data->pnode = uv_blade_to_pnode(info->uv.blade);
 		chip_data->offset = info->uv.offset;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 0e0a531..614f216 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2332,7 +2332,7 @@ static struct irq_chip intcapxt_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_set_affinity       = intcapxt_set_affinity,
 	.irq_set_wake		= intcapxt_set_wake,
-	.flags			= IRQCHIP_MASK_ON_SUSPEND,
+	.flags			= IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_MOVE_DEFERRED,
 };
 
 static const struct irq_domain_ops intcapxt_domain_ops = {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3f691e1..b02e631 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3532,7 +3532,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip_data = data;
 		irq_data->chip = &amd_ir_chip;
 		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 
 	return 0;
diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 466c141..f5402df 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1463,7 +1463,6 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 		else
 			irq_data->chip = &intel_ir_chip;
 		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 	return 0;
 
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index cdd5be1..6084b38 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2053,6 +2053,7 @@ static struct irq_chip hv_msi_irq_chip = {
 	.irq_set_affinity	= irq_chip_set_affinity_parent,
 #ifdef CONFIG_X86
 	.irq_ack		= irq_chip_ack_parent,
+	.flags			= IRQCHIP_MOVE_DEFERRED,
 #elif defined(CONFIG_ARM64)
 	.irq_eoi		= irq_chip_eoi_parent,
 #endif
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 985e155..41309d3 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -722,12 +722,6 @@ static struct irq_info *xen_irq_init(unsigned int irq)
 		INIT_RCU_WORK(&info->rwork, delayed_free_irq);
 
 		set_info_for_irq(irq, info);
-		/*
-		 * Interrupt affinity setting can be immediate. No point
-		 * in delaying it until an interrupt is handled.
-		 */
-		irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-
 		INIT_LIST_HEAD(&info->eoi_list);
 		list_add_tail(&info->list, &xen_irq_list_head);
 	}

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] genirq: Provide IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 ` [patch 3/5] genirq: Provide IRQCHIP_MOVE_DEFERRED Thomas Gleixner
@ 2025-01-15 10:04   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 10:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     a648eb3a3f79e9736a59b28783700c2c691db419
Gitweb:        https://git.kernel.org/tip/a648eb3a3f79e9736a59b28783700c2c691db419
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:14 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:56:22 +01:00

genirq: Provide IRQCHIP_MOVE_DEFERRED

The logic of GENERIC_PENDING_IRQ is backwards for historical reasons. Most
interrupt controllers allow to move the interrupt from arbitrary
contexts. If GENERIC_PENDING_IRQ is enabled by an architecture to support a
chip, which requires the affinity change to happen in interrupt context,
all other chips have to be marked with IRQF_MOVE_PCNTXT.

That's tedious and there is no real good reason for the extra flags in the
irq descriptor and the irq data status fields. In fact the decision whether
interrupts can be moved in arbitrary context or not is a property of the
interrupt chip.

To simplify adoption for RISC-V provide a new mechanism which is enabled
via a config switch and allows to add a flag to irq_chip::flags to request
that interrupt affinity changes are deferred. Setting the top level chip of
an interrupt evaluates the flag and maps it into the existing logic.

The config switch and the various PCNTXT flags are temporary until x86 is
converted over to this scheme. This intermediate step also allows trivial
backporting of the mechanism to plug the affinity change race of various
RISC-V interrupt controllers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241210103335.500314436@linutronix.de

---
 include/linux/irq.h  |  2 ++
 kernel/irq/Kconfig   |  4 ++++
 kernel/irq/chip.c    | 18 +++++++++++++++---
 kernel/irq/debugfs.c |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 25f51bf..6e02154 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -567,6 +567,7 @@ struct irq_chip {
  *                                    in the suspend path if they are in disabled state
  * IRQCHIP_AFFINITY_PRE_STARTUP:      Default affinity update before startup
  * IRQCHIP_IMMUTABLE:		      Don't ever change anything in this chip
+ * IRQCHIP_MOVE_DEFERRED:	      Move the interrupt in actual interrupt context
  */
 enum {
 	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
@@ -581,6 +582,7 @@ enum {
 	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
 	IRQCHIP_AFFINITY_PRE_STARTUP		= (1 << 10),
 	IRQCHIP_IMMUTABLE			= (1 << 11),
+	IRQCHIP_MOVE_DEFERRED			= (1 << 12),
 };
 
 #include <linux/irqdesc.h>
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 875f25e..5432418 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -31,6 +31,10 @@ config GENERIC_IRQ_EFFECTIVE_AFF_MASK
 config GENERIC_PENDING_IRQ
 	bool
 
+# Deduce delayed migration from top-level interrupt chip flags
+config GENERIC_PENDING_IRQ_CHIPFLAGS
+	bool
+
 # Support for generic irq migrating off cpu before the cpu is offline.
 config GENERIC_IRQ_MIGRATION
 	bool
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 271e913..7989da2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,6 +47,13 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
+
+	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
+		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
+			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		else
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1114,16 +1121,21 @@ void irq_modify_status(unsigned int irq, unsigned long clr, unsigned long set)
 	trigger = irqd_get_trigger_type(&desc->irq_data);
 
 	irqd_clear(&desc->irq_data, IRQD_NO_BALANCING | IRQD_PER_CPU |
-		   IRQD_TRIGGER_MASK | IRQD_LEVEL | IRQD_MOVE_PCNTXT);
+		   IRQD_TRIGGER_MASK | IRQD_LEVEL);
 	if (irq_settings_has_no_balance_set(desc))
 		irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
 	if (irq_settings_is_per_cpu(desc))
 		irqd_set(&desc->irq_data, IRQD_PER_CPU);
-	if (irq_settings_can_move_pcntxt(desc))
-		irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
+	/* Keep this around until x86 is converted over */
+	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
+		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		if (irq_settings_can_move_pcntxt(desc))
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
+
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index c6ffb97..975eb8d 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -53,6 +53,7 @@ static const struct irq_bit_descr irqchip_flags[] = {
 	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI),
 	BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND),
 	BIT_MASK_DESCR(IRQCHIP_IMMUTABLE),
+	BIT_MASK_DESCR(IRQCHIP_MOVE_DEFERRED),
 };
 
 static void

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] hexagon: Remove GENERIC_PENDING_IRQ leftover
  2024-12-10 10:34 ` [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover Thomas Gleixner
  2024-12-10 15:06   ` Brian Cain
@ 2025-01-15 10:04   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 10:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Brian Cain, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     65d09d269fc15b4d8bbeff950ecdc4dc36a6961a
Gitweb:        https://git.kernel.org/tip/65d09d269fc15b4d8bbeff950ecdc4dc36a6961a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:12 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:56:22 +01:00

hexagon: Remove GENERIC_PENDING_IRQ leftover

Commented out since 2011....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Brian Cain <bcain@quicinc.com>
Link: https://lore.kernel.org/all/20241210103335.437630614@linutronix.de

---
 arch/hexagon/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 3eb51fb..d987ba3 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -20,7 +20,6 @@ config HEXAGON
 	# select ARCH_HAS_CPU_IDLE_WAIT
 	# select GPIOLIB
 	# select HAVE_CLK
-	# select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_ATOMIC64
 	select HAVE_PERF_EVENTS
 	# GENERIC_ALLOCATOR is used by dma_alloc_coherent()

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] ARC: Remove GENERIC_PENDING_IRQ
  2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
  2024-12-10 17:22   ` Vineet Gupta
@ 2025-01-15 10:04   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 10:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Vineet Gupta, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     5d30d6ab8c65b6caf034892aa8ae29285d0a515f
Gitweb:        https://git.kernel.org/tip/5d30d6ab8c65b6caf034892aa8ae29285d0a515f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:10 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:56:22 +01:00

ARC: Remove GENERIC_PENDING_IRQ

Nothing uses the actual functionality and the MCIP controller sets the
flags which disables the deferred affinity change. The other interrupt
controller does not support affinity setting at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Vineet Gupta <vgupta@kernel.org>   # arch/arc/
Link: https://lore.kernel.org/all/20241210103335.373392568@linutronix.de

---
 arch/arc/Kconfig       | 1 -
 arch/arc/kernel/mcip.c | 2 --
 2 files changed, 3 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 5b24881..d1a97fe 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -24,7 +24,6 @@ config ARC
 	# for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PCI_IOMAP
-	select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_SCHED_CLOCK
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_IOREMAP
diff --git a/arch/arc/kernel/mcip.c b/arch/arc/kernel/mcip.c
index 55373ca..cdd370e 100644
--- a/arch/arc/kernel/mcip.c
+++ b/arch/arc/kernel/mcip.c
@@ -357,8 +357,6 @@ static void idu_cascade_isr(struct irq_desc *desc)
 static int idu_irq_map(struct irq_domain *d, unsigned int virq, irq_hw_number_t hwirq)
 {
 	irq_set_chip_and_handler(virq, &idu_irq_chip, handle_level_irq);
-	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
-
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] genirq: Remove IRQ_MOVE_PCNTXT and related code
  2024-12-10 10:34 ` [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code Thomas Gleixner
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
@ 2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 20:47 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     f94a18249b7f9131f3ca8eacf07f21050747ebd7
Gitweb:        https://git.kernel.org/tip/f94a18249b7f9131f3ca8eacf07f21050747ebd7
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:17 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 21:38:53 +01:00

genirq: Remove IRQ_MOVE_PCNTXT and related code

Now that x86 is converted over to use the IRQCHIP_MOVE_DEFERRED flags,
remove IRQ*_MOVE_PCNTXT and related code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241210103335.626707225@linutronix.de


---
 arch/x86/Kconfig       |  1 -
 include/linux/irq.h    | 12 +-----------
 kernel/irq/chip.c      | 14 --------------
 kernel/irq/debugfs.c   |  1 -
 kernel/irq/internals.h |  2 +-
 kernel/irq/settings.h  |  6 ------
 6 files changed, 2 insertions(+), 34 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index df0fd72..9d7bd0a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,7 +173,6 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
-	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 6e02154..8daa17f 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -64,7 +64,6 @@ enum irqchip_irq_state;
  * IRQ_NOAUTOEN			- Interrupt is not automatically enabled in
  *				  request/setup_irq()
  * IRQ_NO_BALANCING		- Interrupt cannot be balanced (affinity set)
- * IRQ_MOVE_PCNTXT		- Interrupt can be migrated from process context
  * IRQ_NESTED_THREAD		- Interrupt nests into another thread
  * IRQ_PER_CPU_DEVID		- Dev_id is a per-cpu variable
  * IRQ_IS_POLLED		- Always polled by another interrupt. Exclude
@@ -93,7 +92,6 @@ enum {
 	IRQ_NOREQUEST		= (1 << 11),
 	IRQ_NOAUTOEN		= (1 << 12),
 	IRQ_NO_BALANCING	= (1 << 13),
-	IRQ_MOVE_PCNTXT		= (1 << 14),
 	IRQ_NESTED_THREAD	= (1 << 15),
 	IRQ_NOTHREAD		= (1 << 16),
 	IRQ_PER_CPU_DEVID	= (1 << 17),
@@ -105,7 +103,7 @@ enum {
 
 #define IRQF_MODIFY_MASK	\
 	(IRQ_TYPE_SENSE_MASK | IRQ_NOPROBE | IRQ_NOREQUEST | \
-	 IRQ_NOAUTOEN | IRQ_MOVE_PCNTXT | IRQ_LEVEL | IRQ_NO_BALANCING | \
+	 IRQ_NOAUTOEN | IRQ_LEVEL | IRQ_NO_BALANCING | \
 	 IRQ_PER_CPU | IRQ_NESTED_THREAD | IRQ_NOTHREAD | IRQ_PER_CPU_DEVID | \
 	 IRQ_IS_POLLED | IRQ_DISABLE_UNLAZY | IRQ_HIDDEN)
 
@@ -201,8 +199,6 @@ struct irq_data {
  * IRQD_LEVEL			- Interrupt is level triggered
  * IRQD_WAKEUP_STATE		- Interrupt is configured for wakeup
  *				  from suspend
- * IRQD_MOVE_PCNTXT		- Interrupt can be moved in process
- *				  context
  * IRQD_IRQ_DISABLED		- Disabled state of the interrupt
  * IRQD_IRQ_MASKED		- Masked state of the interrupt
  * IRQD_IRQ_INPROGRESS		- In progress state of the interrupt
@@ -233,7 +229,6 @@ enum {
 	IRQD_AFFINITY_SET		= BIT(12),
 	IRQD_LEVEL			= BIT(13),
 	IRQD_WAKEUP_STATE		= BIT(14),
-	IRQD_MOVE_PCNTXT		= BIT(15),
 	IRQD_IRQ_DISABLED		= BIT(16),
 	IRQD_IRQ_MASKED			= BIT(17),
 	IRQD_IRQ_INPROGRESS		= BIT(18),
@@ -338,11 +333,6 @@ static inline bool irqd_is_wakeup_set(struct irq_data *d)
 	return __irqd_to_state(d) & IRQD_WAKEUP_STATE;
 }
 
-static inline bool irqd_can_move_in_process_context(struct irq_data *d)
-{
-	return __irqd_to_state(d) & IRQD_MOVE_PCNTXT;
-}
-
 static inline bool irqd_irq_disabled(struct irq_data *d)
 {
 	return __irqd_to_state(d) & IRQD_IRQ_DISABLED;
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 7989da2..c901436 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,13 +47,6 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
-
-	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
-		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
-			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		else
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1129,13 +1122,6 @@ void irq_modify_status(unsigned int irq, unsigned long clr, unsigned long set)
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
-	/* Keep this around until x86 is converted over */
-	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
-		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
-		if (irq_settings_can_move_pcntxt(desc))
-			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
-	}
-
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index 975eb8d..ca142b9 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -109,7 +109,6 @@ static const struct irq_bit_descr irqdata_states[] = {
 	BIT_MASK_DESCR(IRQD_NO_BALANCING),
 
 	BIT_MASK_DESCR(IRQD_SINGLE_TARGET),
-	BIT_MASK_DESCR(IRQD_MOVE_PCNTXT),
 	BIT_MASK_DESCR(IRQD_AFFINITY_SET),
 	BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
 	BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index b61fc64..a979523 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -421,7 +421,7 @@ irq_init_generic_chip(struct irq_chip_generic *gc, const char *name,
 #ifdef CONFIG_GENERIC_PENDING_IRQ
 static inline bool irq_can_move_pcntxt(struct irq_data *data)
 {
-	return irqd_can_move_in_process_context(data);
+	return !(data->chip->flags & IRQCHIP_MOVE_DEFERRED);
 }
 static inline bool irq_move_pending(struct irq_data *data)
 {
diff --git a/kernel/irq/settings.h b/kernel/irq/settings.h
index 7b7efb1..00b3bd1 100644
--- a/kernel/irq/settings.h
+++ b/kernel/irq/settings.h
@@ -11,7 +11,6 @@ enum {
 	_IRQ_NOREQUEST		= IRQ_NOREQUEST,
 	_IRQ_NOTHREAD		= IRQ_NOTHREAD,
 	_IRQ_NOAUTOEN		= IRQ_NOAUTOEN,
-	_IRQ_MOVE_PCNTXT	= IRQ_MOVE_PCNTXT,
 	_IRQ_NO_BALANCING	= IRQ_NO_BALANCING,
 	_IRQ_NESTED_THREAD	= IRQ_NESTED_THREAD,
 	_IRQ_PER_CPU_DEVID	= IRQ_PER_CPU_DEVID,
@@ -142,11 +141,6 @@ static inline void irq_settings_set_noprobe(struct irq_desc *desc)
 	desc->status_use_accessors |= _IRQ_NOPROBE;
 }
 
-static inline bool irq_settings_can_move_pcntxt(struct irq_desc *desc)
-{
-	return desc->status_use_accessors & _IRQ_MOVE_PCNTXT;
-}
-
 static inline bool irq_settings_can_autoenable(struct irq_desc *desc)
 {
 	return !(desc->status_use_accessors & _IRQ_NOAUTOEN);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: irq/core] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
                     ` (2 preceding siblings ...)
  2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
@ 2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2025-01-15 20:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Steve Wahl, Wei Liu, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     7d04319a05ab17ca3da188d0799af93d3213cb06
Gitweb:        https://git.kernel.org/tip/7d04319a05ab17ca3da188d0799af93d3213cb06
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 10 Dec 2024 11:34:15 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 15 Jan 2025 21:38:53 +01:00

x86/apic: Convert to IRQCHIP_MOVE_DEFERRED

Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de


---
 arch/x86/Kconfig                    | 1 +
 arch/x86/hyperv/irqdomain.c         | 2 +-
 arch/x86/kernel/apic/io_apic.c      | 2 +-
 arch/x86/kernel/apic/msi.c          | 3 ++-
 arch/x86/kernel/hpet.c              | 8 --------
 arch/x86/platform/uv/uv_irq.c       | 3 ---
 drivers/iommu/amd/init.c            | 2 +-
 drivers/iommu/amd/iommu.c           | 1 -
 drivers/iommu/intel/irq_remapping.c | 1 -
 drivers/pci/controller/pci-hyperv.c | 1 +
 drivers/xen/events/events_base.c    | 6 ------
 11 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9d7bd0a..df0fd72 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -173,6 +173,7 @@ config X86
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PENDING_IRQ		if SMP
+	select GENERIC_PENDING_IRQ_CHIPFLAGS	if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_TIME_VSYSCALL
diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
index 3215a4a..64b9213 100644
--- a/arch/x86/hyperv/irqdomain.c
+++ b/arch/x86/hyperv/irqdomain.c
@@ -304,7 +304,7 @@ static struct irq_chip hv_pci_msi_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= hv_irq_compose_msi_msg,
 	.irq_set_affinity	= msi_domain_set_affinity,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED,
 };
 
 static struct msi_domain_ops pci_msi_domain_ops = {
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 1029ea4..5d033d9 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1861,7 +1861,7 @@ static struct irq_chip ioapic_chip __read_mostly = {
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 3407692..66bc5d3 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -214,6 +214,7 @@ static bool x86_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
 		if (WARN_ON_ONCE(domain != real_parent))
 			return false;
 		info->chip->irq_set_affinity = msi_set_affinity;
+		info->chip->flags |= IRQCHIP_MOVE_DEFERRED;
 		break;
 	case DOMAIN_BUS_DMAR:
 	case DOMAIN_BUS_AMDVI:
@@ -315,7 +316,7 @@ static struct irq_chip dmar_msi_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= dmar_msi_compose_msg,
 	.irq_write_msi_msg	= dmar_msi_write_msg,
-	.flags			= IRQCHIP_SKIP_SET_WAKE |
+	.flags			= IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MOVE_DEFERRED |
 				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index c96ae8f..87b5a50 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -516,22 +516,14 @@ static int hpet_msi_init(struct irq_domain *domain,
 			 struct msi_domain_info *info, unsigned int virq,
 			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
 {
-	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 	irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL,
 			    handle_edge_irq, arg->data, "edge");
 
 	return 0;
 }
 
-static void hpet_msi_free(struct irq_domain *domain,
-			  struct msi_domain_info *info, unsigned int virq)
-{
-	irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
-}
-
 static struct msi_domain_ops hpet_msi_domain_ops = {
 	.msi_init	= hpet_msi_init,
-	.msi_free	= hpet_msi_free,
 };
 
 static struct msi_domain_info hpet_msi_domain_info = {
diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
index a379501..4f200ac 100644
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -92,8 +92,6 @@ static int uv_domain_alloc(struct irq_domain *domain, unsigned int virq,
 	if (ret >= 0) {
 		if (info->uv.limit == UV_AFFINITY_CPU)
 			irq_set_status_flags(virq, IRQ_NO_BALANCING);
-		else
-			irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 
 		chip_data->pnode = uv_blade_to_pnode(info->uv.blade);
 		chip_data->offset = info->uv.offset;
@@ -113,7 +111,6 @@ static void uv_domain_free(struct irq_domain *domain, unsigned int virq,
 
 	BUG_ON(nr_irqs != 1);
 	kfree(irq_data->chip_data);
-	irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
 	irq_clear_status_flags(virq, IRQ_NO_BALANCING);
 	irq_domain_free_irqs_top(domain, virq, nr_irqs);
 }
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 0e0a531..614f216 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2332,7 +2332,7 @@ static struct irq_chip intcapxt_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_set_affinity       = intcapxt_set_affinity,
 	.irq_set_wake		= intcapxt_set_wake,
-	.flags			= IRQCHIP_MASK_ON_SUSPEND,
+	.flags			= IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_MOVE_DEFERRED,
 };
 
 static const struct irq_domain_ops intcapxt_domain_ops = {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3f691e1..b02e631 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3532,7 +3532,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq,
 		irq_data->chip_data = data;
 		irq_data->chip = &amd_ir_chip;
 		irq_remapping_prepare_irte(data, cfg, info, devid, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 
 	return 0;
diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 466c141..f5402df 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1463,7 +1463,6 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 		else
 			irq_data->chip = &intel_ir_chip;
 		intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i);
-		irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT);
 	}
 	return 0;
 
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index cdd5be1..6084b38 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2053,6 +2053,7 @@ static struct irq_chip hv_msi_irq_chip = {
 	.irq_set_affinity	= irq_chip_set_affinity_parent,
 #ifdef CONFIG_X86
 	.irq_ack		= irq_chip_ack_parent,
+	.flags			= IRQCHIP_MOVE_DEFERRED,
 #elif defined(CONFIG_ARM64)
 	.irq_eoi		= irq_chip_eoi_parent,
 #endif
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 985e155..41309d3 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -722,12 +722,6 @@ static struct irq_info *xen_irq_init(unsigned int irq)
 		INIT_RCU_WORK(&info->rwork, delayed_free_irq);
 
 		set_info_for_irq(irq, info);
-		/*
-		 * Interrupt affinity setting can be immediate. No point
-		 * in delaying it until an interrupt is handled.
-		 */
-		irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
-
 		INIT_LIST_HEAD(&info->eoi_list);
 		list_add_tail(&info->list, &xen_irq_list_head);
 	}

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-01-15 20:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-10 10:34 [patch 0/5] genirq, x86: Rework deferred interrupt affinity logic Thomas Gleixner
2024-12-10 10:34 ` [patch 1/5] ARC: Remove GENERIC_PENDING_IRQ Thomas Gleixner
2024-12-10 17:22   ` Vineet Gupta
2024-12-10 22:52     ` Thomas Gleixner
2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2024-12-10 10:34 ` [patch 2/5] hexagon: Remove GENERIC_PENDING_IRQ leftover Thomas Gleixner
2024-12-10 15:06   ` Brian Cain
2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2024-12-10 10:34 ` [patch 3/5] genirq: Provide IRQCHIP_MOVE_DEFERRED Thomas Gleixner
2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2024-12-10 10:34 ` [patch 4/5] x86/apic: Convert to IRQCHIP_MOVE_DEFERRED Thomas Gleixner
2024-12-11 16:36   ` Steve Wahl
2024-12-17 19:33   ` Wei Liu
2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner
2024-12-10 10:34 ` [patch 5/5] genirq: Remove IRQ_MOVE_PCNTXT and related code Thomas Gleixner
2025-01-15 10:04   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2025-01-15 20:47   ` tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).