All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V3 00/14] Improve /proc/interrupts further
@ 2026-03-26 21:56 Thomas Gleixner
  2026-03-26 21:56 ` [patch V3 01/14] x86/irq: Optimize interrupts decimals printing Thomas Gleixner
                   ` (14 more replies)
  0 siblings, 15 replies; 19+ messages in thread
From: Thomas Gleixner @ 2026-03-26 21:56 UTC (permalink / raw)
  To: LKML
  Cc: x86, Michael Kelley, Dmitry Ilvokhin, Radu Rendec, Jan Kiszka,
	Kieran Bingham, Florian Fainelli

This is a follow up to v2 which can be found here:

  https://lore.kernel.org/20260320131108.344376329@kernel.org

The v1 cover letter contains a full analysis, explanation and numbers:

  https://lore.kernel.org/20260303150539.513068586@kernel.org

TLDR:

  - The performance of reading of /proc/interrupts has been improved
    piecewise over the years, but most of the low hanging fruit has been
    left on the table.

Changes vs. V2:

   - Addressed the valuable review comments from Michael, Radu and Dmitry.
     Thanks!

   - Addressed the 0-day fallout (missing #ifdef guards, typos)

   - More updates to the x86 irq stats:

       - Provide a mechanism for interrupts which should never happen to
         skip them by default and only remove the skip condition if one
         occurs (Spurious and ICR read retry)

       - Use the same mechanism to handle the IOAPIC misrouted and the
         PIC/APIC error counts

   - Updated the out of sync GDB script - pointed out by Radu
   
	- Use the new array based x86 stats

	- Ensure visually tabular output

   - Reworked the 'first line' mechanism by using proc_seq_create_private()
     which also simplifies the precision and chip name width adjustments

   - Made the output format visually tabular in /proc/interrupts

   - Picked up tags where appropriate

   - Dropped the binary interface parts as they were RFC and just for
     demonstration. Let's see if anyone cares down the road.

   - Tagged the series so the irq/core branch can be updated without
     losing the submitted content.

Delta patch against v2 (w/o the binary RFC part) is below.

The series applies on top of v7.0-rc3 and is also available via git:

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git irq-proc-v3

Thanks,

	tglx
---
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index d68f587f0b7d..305774b67995 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1032,7 +1032,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
 	 * Unmasking the LVTPC is not required as the Mask (M) bit of the LVT
 	 * PMI entry is not set by the local APIC when a PMC overflow occurs
 	 */
-	inc_irq_stat(APIC_PERF);
+	inc_perf_irq_stat();
 
 done:
 	cpuc->enabled = pmu_enabled;
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index b22259b54685..0e36f5580e8d 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1403,7 +1403,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 	handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
 
 	if (handled)
-		inc_irq_stat(APIC_PERF);
+		inc_perf_irq_stat();
 
 	perf_sample_event_took(sched_clock() - stamp);
 
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index ce9f9d4cd5fd..e1e9aaa4f11a 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1747,7 +1747,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
 	}
 
 	if (handled)
-		inc_irq_stat(APIC_PERF);
+		inc_perf_irq_stat();
 
 	return handled;
 }
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index e5c85c5c8f87..297916f29c09 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3504,7 +3504,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 	int bit;
 	int handled = 0;
 
-	inc_irq_stat(APIC_PERF);
+	inc_perf_irq_stat();
 
 	/*
 	 * Ignore a range of extra bits in status that do not indicate
diff --git a/arch/x86/events/intel/knc.c b/arch/x86/events/intel/knc.c
index 537838404524..e887adc108ac 100644
--- a/arch/x86/events/intel/knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -238,7 +238,7 @@ static int knc_pmu_handle_irq(struct pt_regs *regs)
 		goto done;
 	}
 
-	inc_irq_stat(APIC_PERF);
+	inc_perf_irq_stat();
 
 	for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
 		struct perf_event *event = cpuc->events[bit];
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index 1f7621938494..12bf293d42a5 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1077,7 +1077,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
 	}
 
 	if (handled)
-		inc_irq_stat(APIC_PERF);
+		inc_perf_irq_stat();
 
 	/*
 	 * When dealing with the unmasking of the LVTPC on P4 perf hw, it has
diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index f1a5d0347b08..4bc177badac2 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -373,7 +373,7 @@ static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
 	else
 		zhaoxin_pmu_ack_status(status);
 
-	inc_irq_stat(APIC_PERF);
+	inc_perf_irq_stat();
 
 	/*
 	 * CondChgd bit 63 doesn't mean any overflow status. Ignore
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index dcc96edb4f82..dea60d66d976 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -4,7 +4,7 @@
 
 #include <linux/threads.h>
 
-enum {
+enum irq_stat_counts {
 	IRQ_COUNT_NMI,
 #ifdef CONFIG_X86_LOCAL_APIC
 	IRQ_COUNT_APIC_TIMER,
@@ -49,6 +49,10 @@ enum {
 #endif
 #ifdef CONFIG_X86_POSTED_MSI
 	IRQ_COUNT_POSTED_MSI_NOTIFICATION,
+#endif
+	IRQ_COUNT_PIC_APIC_ERROR,
+#ifdef CONFIG_X86_IO_APIC
+	IRQ_COUNT_IOAPIC_MISROUTED,
 #endif
 	IRQ_COUNT_MAX,
 };
@@ -68,14 +72,20 @@ DECLARE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc);
 #define __ARCH_IRQ_STAT
 
 #define inc_irq_stat(index)	this_cpu_inc(irq_stat.counts[IRQ_COUNT_##index])
+void irq_stat_inc_and_enable(enum irq_stat_counts which);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+#define inc_perf_irq_stat()	inc_irq_stat(APIC_PERF)
+#else
+#define inc_perf_irq_stat()	do { } while (0)
+#endif
 
 extern void ack_bad_irq(unsigned int irq);
 
+#ifdef CONFIG_PROC_FS
 extern u64 arch_irq_stat_cpu(unsigned int cpu);
 #define arch_irq_stat_cpu	arch_irq_stat_cpu
-
-extern u64 arch_irq_stat(void);
-#define arch_irq_stat		arch_irq_stat
+#endif
 
 DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
 #define local_softirq_pending_ref       __softirq_pending
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index cbe19e669080..47727d0b540b 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -110,10 +110,6 @@ static inline void lock_vector_lock(void) {}
 static inline void unlock_vector_lock(void) {}
 #endif
 
-/* Statistics */
-extern atomic_t irq_err_count;
-extern atomic_t irq_mis_count;
-
 extern void elcr_set_level_irq(unsigned int irq);
 
 extern char irq_entries_start[];
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index e2de95b0862a..e4b19a288ee7 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2108,7 +2108,7 @@ static noinline void handle_spurious_interrupt(u8 vector)
 
 	trace_spurious_apic_entry(vector);
 
-	inc_irq_stat(SPURIOUS);
+	irq_stat_inc_and_enable(IRQ_COUNT_SPURIOUS);
 
 	/*
 	 * If this is a spurious interrupt then do not acknowledge
@@ -2180,7 +2180,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_error_interrupt)
 		apic_write(APIC_ESR, 0);
 	v = apic_read(APIC_ESR);
 	apic_eoi();
-	atomic_inc(&irq_err_count);
+	irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
 
 	apic_pr_debug("APIC error on CPU%d: %02x", smp_processor_id(), v);
 
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 352ed5558cbc..7d7175d01228 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1575,8 +1575,6 @@ static unsigned int startup_ioapic_irq(struct irq_data *data)
 	return was_pending;
 }
 
-atomic_t irq_mis_count;
-
 #ifdef CONFIG_GENERIC_PENDING_IRQ
 static bool io_apic_level_ack_pending(struct mp_chip_data *data)
 {
@@ -1713,7 +1711,7 @@ static void ioapic_ack_level(struct irq_data *irq_data)
 	 * at the cpu.
 	 */
 	if (!(v & (1 << (i & 0x1f)))) {
-		atomic_inc(&irq_mis_count);
+		irq_stat_inc_and_enable(IRQ_COUNT_IOAPIC_MISROUTED);
 		eoi_ioapic_pin(cfg->vector, irq_data->chip_data);
 	}
 
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 3635c4d7b7f5..c627bee3b14f 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -120,7 +120,7 @@ u32 apic_mem_wait_icr_idle_timeout(void)
 	for (cnt = 0; cnt < 1000; cnt++) {
 		if (!(apic_read(APIC_ICR) & APIC_ICR_BUSY))
 			return 0;
-		inc_irq_stat(ICR_READ_RETRY);
+		irq_stat_inc_and_enable(IRQ_COUNT_ICR_READ_RETRY);
 		udelay(100);
 	}
 	return APIC_ICR_BUSY;
diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index f67063df6723..f7a86b94a0dd 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -214,7 +214,7 @@ static void mask_and_ack_8259A(struct irq_data *data)
 			       "spurious 8259A interrupt: IRQ%d.\n", irq);
 			spurious_irq_mask |= irqmask;
 		}
-		atomic_inc(&irq_err_count);
+		irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
 		/*
 		 * Theoretically we do not have to handle this IRQ,
 		 * but in Linux this does not cause problems and is
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 2bd8c08f8d91..0b3723cec0b9 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -39,8 +39,6 @@ EXPORT_PER_CPU_SYMBOL(__softirq_pending);
 
 DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
 
-atomic_t irq_err_count;
-
 /*
  * 'what should we do if we get a hw irq event on an illegal vector'.
  * each architecture has to answer this themselves.
@@ -68,56 +66,65 @@ struct irq_stat_info {
 	const char	*text;
 };
 
+#define DEFAULT_SUPPRESSED_VECTOR	UINT_MAX
+
 #define ISS(idx, sym, txt) [IRQ_COUNT_##idx] = { .symbol = sym, .text = txt }
 
 #define ITS(idx, sym, txt) [IRQ_COUNT_##idx] =				\
 	{ .skip_vector = idx## _VECTOR, .symbol = sym, .text = txt }
 
+#define IDS(idx, sym, txt) [IRQ_COUNT_##idx] =				\
+	{ .skip_vector = DEFAULT_SUPPRESSED_VECTOR, .symbol = sym, .text = txt }
+
 static struct irq_stat_info irq_stat_info[IRQ_COUNT_MAX] __ro_after_init = {
-	ISS(NMI,			"NMI", " Non-maskable interrupts\n"),
+	ISS(NMI,			"NMI", "  Non-maskable interrupts\n"),
 #ifdef CONFIG_X86_LOCAL_APIC
-	ISS(APIC_TIMER,			"LOC", " Local timer interrupts\n"),
-	ISS(SPURIOUS,			"SPU", " Spurious interrupts\n"),
-	ISS(APIC_PERF,			"PMI", " Performance monitoring interrupts\n"),
-	ISS(IRQ_WORK,			"IWI", " IRQ work interrupts\n"),
-	ISS(ICR_READ_RETRY,		"RTR", " APIC ICR read retries\n"),
-	ISS(X86_PLATFORM_IPI,		"PLT", " Platform interrupts\n"),
+	ISS(APIC_TIMER,			"LOC", "  Local timer interrupts\n"),
+	IDS(SPURIOUS,			"SPU", "  Spurious interrupts\n"),
+	ISS(APIC_PERF,			"PMI", "  Performance monitoring interrupts\n"),
+	ISS(IRQ_WORK,			"IWI", "  IRQ work interrupts\n"),
+	IDS(ICR_READ_RETRY,		"RTR", "  APIC ICR read retries\n"),
+	ISS(X86_PLATFORM_IPI,		"PLT", "  Platform interrupts\n"),
 #endif
 #ifdef CONFIG_SMP
-	ISS(RESCHEDULE,			"RES", " Rescheduling interrupts\n"),
-	ISS(CALL_FUNCTION,		"CAL", " Function call interrupts\n"),
+	ISS(RESCHEDULE,			"RES", "  Rescheduling interrupts\n"),
+	ISS(CALL_FUNCTION,		"CAL", "  Function call interrupts\n"),
 #endif
-	ISS(TLB,			"TLB", " TLB shootdowns\n"),
+	ISS(TLB,			"TLB", "  TLB shootdowns\n"),
 #ifdef CONFIG_X86_THERMAL_VECTOR
-	ISS(THERMAL_APIC,		"TRM", " Thermal event interrupt\n"),
+	ISS(THERMAL_APIC,		"TRM", "  Thermal event interrupt\n"),
 #endif
 #ifdef CONFIG_X86_MCE_THRESHOLD
-	ISS(THRESHOLD_APIC,		"THR", " Threshold APIC interrupts\n"),
+	ISS(THRESHOLD_APIC,		"THR", "  Threshold APIC interrupts\n"),
 #endif
 #ifdef CONFIG_X86_MCE_AMD
-	ISS(DEFERRED_ERROR,		"DFR", " Deferred Error APIC interrupts\n"),
+	ISS(DEFERRED_ERROR,		"DFR", "  Deferred Error APIC interrupts\n"),
 #endif
 #ifdef CONFIG_X86_MCE
-	ISS(MCE_EXCEPTION,		"MCE", " Machine check exceptions\n"),
-	ISS(MCE_POLL,			"MCP", " Machine check polls\n"),
+	ISS(MCE_EXCEPTION,		"MCE", "  Machine check exceptions\n"),
+	ISS(MCE_POLL,			"MCP", "  Machine check polls\n"),
 #endif
 #ifdef CONFIG_X86_HV_CALLBACK_VECTOR
-	ITS(HYPERVISOR_CALLBACK,	"HYP", " Hypervisor callback interrupts\n"),
+	ITS(HYPERVISOR_CALLBACK,	"HYP", "  Hypervisor callback interrupts\n"),
 #endif
 #if IS_ENABLED(CONFIG_HYPERV)
-	ITS(HYPERV_REENLIGHTENMENT,	"HRE", " Hyper-V reenlightment interrupts\n"),
-	ITS(HYPERV_STIMER0,		"HVS", " Hyper-V stimer0 interrupts\n"),
+	ITS(HYPERV_REENLIGHTENMENT,	"HRE", "  Hyper-V reenlightenment interrupts\n"),
+	ITS(HYPERV_STIMER0,		"HVS", "  Hyper-V stimer0 interrupts\n"),
 #endif
 #if IS_ENABLED(CONFIG_KVM)
-	ITS(POSTED_INTR,		"PIN", " Posted-interrupt notification event\n"),
-	ITS(POSTED_INTR_NESTED,		"NPI", " Nested posted-interrupt event\n"),
-	ITS(POSTED_INTR_WAKEUP,		"PIW", " Posted-interrupt wakeup event\n"),
+	ITS(POSTED_INTR,		"PIN", "  Posted-interrupt notification event\n"),
+	ITS(POSTED_INTR_NESTED,		"NPI", "  Nested posted-interrupt event\n"),
+	ITS(POSTED_INTR_WAKEUP,		"PIW", "  Posted-interrupt wakeup event\n"),
 #endif
 #ifdef CONFIG_GUEST_PERF_EVENTS
 	ISS(PERF_GUEST_MEDIATED_PMI,	"VPMI", " Perf Guest Mediated PMI\n"),
 #endif
 #ifdef CONFIG_X86_POSTED_MSI
-	ISS(POSTED_MSI_NOTIFICATION,	"PMN", " Posted MSI notification event\n"),
+	ISS(POSTED_MSI_NOTIFICATION,	"PMN", "  Posted MSI notification event\n"),
+#endif
+	IDS(PIC_APIC_ERROR,		"ERR", "  PIC/APIC error interrupts\n"),
+#ifdef CONFIG_X86_IO_APIC
+	IDS(IOAPIC_MISROUTED,		"MIS", "  Misrouted IO/APIC interrupts\n"),
 #endif
 };
 
@@ -126,19 +133,34 @@ void __init irq_init_stats(void)
 	struct irq_stat_info *info = irq_stat_info;
 
 	for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
-		if (info->skip_vector && test_bit(info->skip_vector, system_vectors))
+		if (info->skip_vector && info->skip_vector != DEFAULT_SUPPRESSED_VECTOR &&
+		    test_bit(info->skip_vector, system_vectors))
 			info->skip_vector = 0;
 	}
 
+#ifdef CONFIG_X86_LOCAL_APIC
 	if (!x86_platform_ipi_callback)
 		irq_stat_info[IRQ_COUNT_X86_PLATFORM_IPI].skip_vector = 1;
+#endif
 
 #ifdef CONFIG_X86_POSTED_MSI
 	if (!posted_msi_enabled())
-		irq_stat_info[IRQ_COUNT_X86_POSTED_MSI].skip_vector = 1;
+		irq_stat_info[IRQ_COUNT_POSTED_MSI_NOTIFICATION].skip_vector = 1;
 #endif
 }
 
+/*
+ * Used for default enabled counters to increment the stats and to enable the
+ * entry for /proc/interrupts output.
+ */
+void irq_stat_inc_and_enable(enum irq_stat_counts which)
+{
+	this_cpu_inc(irq_stat.counts[which]);
+	/* Pairs with the READ_ONCE() in arch_show_interrupts() */
+	WRITE_ONCE(irq_stat_info[which].skip_vector, 0);
+}
+
+#ifdef CONFIG_PROC_FS
 /*
  * /proc/interrupts printing for arch specific interrupts
  */
@@ -147,17 +169,13 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 	const struct irq_stat_info *info = irq_stat_info;
 
 	for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
-		if (info->skip_vector)
+		if (READ_ONCE(info->skip_vector))
 			continue;
 
 		seq_printf(p, "%*s:", prec, info->symbol);
 		irq_proc_emit_counts(p, &irq_stat.counts[i]);
 		seq_puts(p, info->text);
 	}
-
-	seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
-	if (IS_ENABLED(CONFIG_X86_IO_APIC))
-		seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
 	return 0;
 }
 
@@ -173,12 +191,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
 		sum += p->counts[i];
 	return sum;
 }
-
-u64 arch_irq_stat(void)
-{
-	u64 sum = atomic_read(&irq_err_count);
-	return sum;
-}
+#endif /* CONFIG_PROC_FS */
 
 static __always_inline void handle_irq(struct irq_desc *desc,
 				       struct pt_regs *regs)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 8b444e862319..20c3df9a9b80 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -18,9 +18,6 @@
 #ifndef arch_irq_stat_cpu
 #define arch_irq_stat_cpu(cpu) 0
 #endif
-#ifndef arch_irq_stat
-#define arch_irq_stat() 0
-#endif
 
 u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
 {
@@ -122,7 +119,6 @@ static int show_stat(struct seq_file *p, void *v)
 			sum_softirq += softirq_stat;
 		}
 	}
-	sum += arch_irq_stat();
 
 	seq_put_decimal_ull(p, "cpu  ", nsec_to_clock_t(user));
 	seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice));
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6389a462c731..2809f0fc4175 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -46,9 +46,11 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
 		scoped_irqdesc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
 		ret = 0;
 	}
-	/* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
-	if (!ret)
+	if (!ret) {
+		/* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
 		irq_mark_irq(irq);
+		irq_proc_update_chip(chip);
+	}
 	return ret;
 }
 EXPORT_SYMBOL(irq_set_chip);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 37eec0337867..7fbf003c6e93 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -12,6 +12,8 @@
 #include <linux/rcuref.h>
 #include <linux/sched/clock.h>
 
+#include "proc.h"
+
 #ifdef CONFIG_SPARSE_IRQ
 # define MAX_SPARSE_IRQS	INT_MAX
 #else
@@ -149,12 +151,6 @@ static inline void unregister_handler_proc(unsigned int irq,
 static inline void irq_proc_update_valid(struct irq_desc *desc) { }
 #endif
 
-#if defined(CONFIG_PROC_FS) && defined(CONFIG_GENERIC_IRQ_SHOW)
-void irq_proc_calc_prec(void);
-#else
-static inline void irq_proc_calc_prec(void) { }
-#endif
-
 struct irq_desc *irq_find_desc_at_or_after(unsigned int offset);
 
 extern bool irq_can_set_affinity_usr(unsigned int irq);
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 9b9a75dfeebd..80ef4e27dcf4 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -185,6 +185,7 @@ struct irq_desc *irq_find_desc_at_or_after(unsigned int offset)
 {
 	unsigned long index = offset;
 
+	lockdep_assert_in_rcu_read_lock();
 	return mt_find(&sparse_irqs, &index, total_nr_irqs);
 }
 
@@ -930,8 +931,10 @@ EXPORT_SYMBOL_GPL(__irq_alloc_descs);
  */
 unsigned int irq_get_next_irq(unsigned int offset)
 {
-	struct irq_desc *desc = irq_find_desc_at_or_after(offset);
+	struct irq_desc *desc;
 
+	guard(rcu)();
+	desc = irq_find_desc_at_or_after(offset);
 	return desc ? irq_desc_get_irq(desc) : total_nr_irqs;
 }
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cc93abf009e8..9f524ed709b8 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -20,6 +20,8 @@
 #include <linux/smp.h>
 #include <linux/fs.h>
 
+#include "proc.h"
+
 static LIST_HEAD(irq_domain_list);
 static DEFINE_MUTEX(irq_domain_mutex);
 
@@ -1532,6 +1534,7 @@ int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
 	irq_data->chip = (struct irq_chip *)(chip ? chip : &no_irq_chip);
 	irq_data->chip_data = chip_data;
 
+	irq_proc_update_chip(chip);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(irq_domain_set_hwirq_and_chip);
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index f6cdb262d1e7..a62d4694f063 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -443,8 +443,7 @@ void irq_proc_update_valid(struct irq_desc *desc)
 {
 	u32 set = _IRQ_PROC_VALID;
 
-	if (irq_settings_is_hidden(desc) || !desc->action ||
-	    irq_desc_is_chained(desc) || !desc->kstat_irqs)
+	if (irq_settings_is_hidden(desc) || irq_desc_is_chained(desc) || !desc->action)
 		set = 0;
 
 	irq_settings_update_proc_valid(desc, set);
@@ -459,7 +458,16 @@ int __weak arch_show_interrupts(struct seq_file *p, int prec)
 	return 0;
 }
 
-static int irq_num_prec __read_mostly = 3;
+static DEFINE_RAW_SPINLOCK(irq_proc_constraints_lock);
+
+static struct irq_proc_constraints {
+	bool		print_header;
+	unsigned int	num_prec;
+	unsigned int	chip_width;
+} irq_proc_constraints __read_mostly = {
+	.num_prec	= 3,
+	.chip_width	= 8,
+};
 
 #ifndef ACTUAL_NR_IRQS
 # define ACTUAL_NR_IRQS total_nr_irqs
@@ -471,7 +479,23 @@ void irq_proc_calc_prec(void)
 
 	for (prec = 3, n = 1000; prec < 10 && n <= total_nr_irqs; ++prec)
 		n *= 10;
-	WRITE_ONCE(irq_num_prec, prec);
+
+	guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+	if (prec > irq_proc_constraints.num_prec)
+		WRITE_ONCE(irq_proc_constraints.num_prec, prec);
+}
+
+void irq_proc_update_chip(const struct irq_chip *chip)
+{
+	unsigned int len = chip && chip->name ? strlen(chip->name) : 0;
+
+	if (!len || len <= READ_ONCE(irq_proc_constraints.chip_width))
+		return;
+
+	/* Can be invoked from interrupt disabled contexts */
+	guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+	if (len > irq_proc_constraints.chip_width)
+		WRITE_ONCE(irq_proc_constraints.chip_width, len);
 }
 
 #define ZSTR1 "          0"
@@ -512,26 +536,25 @@ void irq_proc_emit_counts(struct seq_file *p, unsigned int __percpu *cnts)
 
 static int irq_seq_show(struct seq_file *p, void *v)
 {
-	int prec = (int)(unsigned long)p->private;
+	struct irq_proc_constraints *constr = p->private;
 	struct irq_desc *desc = v;
 	struct irqaction *action;
 
 	if (desc == ARCH_PROC_IRQDESC)
-		return arch_show_interrupts(p, prec);
+		return arch_show_interrupts(p, constr->num_prec);
 
 	/* print header for the first interrupt indicated by !p>private */
-	if (!prec) {
+	if (constr->print_header) {
 		unsigned int cpu;
 
-		prec = READ_ONCE(irq_num_prec);
-		seq_printf(p, "%*s", prec + 8, "");
+		seq_printf(p, "%*s", constr->num_prec + 8, "");
 		for_each_online_cpu(cpu)
 			seq_printf(p, "CPU%-8d", cpu);
 		seq_putc(p, '\n');
-		p->private = (void *)(unsigned long)prec;
+		constr->print_header = false;
 	}
 
-	seq_put_decimal_ull_width(p, "", irq_desc_get_irq(desc), prec);
+	seq_put_decimal_ull_width(p, "", irq_desc_get_irq(desc), constr->num_prec);
 	seq_putc(p, ':');
 
 	/*
@@ -543,25 +566,27 @@ static int irq_seq_show(struct seq_file *p, void *v)
 		irq_proc_emit_counts(p, &desc->kstat_irqs->cnt);
 	else
 		irq_proc_emit_zero_counts(p, num_online_cpus());
-	seq_putc(p, ' ');
+
+	/* Enforce a visual gap */
+	seq_write(p, "  ", 2);
 
 	guard(raw_spinlock_irq)(&desc->lock);
 	if (desc->irq_data.chip) {
 		if (desc->irq_data.chip->irq_print_chip)
 			desc->irq_data.chip->irq_print_chip(&desc->irq_data, p);
 		else if (desc->irq_data.chip->name)
-			seq_printf(p, "%8s", desc->irq_data.chip->name);
+			seq_printf(p, "%-*s", constr->chip_width, desc->irq_data.chip->name);
 		else
-			seq_printf(p, "%8s", "-");
+			seq_printf(p, "%-*s", constr->chip_width, "-");
 	} else {
-		seq_printf(p, "%8s", "None");
+		seq_printf(p, "%-*s", constr->chip_width, "None");
 	}
 
 	seq_putc(p, ' ');
 	if (desc->irq_data.domain)
-		seq_put_decimal_ull_width(p, "", desc->irq_data.hwirq, prec);
+		seq_put_decimal_ull_width(p, "", desc->irq_data.hwirq, constr->num_prec);
 	else
-		seq_printf(p, " %*s", prec, "");
+		seq_printf(p, " %*s", constr->num_prec, "");
 
 	if (IS_ENABLED(CONFIG_GENERIC_IRQ_SHOW_LEVEL))
 		seq_printf(p, " %-8s", irqd_is_level_type(&desc->irq_data) ? "Level" : "Edge");
@@ -582,14 +607,13 @@ static int irq_seq_show(struct seq_file *p, void *v)
 
 static void *irq_seq_next_desc(loff_t *pos)
 {
-	struct irq_desc *desc;
-
 	if (*pos > total_nr_irqs)
 		return NULL;
 
 	guard(rcu)();
 	for (;;) {
-		desc = irq_find_desc_at_or_after((unsigned int) *pos);
+		struct irq_desc *desc = irq_find_desc_at_or_after((unsigned int) *pos);
+
 		if (desc) {
 			*pos = irq_desc_get_irq(desc);
 			/*
@@ -609,8 +633,13 @@ static void *irq_seq_next_desc(loff_t *pos)
 
 static void *irq_seq_start(struct seq_file *f, loff_t *pos)
 {
-	if (!*pos)
-		f->private = NULL;
+	if (!*pos) {
+		struct irq_proc_constraints *constr = f->private;
+
+		constr->num_prec = READ_ONCE(irq_proc_constraints.num_prec);
+		constr->chip_width = READ_ONCE(irq_proc_constraints.chip_width);
+		constr->print_header = true;
+	}
 	return irq_seq_next_desc(pos);
 }
 
@@ -638,7 +667,8 @@ static const struct seq_operations irq_seq_ops = {
 
 static int __init irq_proc_init(void)
 {
-	proc_create_seq("interrupts", 0, NULL, &irq_seq_ops);
+	proc_create_seq_private("interrupts", 0, NULL, &irq_seq_ops,
+				sizeof(irq_proc_constraints), NULL);
 	return 0;
 }
 fs_initcall(irq_proc_init);
diff --git a/kernel/irq/proc.h b/kernel/irq/proc.h
new file mode 100644
index 000000000000..ec9173d573f9
--- /dev/null
+++ b/kernel/irq/proc.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#if defined(CONFIG_PROC_FS) && defined(CONFIG_GENERIC_IRQ_SHOW)
+void irq_proc_calc_prec(void);
+void irq_proc_update_chip(const struct irq_chip *chip);
+#else
+static inline void irq_proc_calc_prec(void) { }
+static inline void irq_proc_update_chip(const struct irq_chip *chip) { }
+#endif
diff --git a/scripts/gdb/linux/interrupts.py b/scripts/gdb/linux/interrupts.py
index f4f715a8f0e3..6ca7e32f35b0 100644
--- a/scripts/gdb/linux/interrupts.py
+++ b/scripts/gdb/linux/interrupts.py
@@ -20,7 +20,7 @@ def irq_desc_is_chained(desc):
 def irqd_is_level(desc):
     return desc['irq_data']['common']['state_use_accessors'] & constants.LX_IRQD_LEVEL
 
-def show_irq_desc(prec, irq):
+def show_irq_desc(prec, chip_width, irq):
     text = ""
 
     desc = mapletree.mtree_load(gdb.parse_and_eval("&sparse_irqs"), irq)
@@ -48,7 +48,7 @@ def show_irq_desc(prec, irq):
             count = cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt']
         else:
             count = 0
-        text += "%10u" % (count)
+        text += "%10u " % (count)
 
     name = "None"
     if desc['irq_data']['chip']:
@@ -58,7 +58,7 @@ def show_irq_desc(prec, irq):
         else:
             name = "-"
 
-    text += "  %8s" % (name)
+    text += "  %-*s" % (chip_width, name)
 
     if desc['irq_data']['domain']:
         text += "  %*lu" % (prec, desc['irq_data']['hwirq'])
@@ -97,52 +97,26 @@ def show_irq_err_count(prec):
         text += "%*s: %10u\n" % (prec, "ERR", cnt['counter'])
     return text
 
-def x86_show_irqstat(prec, pfx, field, desc):
-    irq_stat = gdb.parse_and_eval("&irq_stat")
+def x86_show_irqstat(prec, pfx, idx, desc):
+    irq_stat = gdb.parse_and_eval("&irq_stat.counts[%d]" %idx)
     text = "%*s: " % (prec, pfx)
     for cpu in cpus.each_online_cpu():
         stat = cpus.per_cpu(irq_stat, cpu)
-        text += "%10u " % (stat[field])
-    text += "  %s\n" % (desc)
-    return text
-
-def x86_show_mce(prec, var, pfx, desc):
-    pvar = gdb.parse_and_eval(var)
-    text = "%*s: " % (prec, pfx)
-    for cpu in cpus.each_online_cpu():
-        text += "%10u " % (cpus.per_cpu(pvar, cpu).dereference())
-    text += "  %s\n" % (desc)
+        text += "%10u " % (stat.dereference())
+    text += desc
     return text
 
 def x86_show_interupts(prec):
-    text = x86_show_irqstat(prec, "NMI", '__nmi_count', 'Non-maskable interrupts')
-
-    if constants.LX_CONFIG_X86_LOCAL_APIC:
-        text += x86_show_irqstat(prec, "LOC", 'apic_timer_irqs', "Local timer interrupts")
-        text += x86_show_irqstat(prec, "SPU", 'irq_spurious_count', "Spurious interrupts")
-        text += x86_show_irqstat(prec, "PMI", 'apic_perf_irqs', "Performance monitoring interrupts")
-        text += x86_show_irqstat(prec, "IWI", 'apic_irq_work_irqs', "IRQ work interrupts")
-        text += x86_show_irqstat(prec, "RTR", 'icr_read_retry_count', "APIC ICR read retries")
-        if utils.gdb_eval_or_none("x86_platform_ipi_callback") is not None:
-            text += x86_show_irqstat(prec, "PLT", 'x86_platform_ipis', "Platform interrupts")
-
-    if constants.LX_CONFIG_SMP:
-        text += x86_show_irqstat(prec, "RES", 'irq_resched_count', "Rescheduling interrupts")
-        text += x86_show_irqstat(prec, "CAL", 'irq_call_count', "Function call interrupts")
-        text += x86_show_irqstat(prec, "TLB", 'irq_tlb_count', "TLB shootdowns")
-
-    if constants.LX_CONFIG_X86_THERMAL_VECTOR:
-        text += x86_show_irqstat(prec, "TRM", 'irq_thermal_count', "Thermal events interrupts")
+    info_type = gdb.lookup_type('struct irq_stat_info')
+    info = gdb.parse_and_eval('irq_stat_info')
 
-    if constants.LX_CONFIG_X86_MCE_THRESHOLD:
-        text += x86_show_irqstat(prec, "THR", 'irq_threshold_count', "Threshold APIC interrupts")
-
-    if constants.LX_CONFIG_X86_MCE_AMD:
-        text += x86_show_irqstat(prec, "DFR", 'irq_deferred_error_count', "Deferred Error APIC interrupts")
-
-    if constants.LX_CONFIG_X86_MCE:
-        text += x86_show_mce(prec, "&mce_exception_count", "MCE", "Machine check exceptions")
-        text += x86_show_mce(prec, "&mce_poll_count", "MCP", "Machine check polls")
+    text = ""
+    for idx in range(int(info.type.sizeof / info_type.sizeof)):
+        if info[idx]['skip_vector']:
+            continue
+        pfx = info[idx]['symbol'].string()
+        desc = info[idx]['text'].string()
+        text += x86_show_irqstat(prec, pfx, idx, desc)
 
     text += show_irq_err_count(prec)
 
@@ -151,11 +125,6 @@ def x86_show_interupts(prec):
         if cnt is not None:
             text += "%*s: %10u\n" % (prec, "MIS", cnt['counter'])
 
-    if constants.LX_CONFIG_KVM:
-        text += x86_show_irqstat(prec, "PIN", 'kvm_posted_intr_ipis', 'Posted-interrupt notification event')
-        text += x86_show_irqstat(prec, "NPI", 'kvm_posted_intr_nested_ipis', 'Nested posted-interrupt event')
-        text += x86_show_irqstat(prec, "PIW", 'kvm_posted_intr_wakeup_ipis', 'Posted-interrupt wakeup event')
-
     return text
 
 def arm_common_show_interrupts(prec):
@@ -209,12 +178,19 @@ class LxInterruptList(gdb.Command):
         super(LxInterruptList, self).__init__("lx-interruptlist", gdb.COMMAND_DATA)
 
     def invoke(self, arg, from_tty):
-        nr_irqs = gdb.parse_and_eval("nr_irqs")
-        prec = 3
-        j = 1000
-        while prec < 10 and j <= nr_irqs:
-            prec += 1
-            j *= 10
+        nr_irqs = gdb.parse_and_eval("total_nr_irqs")
+        constr = utils.gdb_eval_or_none('irq_proc_constraints')
+
+        if constr:
+            prec = int(constr['num_prec'])
+            chip_width = int(constr['chip_width'])
+        else:
+            prec = 3
+            j = 1000
+            while prec < 10 and j <= nr_irqs:
+                prec += 1
+                j *= 10
+            chip_width = 8
 
         gdb.write("%*s" % (prec + 8, ""))
         for cpu in cpus.each_online_cpu():
@@ -225,7 +201,7 @@ class LxInterruptList(gdb.Command):
             raise gdb.GdbError("Unable to find the sparse IRQ tree, is CONFIG_SPARSE_IRQ enabled?")
 
         for irq in range(nr_irqs):
-            gdb.write(show_irq_desc(prec, irq))
+            gdb.write(show_irq_desc(prec, chip_width, irq))
         gdb.write(arch_show_interrupts(prec))
 
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-03-30 13:06 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 21:56 [patch V3 00/14] Improve /proc/interrupts further Thomas Gleixner
2026-03-26 21:56 ` [patch V3 01/14] x86/irq: Optimize interrupts decimals printing Thomas Gleixner
2026-03-26 22:44   ` David Laight
2026-03-26 21:56 ` [patch V3 02/14] genirq/proc: Avoid formatting zero counts in /proc/interrupts Thomas Gleixner
2026-03-26 21:56 ` [patch V3 03/14] genirq/proc: Utilize irq_desc::tot_count to avoid evaluation Thomas Gleixner
2026-03-26 21:56 ` [patch V3 04/14] x86/irq: Make irqstats array based Thomas Gleixner
2026-03-28 17:04   ` Radu Rendec
2026-03-26 21:56 ` [patch V3 05/14] x86/irq: Suppress unlikely interrupt stats by default Thomas Gleixner
2026-03-26 21:57 ` [patch V3 06/14] x86/irq: Move IOAPIC misrouted and PIC/APIC error counts into irq_stats Thomas Gleixner
2026-03-26 21:57 ` [patch V3 07/14] scripts/gdb: Update x86 interrupts to the array based storage Thomas Gleixner
2026-03-26 22:52   ` Florian Fainelli
2026-03-26 21:57 ` [patch V3 08/14] genirq: Expose nr_irqs in core code Thomas Gleixner
2026-03-26 21:57 ` [patch V3 09/14] genirq: Cache the condition for /proc/interrupts exposure Thomas Gleixner
2026-03-26 21:57 ` [patch V3 10/14] genirq: Calculate precision only when required Thomas Gleixner
2026-03-26 21:57 ` [patch V3 11/14] genirq: Add rcuref count to struct irq_desc Thomas Gleixner
2026-03-26 21:57 ` [patch V3 12/14] genirq: Expose irq_find_desc_at_or_after() in core code Thomas Gleixner
2026-03-26 21:57 ` [patch V3 13/14] genirq/proc: Runtime size the chip name Thomas Gleixner
2026-03-26 21:58 ` [patch V3 14/14] genirq/proc: Speed up /proc/interrupts iteration Thomas Gleixner
2026-03-30 13:06 ` [patch V3 00/14] Improve /proc/interrupts further Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.