From: Thomas Gleixner <tglx@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, Michael Kelley <mhklinux@outlook.com>,
Dmitry Ilvokhin <d@ilvokhin.com>, Radu Rendec <radu@rendec.net>,
Jan Kiszka <jan.kiszka@siemens.com>,
Kieran Bingham <kbingham@kernel.org>,
Florian Fainelli <florian.fainelli@broadcom.com>
Subject: [patch V3 00/14] Improve /proc/interrupts further
Date: Thu, 26 Mar 2026 22:56:22 +0100 [thread overview]
Message-ID: <20260326214345.019130211@kernel.org> (raw)
This is a follow up to v2 which can be found here:
https://lore.kernel.org/20260320131108.344376329@kernel.org
The v1 cover letter contains a full analysis, explanation and numbers:
https://lore.kernel.org/20260303150539.513068586@kernel.org
TLDR:
- The performance of reading of /proc/interrupts has been improved
piecewise over the years, but most of the low hanging fruit has been
left on the table.
Changes vs. V2:
- Addressed the valuable review comments from Michael, Radu and Dmitry.
Thanks!
- Addressed the 0-day fallout (missing #ifdef guards, typos)
- More updates to the x86 irq stats:
- Provide a mechanism for interrupts which should never happen to
skip them by default and only remove the skip condition if one
occurs (Spurious and ICR read retry)
- Use the same mechanism to handle the IOAPIC misrouted and the
PIC/APIC error counts
- Updated the out of sync GDB script - pointed out by Radu
- Use the new array based x86 stats
- Ensure visually tabular output
- Reworked the 'first line' mechanism by using proc_seq_create_private()
which also simplifies the precision and chip name width adjustments
- Made the output format visually tabular in /proc/interrupts
- Picked up tags where appropriate
- Dropped the binary interface parts as they were RFC and just for
demonstration. Let's see if anyone cares down the road.
- Tagged the series so the irq/core branch can be updated without
losing the submitted content.
Delta patch against v2 (w/o the binary RFC part) is below.
The series applies on top of v7.0-rc3 and is also available via git:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git irq-proc-v3
Thanks,
tglx
---
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index d68f587f0b7d..305774b67995 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1032,7 +1032,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
* Unmasking the LVTPC is not required as the Mask (M) bit of the LVT
* PMI entry is not set by the local APIC when a PMC overflow occurs
*/
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
done:
cpuc->enabled = pmu_enabled;
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index b22259b54685..0e36f5580e8d 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1403,7 +1403,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
if (handled)
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
perf_sample_event_took(sched_clock() - stamp);
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index ce9f9d4cd5fd..e1e9aaa4f11a 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1747,7 +1747,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
}
if (handled)
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
return handled;
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index e5c85c5c8f87..297916f29c09 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3504,7 +3504,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
int bit;
int handled = 0;
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
/*
* Ignore a range of extra bits in status that do not indicate
diff --git a/arch/x86/events/intel/knc.c b/arch/x86/events/intel/knc.c
index 537838404524..e887adc108ac 100644
--- a/arch/x86/events/intel/knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -238,7 +238,7 @@ static int knc_pmu_handle_irq(struct pt_regs *regs)
goto done;
}
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
struct perf_event *event = cpuc->events[bit];
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index 1f7621938494..12bf293d42a5 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1077,7 +1077,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
}
if (handled)
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
/*
* When dealing with the unmasking of the LVTPC on P4 perf hw, it has
diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index f1a5d0347b08..4bc177badac2 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -373,7 +373,7 @@ static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
else
zhaoxin_pmu_ack_status(status);
- inc_irq_stat(APIC_PERF);
+ inc_perf_irq_stat();
/*
* CondChgd bit 63 doesn't mean any overflow status. Ignore
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index dcc96edb4f82..dea60d66d976 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -4,7 +4,7 @@
#include <linux/threads.h>
-enum {
+enum irq_stat_counts {
IRQ_COUNT_NMI,
#ifdef CONFIG_X86_LOCAL_APIC
IRQ_COUNT_APIC_TIMER,
@@ -49,6 +49,10 @@ enum {
#endif
#ifdef CONFIG_X86_POSTED_MSI
IRQ_COUNT_POSTED_MSI_NOTIFICATION,
+#endif
+ IRQ_COUNT_PIC_APIC_ERROR,
+#ifdef CONFIG_X86_IO_APIC
+ IRQ_COUNT_IOAPIC_MISROUTED,
#endif
IRQ_COUNT_MAX,
};
@@ -68,14 +72,20 @@ DECLARE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc);
#define __ARCH_IRQ_STAT
#define inc_irq_stat(index) this_cpu_inc(irq_stat.counts[IRQ_COUNT_##index])
+void irq_stat_inc_and_enable(enum irq_stat_counts which);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+#define inc_perf_irq_stat() inc_irq_stat(APIC_PERF)
+#else
+#define inc_perf_irq_stat() do { } while (0)
+#endif
extern void ack_bad_irq(unsigned int irq);
+#ifdef CONFIG_PROC_FS
extern u64 arch_irq_stat_cpu(unsigned int cpu);
#define arch_irq_stat_cpu arch_irq_stat_cpu
-
-extern u64 arch_irq_stat(void);
-#define arch_irq_stat arch_irq_stat
+#endif
DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
#define local_softirq_pending_ref __softirq_pending
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index cbe19e669080..47727d0b540b 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -110,10 +110,6 @@ static inline void lock_vector_lock(void) {}
static inline void unlock_vector_lock(void) {}
#endif
-/* Statistics */
-extern atomic_t irq_err_count;
-extern atomic_t irq_mis_count;
-
extern void elcr_set_level_irq(unsigned int irq);
extern char irq_entries_start[];
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index e2de95b0862a..e4b19a288ee7 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2108,7 +2108,7 @@ static noinline void handle_spurious_interrupt(u8 vector)
trace_spurious_apic_entry(vector);
- inc_irq_stat(SPURIOUS);
+ irq_stat_inc_and_enable(IRQ_COUNT_SPURIOUS);
/*
* If this is a spurious interrupt then do not acknowledge
@@ -2180,7 +2180,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_error_interrupt)
apic_write(APIC_ESR, 0);
v = apic_read(APIC_ESR);
apic_eoi();
- atomic_inc(&irq_err_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
apic_pr_debug("APIC error on CPU%d: %02x", smp_processor_id(), v);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 352ed5558cbc..7d7175d01228 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1575,8 +1575,6 @@ static unsigned int startup_ioapic_irq(struct irq_data *data)
return was_pending;
}
-atomic_t irq_mis_count;
-
#ifdef CONFIG_GENERIC_PENDING_IRQ
static bool io_apic_level_ack_pending(struct mp_chip_data *data)
{
@@ -1713,7 +1711,7 @@ static void ioapic_ack_level(struct irq_data *irq_data)
* at the cpu.
*/
if (!(v & (1 << (i & 0x1f)))) {
- atomic_inc(&irq_mis_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_IOAPIC_MISROUTED);
eoi_ioapic_pin(cfg->vector, irq_data->chip_data);
}
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 3635c4d7b7f5..c627bee3b14f 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -120,7 +120,7 @@ u32 apic_mem_wait_icr_idle_timeout(void)
for (cnt = 0; cnt < 1000; cnt++) {
if (!(apic_read(APIC_ICR) & APIC_ICR_BUSY))
return 0;
- inc_irq_stat(ICR_READ_RETRY);
+ irq_stat_inc_and_enable(IRQ_COUNT_ICR_READ_RETRY);
udelay(100);
}
return APIC_ICR_BUSY;
diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index f67063df6723..f7a86b94a0dd 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -214,7 +214,7 @@ static void mask_and_ack_8259A(struct irq_data *data)
"spurious 8259A interrupt: IRQ%d.\n", irq);
spurious_irq_mask |= irqmask;
}
- atomic_inc(&irq_err_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
/*
* Theoretically we do not have to handle this IRQ,
* but in Linux this does not cause problems and is
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 2bd8c08f8d91..0b3723cec0b9 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -39,8 +39,6 @@ EXPORT_PER_CPU_SYMBOL(__softirq_pending);
DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
-atomic_t irq_err_count;
-
/*
* 'what should we do if we get a hw irq event on an illegal vector'.
* each architecture has to answer this themselves.
@@ -68,56 +66,65 @@ struct irq_stat_info {
const char *text;
};
+#define DEFAULT_SUPPRESSED_VECTOR UINT_MAX
+
#define ISS(idx, sym, txt) [IRQ_COUNT_##idx] = { .symbol = sym, .text = txt }
#define ITS(idx, sym, txt) [IRQ_COUNT_##idx] = \
{ .skip_vector = idx## _VECTOR, .symbol = sym, .text = txt }
+#define IDS(idx, sym, txt) [IRQ_COUNT_##idx] = \
+ { .skip_vector = DEFAULT_SUPPRESSED_VECTOR, .symbol = sym, .text = txt }
+
static struct irq_stat_info irq_stat_info[IRQ_COUNT_MAX] __ro_after_init = {
- ISS(NMI, "NMI", " Non-maskable interrupts\n"),
+ ISS(NMI, "NMI", " Non-maskable interrupts\n"),
#ifdef CONFIG_X86_LOCAL_APIC
- ISS(APIC_TIMER, "LOC", " Local timer interrupts\n"),
- ISS(SPURIOUS, "SPU", " Spurious interrupts\n"),
- ISS(APIC_PERF, "PMI", " Performance monitoring interrupts\n"),
- ISS(IRQ_WORK, "IWI", " IRQ work interrupts\n"),
- ISS(ICR_READ_RETRY, "RTR", " APIC ICR read retries\n"),
- ISS(X86_PLATFORM_IPI, "PLT", " Platform interrupts\n"),
+ ISS(APIC_TIMER, "LOC", " Local timer interrupts\n"),
+ IDS(SPURIOUS, "SPU", " Spurious interrupts\n"),
+ ISS(APIC_PERF, "PMI", " Performance monitoring interrupts\n"),
+ ISS(IRQ_WORK, "IWI", " IRQ work interrupts\n"),
+ IDS(ICR_READ_RETRY, "RTR", " APIC ICR read retries\n"),
+ ISS(X86_PLATFORM_IPI, "PLT", " Platform interrupts\n"),
#endif
#ifdef CONFIG_SMP
- ISS(RESCHEDULE, "RES", " Rescheduling interrupts\n"),
- ISS(CALL_FUNCTION, "CAL", " Function call interrupts\n"),
+ ISS(RESCHEDULE, "RES", " Rescheduling interrupts\n"),
+ ISS(CALL_FUNCTION, "CAL", " Function call interrupts\n"),
#endif
- ISS(TLB, "TLB", " TLB shootdowns\n"),
+ ISS(TLB, "TLB", " TLB shootdowns\n"),
#ifdef CONFIG_X86_THERMAL_VECTOR
- ISS(THERMAL_APIC, "TRM", " Thermal event interrupt\n"),
+ ISS(THERMAL_APIC, "TRM", " Thermal event interrupt\n"),
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
- ISS(THRESHOLD_APIC, "THR", " Threshold APIC interrupts\n"),
+ ISS(THRESHOLD_APIC, "THR", " Threshold APIC interrupts\n"),
#endif
#ifdef CONFIG_X86_MCE_AMD
- ISS(DEFERRED_ERROR, "DFR", " Deferred Error APIC interrupts\n"),
+ ISS(DEFERRED_ERROR, "DFR", " Deferred Error APIC interrupts\n"),
#endif
#ifdef CONFIG_X86_MCE
- ISS(MCE_EXCEPTION, "MCE", " Machine check exceptions\n"),
- ISS(MCE_POLL, "MCP", " Machine check polls\n"),
+ ISS(MCE_EXCEPTION, "MCE", " Machine check exceptions\n"),
+ ISS(MCE_POLL, "MCP", " Machine check polls\n"),
#endif
#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
- ITS(HYPERVISOR_CALLBACK, "HYP", " Hypervisor callback interrupts\n"),
+ ITS(HYPERVISOR_CALLBACK, "HYP", " Hypervisor callback interrupts\n"),
#endif
#if IS_ENABLED(CONFIG_HYPERV)
- ITS(HYPERV_REENLIGHTENMENT, "HRE", " Hyper-V reenlightment interrupts\n"),
- ITS(HYPERV_STIMER0, "HVS", " Hyper-V stimer0 interrupts\n"),
+ ITS(HYPERV_REENLIGHTENMENT, "HRE", " Hyper-V reenlightenment interrupts\n"),
+ ITS(HYPERV_STIMER0, "HVS", " Hyper-V stimer0 interrupts\n"),
#endif
#if IS_ENABLED(CONFIG_KVM)
- ITS(POSTED_INTR, "PIN", " Posted-interrupt notification event\n"),
- ITS(POSTED_INTR_NESTED, "NPI", " Nested posted-interrupt event\n"),
- ITS(POSTED_INTR_WAKEUP, "PIW", " Posted-interrupt wakeup event\n"),
+ ITS(POSTED_INTR, "PIN", " Posted-interrupt notification event\n"),
+ ITS(POSTED_INTR_NESTED, "NPI", " Nested posted-interrupt event\n"),
+ ITS(POSTED_INTR_WAKEUP, "PIW", " Posted-interrupt wakeup event\n"),
#endif
#ifdef CONFIG_GUEST_PERF_EVENTS
ISS(PERF_GUEST_MEDIATED_PMI, "VPMI", " Perf Guest Mediated PMI\n"),
#endif
#ifdef CONFIG_X86_POSTED_MSI
- ISS(POSTED_MSI_NOTIFICATION, "PMN", " Posted MSI notification event\n"),
+ ISS(POSTED_MSI_NOTIFICATION, "PMN", " Posted MSI notification event\n"),
+#endif
+ IDS(PIC_APIC_ERROR, "ERR", " PIC/APIC error interrupts\n"),
+#ifdef CONFIG_X86_IO_APIC
+ IDS(IOAPIC_MISROUTED, "MIS", " Misrouted IO/APIC interrupts\n"),
#endif
};
@@ -126,19 +133,34 @@ void __init irq_init_stats(void)
struct irq_stat_info *info = irq_stat_info;
for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
- if (info->skip_vector && test_bit(info->skip_vector, system_vectors))
+ if (info->skip_vector && info->skip_vector != DEFAULT_SUPPRESSED_VECTOR &&
+ test_bit(info->skip_vector, system_vectors))
info->skip_vector = 0;
}
+#ifdef CONFIG_X86_LOCAL_APIC
if (!x86_platform_ipi_callback)
irq_stat_info[IRQ_COUNT_X86_PLATFORM_IPI].skip_vector = 1;
+#endif
#ifdef CONFIG_X86_POSTED_MSI
if (!posted_msi_enabled())
- irq_stat_info[IRQ_COUNT_X86_POSTED_MSI].skip_vector = 1;
+ irq_stat_info[IRQ_COUNT_POSTED_MSI_NOTIFICATION].skip_vector = 1;
#endif
}
+/*
+ * Used for default enabled counters to increment the stats and to enable the
+ * entry for /proc/interrupts output.
+ */
+void irq_stat_inc_and_enable(enum irq_stat_counts which)
+{
+ this_cpu_inc(irq_stat.counts[which]);
+ /* Pairs with the READ_ONCE() in arch_show_interrupts() */
+ WRITE_ONCE(irq_stat_info[which].skip_vector, 0);
+}
+
+#ifdef CONFIG_PROC_FS
/*
* /proc/interrupts printing for arch specific interrupts
*/
@@ -147,17 +169,13 @@ int arch_show_interrupts(struct seq_file *p, int prec)
const struct irq_stat_info *info = irq_stat_info;
for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
- if (info->skip_vector)
+ if (READ_ONCE(info->skip_vector))
continue;
seq_printf(p, "%*s:", prec, info->symbol);
irq_proc_emit_counts(p, &irq_stat.counts[i]);
seq_puts(p, info->text);
}
-
- seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
- if (IS_ENABLED(CONFIG_X86_IO_APIC))
- seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
return 0;
}
@@ -173,12 +191,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
sum += p->counts[i];
return sum;
}
-
-u64 arch_irq_stat(void)
-{
- u64 sum = atomic_read(&irq_err_count);
- return sum;
-}
+#endif /* CONFIG_PROC_FS */
static __always_inline void handle_irq(struct irq_desc *desc,
struct pt_regs *regs)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 8b444e862319..20c3df9a9b80 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -18,9 +18,6 @@
#ifndef arch_irq_stat_cpu
#define arch_irq_stat_cpu(cpu) 0
#endif
-#ifndef arch_irq_stat
-#define arch_irq_stat() 0
-#endif
u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
{
@@ -122,7 +119,6 @@ static int show_stat(struct seq_file *p, void *v)
sum_softirq += softirq_stat;
}
}
- sum += arch_irq_stat();
seq_put_decimal_ull(p, "cpu ", nsec_to_clock_t(user));
seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice));
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6389a462c731..2809f0fc4175 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -46,9 +46,11 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
scoped_irqdesc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
ret = 0;
}
- /* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
- if (!ret)
+ if (!ret) {
+ /* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
irq_mark_irq(irq);
+ irq_proc_update_chip(chip);
+ }
return ret;
}
EXPORT_SYMBOL(irq_set_chip);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 37eec0337867..7fbf003c6e93 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -12,6 +12,8 @@
#include <linux/rcuref.h>
#include <linux/sched/clock.h>
+#include "proc.h"
+
#ifdef CONFIG_SPARSE_IRQ
# define MAX_SPARSE_IRQS INT_MAX
#else
@@ -149,12 +151,6 @@ static inline void unregister_handler_proc(unsigned int irq,
static inline void irq_proc_update_valid(struct irq_desc *desc) { }
#endif
-#if defined(CONFIG_PROC_FS) && defined(CONFIG_GENERIC_IRQ_SHOW)
-void irq_proc_calc_prec(void);
-#else
-static inline void irq_proc_calc_prec(void) { }
-#endif
-
struct irq_desc *irq_find_desc_at_or_after(unsigned int offset);
extern bool irq_can_set_affinity_usr(unsigned int irq);
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 9b9a75dfeebd..80ef4e27dcf4 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -185,6 +185,7 @@ struct irq_desc *irq_find_desc_at_or_after(unsigned int offset)
{
unsigned long index = offset;
+ lockdep_assert_in_rcu_read_lock();
return mt_find(&sparse_irqs, &index, total_nr_irqs);
}
@@ -930,8 +931,10 @@ EXPORT_SYMBOL_GPL(__irq_alloc_descs);
*/
unsigned int irq_get_next_irq(unsigned int offset)
{
- struct irq_desc *desc = irq_find_desc_at_or_after(offset);
+ struct irq_desc *desc;
+ guard(rcu)();
+ desc = irq_find_desc_at_or_after(offset);
return desc ? irq_desc_get_irq(desc) : total_nr_irqs;
}
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cc93abf009e8..9f524ed709b8 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -20,6 +20,8 @@
#include <linux/smp.h>
#include <linux/fs.h>
+#include "proc.h"
+
static LIST_HEAD(irq_domain_list);
static DEFINE_MUTEX(irq_domain_mutex);
@@ -1532,6 +1534,7 @@ int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
irq_data->chip = (struct irq_chip *)(chip ? chip : &no_irq_chip);
irq_data->chip_data = chip_data;
+ irq_proc_update_chip(chip);
return 0;
}
EXPORT_SYMBOL_GPL(irq_domain_set_hwirq_and_chip);
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index f6cdb262d1e7..a62d4694f063 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -443,8 +443,7 @@ void irq_proc_update_valid(struct irq_desc *desc)
{
u32 set = _IRQ_PROC_VALID;
- if (irq_settings_is_hidden(desc) || !desc->action ||
- irq_desc_is_chained(desc) || !desc->kstat_irqs)
+ if (irq_settings_is_hidden(desc) || irq_desc_is_chained(desc) || !desc->action)
set = 0;
irq_settings_update_proc_valid(desc, set);
@@ -459,7 +458,16 @@ int __weak arch_show_interrupts(struct seq_file *p, int prec)
return 0;
}
-static int irq_num_prec __read_mostly = 3;
+static DEFINE_RAW_SPINLOCK(irq_proc_constraints_lock);
+
+static struct irq_proc_constraints {
+ bool print_header;
+ unsigned int num_prec;
+ unsigned int chip_width;
+} irq_proc_constraints __read_mostly = {
+ .num_prec = 3,
+ .chip_width = 8,
+};
#ifndef ACTUAL_NR_IRQS
# define ACTUAL_NR_IRQS total_nr_irqs
@@ -471,7 +479,23 @@ void irq_proc_calc_prec(void)
for (prec = 3, n = 1000; prec < 10 && n <= total_nr_irqs; ++prec)
n *= 10;
- WRITE_ONCE(irq_num_prec, prec);
+
+ guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+ if (prec > irq_proc_constraints.num_prec)
+ WRITE_ONCE(irq_proc_constraints.num_prec, prec);
+}
+
+void irq_proc_update_chip(const struct irq_chip *chip)
+{
+ unsigned int len = chip && chip->name ? strlen(chip->name) : 0;
+
+ if (!len || len <= READ_ONCE(irq_proc_constraints.chip_width))
+ return;
+
+ /* Can be invoked from interrupt disabled contexts */
+ guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+ if (len > irq_proc_constraints.chip_width)
+ WRITE_ONCE(irq_proc_constraints.chip_width, len);
}
#define ZSTR1 " 0"
@@ -512,26 +536,25 @@ void irq_proc_emit_counts(struct seq_file *p, unsigned int __percpu *cnts)
static int irq_seq_show(struct seq_file *p, void *v)
{
- int prec = (int)(unsigned long)p->private;
+ struct irq_proc_constraints *constr = p->private;
struct irq_desc *desc = v;
struct irqaction *action;
if (desc == ARCH_PROC_IRQDESC)
- return arch_show_interrupts(p, prec);
+ return arch_show_interrupts(p, constr->num_prec);
/* print header for the first interrupt indicated by !p>private */
- if (!prec) {
+ if (constr->print_header) {
unsigned int cpu;
- prec = READ_ONCE(irq_num_prec);
- seq_printf(p, "%*s", prec + 8, "");
+ seq_printf(p, "%*s", constr->num_prec + 8, "");
for_each_online_cpu(cpu)
seq_printf(p, "CPU%-8d", cpu);
seq_putc(p, '\n');
- p->private = (void *)(unsigned long)prec;
+ constr->print_header = false;
}
- seq_put_decimal_ull_width(p, "", irq_desc_get_irq(desc), prec);
+ seq_put_decimal_ull_width(p, "", irq_desc_get_irq(desc), constr->num_prec);
seq_putc(p, ':');
/*
@@ -543,25 +566,27 @@ static int irq_seq_show(struct seq_file *p, void *v)
irq_proc_emit_counts(p, &desc->kstat_irqs->cnt);
else
irq_proc_emit_zero_counts(p, num_online_cpus());
- seq_putc(p, ' ');
+
+ /* Enforce a visual gap */
+ seq_write(p, " ", 2);
guard(raw_spinlock_irq)(&desc->lock);
if (desc->irq_data.chip) {
if (desc->irq_data.chip->irq_print_chip)
desc->irq_data.chip->irq_print_chip(&desc->irq_data, p);
else if (desc->irq_data.chip->name)
- seq_printf(p, "%8s", desc->irq_data.chip->name);
+ seq_printf(p, "%-*s", constr->chip_width, desc->irq_data.chip->name);
else
- seq_printf(p, "%8s", "-");
+ seq_printf(p, "%-*s", constr->chip_width, "-");
} else {
- seq_printf(p, "%8s", "None");
+ seq_printf(p, "%-*s", constr->chip_width, "None");
}
seq_putc(p, ' ');
if (desc->irq_data.domain)
- seq_put_decimal_ull_width(p, "", desc->irq_data.hwirq, prec);
+ seq_put_decimal_ull_width(p, "", desc->irq_data.hwirq, constr->num_prec);
else
- seq_printf(p, " %*s", prec, "");
+ seq_printf(p, " %*s", constr->num_prec, "");
if (IS_ENABLED(CONFIG_GENERIC_IRQ_SHOW_LEVEL))
seq_printf(p, " %-8s", irqd_is_level_type(&desc->irq_data) ? "Level" : "Edge");
@@ -582,14 +607,13 @@ static int irq_seq_show(struct seq_file *p, void *v)
static void *irq_seq_next_desc(loff_t *pos)
{
- struct irq_desc *desc;
-
if (*pos > total_nr_irqs)
return NULL;
guard(rcu)();
for (;;) {
- desc = irq_find_desc_at_or_after((unsigned int) *pos);
+ struct irq_desc *desc = irq_find_desc_at_or_after((unsigned int) *pos);
+
if (desc) {
*pos = irq_desc_get_irq(desc);
/*
@@ -609,8 +633,13 @@ static void *irq_seq_next_desc(loff_t *pos)
static void *irq_seq_start(struct seq_file *f, loff_t *pos)
{
- if (!*pos)
- f->private = NULL;
+ if (!*pos) {
+ struct irq_proc_constraints *constr = f->private;
+
+ constr->num_prec = READ_ONCE(irq_proc_constraints.num_prec);
+ constr->chip_width = READ_ONCE(irq_proc_constraints.chip_width);
+ constr->print_header = true;
+ }
return irq_seq_next_desc(pos);
}
@@ -638,7 +667,8 @@ static const struct seq_operations irq_seq_ops = {
static int __init irq_proc_init(void)
{
- proc_create_seq("interrupts", 0, NULL, &irq_seq_ops);
+ proc_create_seq_private("interrupts", 0, NULL, &irq_seq_ops,
+ sizeof(irq_proc_constraints), NULL);
return 0;
}
fs_initcall(irq_proc_init);
diff --git a/kernel/irq/proc.h b/kernel/irq/proc.h
new file mode 100644
index 000000000000..ec9173d573f9
--- /dev/null
+++ b/kernel/irq/proc.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#if defined(CONFIG_PROC_FS) && defined(CONFIG_GENERIC_IRQ_SHOW)
+void irq_proc_calc_prec(void);
+void irq_proc_update_chip(const struct irq_chip *chip);
+#else
+static inline void irq_proc_calc_prec(void) { }
+static inline void irq_proc_update_chip(const struct irq_chip *chip) { }
+#endif
diff --git a/scripts/gdb/linux/interrupts.py b/scripts/gdb/linux/interrupts.py
index f4f715a8f0e3..6ca7e32f35b0 100644
--- a/scripts/gdb/linux/interrupts.py
+++ b/scripts/gdb/linux/interrupts.py
@@ -20,7 +20,7 @@ def irq_desc_is_chained(desc):
def irqd_is_level(desc):
return desc['irq_data']['common']['state_use_accessors'] & constants.LX_IRQD_LEVEL
-def show_irq_desc(prec, irq):
+def show_irq_desc(prec, chip_width, irq):
text = ""
desc = mapletree.mtree_load(gdb.parse_and_eval("&sparse_irqs"), irq)
@@ -48,7 +48,7 @@ def show_irq_desc(prec, irq):
count = cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt']
else:
count = 0
- text += "%10u" % (count)
+ text += "%10u " % (count)
name = "None"
if desc['irq_data']['chip']:
@@ -58,7 +58,7 @@ def show_irq_desc(prec, irq):
else:
name = "-"
- text += " %8s" % (name)
+ text += " %-*s" % (chip_width, name)
if desc['irq_data']['domain']:
text += " %*lu" % (prec, desc['irq_data']['hwirq'])
@@ -97,52 +97,26 @@ def show_irq_err_count(prec):
text += "%*s: %10u\n" % (prec, "ERR", cnt['counter'])
return text
-def x86_show_irqstat(prec, pfx, field, desc):
- irq_stat = gdb.parse_and_eval("&irq_stat")
+def x86_show_irqstat(prec, pfx, idx, desc):
+ irq_stat = gdb.parse_and_eval("&irq_stat.counts[%d]" %idx)
text = "%*s: " % (prec, pfx)
for cpu in cpus.each_online_cpu():
stat = cpus.per_cpu(irq_stat, cpu)
- text += "%10u " % (stat[field])
- text += " %s\n" % (desc)
- return text
-
-def x86_show_mce(prec, var, pfx, desc):
- pvar = gdb.parse_and_eval(var)
- text = "%*s: " % (prec, pfx)
- for cpu in cpus.each_online_cpu():
- text += "%10u " % (cpus.per_cpu(pvar, cpu).dereference())
- text += " %s\n" % (desc)
+ text += "%10u " % (stat.dereference())
+ text += desc
return text
def x86_show_interupts(prec):
- text = x86_show_irqstat(prec, "NMI", '__nmi_count', 'Non-maskable interrupts')
-
- if constants.LX_CONFIG_X86_LOCAL_APIC:
- text += x86_show_irqstat(prec, "LOC", 'apic_timer_irqs', "Local timer interrupts")
- text += x86_show_irqstat(prec, "SPU", 'irq_spurious_count', "Spurious interrupts")
- text += x86_show_irqstat(prec, "PMI", 'apic_perf_irqs', "Performance monitoring interrupts")
- text += x86_show_irqstat(prec, "IWI", 'apic_irq_work_irqs', "IRQ work interrupts")
- text += x86_show_irqstat(prec, "RTR", 'icr_read_retry_count', "APIC ICR read retries")
- if utils.gdb_eval_or_none("x86_platform_ipi_callback") is not None:
- text += x86_show_irqstat(prec, "PLT", 'x86_platform_ipis', "Platform interrupts")
-
- if constants.LX_CONFIG_SMP:
- text += x86_show_irqstat(prec, "RES", 'irq_resched_count', "Rescheduling interrupts")
- text += x86_show_irqstat(prec, "CAL", 'irq_call_count', "Function call interrupts")
- text += x86_show_irqstat(prec, "TLB", 'irq_tlb_count', "TLB shootdowns")
-
- if constants.LX_CONFIG_X86_THERMAL_VECTOR:
- text += x86_show_irqstat(prec, "TRM", 'irq_thermal_count', "Thermal events interrupts")
+ info_type = gdb.lookup_type('struct irq_stat_info')
+ info = gdb.parse_and_eval('irq_stat_info')
- if constants.LX_CONFIG_X86_MCE_THRESHOLD:
- text += x86_show_irqstat(prec, "THR", 'irq_threshold_count', "Threshold APIC interrupts")
-
- if constants.LX_CONFIG_X86_MCE_AMD:
- text += x86_show_irqstat(prec, "DFR", 'irq_deferred_error_count', "Deferred Error APIC interrupts")
-
- if constants.LX_CONFIG_X86_MCE:
- text += x86_show_mce(prec, "&mce_exception_count", "MCE", "Machine check exceptions")
- text += x86_show_mce(prec, "&mce_poll_count", "MCP", "Machine check polls")
+ text = ""
+ for idx in range(int(info.type.sizeof / info_type.sizeof)):
+ if info[idx]['skip_vector']:
+ continue
+ pfx = info[idx]['symbol'].string()
+ desc = info[idx]['text'].string()
+ text += x86_show_irqstat(prec, pfx, idx, desc)
text += show_irq_err_count(prec)
@@ -151,11 +125,6 @@ def x86_show_interupts(prec):
if cnt is not None:
text += "%*s: %10u\n" % (prec, "MIS", cnt['counter'])
- if constants.LX_CONFIG_KVM:
- text += x86_show_irqstat(prec, "PIN", 'kvm_posted_intr_ipis', 'Posted-interrupt notification event')
- text += x86_show_irqstat(prec, "NPI", 'kvm_posted_intr_nested_ipis', 'Nested posted-interrupt event')
- text += x86_show_irqstat(prec, "PIW", 'kvm_posted_intr_wakeup_ipis', 'Posted-interrupt wakeup event')
-
return text
def arm_common_show_interrupts(prec):
@@ -209,12 +178,19 @@ class LxInterruptList(gdb.Command):
super(LxInterruptList, self).__init__("lx-interruptlist", gdb.COMMAND_DATA)
def invoke(self, arg, from_tty):
- nr_irqs = gdb.parse_and_eval("nr_irqs")
- prec = 3
- j = 1000
- while prec < 10 and j <= nr_irqs:
- prec += 1
- j *= 10
+ nr_irqs = gdb.parse_and_eval("total_nr_irqs")
+ constr = utils.gdb_eval_or_none('irq_proc_constraints')
+
+ if constr:
+ prec = int(constr['num_prec'])
+ chip_width = int(constr['chip_width'])
+ else:
+ prec = 3
+ j = 1000
+ while prec < 10 and j <= nr_irqs:
+ prec += 1
+ j *= 10
+ chip_width = 8
gdb.write("%*s" % (prec + 8, ""))
for cpu in cpus.each_online_cpu():
@@ -225,7 +201,7 @@ class LxInterruptList(gdb.Command):
raise gdb.GdbError("Unable to find the sparse IRQ tree, is CONFIG_SPARSE_IRQ enabled?")
for irq in range(nr_irqs):
- gdb.write(show_irq_desc(prec, irq))
+ gdb.write(show_irq_desc(prec, chip_width, irq))
gdb.write(arch_show_interrupts(prec))
next reply other threads:[~2026-03-26 21:56 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 21:56 Thomas Gleixner [this message]
2026-03-26 21:56 ` [patch V3 01/14] x86/irq: Optimize interrupts decimals printing Thomas Gleixner
2026-03-26 22:44 ` David Laight
2026-03-26 21:56 ` [patch V3 02/14] genirq/proc: Avoid formatting zero counts in /proc/interrupts Thomas Gleixner
2026-03-26 21:56 ` [patch V3 03/14] genirq/proc: Utilize irq_desc::tot_count to avoid evaluation Thomas Gleixner
2026-03-26 21:56 ` [patch V3 04/14] x86/irq: Make irqstats array based Thomas Gleixner
2026-03-28 17:04 ` Radu Rendec
2026-03-26 21:56 ` [patch V3 05/14] x86/irq: Suppress unlikely interrupt stats by default Thomas Gleixner
2026-03-26 21:57 ` [patch V3 06/14] x86/irq: Move IOAPIC misrouted and PIC/APIC error counts into irq_stats Thomas Gleixner
2026-03-26 21:57 ` [patch V3 07/14] scripts/gdb: Update x86 interrupts to the array based storage Thomas Gleixner
2026-03-26 22:52 ` Florian Fainelli
2026-03-26 21:57 ` [patch V3 08/14] genirq: Expose nr_irqs in core code Thomas Gleixner
2026-03-26 21:57 ` [patch V3 09/14] genirq: Cache the condition for /proc/interrupts exposure Thomas Gleixner
2026-03-26 21:57 ` [patch V3 10/14] genirq: Calculate precision only when required Thomas Gleixner
2026-03-26 21:57 ` [patch V3 11/14] genirq: Add rcuref count to struct irq_desc Thomas Gleixner
2026-03-26 21:57 ` [patch V3 12/14] genirq: Expose irq_find_desc_at_or_after() in core code Thomas Gleixner
2026-03-26 21:57 ` [patch V3 13/14] genirq/proc: Runtime size the chip name Thomas Gleixner
2026-03-26 21:58 ` [patch V3 14/14] genirq/proc: Speed up /proc/interrupts iteration Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260326214345.019130211@kernel.org \
--to=tglx@kernel.org \
--cc=d@ilvokhin.com \
--cc=florian.fainelli@broadcom.com \
--cc=jan.kiszka@siemens.com \
--cc=kbingham@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhklinux@outlook.com \
--cc=radu@rendec.net \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox