From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14C642F9DA1 for ; Thu, 26 Mar 2026 22:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774565069; cv=none; b=Q1Glm5yC5LLwWwxR/L1Ksrwuk0n3w23+1bfipF963SAWpbeY41WTHfcdifpj5n0/pfA992HgzQipyb/FSTiPrbuEpvzfGsQ17msfhPxz1NzD2YIMK2k00O1EjIKcqh2c3F6lw4hHIZrmBnli0oZKiYUxKRnpwBXH0gHuq6BQAK8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774565069; c=relaxed/simple; bh=TwRCp77D1VVueVJz/SH958krFueZqJPzmdCA3kWWQI0=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uYsazeIdoquo07qVnEUW7VYcpsaufoyPEkEkd0BKu/incxR5JbHZTgL+rkUH7bye55EmwSCAUNoD+YGPS02fvTjOOTuDsFKeQOcXJF0RSBDCzge8FARQVkwqJ2gRctFXpFb/fG8s2QxvfahI1TRUOmFQAC//H7LtF6QuyHYZV0I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VH2SqKb8; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VH2SqKb8" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-486ff3a0fc1so14968565e9.2 for ; Thu, 26 Mar 2026 15:44:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774565065; x=1775169865; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=0/2i+m6pDcsdXqT7s+wC6zJMgFBtb9/1UvE1NSv23js=; b=VH2SqKb8J1wevyL5rrdogt0H1EA2uZRl7GVxwecPCFhP03IR+QPHSF3l9R0VBAOlJg ngFzMs8p8xI16k8z3AYKtA4GputCaYvIXKB/Dy+qEWp8xwnJVDHDoucmDzshql0bBR6M CHOQitUCwac4kaYw87tV5Krus/UaMyZuh3MbezSm8/WPcL53BArsTs+ztbMLxLXN/qVy OfJBMUvIMDhHAadTEN/JL4MBETeMcgBMzTuUcwO3C0skGY/p7DAoIpZRhUM1HmFL2WxW J3LCmiuBf7JBdeBTveq2SthJnOFDq76TSA258VXocUowP30ROt8EB6Slka5AXstUjpg+ sadw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774565065; x=1775169865; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=0/2i+m6pDcsdXqT7s+wC6zJMgFBtb9/1UvE1NSv23js=; b=QVmrrsWOXNrLwnzG1umcm1kpb7+ySkDRuvdSppPXXBA/4fqZBUDcSygpQ82UQUSEEi E1Q+u3qt6KF31CCWG3tSsKpwcHfjNwbUCii9wBDxYfGULXFdGG/22ggmxbdA53GcECdl IKUTdQa8vZyBjIi9l8fYy+8vHs+mfLix5MLwjcG7uAC07TxXq8Vs3UWzk7gTqHcxWOwv wjynybRL5TDzZ6vx0pdNypGJLC1dGGBPRL6BbwYYKWt3y6SFoXNWyeZvWjSGTMCTFCtl m1V79U0mbXxvlsdam0rZ4iSZckwBDboLZyJgMYYcdvDh0xEJI5MZvrNueDZH1lINzF5g fh7w== X-Gm-Message-State: AOJu0Yyd9GnzHLyUAjHg3/YI/ekYdrgCyZ4oTck5XZoYEtCciFPri+iP GjgvSQmAQyWIuUT8xxpkh9USd8GT6jHcWx9xeKavBj7Tqlqw/ARNgjZM X-Gm-Gg: ATEYQzzvepHZAKMORtyWVZmo8Vj4ksZW4kKnzHt+g2SG9B70C4y8SZ6AqdxYM1GQWNj yH6TeEc09IUzfME3Ey2s4/KAkGaPVWAC+ssfQB6HgWWUJM92JhFdHH77iV+8aVrL1hrzonz1bis HgUvRyz0wUgp7tSeFcqzi5O2oDCcYfP3sUVQSoBNdXKWCoaGYTwLZ7Ex2hw6FUP8oLksZGsyFPk UBp73ZIwBPM1NyfohLEZ9x6ZKBTbNfN/me/1uT+B3igs+zu3rVY2xAQ6i3Vgl5dlHSr8f+C266a t9fhVQEnkrjlZRULvH4YY1Bj06PMP/isZ3weMaXCV5GFyxyojX3qzSZ65kFtiDPJpPZQ88jA0iu D50mg5a6FODc5XwYpfw1x9BOOOAUxp8HTrXJXH2xgXZ+iwZsQgkn+s4EZPzW5pjT7u3PoF9SyJ8 doHXU1hqsACdfvMf7cF+8wybm5yrB4TrqznDsEzxZFDihyjdyC6CcSQpR51h7UQOFl X-Received: by 2002:a05:600c:8b41:b0:485:41c4:e2e5 with SMTP id 5b1f17b1804b1-48727edf577mr5915095e9.27.1774565065299; Thu, 26 Mar 2026 15:44:25 -0700 (PDT) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48722c8cfdcsm80795215e9.8.2026.03.26.15.44.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2026 15:44:25 -0700 (PDT) Date: Thu, 26 Mar 2026 22:44:23 +0000 From: David Laight To: Thomas Gleixner Cc: LKML , x86@kernel.org, Michael Kelley , Dmitry Ilvokhin , Radu Rendec , Jan Kiszka , Kieran Bingham , Florian Fainelli Subject: Re: [patch V3 01/14] x86/irq: Optimize interrupts decimals printing Message-ID: <20260326224423.321e506f@pumpkin> In-Reply-To: <20260326215036.308179281@kernel.org> References: <20260326214345.019130211@kernel.org> <20260326215036.308179281@kernel.org> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Thu, 26 Mar 2026 22:56:28 +0100 Thomas Gleixner wrote: > From: Dmitry Ilvokhin > > Monitoring tools periodically scan /proc/interrupts to export metrics as a > timeseries for future analysis and investigation. > > In large fleets, /proc/interrupts is polled (often every few seconds) on > every machine. The cumulative overhead adds up quickly across thousands > of nodes, so reducing the cost of generating these stats does have a > measurable operational impact. With the ongoing trend toward higher core > counts per machine, this cost becomes even more noticeable over time, > since interrupt counters are per-CPU. In Meta's fleet, we have observed > this overhead at scale. > > Although a binary /proc interface would be a better long-term solution > due to lower formatting (kernel side) and parsing (userspace side) > overhead, the text interface will remain in use for some time, even if > better solutions will be available. Optimizing the /proc/interrupts > printing code is therefore still beneficial. > > Function seq_printf() supports rich format string for decimals printing, > but it doesn't required for printing /proc/interrupts per CPU counters, > seq_put_decimal_ull_width() function can be used instead to print per > CPU counters, because very limited formatting is required for this case. > Similar optimization idea is already used in show_interrupts(). Did you consider optimising the underlying snprintf() code instead? I looked at it recently and it looks like it could be optimised somewhat. > As a side effect this aligns the x86 decriptions with the generic > interrupts event descriptions. > > Performance counter stats (truncated) for 'sh -c cat /proc/interrupts > > Before: > > 3.42 msec task-clock # 0.802 CPUs utilized ( +- 0.05% ) > 1 context-switches # 291.991 /sec ( +- 0.74% ) > 0 cpu-migrations # 0.000 /sec > 343 page-faults # 100.153 K/sec ( +- 0.01% ) > 8,932,242 instructions # 1.66 insn per cycle ( +- 0.34% ) > 5,374,427 cycles # 1.569 GHz ( +- 0.04% ) > 1,483,154 branches # 433.068 M/sec ( +- 0.22% ) > 28,768 branch-misses # 1.94% of all branches ( +- 0.31% ) > > 0.00427182 +- 0.00000215 seconds time elapsed ( +- 0.05% ) > > After: > > 2.39 msec task-clock # 0.796 CPUs utilized ( +- 0.06% ) > 1 context-switches # 418.541 /sec ( +- 0.70% ) > 0 cpu-migrations # 0.000 /sec > 343 page-faults # 143.560 K/sec ( +- 0.01% ) > 7,020,982 instructions # 1.30 insn per cycle ( +- 0.52% ) > 5,397,266 cycles # 2.259 GHz ( +- 0.06% ) > 1,569,648 branches # 656.962 M/sec ( +- 0.08% ) > 25,419 branch-misses # 1.62% of all branches ( +- 0.72% ) > > 0.00299996 +- 0.00000206 seconds time elapsed ( +- 0.07% ) > > Relative speed up in time elapsed is around 29%. I'm not sure I completely understand that table. 'cycles' seems to have gone up - which ought to make it slower. But 'GHz' has gone up considerable - which will reduce 'wall time' for a fixed number of cycles. Are you sure that you've not just make the cpu clock faster? It can be hard to lock the clock frequency of Intel cpu in order to get meaningful benchmark times. It also only really makes sense to measure how long the read() itself takes. Everything else is unwanted noise as far as this change is concerned. David > > [ tglx: Fixed it up so it applies to current mainline ] > > Signed-off-by: Dmitry Ilvokhin > Signed-off-by: Thomas Gleixner > Reviewed-by: Thomas Gleixner > Reviewed-by: Radu Rendec > Link: https://patch.msgid.link/aQj5mGZ6_BBlAm3B@shell.ilvokhin.com > > --- > Changes v2: > - Expanded commit message: add more rationale for the proposed change. > - Renamed helper put_spaced_decimal() -> put_decimal() primarely to make > checkpatch.pl --strict pass. > > arch/x86/kernel/irq.c | 112 ++++++++++++++++++++++++++------------------------ > 1 file changed, 59 insertions(+), 53 deletions(-) > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -62,6 +62,18 @@ void ack_bad_irq(unsigned int irq) > apic_eoi(); > } > > +/* > + * A helper routine for putting space and decimal number without overhead > + * from rich format of printf(). > + */ > +static void put_decimal(struct seq_file *p, unsigned long long num) > +{ > + const char *delimiter = " "; > + unsigned int width = 10; > + > + seq_put_decimal_ull_width(p, delimiter, num, width); > +} > + > #define irq_stats(x) (&per_cpu(irq_stat, x)) > /* > * /proc/interrupts printing for arch specific interrupts > @@ -70,103 +82,101 @@ int arch_show_interrupts(struct seq_file > { > int j; > > - seq_printf(p, "%*s: ", prec, "NMI"); > + seq_printf(p, "%*s:", prec, "NMI"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->__nmi_count); > + put_decimal(p, irq_stats(j)->__nmi_count); > seq_puts(p, " Non-maskable interrupts\n"); > #ifdef CONFIG_X86_LOCAL_APIC > - seq_printf(p, "%*s: ", prec, "LOC"); > + seq_printf(p, "%*s:", prec, "LOC"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->apic_timer_irqs); > + put_decimal(p, irq_stats(j)->apic_timer_irqs); > seq_puts(p, " Local timer interrupts\n"); > > - seq_printf(p, "%*s: ", prec, "SPU"); > + seq_printf(p, "%*s:", prec, "SPU"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_spurious_count); > + put_decimal(p, irq_stats(j)->irq_spurious_count); > seq_puts(p, " Spurious interrupts\n"); > - seq_printf(p, "%*s: ", prec, "PMI"); > + seq_printf(p, "%*s:", prec, "PMI"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->apic_perf_irqs); > + put_decimal(p, irq_stats(j)->apic_perf_irqs); > seq_puts(p, " Performance monitoring interrupts\n"); > - seq_printf(p, "%*s: ", prec, "IWI"); > + seq_printf(p, "%*s:", prec, "IWI"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->apic_irq_work_irqs); > + put_decimal(p, irq_stats(j)->apic_irq_work_irqs); > seq_puts(p, " IRQ work interrupts\n"); > - seq_printf(p, "%*s: ", prec, "RTR"); > + seq_printf(p, "%*s:", prec, "RTR"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->icr_read_retry_count); > + put_decimal(p, irq_stats(j)->icr_read_retry_count); > seq_puts(p, " APIC ICR read retries\n"); > if (x86_platform_ipi_callback) { > - seq_printf(p, "%*s: ", prec, "PLT"); > + seq_printf(p, "%*s:", prec, "PLT"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->x86_platform_ipis); > + put_decimal(p, irq_stats(j)->x86_platform_ipis); > seq_puts(p, " Platform interrupts\n"); > } > #endif > #ifdef CONFIG_SMP > - seq_printf(p, "%*s: ", prec, "RES"); > + seq_printf(p, "%*s:", prec, "RES"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_resched_count); > + put_decimal(p, irq_stats(j)->irq_resched_count); > seq_puts(p, " Rescheduling interrupts\n"); > - seq_printf(p, "%*s: ", prec, "CAL"); > + seq_printf(p, "%*s:", prec, "CAL"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_call_count); > + put_decimal(p, irq_stats(j)->irq_call_count); > seq_puts(p, " Function call interrupts\n"); > - seq_printf(p, "%*s: ", prec, "TLB"); > + seq_printf(p, "%*s:", prec, "TLB"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_tlb_count); > + put_decimal(p, irq_stats(j)->irq_tlb_count); > seq_puts(p, " TLB shootdowns\n"); > #endif > #ifdef CONFIG_X86_THERMAL_VECTOR > - seq_printf(p, "%*s: ", prec, "TRM"); > + seq_printf(p, "%*s:", prec, "TRM"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_thermal_count); > + put_decimal(p, irq_stats(j)->irq_thermal_count); > seq_puts(p, " Thermal event interrupts\n"); > #endif > #ifdef CONFIG_X86_MCE_THRESHOLD > - seq_printf(p, "%*s: ", prec, "THR"); > + seq_printf(p, "%*s:", prec, "THR"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_threshold_count); > + put_decimal(p, irq_stats(j)->irq_threshold_count); > seq_puts(p, " Threshold APIC interrupts\n"); > #endif > #ifdef CONFIG_X86_MCE_AMD > - seq_printf(p, "%*s: ", prec, "DFR"); > + seq_printf(p, "%*s:", prec, "DFR"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->irq_deferred_error_count); > + put_decimal(p, irq_stats(j)->irq_deferred_error_count); > seq_puts(p, " Deferred Error APIC interrupts\n"); > #endif > #ifdef CONFIG_X86_MCE > - seq_printf(p, "%*s: ", prec, "MCE"); > + seq_printf(p, "%*s:", prec, "MCE"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", per_cpu(mce_exception_count, j)); > + put_decimal(p, per_cpu(mce_exception_count, j)); > seq_puts(p, " Machine check exceptions\n"); > - seq_printf(p, "%*s: ", prec, "MCP"); > + seq_printf(p, "%*s:", prec, "MCP"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", per_cpu(mce_poll_count, j)); > + put_decimal(p, per_cpu(mce_poll_count, j)); > seq_puts(p, " Machine check polls\n"); > #endif > #ifdef CONFIG_X86_HV_CALLBACK_VECTOR > if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) { > - seq_printf(p, "%*s: ", prec, "HYP"); > + seq_printf(p, "%*s:", prec, "HYP"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->irq_hv_callback_count); > + put_decimal(p, irq_stats(j)->irq_hv_callback_count); > seq_puts(p, " Hypervisor callback interrupts\n"); > } > #endif > #if IS_ENABLED(CONFIG_HYPERV) > if (test_bit(HYPERV_REENLIGHTENMENT_VECTOR, system_vectors)) { > - seq_printf(p, "%*s: ", prec, "HRE"); > + seq_printf(p, "%*s:", prec, "HRE"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->irq_hv_reenlightenment_count); > + put_decimal(p, > + irq_stats(j)->irq_hv_reenlightenment_count); > seq_puts(p, " Hyper-V reenlightenment interrupts\n"); > } > if (test_bit(HYPERV_STIMER0_VECTOR, system_vectors)) { > - seq_printf(p, "%*s: ", prec, "HVS"); > + seq_printf(p, "%*s:", prec, "HVS"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->hyperv_stimer0_count); > + put_decimal(p, irq_stats(j)->hyperv_stimer0_count); > seq_puts(p, " Hyper-V stimer0 interrupts\n"); > } > #endif > @@ -175,35 +185,31 @@ int arch_show_interrupts(struct seq_file > seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count)); > #endif > #if IS_ENABLED(CONFIG_KVM) > - seq_printf(p, "%*s: ", prec, "PIN"); > + seq_printf(p, "%*s:", prec, "PIN"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis); > + put_decimal(p, irq_stats(j)->kvm_posted_intr_ipis); > seq_puts(p, " Posted-interrupt notification event\n"); > > - seq_printf(p, "%*s: ", prec, "NPI"); > + seq_printf(p, "%*s:", prec, "NPI"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->kvm_posted_intr_nested_ipis); > + put_decimal(p, irq_stats(j)->kvm_posted_intr_nested_ipis); > seq_puts(p, " Nested posted-interrupt event\n"); > > - seq_printf(p, "%*s: ", prec, "PIW"); > + seq_printf(p, "%*s:", prec, "PIW"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->kvm_posted_intr_wakeup_ipis); > + put_decimal(p, irq_stats(j)->kvm_posted_intr_wakeup_ipis); > seq_puts(p, " Posted-interrupt wakeup event\n"); > #endif > #ifdef CONFIG_GUEST_PERF_EVENTS > - seq_printf(p, "%*s: ", prec, "VPMI"); > + seq_printf(p, "%*s:", prec, "VPMI"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->perf_guest_mediated_pmis); > + put_decimal(p, irq_stats(j)->perf_guest_mediated_pmis); > seq_puts(p, " Perf Guest Mediated PMI\n"); > #endif > #ifdef CONFIG_X86_POSTED_MSI > - seq_printf(p, "%*s: ", prec, "PMN"); > + seq_printf(p, "%*s:", prec, "PMN"); > for_each_online_cpu(j) > - seq_printf(p, "%10u ", > - irq_stats(j)->posted_msi_notification_count); > + put_decimal(p, irq_stats(j)->posted_msi_notification_count); > seq_puts(p, " Posted MSI notification event\n"); > #endif > return 0; > >