public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] arm64/irqflags: force inline of arch_local_irq_enable()
@ 2026-04-20 12:42 Breno Leitao
  2026-04-20 13:06 ` Mark Rutland
  0 siblings, 1 reply; 5+ messages in thread
From: Breno Leitao @ 2026-04-20 12:42 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: leo.bras, mark.rutland, leo.yan, linux-arm-kernel, linux-kernel,
	palmer, paulmck, puranjay, usama.arif, kernel-team, Breno Leitao

arch_local_irq_enable() is a small wrapper that dispatches between two
unmask paths: __daif_local_irq_enable() on most systems, and
__pmr_local_irq_enable() on builds that use GIC PMR-based masking
(Pseudo-NMI). Both leaf primitives are already __always_inline; the
wrapper itself is plain "static inline".

In practice the compiler does not always inline the wrapper. When it
gets emitted out-of-line, samples taken inside it during the post-WFI
IRQ unmask in default_idle_call() show up as arch_local_irq_enable
overhead in profiles, with default_idle_call() lost from the unwound
chain.

This matters most at fleet scale. On a large arm64 fleet, the
aggregate effect is that idle CPUs show up in fleet-wide profilers as
"busy stuck in arch_local_irq_enable" instead of as idle
(default_idle_call / cpu_startup_entry). Engineers looking at
fleet-wide top-symbol dashboards see what looks like significant
CPU-bound work in IRQ unmasking and chase a phantom hot path, when in
fact the cost is the WFI wake-up cycle being attributed to the wrong
function. Tooling has to special-case this symbol to suppress it,
which is fragile across kernel versions. Inlining the wrapper makes
idle CPUs appear idle in profiles - which is what they are.

The same misattribution affects driver stalls. arm64 PMU overflow is
delivered as a regular IRQ (no NMI on default builds), so a driver
that holds local_irq_disable() for milliseconds defers every PMU
sample to the moment it calls local_irq_enable(). With the wrapper
out-of-line, the resulting fat sample is credited to
arch_local_irq_enable rather than to the driver, and the FP-unwinder
points the call chain at the driver's caller instead of the driver
itself (the immediate caller is skipped because arch_local_irq_enable
is a leaf with no saved frame). The driver is still visible in the
profile from its other samples, but the stall cost itself is
mis-attributed and the chain leading to it is one frame off, making
fleet-wide root-cause analysis harder than it needs to be. Inlining
the wrapper attributes the stall sample to the driver function that
actually held IRQs disabled.

Trade-offs:

 - Minor .text effect: every caller now expands the dispatch +
   underlying primitive at its call site. system_uses_irq_prio_masking()
   is a static-key check, so on non-pNMI systems the inlined body
   collapses to a single MSR daifclr; on pNMI systems it collapses to a
   single sysreg write.

 - Loss of a debugging convenience: there is no longer an
   arch_local_irq_enable symbol to set a breakpoint on. Callers must be
   targeted individually.

 - Compiler trust: __always_inline overrides size heuristics. The body
   is small enough that this should be unobjectionable, but it is a
   policy change.

This patch only flips arch_local_irq_enable(). The same reasoning
applies to arch_local_irq_disable()/save()/restore() which share the
identical static-inline-wrapper-around-__always_inline-primitives
pattern. Holding those off until profiles motivate them.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/arm64/include/asm/irqflags.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index d4d7451c2c129..505ef5be53a71 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -40,7 +40,7 @@ static __always_inline void __pmr_local_irq_enable(void)
 	barrier();
 }
 
-static inline void arch_local_irq_enable(void)
+static __always_inline void arch_local_irq_enable(void)
 {
 	if (system_uses_irq_prio_masking()) {
 		__pmr_local_irq_enable();

---
base-commit: 615aad0f61e0c7a898184a394dc895c610100d4f
change-id: 20260420-arm64_always_inline-6bc9dd3c17e6

Best regards,
--  
Breno Leitao <leitao@debian.org>


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-20 14:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-20 12:42 [PATCH RFC] arm64/irqflags: force inline of arch_local_irq_enable() Breno Leitao
2026-04-20 13:06 ` Mark Rutland
2026-04-20 13:15   ` Breno Leitao
2026-04-20 14:14     ` Mark Rutland
2026-04-20 14:37       ` Breno Leitao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox