All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Catalin Marinas <catalin.marinas@arm.com>,
	 Will Deacon <will@kernel.org>,
	mark.rutland@arm.com
Cc: leo.bras@arm.com, leo.yan@arm.com,
	 linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, palmer@dabbelt.com,
	 paulmck@kernel.org, puranjay@kernel.org, usama.arif@linux.dev,
	rmikey@meta.com,  kernel-team@meta.com
Subject: Re: [PATCH v2] arm64/irqflags: __always_inline the arch_local_irq_*() helpers
Date: Thu, 23 Apr 2026 09:45:30 -0700	[thread overview]
Message-ID: <aepMaa1AQoJO4lza@gmail.com> (raw)
In-Reply-To: <20260421-arm64_always_inline-v2-1-c59d1400514d@debian.org>

On Tue, Apr 21, 2026 at 08:58:57AM -0700, Breno Leitao wrote:
> The arch_local_irq_*() wrappers in <asm/irqflags.h> dispatch between two
> underlying primitives: the __daif_* path on most systems, and the
> __pmr_* path on builds that use GIC PMR-based masking (Pseudo-NMI). The
> leaf primitives are already __always_inline, but the wrappers themselves
> are plain "static inline".
> 
> That is unsafe for noinstr callers: nothing prevents the compiler from
> emitting an out-of-line copy of e.g. arch_local_irq_disable(), and an
> out-of-line copy can be instrumented (ftrace, kcov, sanitizers), which
> breaks the noinstr contract on the entry/idle paths that rely on these
> helpers.
> 
> x86 hit and fixed exactly this class of bug in commit 7a745be1cc90
> ("x86/entry: __always_inline irqflags for noinstr").
> 
> Force-inline all of the arch_local_irq_*() wrappers so they cannot be
> emitted out-of-line:
> 
>   - arch_local_irq_enable()
>   - arch_local_irq_disable()
>   - arch_local_save_flags()
>   - arch_irqs_disabled_flags()
>   - arch_irqs_disabled()
>   - arch_local_irq_save()
>   - arch_local_irq_restore()
> 
> The primary motivation is noinstr safety. There is a useful side effect
> for fleet-wide profiling: when the wrapper is emitted out-of-line,
> samples taken inside it during the post-WFI IRQ unmask in
> default_idle_call() are attributed to arch_local_irq_enable rather than
> default_idle_call(), and the FP-unwinder loses default_idle_call() from
> the chain.

FWIW I run scripts/bloat-o-meter on the kernel with and without the
patch, and the the code size is mostly the same. here is the result:

	add/remove: 4/12 grow/shrink: 40/0 up/down: 1684/-652 (1032)
	Function                                     old     new   delta
	__schedule                                  8892    9024    +132
	irqentry_exit                                816     892     +76
	lockdep_hardirqs_off                         396     452     +56
	lock_is_held_type                            412     468     +56
	ct_idle_exit                                  76     132     +56
	cpu_idle_poll                                304     360     +56
	arch_stack_walk_reliable                    1152    1196     +44
	arch_stack_walk                             1184    1228     +44
	arch_bpf_stack_walk                          996    1040     +44
	lockdep_hardirqs_on                          464     504     +40
	el0_watchpt                                  576     616     +40
	el0_undef                                    560     600     +40
	el0_sys                                      560     600     +40
	el0_sve_acc                                  560     600     +40
	el0_svc                                      600     640     +40
	el0_sp                                       564     604     +40
	el0_softstp                                  728     768     +40
	el0_sme_acc                                  560     600     +40
	el0_pc                                       740     780     +40
	el0_mops                                     560     600     +40
	el0_inv                                      564     604     +40
	el0_interrupt                                656     696     +40
	el0_ia                                       716     756     +40
	el0_gcs                                      560     600     +40
	el0_fpsimd_exc                               560     600     +40
	el0_fpsimd_acc                               560     600     +40
	el0_fpac                                     560     600     +40
	el0_da                                       568     608     +40
	el0_bti                                      552     592     +40
	el0_brk64                                    560     600     +40
	el0_breakpt                                  720     760     +40
	asm_exit_to_user_mode                        416     456     +40
	__el0_error_handler_common                   592     632     +40
	cpuidle_enter_state                         1220    1248     +28
	check_preemption_disabled                    228     252     +24
	default_idle_call                            252     272     +20
	ct_kernel_enter                              388     404     +16
	ct_idle_enter                                 52      68     +16
	look_up_lock_class                           364     376     +12
	check_flags                                  492     504     +12
	__CortexA53843419_FFFF800081146000             -       8      +8
	__CortexA53843419_FFFF8000809C3004             -       8      +8
	__CortexA53843419_FFFF8000809AE000             -       8      +8
	__CortexA53843419_FFFF800080248004             -       8      +8
	__CortexA53843419_FFFF80008100C000             8       -      -8
	__CortexA53843419_FFFF8000809A9000             8       -      -8
	__CortexA53843419_FFFF8000809A8004             8       -      -8
	__CortexA53843419_FFFF800080448008             8       -      -8
	__CortexA53843419_FFFF8000801EE000             8       -      -8
	arch_local_irq_restore                        48       -     -48
	arch_local_save_flags                         80       -     -80
	arch_local_irq_save                           80       -     -80
	arch_local_irq_enable                         84       -     -84
	arch_local_irq_disable                        96       -     -96
	arch_irqs_disabled_flags                      96       -     -96
	arch_irqs_disabled                           128       -    -128
	Total: Before=163062863, After=163063895, chg +0.00%








  parent reply	other threads:[~2026-04-23 16:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 15:58 [PATCH v2] arm64/irqflags: __always_inline the arch_local_irq_*() helpers Breno Leitao
2026-04-21 16:07 ` Leonardo Bras
2026-04-23 16:45 ` Breno Leitao [this message]
2026-04-27 12:26 ` Catalin Marinas
2026-04-27 13:08   ` Mark Rutland
2026-04-27 14:01     ` [PATCH] arm64/daifflags: Make local_daif_*() helpers __always_inline Leonardo Bras
2026-04-27 13:44 ` [PATCH v2] arm64/irqflags: __always_inline the arch_local_irq_*() helpers Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aepMaa1AQoJO4lza@gmail.com \
    --to=leitao@debian.org \
    --cc=catalin.marinas@arm.com \
    --cc=kernel-team@meta.com \
    --cc=leo.bras@arm.com \
    --cc=leo.yan@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=palmer@dabbelt.com \
    --cc=paulmck@kernel.org \
    --cc=puranjay@kernel.org \
    --cc=rmikey@meta.com \
    --cc=usama.arif@linux.dev \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.