From: Yonghong Song <yonghong.song@linux.dev>
To: Puranjay Mohan <puranjay@kernel.org>, bpf@vger.kernel.org
Cc: Puranjay Mohan <puranjay12@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
kernel-team@meta.com, Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH bpf-next v2 2/2] bpf: arm64: Optimize recursion detection by not using atomics
Date: Thu, 18 Dec 2025 09:55:52 -0800 [thread overview]
Message-ID: <1728d4e1-ce5c-476e-b057-b8a9a7621e1b@linux.dev> (raw)
In-Reply-To: <20251217233608.2374187-3-puranjay@kernel.org>
On 12/17/25 3:35 PM, Puranjay Mohan wrote:
> BPF programs detect recursion using a per-CPU 'active' flag in struct
> bpf_prog. The trampoline currently sets/clears this flag with atomic
> operations.
>
> On some arm64 platforms (e.g., Neoverse V2 with LSE), per-CPU atomic
> operations are relatively slow. Unlike x86_64 - where per-CPU updates
> can avoid cross-core atomicity, arm64 LSE atomics are always atomic
> across all cores, which is unnecessary overhead for strictly per-CPU
> state.
>
> This patch removes atomics from the recursion detection path on arm64 by
> changing 'active' to a per-CPU array of four u8 counters, one per
> context: {NMI, hard-irq, soft-irq, normal}. The running context uses a
> non-atomic increment/decrement on its element. After increment,
> recursion is detected by reading the array as a u32 and verifying that
> only the expected element changed; any change in another element
> indicates inter-context recursion, and a value > 1 in the same element
> indicates same-context recursion.
>
> For example, starting from {0,0,0,0}, a normal-context trigger changes
> the array to {0,0,0,1}. If an NMI arrives on the same CPU and triggers
> the program, the array becomes {1,0,0,1}. When the NMI context checks
> the u32 against the expected mask for normal (0x00000001), it observes
> 0x01000001 and correctly reports recursion. Same-context recursion is
> detected analogously.
>
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
LGTM with a few nits below.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
> ---
> include/linux/bpf.h | 33 ++++++++++++++++++++++++++++++---
> kernel/bpf/core.c | 3 ++-
> 2 files changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 2da986136d26..5ca2a761d9a1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -31,6 +31,7 @@
> #include <linux/static_call.h>
> #include <linux/memcontrol.h>
> #include <linux/cfi.h>
> +#include <linux/unaligned.h>
> #include <asm/rqspinlock.h>
>
> struct bpf_verifier_env;
> @@ -1746,6 +1747,8 @@ struct bpf_prog_aux {
> struct bpf_map __rcu *st_ops_assoc;
> };
>
> +#define BPF_NR_CONTEXTS 4 /* normal, softirq, hardirq, NMI */
> +
> struct bpf_prog {
> u16 pages; /* Number of allocated pages */
> u16 jited:1, /* Is our filter JIT'ed? */
> @@ -1772,7 +1775,7 @@ struct bpf_prog {
> u8 tag[BPF_TAG_SIZE];
> };
> struct bpf_prog_stats __percpu *stats;
> - int __percpu *active;
> + u8 __percpu *active; /* u8[BPF_NR_CONTEXTS] for rerecursion protection */
> unsigned int (*bpf_func)(const void *ctx,
> const struct bpf_insn *insn);
> struct bpf_prog_aux *aux; /* Auxiliary fields */
> @@ -2006,12 +2009,36 @@ struct bpf_struct_ops_common_value {
>
> static inline bool bpf_prog_get_recursion_context(struct bpf_prog *prog)
> {
> - return this_cpu_inc_return(*(prog->active)) == 1;
> +#ifdef CONFIG_ARM64
> + u8 rctx = interrupt_context_level();
> + u8 *active = this_cpu_ptr(prog->active);
> + u32 val;
> +
> + preempt_disable();
> + active[rctx]++;
> + val = get_unaligned_le32(active);
The 'active' already aligned with 8 (or 4 with my below suggestion).
The get_unaligned_le32() works, but maybe we could use le32_to_cpu()
instead. Maybe there is no performance difference between
get_unaligned_le32() and le32_to_cpu() so you pick get_unaligned_le32()?
It would be good to clarify in commit message if get_unaligned_le32()
is used.
> + preempt_enable();
> + if (val != BIT(rctx * 8))
> + return false;
> +
> + return true;
> +#else
> + return this_cpu_inc_return(*(int __percpu *)(prog->active)) == 1;
> +#endif
> }
>
> static inline void bpf_prog_put_recursion_context(struct bpf_prog *prog)
> {
> - this_cpu_dec(*(prog->active));
> +#ifdef CONFIG_ARM64
> + u8 rctx = interrupt_context_level();
> + u8 *active = this_cpu_ptr(prog->active);
> +
> + preempt_disable();
> + active[rctx]--;
> + preempt_enable();
> +#else
> + this_cpu_dec(*(int __percpu *)(prog->active));
> +#endif
> }
>
> #if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index c66316e32563..b5063acfcf92 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -112,7 +112,8 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> vfree(fp);
> return NULL;
> }
> - fp->active = alloc_percpu_gfp(int, bpf_memcg_flags(GFP_KERNEL | gfp_extra_flags));
> + fp->active = __alloc_percpu_gfp(sizeof(u8[BPF_NR_CONTEXTS]), 8,
> + bpf_memcg_flags(GFP_KERNEL | gfp_extra_flags));
Here, the alignment is 8. Can it be 4 since the above reads a 32bit value?
> if (!fp->active) {
> vfree(fp);
> kfree(aux);
next prev parent reply other threads:[~2025-12-18 17:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-17 23:35 [PATCH bpf-next v2 0/2] bpf: Optimize recursion detection on arm64 Puranjay Mohan
2025-12-17 23:35 ` [PATCH bpf-next v2 1/2] bpf: move recursion detection logic to helpers Puranjay Mohan
2025-12-18 17:44 ` Yonghong Song
2025-12-17 23:35 ` [PATCH bpf-next v2 2/2] bpf: arm64: Optimize recursion detection by not using atomics Puranjay Mohan
2025-12-18 17:55 ` Yonghong Song [this message]
2025-12-19 16:40 ` Puranjay Mohan
2025-12-19 18:23 ` Puranjay Mohan
2025-12-18 2:52 ` [PATCH bpf-next v2 0/2] bpf: Optimize recursion detection on arm64 Puranjay Mohan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1728d4e1-ce5c-476e-b057-b8a9a7621e1b@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=puranjay12@gmail.com \
--cc=puranjay@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.