* [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs
@ 2026-05-11 5:33 Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
` (23 more replies)
0 siblings, 24 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Currently, bpf function calls and kfunc's are limited by 5 reg-level
parameters. For function calls with more than 5 parameters,
developers can use always inlining or pass a struct pointer
after packing more parameters in that struct although it may have
some inconvenience. But there is no workaround for kfunc if more
than 5 parameters is needed.
This patch set lifts the 5-argument limit by introducing stack-based
argument passing for BPF functions and kfunc's, coordinated with
compiler support in LLVM [1]. The compiler emits loads/stores through
a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
the 5th, keeping the stack arg area separate from the r10-based program
stack. The current maximum number of arguments is capped at
MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
use cases.
All kfunc/bpf-function arguments are caller saved, including stack
arguments. For register arguments (r1-r5), the verifier already marks
them as clobbered after each call. For stack arguments, the verifier
invalidates all outgoing stack arg slots immediately after a call,
requiring the compiler to re-store them before any subsequent call.
This follows the native calling convention where all function
parameters are caller saved.
The x86_64 JIT translates r11-relative accesses to RBP-relative
native instructions. Each function's stack allocation is extended
by 'max_outgoing' bytes to hold the outgoing arg area below the
callee-saved registers. This makes implementation easier as the r10
can be reused for stack argument access. At both BPF-to-BPF and kfunc
calls, outgoing args are pushed onto the expected calling convention
locations directly. The incoming parameters can directly get the value
from caller.
Global subprogs and freplace progs with >5 args are not yet supported.
Only x86_64 and arm64 are supported for now. Same selftests are tested
by both x86_64 and arm64. Please see each individual patch for details.
[1] https://github.com/llvm/llvm-project/pull/189060
Changelogs:
v2 -> v3:
- v2: https://lore.kernel.org/bpf/20260507212942.1122000-1-yonghong.song@linux.dev/
- In do_check_common() and for main prog, if btf does not match with actual
parameter, the verification will continue and will ignore arg_cnt. Make
arg_cnt=1 explictly to prevent any incoming stack arguments.
- Remove the loop which clear current frame stack slot and set the upper level frame
stack slot. This is not needed unless there is a bug. Add a verifier_bug
if the bug happens.
- For liveness, avoid r11 based load/stores mixing with r10 based stack tracking.
Also, print out stack arguments properly.
- Pass bpf_subprog_info the JIT so we can avoid copy bpf_subprog_info fields to
bpf_prog_aux.
- Fix the missed allocation free for test infra BTF fixup.
- Remove selftest result for precision backtracking test since the result would
be change (two possible output).
v1 -> v2:
- v1: https://lore.kernel.org/bpf/20260424171433.2034470-1-yonghong.song@linux.dev/
- Several refactoring (convert bpf_get_spilled_reg macro to static inline func,
Remove copy_register_state(), Refactor jmp history, Refactor record_call_access(), etc),
suggested by Eduard.
- Use incoming_stack_arg_cnt/stack_arg_cnt instead of incoming_stack_arg_depth/stack_arg_depth,
suggested by Eduard.
- Fix a stack arg pruning bug, from Eduard.
- Fix a bug for precision marking and backtracking, basically callee needs to get the
stack arg value from callers, helped from Eduard.
- Set sub->arg_cnt earlier in btf_prepare_func_args(), this will avoid having
incoming_stack_arg_cnt in bpf_subprog_info.
- Do stack-arg liveness analysis together with r10 based liveness analysis,
suggested by Eduard.
- Fix a few tests to ensure that r11-based loads cannot be ahead of r11-based stores,
and r11-based loads cannot be after kfunc/helper/bpf-function.
Puranjay Mohan (3):
bpf, arm64: Map BPF_REG_0 to x8 instead of x7
bpf, arm64: Add JIT support for stack arguments
selftests/bpf: Enable stack argument tests for arm64
Yonghong Song (21):
bpf: Convert bpf_get_spilled_reg macro to static inline function
bpf: Remove copy_register_state wrapper function
bpf: Add helper functions for r11-based stack argument insns
bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
bpf: Support stack arguments for bpf functions
bpf: Refactor jmp history to use dedicated spi/frame fields
bpf: Add precision marking and backtracking for stack argument slots
bpf: Refactor record_call_access() to extract per-arg logic
bpf: Extend liveness analysis to track stack argument slots
bpf: Reject stack arguments in non-JITed programs
bpf: Prepare architecture JIT support for stack arguments
bpf: Enable r11 based insns
bpf: Support stack arguments for kfunc calls
bpf: Reject stack arguments if tail call reachable
bpf: Pass bpf_subprog_info to bpf_int_jit_compile()
bpf,x86: Implement JIT support for stack arguments
selftests/bpf: Add tests for BPF function stack arguments
selftests/bpf: Add tests for stack argument validation
selftests/bpf: Add BTF fixup for __naked subprog parameter names
selftests/bpf: Add verifier tests for stack argument validation
selftests/bpf: Add precision backtracking test for stack arguments
arch/arc/net/bpf_jit_core.c | 3 +-
arch/arm/net/bpf_jit_32.c | 3 +-
arch/arm64/net/bpf_jit_comp.c | 95 +++-
arch/arm64/net/bpf_timed_may_goto.S | 8 +-
arch/loongarch/net/bpf_jit.c | 3 +-
arch/mips/net/bpf_jit_comp.c | 3 +-
arch/parisc/net/bpf_jit_core.c | 3 +-
arch/powerpc/net/bpf_jit_comp.c | 3 +-
arch/riscv/net/bpf_jit_core.c | 3 +-
arch/s390/net/bpf_jit_comp.c | 3 +-
arch/sparc/net/bpf_jit_comp_64.c | 3 +-
arch/x86/net/bpf_jit_comp.c | 163 ++++++-
arch/x86/net/bpf_jit_comp32.c | 3 +-
include/linux/bpf_verifier.h | 89 +++-
include/linux/filter.h | 26 +-
kernel/bpf/backtrack.c | 82 +++-
kernel/bpf/btf.c | 20 +-
kernel/bpf/const_fold.c | 8 +
kernel/bpf/core.c | 16 +-
kernel/bpf/fixups.c | 26 +-
kernel/bpf/liveness.c | 178 +++++--
kernel/bpf/states.c | 31 +-
kernel/bpf/verifier.c | 387 ++++++++++++---
.../selftests/bpf/prog_tests/stack_arg.c | 139 ++++++
.../selftests/bpf/prog_tests/stack_arg_fail.c | 10 +
.../bpf/prog_tests/stack_arg_precision.c | 10 +
.../selftests/bpf/prog_tests/verifier.c | 4 +
tools/testing/selftests/bpf/progs/bpf_misc.h | 1 +
.../bpf/progs/btf__stack_arg_precision.c | 24 +
.../bpf/progs/btf__verifier_stack_arg_order.c | 31 ++
tools/testing/selftests/bpf/progs/stack_arg.c | 253 ++++++++++
.../selftests/bpf/progs/stack_arg_fail.c | 114 +++++
.../selftests/bpf/progs/stack_arg_kfunc.c | 164 +++++++
.../selftests/bpf/progs/stack_arg_precision.c | 135 ++++++
.../selftests/bpf/progs/verifier_jit_inline.c | 2 +-
.../selftests/bpf/progs/verifier_ldsx.c | 6 +-
.../bpf/progs/verifier_private_stack.c | 10 +-
.../selftests/bpf/progs/verifier_stack_arg.c | 445 ++++++++++++++++++
.../bpf/progs/verifier_stack_arg_order.c | 87 ++++
.../selftests/bpf/test_kmods/bpf_testmod.c | 72 +++
.../bpf/test_kmods/bpf_testmod_kfunc.h | 26 +
tools/testing/selftests/bpf/test_loader.c | 136 +++++-
42 files changed, 2632 insertions(+), 196 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
create mode 100644 tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
create mode 100644 tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_fail.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_precision.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
--
2.53.0-Meta
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 02/24] bpf: Remove copy_register_state wrapper function Yonghong Song
` (22 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Convert the bpf_get_spilled_reg() macro to a static inline function
for better type safety and readability. This also simplifies the macro
definition in preparation for upcoming stack argument support which
will introduce additional macros.
No functional change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/bpf_verifier.h | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 976e2b2f40e8..321b9d69cf9c 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -552,10 +552,14 @@ struct bpf_verifier_state {
u32 may_goto_depth;
};
-#define bpf_get_spilled_reg(slot, frame, mask) \
- (((slot < frame->allocated_stack / BPF_REG_SIZE) && \
- ((1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & (mask))) \
- ? &frame->stack[slot].spilled_ptr : NULL)
+static inline struct bpf_reg_state *
+bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
+{
+ if (slot < frame->allocated_stack / BPF_REG_SIZE &&
+ (1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & mask)
+ return &frame->stack[slot].spilled_ptr;
+ return NULL;
+}
/* Iterate over 'frame', setting 'reg' to either NULL or a spilled register. */
#define bpf_for_each_spilled_reg(iter, frame, reg, mask) \
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 02/24] bpf: Remove copy_register_state wrapper function
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 03/24] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
` (21 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Remove the copy_register_state() helper which was just a plain struct
assignment wrapper and replace all call sites with direct struct
assignment. This simplifies the code in preparation for upcoming stack
argument support.
No functional change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 44 +++++++++++++++++++------------------------
1 file changed, 19 insertions(+), 25 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 11054ad89c14..3bafb7ad2ba7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3410,12 +3410,6 @@ static void assign_scalar_id_before_mov(struct bpf_verifier_env *env,
src_reg->id = ++env->id_gen;
}
-/* Copy src state preserving dst->parent and dst->live fields */
-static void copy_register_state(struct bpf_reg_state *dst, const struct bpf_reg_state *src)
-{
- *dst = *src;
-}
-
static void save_register_state(struct bpf_verifier_env *env,
struct bpf_func_state *state,
int spi, struct bpf_reg_state *reg,
@@ -3423,7 +3417,7 @@ static void save_register_state(struct bpf_verifier_env *env,
{
int i;
- copy_register_state(&state->stack[spi].spilled_ptr, reg);
+ state->stack[spi].spilled_ptr = *reg;
for (i = BPF_REG_SIZE; i > BPF_REG_SIZE - size; i--)
state->stack[spi].slot_type[i - 1] = STACK_SPILL;
@@ -3822,7 +3816,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
* with the destination register on fill.
*/
assign_scalar_id_before_mov(env, reg);
- copy_register_state(&state->regs[dst_regno], reg);
+ state->regs[dst_regno] = *reg;
state->regs[dst_regno].subreg_def = subreg_def;
/* Break the relation on a narrowing fill.
@@ -3877,7 +3871,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
* with the destination register on fill.
*/
assign_scalar_id_before_mov(env, reg);
- copy_register_state(&state->regs[dst_regno], reg);
+ state->regs[dst_regno] = *reg;
/* mark reg as written since spilled pointer state likely
* has its liveness marks cleared by is_state_visited()
* which resets stack/reg liveness for state transitions
@@ -6031,7 +6025,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, struct b
size);
return -EACCES;
}
- copy_register_state(®s[value_regno], reg);
+ regs[value_regno] = *reg;
add_scalar_to_reg(®s[value_regno], off);
regs[value_regno].type = PTR_TO_INSN;
} else {
@@ -13248,7 +13242,7 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
*/
if (!ptr_is_dst_reg) {
tmp = *dst_reg;
- copy_register_state(dst_reg, ptr_reg);
+ *dst_reg = *ptr_reg;
}
err = sanitize_speculative_path(env, NULL, env->insn_idx + 1, env->insn_idx);
if (err < 0)
@@ -14698,7 +14692,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
* copy register state to dest reg
*/
assign_scalar_id_before_mov(env, src_reg);
- copy_register_state(dst_reg, src_reg);
+ *dst_reg = *src_reg;
dst_reg->subreg_def = DEF_NOT_SUBREG;
} else {
/* case: R1 = (s8, s16 s32)R2 */
@@ -14713,7 +14707,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
no_sext = reg_umax(src_reg) < (1ULL << (insn->off - 1));
if (no_sext)
assign_scalar_id_before_mov(env, src_reg);
- copy_register_state(dst_reg, src_reg);
+ *dst_reg = *src_reg;
if (!no_sext)
clear_scalar_id(dst_reg);
coerce_reg_to_size_sx(dst_reg, insn->off >> 3);
@@ -14735,7 +14729,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
if (is_src_reg_u32)
assign_scalar_id_before_mov(env, src_reg);
- copy_register_state(dst_reg, src_reg);
+ *dst_reg = *src_reg;
/* Make sure ID is cleared if src_reg is not in u32
* range otherwise dst_reg min/max could be incorrectly
* propagated into src_reg by sync_linked_regs()
@@ -14749,7 +14743,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
if (no_sext)
assign_scalar_id_before_mov(env, src_reg);
- copy_register_state(dst_reg, src_reg);
+ *dst_reg = *src_reg;
if (!no_sext)
clear_scalar_id(dst_reg);
dst_reg->subreg_def = env->insn_idx + 1;
@@ -15629,7 +15623,7 @@ static void sync_linked_regs(struct bpf_verifier_env *env, struct bpf_verifier_s
reg->delta == known_reg->delta) {
s32 saved_subreg_def = reg->subreg_def;
- copy_register_state(reg, known_reg);
+ *reg = *known_reg;
reg->subreg_def = saved_subreg_def;
} else {
s32 saved_subreg_def = reg->subreg_def;
@@ -15640,7 +15634,7 @@ static void sync_linked_regs(struct bpf_verifier_env *env, struct bpf_verifier_s
__mark_reg_known(&fake_reg, (s64)reg->delta - (s64)known_reg->delta);
/* reg = known_reg; reg += delta */
- copy_register_state(reg, known_reg);
+ *reg = *known_reg;
/*
* Must preserve off, id and subreg_def flag,
* otherwise another sync_linked_regs() will be incorrect.
@@ -15743,10 +15737,10 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
}
is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;
- copy_register_state(&env->false_reg1, dst_reg);
- copy_register_state(&env->false_reg2, src_reg);
- copy_register_state(&env->true_reg1, dst_reg);
- copy_register_state(&env->true_reg2, src_reg);
+ env->false_reg1 = *dst_reg;
+ env->false_reg2 = *src_reg;
+ env->true_reg1 = *dst_reg;
+ env->true_reg2 = *src_reg;
pred = is_branch_taken(env, dst_reg, src_reg, opcode, is_jmp32);
if (pred >= 0) {
/* If we get here with a dst_reg pointer type it is because
@@ -15815,11 +15809,11 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
if (err)
return err;
- copy_register_state(dst_reg, &env->false_reg1);
- copy_register_state(src_reg, &env->false_reg2);
- copy_register_state(&other_branch_regs[insn->dst_reg], &env->true_reg1);
+ *dst_reg = env->false_reg1;
+ *src_reg = env->false_reg2;
+ other_branch_regs[insn->dst_reg] = env->true_reg1;
if (BPF_SRC(insn->code) == BPF_X)
- copy_register_state(&other_branch_regs[insn->src_reg], &env->true_reg2);
+ other_branch_regs[insn->src_reg] = env->true_reg2;
if (BPF_SRC(insn->code) == BPF_X &&
src_reg->type == SCALAR_VALUE && src_reg->id &&
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 03/24] bpf: Add helper functions for r11-based stack argument insns
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 02/24] bpf: Remove copy_register_state wrapper function Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
` (20 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Add three static inline helper functions — is_stack_arg_ldx(),
is_stack_arg_st(), and is_stack_arg_stx() — that identify r11-based
(BPF_REG_PARAMS) instructions used for stack argument passing. These
helpers encapsulate the detailed encoding requirements (operand size,
register, offset alignment and sign) and hide raw BPF_REG_PARAMS usage
from the verifier, making call sites more readable and explicit.
A later patch ("bpf: Enable r11 based insns") will wire these helpers
into the verifier. Until then, check_and_resolve_insns() rejects any
r11-based registers.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/filter.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index b77d0b06db6e..918d9b34eac6 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -749,6 +749,27 @@ static inline u32 bpf_prog_run_pin_on_cpu(const struct bpf_prog *prog,
return ret;
}
+static inline bool is_stack_arg_ldx(const struct bpf_insn *insn)
+{
+ return insn->code == (BPF_LDX | BPF_MEM | BPF_DW) &&
+ insn->src_reg == BPF_REG_PARAMS &&
+ insn->off > 0 && insn->off % 8 == 0;
+}
+
+static inline bool is_stack_arg_st(const struct bpf_insn *insn)
+{
+ return insn->code == (BPF_ST | BPF_MEM | BPF_DW) &&
+ insn->dst_reg == BPF_REG_PARAMS &&
+ insn->off < 0 && insn->off % 8 == 0;
+}
+
+static inline bool is_stack_arg_stx(const struct bpf_insn *insn)
+{
+ return insn->code == (BPF_STX | BPF_MEM | BPF_DW) &&
+ insn->dst_reg == BPF_REG_PARAMS &&
+ insn->off < 0 && insn->off % 8 == 0;
+}
+
#define BPF_SKB_CB_LEN QDISC_CB_PRIV_LEN
struct bpf_skb_data_end {
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (2 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 03/24] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:33 ` [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions Yonghong Song
` (19 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Move the "sub->arg_cnt = nargs" assignment to immediately after
nargs is computed from btf_type_vlen(), instead of at the end of
btf_prepare_func_args().
btf_prepare_func_args() can return -EINVAL early in several cases,
e.g. when a static function has some non-int/enum arguments.
Since -EINVAL from btf_prepare_func_args() does not immediately
reject verification, arg_cnt remains zero after the early return.
This causes later stack argument based load/store insns to
incorrectly assume the function has no arguments.
Setting arg_cnt right after nargs ensures it is available regardless
of which path btf_prepare_func_args() takes.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/btf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 77af44d8a3ad..a33a5b4122f8 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
}
args = (const struct btf_param *)(t + 1);
nargs = btf_type_vlen(t);
+ sub->arg_cnt = nargs;
if (nargs > MAX_BPF_FUNC_REG_ARGS) {
if (!is_global)
return -EINVAL;
@@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
return -EINVAL;
}
- sub->arg_cnt = nargs;
sub->args_cached = true;
return 0;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (3 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:33 ` [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
` (18 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Currently BPF functions (subprogs) are limited to 5 register arguments.
With [1], the compiler can emit code that passes additional arguments
via a dedicated stack area through bpf register BPF_REG_PARAMS (r11),
introduced in an earlier patch ([2]).
The compiler uses positive r11 offsets for incoming (callee-side) args
and negative r11 offsets for outgoing (caller-side) args, following the
x86_64/arm64 calling convention direction. There is an 8-byte gap at
offset 0 separating two regions:
Incoming (callee reads): r11+8 (arg6), r11+16 (arg7), ...
Outgoing (caller writes): r11-8 (arg6), r11-16 (arg7), ...
The following is an example to show how stack arguments are saved
and transferred between caller and callee:
int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
...
bar(a1, a2, a3, a4, a5, a6, a7, a8);
...
}
Caller (foo) Callee (bar)
============ ============
Incoming (positive offsets): Incoming (positive offsets):
r11+8: [incoming arg 6] r11+8: [incoming arg 6] <-+
r11+16: [incoming arg 7] r11+16: [incoming arg 7] <-|+
r11+24: [incoming arg 8] <-||+
Outgoing (negative offsets): |||
r11-8: [outgoing arg 6 to bar] -------->-------------------------+||
r11-16: [outgoing arg 7 to bar] -------->--------------------------+|
r11-24: [outgoing arg 8 to bar] -------->---------------------------+
If the bpf function has more than one call:
int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
...
bar1(a1, a2, a3, a4, a5, a6, a7, a8);
...
bar2(a1, a2, a3, a4, a5, a6, a7, a8, a9);
...
}
Caller (foo) Callee (bar2)
============ ==============
Incoming (positive offsets): Incoming (positive offsets):
r11+8: [incoming arg 6] r11+8: [incoming arg 6] <+
r11+16: [incoming arg 7] r11+16: [incoming arg 7] <|+
r11+24: [incoming arg 8] <||+
Outgoing for bar2 (negative offsets): r11+32: [incoming arg 9] <|||+
r11-8: [outgoing arg 6] ---->----------->-------------------------+|||
r11-16: [outgoing arg 7] ---->----------->--------------------------+||
r11-24: [outgoing arg 8] ---->----------->---------------------------+|
r11-32: [outgoing arg 9] ---->----------->----------------------------+
The verifier tracks outgoing stack arguments in stack_arg_regs[] and
out_stack_arg_cnt in bpf_func_state, separately from the regular
r10 stack. The callee does not copy incoming args — it reads them
directly from the caller's outgoing slots at positive r11 offsets.
Similar to stacksafe(), introduce stack_arg_safe() to do pruning
check.
Outgoing stack arg slots are invalidated when the callee returns
(in prepare_func_exit), not at call time. This allows the callee to
read incoming args from the caller's outgoing slots during
verification. The following are a few examples.
Example 1:
*(u64 *)(r11 - 8) = r6;
*(u64 *)(r11 - 16) = r7;
call bar1; // arg6 = r6, arg7 = r7
call bar2; // expected with 2 stack arguments, failed
Example 2:
To fix the Example 1:
*(u64 *)(r11 - 8) = r6;
*(u64 *)(r11 - 16) = r7;
call bar1; // arg6 = r6, arg7 = r7
*(u64 *)(r11 - 8) = r8;
*(u64 *)(r11 - 16) = r9;
call bar2; // arg6 = r8, arg7 = r9
Example 3:
The compiler can hoist the shared stack arg stores above the branch:
*(u64 *)(r11 - 16) = r7;
if cond goto else;
*(u64 *)(r11 - 8) = r8;
call bar1; // arg6 = r8, arg7 = r7
goto end;
else:
*(u64 *)(r11 - 8) = r9;
call bar2; // arg6 = r9, arg7 = r7
end:
Example 4:
Within a loop:
loop:
*(u64 *)(r11 - 8) = r6; // arg6, before loop
call bar; // reuses arg6 each iteration
if ... goto loop;
A separate max_out_stack_arg_cnt field in bpf_subprog_info tracks
the deepest outgoing slot actually written. This intends to
reject programs that write to slots beyond what any callee expects.
It is necessary for JIT.
Similar to typical compiler generated code, enforce the following
orderings:
- all stack arg reads must be ahead of any stack arg write
- all stack arg reads must be before any bpf func, kfunc and helpers
This is needed as JIT may emit 'mov' insns for read/write with
the same register and bpf function, kfunc and helper will invalidate
all arguments immediately after the call.
Callback functions with stack arguments need kernel setup parameter
types (including stack parameters) properly and then callback function
can retrieve such information for verification purpose.
Global subprogs and freplace with >5 args are not yet supported.
[1] https://github.com/llvm/llvm-project/pull/189060
[2] https://lore.kernel.org/bpf/20260423033506.2542005-1-yonghong.song@linux.dev/
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/bpf_verifier.h | 37 ++++++-
kernel/bpf/btf.c | 14 ++-
kernel/bpf/fixups.c | 16 ++-
kernel/bpf/states.c | 29 +++++
kernel/bpf/verifier.c | 198 ++++++++++++++++++++++++++++++++++-
5 files changed, 282 insertions(+), 12 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 321b9d69cf9c..f9020a4ea005 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -402,6 +402,7 @@ struct bpf_func_state {
bool in_callback_fn;
bool in_async_callback_fn;
bool in_exception_callback_fn;
+ bool no_stack_arg_load;
/* For callback calling functions that limit number of possible
* callback executions (e.g. bpf_loop) keeps track of current
* simulated iteration number.
@@ -427,6 +428,9 @@ struct bpf_func_state {
* `stack`. allocated_stack is always a multiple of BPF_REG_SIZE.
*/
int allocated_stack;
+
+ u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */
+ struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */
};
#define MAX_CALL_FRAMES 8
@@ -561,12 +565,27 @@ bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
return NULL;
}
+static inline struct bpf_reg_state *
+bpf_get_spilled_stack_arg(int slot, struct bpf_func_state *frame, u32 mask)
+{
+ if (slot < frame->out_stack_arg_cnt &&
+ frame->stack_arg_regs[slot].type != NOT_INIT)
+ return &frame->stack_arg_regs[slot];
+ return NULL;
+}
+
/* Iterate over 'frame', setting 'reg' to either NULL or a spilled register. */
#define bpf_for_each_spilled_reg(iter, frame, reg, mask) \
for (iter = 0, reg = bpf_get_spilled_reg(iter, frame, mask); \
iter < frame->allocated_stack / BPF_REG_SIZE; \
iter++, reg = bpf_get_spilled_reg(iter, frame, mask))
+/* Iterate over 'frame', setting 'reg' to either NULL or a spilled stack arg. */
+#define bpf_for_each_spilled_stack_arg(iter, frame, reg, mask) \
+ for (iter = 0, reg = bpf_get_spilled_stack_arg(iter, frame, mask); \
+ iter < frame->out_stack_arg_cnt; \
+ iter++, reg = bpf_get_spilled_stack_arg(iter, frame, mask))
+
#define bpf_for_each_reg_in_vstate_mask(__vst, __state, __reg, __mask, __expr) \
({ \
struct bpf_verifier_state *___vstate = __vst; \
@@ -584,6 +603,11 @@ bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
continue; \
(void)(__expr); \
} \
+ bpf_for_each_spilled_stack_arg(___j, __state, __reg, __mask) { \
+ if (!__reg) \
+ continue; \
+ (void)(__expr); \
+ } \
} \
})
@@ -799,12 +823,21 @@ struct bpf_subprog_info {
bool keep_fastcall_stack: 1;
bool changes_pkt_data: 1;
bool might_sleep: 1;
- u8 arg_cnt:3;
+ u8 arg_cnt:4;
enum priv_stack_mode priv_stack_mode;
- struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
+ struct bpf_subprog_arg_info args[MAX_BPF_FUNC_ARGS];
+ u16 stack_arg_cnt; /* incoming + max outgoing */
+ u16 max_out_stack_arg_cnt;
};
+static inline u16 bpf_in_stack_arg_cnt(struct bpf_subprog_info *sub)
+{
+ if (sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS)
+ return sub->arg_cnt - MAX_BPF_FUNC_REG_ARGS;
+ return 0;
+}
+
struct bpf_verifier_env;
struct backtrack_state {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a33a5b4122f8..ec3fb8c8f4ee 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7881,10 +7881,16 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
args = (const struct btf_param *)(t + 1);
nargs = btf_type_vlen(t);
sub->arg_cnt = nargs;
- if (nargs > MAX_BPF_FUNC_REG_ARGS) {
- if (!is_global)
- return -EINVAL;
- bpf_log(log, "Global function %s() with %d > %d args. Buggy compiler.\n",
+ if (nargs > MAX_BPF_FUNC_ARGS) {
+ bpf_log(log, "kernel supports at most %d parameters, function %s has %d\n",
+ MAX_BPF_FUNC_ARGS, tname, nargs);
+ return -EFAULT;
+ }
+ if (nargs > MAX_BPF_FUNC_REG_ARGS)
+ sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+
+ if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
+ bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",
tname, nargs, MAX_BPF_FUNC_REG_ARGS);
return -EINVAL;
}
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index fba9e8c00878..ba86039789fd 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1378,9 +1378,21 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
struct bpf_prog *prog = env->prog;
struct bpf_insn *insn = prog->insnsi;
bool has_kfunc_call = bpf_prog_has_kfunc_call(prog);
- int i, depth;
+ int depth;
#endif
- int err = 0;
+ int i, err = 0;
+
+ for (i = 0; i < env->subprog_cnt; i++) {
+ struct bpf_subprog_info *subprog = &env->subprog_info[i];
+ u16 outgoing = subprog->stack_arg_cnt - bpf_in_stack_arg_cnt(subprog);
+
+ if (subprog->max_out_stack_arg_cnt > outgoing) {
+ verbose(env,
+ "func#%d writes %u stack arg slots, but calls only require %u\n",
+ i, subprog->max_out_stack_arg_cnt, outgoing);
+ return -EINVAL;
+ }
+ }
if (env->prog->jit_requested &&
!bpf_prog_is_offloaded(env->prog->aux)) {
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index bd9c22945050..c249eb40c6d6 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
return true;
}
+/*
+ * Compare stack arg slots between old and current states.
+ * Outgoing stack args are path-local state and must agree for pruning.
+ */
+static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
+ struct bpf_func_state *cur, struct bpf_idmap *idmap,
+ enum exact_level exact)
+{
+ int i, nslots;
+
+ nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
+ for (i = 0; i < nslots; i++) {
+ struct bpf_reg_state *old_arg, *cur_arg;
+ struct bpf_reg_state not_init = { .type = NOT_INIT };
+
+ old_arg = i < old->out_stack_arg_cnt ?
+ &old->stack_arg_regs[i] : ¬_init;
+ cur_arg = i < cur->out_stack_arg_cnt ?
+ &cur->stack_arg_regs[i] : ¬_init;
+ if (!regsafe(env, old_arg, cur_arg, idmap, exact))
+ return false;
+ }
+
+ return true;
+}
+
static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
struct bpf_idmap *idmap)
{
@@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
return false;
+ if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
+ return false;
+
return true;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3bafb7ad2ba7..4ba7510bc87c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1362,6 +1362,18 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st
return -ENOMEM;
dst->allocated_stack = src->allocated_stack;
+
+ /* copy stack args state */
+ n = src->out_stack_arg_cnt;
+ if (n) {
+ dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n,
+ sizeof(struct bpf_reg_state),
+ GFP_KERNEL_ACCOUNT);
+ if (!dst->stack_arg_regs)
+ return -ENOMEM;
+ }
+
+ dst->out_stack_arg_cnt = src->out_stack_arg_cnt;
return 0;
}
@@ -1403,6 +1415,23 @@ static int grow_stack_state(struct bpf_verifier_env *env, struct bpf_func_state
return 0;
}
+static int grow_stack_arg_slots(struct bpf_verifier_env *env,
+ struct bpf_func_state *state, int cnt)
+{
+ size_t old_n = state->out_stack_arg_cnt;
+
+ if (old_n >= cnt)
+ return 0;
+
+ state->stack_arg_regs = realloc_array(state->stack_arg_regs, old_n, cnt,
+ sizeof(struct bpf_reg_state));
+ if (!state->stack_arg_regs)
+ return -ENOMEM;
+
+ state->out_stack_arg_cnt = cnt;
+ return 0;
+}
+
/* Acquire a pointer id from the env and update the state->refs to include
* this new pointer reference.
* On success, returns a valid pointer id to associate with the register
@@ -1565,6 +1594,7 @@ static void free_func_state(struct bpf_func_state *state)
{
if (!state)
return;
+ kfree(state->stack_arg_regs);
kfree(state->stack);
kfree(state);
}
@@ -4050,6 +4080,103 @@ static int check_stack_write(struct bpf_verifier_env *env,
return err;
}
+/*
+ * Write a value to the outgoing stack arg area.
+ * off is a negative offset from r11 (e.g. -8 for arg6, -16 for arg7).
+ */
+static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_state *state,
+ int off, struct bpf_reg_state *value_reg)
+{
+ int max_stack_arg_regs = MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS;
+ struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+ int spi = -off / BPF_REG_SIZE - 1;
+ struct bpf_reg_state *arg;
+ int err;
+
+ if (spi >= max_stack_arg_regs) {
+ verbose(env, "stack arg write offset %d exceeds max %d stack args\n",
+ off, max_stack_arg_regs);
+ return -EINVAL;
+ }
+
+ err = grow_stack_arg_slots(env, state, spi + 1);
+ if (err)
+ return err;
+
+ /* Track the max outgoing stack arg slot count. */
+ if (spi + 1 > subprog->max_out_stack_arg_cnt)
+ subprog->max_out_stack_arg_cnt = spi + 1;
+
+ if (value_reg) {
+ state->stack_arg_regs[spi] = *value_reg;
+ } else {
+ /* BPF_ST: store immediate, treat as scalar */
+ arg = &state->stack_arg_regs[spi];
+ arg->type = SCALAR_VALUE;
+ __mark_reg_known(arg, env->prog->insnsi[env->insn_idx].imm);
+ }
+ state->no_stack_arg_load = true;
+ return 0;
+}
+
+/*
+ * Read a value from the incoming stack arg area.
+ * off is a positive offset from r11 (e.g. +8 for arg6, +16 for arg7).
+ */
+static int check_stack_arg_read(struct bpf_verifier_env *env, struct bpf_func_state *state,
+ int off, int dst_regno)
+{
+ struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+ struct bpf_verifier_state *vstate = env->cur_state;
+ int spi = off / BPF_REG_SIZE - 1;
+ struct bpf_func_state *caller, *cur;
+ struct bpf_reg_state *arg;
+
+ if (state->no_stack_arg_load) {
+ verbose(env, "r11 load must be before any r11 store or call insn\n");
+ return -EINVAL;
+ }
+
+ if (spi + 1 > bpf_in_stack_arg_cnt(subprog)) {
+ verbose(env, "invalid read from stack arg off %d depth %d\n",
+ off, bpf_in_stack_arg_cnt(subprog) * BPF_REG_SIZE);
+ return -EACCES;
+ }
+
+ caller = vstate->frame[vstate->curframe - 1];
+ arg = &caller->stack_arg_regs[spi];
+ cur = vstate->frame[vstate->curframe];
+ cur->regs[dst_regno] = *arg;
+ return 0;
+}
+
+static int check_outgoing_stack_args(struct bpf_verifier_env *env, struct bpf_func_state *caller,
+ int nargs)
+{
+ int i, spi;
+
+ for (i = MAX_BPF_FUNC_REG_ARGS; i < nargs; i++) {
+ spi = i - MAX_BPF_FUNC_REG_ARGS;
+ if (spi >= caller->out_stack_arg_cnt ||
+ caller->stack_arg_regs[spi].type == NOT_INIT) {
+ verbose(env, "caller expects %d args, stack arg%d is not initialized\n",
+ nargs, spi + 1);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static struct bpf_reg_state *get_func_arg_reg(struct bpf_func_state *caller,
+ struct bpf_reg_state *regs, int arg)
+{
+ if (arg < MAX_BPF_FUNC_REG_ARGS)
+ return ®s[arg + 1];
+
+ return &caller->stack_arg_regs[arg - MAX_BPF_FUNC_REG_ARGS];
+}
+
static int check_map_access_type(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
int off, int size, enum bpf_access_type type)
{
@@ -6217,10 +6344,20 @@ static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
bool strict_alignment_once, bool is_ldsx,
bool allow_trust_mismatch, const char *ctx)
{
+ struct bpf_verifier_state *vstate = env->cur_state;
+ struct bpf_func_state *state = vstate->frame[vstate->curframe];
struct bpf_reg_state *regs = cur_regs(env);
enum bpf_reg_type src_reg_type;
int err;
+ /* Handle stack arg read */
+ if (is_stack_arg_ldx(insn)) {
+ err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+ if (err)
+ return err;
+ return check_stack_arg_read(env, state, insn->off, insn->dst_reg);
+ }
+
/* check src operand */
err = check_reg_arg(env, insn->src_reg, SRC_OP);
if (err)
@@ -6249,10 +6386,20 @@ static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
static int check_store_reg(struct bpf_verifier_env *env, struct bpf_insn *insn,
bool strict_alignment_once)
{
+ struct bpf_verifier_state *vstate = env->cur_state;
+ struct bpf_func_state *state = vstate->frame[vstate->curframe];
struct bpf_reg_state *regs = cur_regs(env);
enum bpf_reg_type dst_reg_type;
int err;
+ /* Handle stack arg write */
+ if (is_stack_arg_stx(insn)) {
+ err = check_reg_arg(env, insn->src_reg, SRC_OP);
+ if (err)
+ return err;
+ return check_stack_arg_write(env, state, insn->off, regs + insn->src_reg);
+ }
+
/* check src1 operand */
err = check_reg_arg(env, insn->src_reg, SRC_OP);
if (err)
@@ -8860,6 +9007,14 @@ static void clear_caller_saved_regs(struct bpf_verifier_env *env,
}
}
+static void invalidate_outgoing_stack_args(struct bpf_func_state *state)
+{
+ int i, nslots = state->out_stack_arg_cnt;
+
+ for (i = 0; i < nslots; i++)
+ state->stack_arg_regs[i].type = NOT_INIT;
+}
+
typedef int (*set_callee_state_fn)(struct bpf_verifier_env *env,
struct bpf_func_state *caller,
struct bpf_func_state *callee,
@@ -8922,6 +9077,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs)
{
struct bpf_subprog_info *sub = subprog_info(env, subprog);
+ struct bpf_func_state *caller = cur_func(env);
struct bpf_verifier_log *log = &env->log;
u32 i;
int ret;
@@ -8930,13 +9086,16 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
if (ret)
return ret;
+ ret = check_outgoing_stack_args(env, caller, sub->arg_cnt);
+ if (ret)
+ return ret;
+
/* check that BTF function arguments match actual types that the
* verifier sees.
*/
for (i = 0; i < sub->arg_cnt; i++) {
argno_t argno = argno_from_arg(i + 1);
- u32 regno = i + 1;
- struct bpf_reg_state *reg = ®s[regno];
+ struct bpf_reg_state *reg = get_func_arg_reg(caller, regs, i);
struct bpf_subprog_arg_info *arg = &sub->args[i];
if (arg->arg_type == ARG_ANYTHING) {
@@ -9124,6 +9283,8 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
int *insn_idx)
{
struct bpf_verifier_state *state = env->cur_state;
+ struct bpf_subprog_info *caller_info;
+ u16 callee_incoming, stack_arg_cnt;
struct bpf_func_state *caller;
int err, subprog, target_insn;
@@ -9177,6 +9338,16 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
return 0;
}
+ /*
+ * Track caller's total stack arg count (incoming + max outgoing).
+ * This is needed so the JIT knows how much stack arg space to allocate.
+ */
+ caller_info = &env->subprog_info[caller->subprogno];
+ callee_incoming = bpf_in_stack_arg_cnt(&env->subprog_info[subprog]);
+ stack_arg_cnt = bpf_in_stack_arg_cnt(caller_info) + callee_incoming;
+ if (stack_arg_cnt > caller_info->stack_arg_cnt)
+ caller_info->stack_arg_cnt = stack_arg_cnt;
+
/* for regular function entry setup new frame and continue
* from that frame.
*/
@@ -9534,6 +9705,7 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
* bpf_throw, this will be done by copy_verifier_state for extra frames. */
free_func_state(callee);
state->frame[state->curframe--] = NULL;
+ invalidate_outgoing_stack_args(caller);
/* for callbacks widen imprecise scalars to make programs like below verify:
*
@@ -16961,6 +17133,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
return check_store_reg(env, insn, false);
case BPF_ST: {
+ /* Handle stack arg write (store immediate) */
+ if (is_stack_arg_st(insn)) {
+ struct bpf_verifier_state *vstate = env->cur_state;
+ struct bpf_func_state *state = vstate->frame[vstate->curframe];
+
+ return check_stack_arg_write(env, state, insn->off, NULL);
+ }
+
enum bpf_reg_type dst_reg_type;
err = check_reg_arg(env, insn->dst_reg, SRC_OP);
@@ -16995,6 +17175,7 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
}
}
mark_reg_scratched(env, BPF_REG_0);
+ cur_func(env)->no_stack_arg_load = true;
if (insn->src_reg == BPF_PSEUDO_CALL)
return check_func_call(env, insn, &env->insn_idx);
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
@@ -18110,7 +18291,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
goto out;
}
}
- for (i = BPF_REG_1; i <= sub->arg_cnt; i++) {
+ for (i = BPF_REG_1; i <= min_t(u32, sub->arg_cnt, MAX_BPF_FUNC_REG_ARGS); i++) {
arg = &sub->args[i - BPF_REG_1];
reg = ®s[i];
@@ -18153,6 +18334,12 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
goto out;
}
}
+ if (env->prog->type == BPF_PROG_TYPE_EXT && sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS) {
+ verbose(env, "freplace programs with >%d args not supported yet\n",
+ MAX_BPF_FUNC_REG_ARGS);
+ ret = -EINVAL;
+ goto out;
+ }
} else {
/* if main BPF program has associated BTF info, validate that
* it's matching expected signature, and otherwise mark BTF
@@ -18160,8 +18347,11 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
*/
if (env->prog->aux->func_info_aux) {
ret = btf_prepare_func_args(env, 0);
- if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX)
+ if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX) {
env->prog->aux->func_info_aux[0].unreliable = true;
+ sub->arg_cnt = 1;
+ sub->stack_arg_cnt = 0;
+ }
}
/* 1st arg to a function */
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (4 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 16:17 ` Alexei Starovoitov
2026-05-11 5:33 ` [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
` (17 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Move stack slot index (spi) and frame number out of the flags field
in bpf_jmp_history_entry into dedicated bitfields. This simplifies
the encoding and makes room for new flags.
Previously, spi and frame were packed into the lower 9 bits of the
12-bit flags field (3 bits frame + 6 bits spi), with INSN_F_STACK_ACCESS
at BIT(9) and INSN_F_DST/SRC_REG_STACK at BIT(10)/BIT(11).
But this has no room for an INSN_F_* flag for stack arguments.
To resolve this issue, bpf_jmp_history_entry field idx is narrowed to
20 bits (sufficient for insn indices up to 1M), and the freed bits hold
spi (6 bits) and frame (3 bits) as dedicated struct fields. The flags
enum is simplified accordingly:
INSN_F_STACK_ACCESS -> BIT(0)
INSN_F_DST_REG_STACK -> BIT(1)
INSN_F_SRC_REG_STACK -> BIT(2)
which allows more room for additional INSN_F_* flags.
bpf_push_jmp_history() now takes explicit spi and frame parameters
instead of encoding them into flags. The insn_stack_access_flags(),
insn_stack_access_spi(), and insn_stack_access_frameno() helpers are
removed.
No functional change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/bpf_verifier.h | 34 ++++++++++++++--------------------
kernel/bpf/backtrack.c | 24 +++++++++---------------
kernel/bpf/states.c | 2 +-
kernel/bpf/verifier.c | 23 +++++++++++------------
4 files changed, 35 insertions(+), 48 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index f9020a4ea005..adf00585a627 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -435,31 +435,22 @@ struct bpf_func_state {
#define MAX_CALL_FRAMES 8
-/* instruction history flags, used in bpf_jmp_history_entry.flags field */
+/* instruction history flags, used in bpf_jmp_history_entry.flags field.
+ * Frame number and SPI are stored in dedicated fields of bpf_jmp_history_entry.
+ */
enum {
- /* instruction references stack slot through PTR_TO_STACK register;
- * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8)
- * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512,
- * 8 bytes per slot, so slot index (spi) is [0, 63])
- */
- INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */
-
- INSN_F_SPI_MASK = 0x3f, /* 6 bits */
- INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */
+ INSN_F_STACK_ACCESS = BIT(0),
- INSN_F_STACK_ACCESS = BIT(9),
-
- INSN_F_DST_REG_STACK = BIT(10), /* dst_reg is PTR_TO_STACK */
- INSN_F_SRC_REG_STACK = BIT(11), /* src_reg is PTR_TO_STACK */
- /* total 12 bits are used now. */
+ INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
+ INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
};
-static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES);
-static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8);
-
struct bpf_jmp_history_entry {
- u32 idx;
/* insn idx can't be bigger than 1 million */
+ u32 idx : 20;
+ u32 frame : 3; /* stack access frame number */
+ u32 spi : 6; /* stack slot index (0..63) */
+ u32 : 3;
u32 prev_idx : 20;
/* special INSN_F_xxx flags */
u32 flags : 12;
@@ -469,6 +460,9 @@ struct bpf_jmp_history_entry {
u64 linked_regs;
};
+static_assert(MAX_CALL_FRAMES <= (1 << 3));
+static_assert(MAX_BPF_STACK / 8 <= (1 << 6));
+
/* Maximum number of register states that can exist at once */
#define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES)
struct bpf_verifier_state {
@@ -1180,7 +1174,7 @@ struct list_head *bpf_explored_state(struct bpf_verifier_env *env, int idx);
void bpf_free_verifier_state(struct bpf_verifier_state *state, bool free_self);
void bpf_free_backedges(struct bpf_scc_visit *visit);
int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur,
- int insn_flags, u64 linked_regs);
+ int insn_flags, int spi, int frame, u64 linked_regs);
void bpf_bt_sync_linked_regs(struct backtrack_state *bt, struct bpf_jmp_history_entry *hist);
void bpf_mark_reg_not_init(const struct bpf_verifier_env *env,
struct bpf_reg_state *reg);
diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
index 854731dc93fe..5e93e57fb7ae 100644
--- a/kernel/bpf/backtrack.c
+++ b/kernel/bpf/backtrack.c
@@ -9,7 +9,7 @@
/* for any branch, call, exit record the history of jmps in the given state */
int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur,
- int insn_flags, u64 linked_regs)
+ int insn_flags, int spi, int frame, u64 linked_regs)
{
u32 cnt = cur->jmp_history_cnt;
struct bpf_jmp_history_entry *p;
@@ -25,6 +25,8 @@ int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state
env, "insn history: insn_idx %d cur flags %x new flags %x",
env->insn_idx, env->cur_hist_ent->flags, insn_flags);
env->cur_hist_ent->flags |= insn_flags;
+ env->cur_hist_ent->spi = spi;
+ env->cur_hist_ent->frame = frame;
verifier_bug_if(env->cur_hist_ent->linked_regs != 0, env,
"insn history: insn_idx %d linked_regs: %#llx",
env->insn_idx, env->cur_hist_ent->linked_regs);
@@ -43,6 +45,8 @@ int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state
p->idx = env->insn_idx;
p->prev_idx = env->prev_insn_idx;
p->flags = insn_flags;
+ p->spi = spi;
+ p->frame = frame;
p->linked_regs = linked_regs;
cur->jmp_history_cnt = cnt;
env->cur_hist_ent = p;
@@ -64,16 +68,6 @@ static bool is_atomic_fetch_insn(const struct bpf_insn *insn)
(insn->imm & BPF_FETCH);
}
-static int insn_stack_access_spi(int insn_flags)
-{
- return (insn_flags >> INSN_F_SPI_SHIFT) & INSN_F_SPI_MASK;
-}
-
-static int insn_stack_access_frameno(int insn_flags)
-{
- return insn_flags & INSN_F_FRAMENO_MASK;
-}
-
/* Backtrack one insn at a time. If idx is not at the top of recorded
* history then previous instruction came from straight line execution.
* Return -ENOENT if we exhausted all instructions within given state.
@@ -353,8 +347,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
* that [fp - off] slot contains scalar that needs to be
* tracked with precision
*/
- spi = insn_stack_access_spi(hist->flags);
- fr = insn_stack_access_frameno(hist->flags);
+ spi = hist->spi;
+ fr = hist->frame;
bpf_bt_set_frame_slot(bt, fr, spi);
} else if (class == BPF_STX || class == BPF_ST) {
if (bt_is_reg_set(bt, dreg))
@@ -366,8 +360,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
/* scalars can only be spilled into stack */
if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
return 0;
- spi = insn_stack_access_spi(hist->flags);
- fr = insn_stack_access_frameno(hist->flags);
+ spi = hist->spi;
+ fr = hist->frame;
if (!bt_is_frame_slot_set(bt, fr, spi))
return 0;
bt_clear_frame_slot(bt, fr, spi);
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index c249eb40c6d6..45d86bfe3b68 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -1400,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
*/
err = 0;
if (bpf_is_jmp_point(env, env->insn_idx))
- err = bpf_push_jmp_history(env, cur, 0, 0);
+ err = bpf_push_jmp_history(env, cur, 0, 0, 0, 0);
err = err ? : propagate_precision(env, &sl->state, cur, NULL);
if (err)
return err;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4ba7510bc87c..3e65dd0edbf9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3198,11 +3198,6 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
return __check_reg_arg(env, state->regs, regno, t);
}
-static int insn_stack_access_flags(int frameno, int spi)
-{
- return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno;
-}
-
static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
{
env->insn_aux_data[idx].indirect_target = true;
@@ -3517,7 +3512,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
int i, slot = -off - 1, spi = slot / BPF_REG_SIZE, err;
struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
struct bpf_reg_state *reg = NULL;
- int insn_flags = insn_stack_access_flags(state->frameno, spi);
+ int insn_flags = INSN_F_STACK_ACCESS;
+ int hist_spi = spi, hist_frame = state->frameno;
/* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0,
* so it's aligned access and [off, off + size) are within stack limits
@@ -3613,7 +3609,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
}
if (insn_flags)
- return bpf_push_jmp_history(env, env->cur_state, insn_flags, 0);
+ return bpf_push_jmp_history(env, env->cur_state, insn_flags,
+ hist_spi, hist_frame, 0);
return 0;
}
@@ -3809,7 +3806,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
int i, slot = -off - 1, spi = slot / BPF_REG_SIZE;
struct bpf_reg_state *reg;
u8 *stype, type;
- int insn_flags = insn_stack_access_flags(reg_state->frameno, spi);
+ int insn_flags = INSN_F_STACK_ACCESS;
+ int hist_spi = spi, hist_frame = reg_state->frameno;
stype = reg_state->stack[spi].slot_type;
reg = ®_state->stack[spi].spilled_ptr;
@@ -3940,7 +3938,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
insn_flags = 0; /* we are not restoring spilled register */
}
if (insn_flags)
- return bpf_push_jmp_history(env, env->cur_state, insn_flags, 0);
+ return bpf_push_jmp_history(env, env->cur_state, insn_flags,
+ hist_spi, hist_frame, 0);
return 0;
}
@@ -15903,7 +15902,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
}
if (insn_flags) {
- err = bpf_push_jmp_history(env, this_branch, insn_flags, 0);
+ err = bpf_push_jmp_history(env, this_branch, insn_flags, 0, 0, 0);
if (err)
return err;
}
@@ -15967,7 +15966,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
if (dst_reg->type == SCALAR_VALUE && dst_reg->id)
collect_linked_regs(env, this_branch, dst_reg->id, &linked_regs);
if (linked_regs.cnt > 1) {
- err = bpf_push_jmp_history(env, this_branch, 0, linked_regs_pack(&linked_regs));
+ err = bpf_push_jmp_history(env, this_branch, 0, 0, 0, linked_regs_pack(&linked_regs));
if (err)
return err;
}
@@ -17273,7 +17272,7 @@ static int do_check(struct bpf_verifier_env *env)
}
if (bpf_is_jmp_point(env, env->insn_idx)) {
- err = bpf_push_jmp_history(env, state, 0, 0);
+ err = bpf_push_jmp_history(env, state, 0, 0, 0, 0);
if (err)
return err;
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (5 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:33 ` [PATCH bpf-next v3 08/24] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
` (16 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Extend the precision marking and backtracking infrastructure to
support stack argument slots (r11-based accesses). Without this,
precision demands for scalar values passed through stack arguments
are silently dropped, which could allow the verifier to incorrectly
prune states with different constant values in stack arg slots.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/bpf_verifier.h | 8 +++++
kernel/bpf/backtrack.c | 58 +++++++++++++++++++++++++++++++++++-
kernel/bpf/verifier.c | 32 ++++++++++++++++----
3 files changed, 92 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index adf00585a627..338e54011d9d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -443,6 +443,8 @@ enum {
INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
+
+ INSN_F_STACK_ARG_ACCESS = BIT(3),
};
struct bpf_jmp_history_entry {
@@ -839,6 +841,7 @@ struct backtrack_state {
u32 frame;
u32 reg_masks[MAX_CALL_FRAMES];
u64 stack_masks[MAX_CALL_FRAMES];
+ u8 stack_arg_masks[MAX_CALL_FRAMES];
};
struct bpf_id_pair {
@@ -1237,6 +1240,11 @@ static inline void bpf_bt_set_frame_slot(struct backtrack_state *bt, u32 frame,
bt->stack_masks[frame] |= 1ull << slot;
}
+static inline void bt_set_frame_stack_arg_slot(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+ bt->stack_arg_masks[frame] |= 1 << slot;
+}
+
static inline bool bt_is_frame_reg_set(struct backtrack_state *bt, u32 frame, u32 reg)
{
return bt->reg_masks[frame] & (1 << reg);
diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
index 5e93e57fb7ae..2e4ae0ef0860 100644
--- a/kernel/bpf/backtrack.c
+++ b/kernel/bpf/backtrack.c
@@ -129,11 +129,21 @@ static inline u32 bt_empty(struct backtrack_state *bt)
int i;
for (i = 0; i <= bt->frame; i++)
- mask |= bt->reg_masks[i] | bt->stack_masks[i];
+ mask |= bt->reg_masks[i] | bt->stack_masks[i] | bt->stack_arg_masks[i];
return mask == 0;
}
+static inline void bt_clear_frame_stack_arg_slot(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+ bt->stack_arg_masks[frame] &= ~(1 << slot);
+}
+
+static inline bool bt_is_frame_stack_arg_slot_set(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+ return bt->stack_arg_masks[frame] & (1 << slot);
+}
+
static inline int bt_subprog_enter(struct backtrack_state *bt)
{
if (bt->frame == MAX_CALL_FRAMES - 1) {
@@ -194,6 +204,11 @@ static inline u64 bt_stack_mask(struct backtrack_state *bt)
return bt->stack_masks[bt->frame];
}
+static inline u8 bt_stack_arg_mask(struct backtrack_state *bt)
+{
+ return bt->stack_arg_masks[bt->frame];
+}
+
static inline bool bt_is_reg_set(struct backtrack_state *bt, u32 reg)
{
return bt->reg_masks[bt->frame] & (1 << reg);
@@ -335,6 +350,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
return 0;
bt_clear_reg(bt, load_reg);
+ if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
+ spi = hist->spi;
+ /*
+ * Stack arg read: callee reads from r11+off, but
+ * the data lives in the caller's stack_arg_regs.
+ * Set the mask in the caller frame so precision
+ * is marked in the caller's slot at the callee
+ * entry checkpoint.
+ */
+ bt_set_frame_stack_arg_slot(bt, bt->frame - 1, spi);
+ return 0;
+ }
+
/* scalars can only be spilled into stack w/o losing precision.
* Load from any other memory can be zero extended.
* The desire to keep that precision is already indicated
@@ -357,6 +385,17 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
* encountered a case of pointer subtraction.
*/
return -ENOTSUPP;
+
+ if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
+ spi = hist->spi;
+ if (!bt_is_frame_stack_arg_slot_set(bt, bt->frame, spi))
+ return 0;
+ bt_clear_frame_stack_arg_slot(bt, bt->frame, spi);
+ if (class == BPF_STX)
+ bt_set_reg(bt, sreg);
+ return 0;
+ }
+
/* scalars can only be spilled into stack */
if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
return 0;
@@ -425,6 +464,12 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
bpf_bt_set_frame_reg(bt, bt->frame - 1, i);
}
}
+ if (bt_stack_arg_mask(bt)) {
+ verifier_bug(env,
+ "static subprog leftover stack arg slots %x",
+ bt_stack_arg_mask(bt));
+ return -EFAULT;
+ }
if (bt_subprog_exit(bt))
return -EFAULT;
return 0;
@@ -895,6 +940,17 @@ int bpf_mark_chain_precision(struct bpf_verifier_env *env,
*changed = true;
}
}
+ for (i = 0; i < func->out_stack_arg_cnt; i++) {
+ if (!bt_is_frame_stack_arg_slot_set(bt, fr, i))
+ continue;
+ reg = &func->stack_arg_regs[i];
+ if (reg->type != SCALAR_VALUE || reg->precise) {
+ bt_clear_frame_stack_arg_slot(bt, fr, i);
+ } else {
+ reg->precise = true;
+ *changed = true;
+ }
+ }
if (env->log.level & BPF_LOG_LEVEL2) {
fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN,
bt_frame_reg_mask(bt, fr));
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3e65dd0edbf9..0a0157b0972a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -292,6 +292,11 @@ static int arg_from_argno(argno_t a)
return -1;
}
+static int arg_idx_from_argno(argno_t a)
+{
+ return arg_from_argno(a) - 1;
+}
+
static const char *btf_type_name(const struct btf *btf, u32 id)
{
return btf_name_by_offset(btf, btf_type_by_id(btf, id)->name_off);
@@ -4115,7 +4120,8 @@ static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_s
__mark_reg_known(arg, env->prog->insnsi[env->insn_idx].imm);
}
state->no_stack_arg_load = true;
- return 0;
+ return bpf_push_jmp_history(env, env->cur_state,
+ INSN_F_STACK_ARG_ACCESS, spi, 0, 0);
}
/*
@@ -4146,7 +4152,17 @@ static int check_stack_arg_read(struct bpf_verifier_env *env, struct bpf_func_st
arg = &caller->stack_arg_regs[spi];
cur = vstate->frame[vstate->curframe];
cur->regs[dst_regno] = *arg;
- return 0;
+ return bpf_push_jmp_history(env, env->cur_state,
+ INSN_F_STACK_ARG_ACCESS, spi, 0, 0);
+}
+
+static int mark_stack_arg_precision(struct bpf_verifier_env *env, int arg_idx)
+{
+ struct bpf_func_state *caller = cur_func(env);
+ int spi = arg_idx - MAX_BPF_FUNC_REG_ARGS;
+
+ bt_set_frame_stack_arg_slot(&env->bt, caller->frameno, spi);
+ return mark_chain_precision_batch(env, env->cur_state);
}
static int check_outgoing_stack_args(struct bpf_verifier_env *env, struct bpf_func_state *caller,
@@ -6875,8 +6891,14 @@ static int check_mem_size_reg(struct bpf_verifier_env *env,
}
err = check_helper_mem_access(env, mem_reg, mem_argno, reg_umax(size_reg),
access_type, zero_size_allowed, meta);
- if (!err)
- err = mark_chain_precision(env, reg_from_argno(size_argno));
+ if (!err) {
+ int regno = reg_from_argno(size_argno);
+
+ if (regno >= 0)
+ err = mark_chain_precision(env, regno);
+ else
+ err = mark_stack_arg_precision(env, arg_idx_from_argno(size_argno));
+ }
return err;
}
@@ -7325,7 +7347,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, struct bpf_reg_state *
struct bpf_kfunc_call_arg_meta *meta)
{
const struct btf_type *t;
- u32 arg_idx = arg_from_argno(argno) - 1;
+ u32 arg_idx = arg_idx_from_argno(argno);
int spi, err, i, nr_slots, btf_id;
if (reg->type != PTR_TO_STACK) {
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 08/24] bpf: Refactor record_call_access() to extract per-arg logic
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (6 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
` (15 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Extract the per-argument FP-derived pointer handling from
record_call_access() into a new record_arg_access() helper.
The existing loop body — checking arg_is_fp, querying stack access
bytes, and calling record_stack_access/record_imprecise — will be
reused for stack argument slots in the next patch. Factoring it out
now avoids duplicating the logic.
No functional change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/liveness.c | 65 +++++++++++++++++++++++++------------------
1 file changed, 38 insertions(+), 27 deletions(-)
diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index 58197d73b120..c81337dfbfc7 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -1343,6 +1343,42 @@ static int record_load_store_access(struct bpf_verifier_env *env,
return 0;
}
+static int record_arg_access(struct bpf_verifier_env *env,
+ struct func_instance *instance,
+ struct bpf_insn *insn,
+ struct arg_track *at, int arg_idx,
+ int insn_idx)
+{
+ int depth = instance->depth;
+ int frame = at->frame;
+ int err = 0;
+ s64 bytes;
+
+ if (!arg_is_fp(at))
+ return 0;
+
+ if (bpf_helper_call(insn)) {
+ bytes = bpf_helper_stack_access_bytes(env, insn, arg_idx, insn_idx);
+ } else if (bpf_pseudo_kfunc_call(insn)) {
+ bytes = bpf_kfunc_stack_access_bytes(env, insn, arg_idx, insn_idx);
+ } else {
+ for (int f = 0; f <= depth; f++) {
+ err = mark_stack_read(instance, f, insn_idx, SPIS_ALL);
+ if (err)
+ return err;
+ }
+ return 0;
+ }
+ if (bytes == 0)
+ return 0;
+
+ if (frame >= 0 && frame <= depth)
+ err = record_stack_access(instance, at, bytes, frame, insn_idx);
+ else if (frame == ARG_IMPRECISE)
+ err = record_imprecise(instance, at->mask, insn_idx);
+ return err;
+}
+
/* Record stack access for a given 'at' state of helper/kfunc 'insn' */
static int record_call_access(struct bpf_verifier_env *env,
struct func_instance *instance,
@@ -1350,9 +1386,8 @@ static int record_call_access(struct bpf_verifier_env *env,
int insn_idx)
{
struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
- int depth = instance->depth;
struct bpf_call_summary cs;
- int r, err = 0, num_params = 5;
+ int r, err, num_params = 5;
if (bpf_pseudo_call(insn))
return 0;
@@ -1361,31 +1396,7 @@ static int record_call_access(struct bpf_verifier_env *env,
num_params = cs.num_params;
for (r = BPF_REG_1; r < BPF_REG_1 + num_params; r++) {
- int frame = at[r].frame;
- s64 bytes;
-
- if (!arg_is_fp(&at[r]))
- continue;
-
- if (bpf_helper_call(insn)) {
- bytes = bpf_helper_stack_access_bytes(env, insn, r - 1, insn_idx);
- } else if (bpf_pseudo_kfunc_call(insn)) {
- bytes = bpf_kfunc_stack_access_bytes(env, insn, r - 1, insn_idx);
- } else {
- for (int f = 0; f <= depth; f++) {
- err = mark_stack_read(instance, f, insn_idx, SPIS_ALL);
- if (err)
- return err;
- }
- return 0;
- }
- if (bytes == 0)
- continue;
-
- if (frame >= 0 && frame <= depth)
- err = record_stack_access(instance, &at[r], bytes, frame, insn_idx);
- else if (frame == ARG_IMPRECISE)
- err = record_imprecise(instance, at[r].mask, insn_idx);
+ err = record_arg_access(env, instance, insn, &at[r], r - 1, insn_idx);
if (err)
return err;
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (7 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 08/24] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:34 ` Alexei Starovoitov
2026-05-11 5:33 ` [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs Yonghong Song
` (14 subsequent siblings)
23 siblings, 2 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
BPF_REG_PARAMS (R11) is at index MAX_BPF_REG, which is beyond the
register tracking arrays in const_fold.c and liveness.c. Handle it
explicitly to avoid out-of-bounds accesses.
Extend the arg tracking dataflow to cover stack arg slots. Otherwise,
pointers passed through stack args are invisible to liveness, causing
the pointed-to stack slots to be incorrectly poisoned.
Extend the at_out tracking array to MAX_AT_TRACK_REGS (registers
plus stack arg slots) so that outgoing stack arg stores are tracked
alongside registers. Add a separate at_stack_arg_entry array in
arg_track_xfer() to restore FP-derived values on incoming stack arg
reads.
Extend record_call_access() to check stack arg slots for FP-derived
pointers at kfunc call sites, reusing the record_arg_access() helper
extracted in the previous patch. Pass stack arg state from caller to
callee in analyze_subprog() so that callees can track pointers received
through stack args, hence avoid poisoning.
Skip stack arg instructions in record_load_store_access(). Stack arg
STX uses dst_reg=BPF_REG_PARAMS (index 11), but at[11] is repurposed
to track the value stored in stack arg slot 0. Without the skip, if a
prior stack arg STX stored an FP-derived pointer (e.g., fp-64) into
slot 0, a subsequent stack arg STX would read that FP-derived value as
the base pointer and spuriously mark a regular stack slot (e.g., fp-72
from -64 + -8) as accessed in the liveness bitmap.
Extend arg_track_log() to log state transitions for outgoing stack arg
slots at indices MAX_BPF_REG through MAX_AT_TRACK_REGS-1. Without this,
changes to at_out[11..17] caused by stack arg store instructions are
silently omitted from BPF_LOG_LEVEL2 output. For example, when a
caller passes fp-64 through a stack argument:
subprog#0:
10: (bf) r6 = r10
11: (07) r6 += -64
12: (7b) *(u64 *)(r11 -8) = r6
sa0: none -> fp0-64
13: (85) call pc+5
Without the fix, the "sa0: none -> fp0-64" transition at insn 12
would not appear.
Extend print_subprog_arg_access() to include stack arg slots in the
per-instruction FP-derived state dump. For example:
subprog#0:
12: (7b) *(u64 *)(r11 - 8) = r6 // r6=fp0-64
13: (85) call pc+5 // r6=fp0-64 sa0=fp0-64
Without the fix, the "sa0=fp0-64" annotation at insn 13 would not
appear, making it harder to debug liveness analysis for programs
that pass FP-derived pointers through stack arguments.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/const_fold.c | 8 +++
kernel/bpf/liveness.c | 115 +++++++++++++++++++++++++++++++++++-----
2 files changed, 109 insertions(+), 14 deletions(-)
diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
index db73c4740b1e..b2a19acadb91 100644
--- a/kernel/bpf/const_fold.c
+++ b/kernel/bpf/const_fold.c
@@ -58,6 +58,14 @@ static void const_reg_xfer(struct bpf_verifier_env *env, struct const_arg_info *
u8 opcode = BPF_OP(insn->code) | BPF_SRC(insn->code);
int r;
+ /* Stack arg stores (r11-based) are outside the tracked register set. */
+ if (is_stack_arg_st(insn) || is_stack_arg_stx(insn))
+ return;
+ if (is_stack_arg_ldx(insn)) {
+ ci_out[insn->dst_reg] = unknown;
+ return;
+ }
+
switch (class) {
case BPF_ALU:
case BPF_ALU64:
diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index c81337dfbfc7..6527631de758 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -610,6 +610,24 @@ enum arg_track_state {
/* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
#define MAX_ARG_SPILL_SLOTS 64
+/* Track stack arg slots: outgoing starts at -(i+1)*8, incoming at +(i+1)*8 */
+#define MAX_STACK_ARG_SLOTS (MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS)
+
+/*
+ * Combined register + stack arg tracking: R0-R10 at indices 0-10,
+ * outgoing stack arg slots at indices MAX_BPF_REG..MAX_BPF_REG+6.
+ */
+#define MAX_AT_TRACK_REGS (MAX_BPF_REG + MAX_STACK_ARG_SLOTS)
+
+static int stack_arg_off_to_slot(s16 off)
+{
+ int aoff = off < 0 ? -off : off;
+
+ if (aoff / 8 > MAX_STACK_ARG_SLOTS)
+ return -1;
+ return aoff / 8 - 1;
+}
+
static bool arg_is_visited(const struct arg_track *at)
{
return at->frame != ARG_UNVISITED;
@@ -1032,6 +1050,21 @@ static void arg_track_log(struct bpf_verifier_env *env, struct bpf_insn *insn, i
verbose(env, "\tr%d: ", i); verbose_arg_track(env, &at_in[i]);
verbose(env, " -> "); verbose_arg_track(env, &at_out[i]);
}
+ /* Log outgoing stack arg slot transitions at indices MAX_BPF_REG..MAX_AT_TRACK_REGS-1 */
+ for (i = 0; i < MAX_STACK_ARG_SLOTS; i++) {
+ int ai = MAX_BPF_REG + i;
+
+ if (arg_track_eq(&at_out[ai], &at_in[ai]))
+ continue;
+ if (!printed) {
+ verbose(env, "%3d: ", idx);
+ bpf_verbose_insn(env, insn);
+ bpf_vlog_reset(&env->log, env->log.end_pos - 1);
+ printed = true;
+ }
+ verbose(env, "\tsa%d: ", i); verbose_arg_track(env, &at_in[ai]);
+ verbose(env, " -> "); verbose_arg_track(env, &at_out[ai]);
+ }
for (i = 0; i < MAX_ARG_SPILL_SLOTS; i++) {
if (arg_track_eq(&at_stack_out[i], &at_stack_in[i]))
continue;
@@ -1062,6 +1095,7 @@ static bool can_be_local_fp(int depth, int regno, struct arg_track *at)
static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
int insn_idx,
struct arg_track *at_out, struct arg_track *at_stack_out,
+ const struct arg_track *at_stack_arg_entry,
struct func_instance *instance,
u32 *callsites)
{
@@ -1071,8 +1105,24 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
struct arg_track *dst = &at_out[insn->dst_reg];
struct arg_track *src = &at_out[insn->src_reg];
struct arg_track none = { .frame = ARG_NONE };
- int r;
-
+ int r, slot;
+
+ /* Handle stack arg stores and loads. */
+ if (is_stack_arg_st(insn) || is_stack_arg_stx(insn)) {
+ slot = stack_arg_off_to_slot(insn->off);
+ if (slot >= 0) {
+ if (is_stack_arg_stx(insn))
+ at_out[MAX_BPF_REG + slot] = at_out[insn->src_reg];
+ else
+ at_out[MAX_BPF_REG + slot] = none;
+ }
+ return;
+ }
+ if (is_stack_arg_ldx(insn)) {
+ slot = stack_arg_off_to_slot(insn->off);
+ at_out[insn->dst_reg] = (slot >= 0) ? at_stack_arg_entry[slot] : none;
+ return;
+ }
if (class == BPF_ALU64 && BPF_SRC(insn->code) == BPF_K) {
if (code == BPF_MOV) {
*dst = none;
@@ -1297,6 +1347,14 @@ static int record_load_store_access(struct bpf_verifier_env *env,
struct arg_track resolved, *ptr;
int oi;
+ /*
+ * Stack arg insns use dst_reg=BPF_REG_PARAMS(11), but at[11] tracks
+ * the value stored in stack arg slot 0, not a memory base pointer.
+ * Skip to avoid misinterpreting that value as an FP-derived pointer.
+ */
+ if (is_stack_arg_stx(insn) || is_stack_arg_st(insn) || is_stack_arg_ldx(insn))
+ return 0;
+
switch (class) {
case BPF_LDX:
ptr = &at[insn->src_reg];
@@ -1395,11 +1453,18 @@ static int record_call_access(struct bpf_verifier_env *env,
if (bpf_get_call_summary(env, insn, &cs))
num_params = cs.num_params;
- for (r = BPF_REG_1; r < BPF_REG_1 + num_params; r++) {
+ for (r = BPF_REG_1; r < BPF_REG_1 + min(num_params, MAX_BPF_FUNC_REG_ARGS); r++) {
err = record_arg_access(env, instance, insn, &at[r], r - 1, insn_idx);
if (err)
return err;
}
+
+ for (r = 0; r < MAX_STACK_ARG_SLOTS && r < num_params - MAX_BPF_FUNC_REG_ARGS; r++) {
+ err = record_arg_access(env, instance, insn, &at[MAX_BPF_REG + r],
+ r + MAX_BPF_FUNC_REG_ARGS, insn_idx);
+ if (err)
+ return err;
+ }
return 0;
}
@@ -1456,7 +1521,7 @@ static int find_callback_subprog(struct bpf_verifier_env *env,
/* Per-subprog intermediate state kept alive across analysis phases */
struct subprog_at_info {
- struct arg_track (*at_in)[MAX_BPF_REG];
+ struct arg_track (*at_in)[MAX_AT_TRACK_REGS];
int len;
};
@@ -1490,6 +1555,9 @@ static void print_subprog_arg_access(struct bpf_verifier_env *env,
for (r = 0; r < MAX_BPF_REG - 1; r++)
if (arg_is_fp(&info->at_in[i][r]))
has_extra = true;
+ for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+ if (arg_is_fp(&info->at_in[i][MAX_BPF_REG + r]))
+ has_extra = true;
}
if (is_ldx_stx_call) {
for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
@@ -1514,6 +1582,12 @@ static void print_subprog_arg_access(struct bpf_verifier_env *env,
verbose(env, " r%d=", r);
verbose_arg_track(env, &info->at_in[i][r]);
}
+ for (r = 0; r < MAX_STACK_ARG_SLOTS; r++) {
+ if (!arg_is_fp(&info->at_in[i][MAX_BPF_REG + r]))
+ continue;
+ verbose(env, " sa%d=", r);
+ verbose_arg_track(env, &info->at_in[i][MAX_BPF_REG + r]);
+ }
}
if (is_ldx_stx_call) {
@@ -1554,10 +1628,11 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
int end = env->subprog_info[subprog + 1].start;
int po_end = env->subprog_info[subprog + 1].postorder_start;
int len = end - start;
- struct arg_track (*at_in)[MAX_BPF_REG] = NULL;
- struct arg_track at_out[MAX_BPF_REG];
+ struct arg_track (*at_in)[MAX_AT_TRACK_REGS] = NULL;
+ struct arg_track at_out[MAX_AT_TRACK_REGS];
struct arg_track (*at_stack_in)[MAX_ARG_SPILL_SLOTS] = NULL;
struct arg_track *at_stack_out = NULL;
+ struct arg_track at_stack_arg_entry[MAX_STACK_ARG_SLOTS];
struct arg_track unvisited = { .frame = ARG_UNVISITED };
struct arg_track none = { .frame = ARG_NONE };
bool changed;
@@ -1576,19 +1651,19 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
goto err_free;
for (i = 0; i < len; i++) {
- for (r = 0; r < MAX_BPF_REG; r++)
+ for (r = 0; r < MAX_AT_TRACK_REGS; r++)
at_in[i][r] = unvisited;
for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
at_stack_in[i][r] = unvisited;
}
- for (r = 0; r < MAX_BPF_REG; r++)
+ for (r = 0; r < MAX_AT_TRACK_REGS; r++)
at_in[0][r] = none;
/* Entry: R10 is always precisely the current frame's FP */
at_in[0][BPF_REG_FP] = arg_single(depth, 0);
- /* R1-R5: from caller or ARG_NONE for main */
+ /* R1-R5 and outgoing stack args: from caller or ARG_NONE for main */
if (callee_entry) {
for (r = BPF_REG_1; r <= BPF_REG_5; r++)
at_in[0][r] = callee_entry[r];
@@ -1598,6 +1673,10 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
at_stack_in[0][r] = none;
+ /* Entry: incoming stack args from caller, or ARG_NONE for main */
+ for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+ at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
+
if (env->log.level & BPF_LOG_LEVEL2)
verbose(env, "subprog#%d: analyzing (depth %d)...\n", subprog, depth);
@@ -1616,7 +1695,8 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
memcpy(at_out, at_in[i], sizeof(at_out));
memcpy(at_stack_out, at_stack_in[i], MAX_ARG_SPILL_SLOTS * sizeof(*at_stack_out));
- arg_track_xfer(env, insn, idx, at_out, at_stack_out, instance, callsites);
+ arg_track_xfer(env, insn, idx, at_out, at_stack_out,
+ at_stack_arg_entry, instance, callsites);
arg_track_log(env, insn, idx, at_in[i], at_stack_in[i], at_out, at_stack_out);
/* Propagate to successors within this subprogram */
@@ -1630,7 +1710,7 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
continue;
ti = target - start;
- for (r = 0; r < MAX_BPF_REG; r++)
+ for (r = 0; r < MAX_AT_TRACK_REGS; r++)
changed |= arg_track_join(env, idx, target, r,
&at_in[ti][r], at_out[r]);
@@ -1685,12 +1765,15 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
return err;
}
-/* Return true if any of R1-R5 is derived from a frame pointer. */
+/* Return true if any of R1-R5 or stack args is derived from a frame pointer. */
static bool has_fp_args(struct arg_track *args)
{
for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
if (args[r].frame != ARG_NONE)
return true;
+ for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+ if (arg_is_fp(&args[MAX_BPF_REG + r]))
+ return true;
return false;
}
@@ -1814,7 +1897,7 @@ static int analyze_subprog(struct bpf_verifier_env *env,
/* For each reachable call site in the subprog, recurse into callees */
for (int p = po_start; p < po_end; p++) {
int idx = env->cfg.insn_postorder[p];
- struct arg_track callee_args[BPF_REG_5 + 1];
+ struct arg_track callee_args[MAX_AT_TRACK_REGS] = {};
struct arg_track none = { .frame = ARG_NONE };
struct bpf_insn *insn = &insns[idx];
struct func_instance *callee_instance;
@@ -1829,9 +1912,11 @@ static int analyze_subprog(struct bpf_verifier_env *env,
if (callee < 0)
continue;
- /* Build entry args: R1-R5 from at_in at call site */
+ /* Build entry args: R1-R5 and stack args from at_in at call site */
for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
callee_args[r] = info[subprog].at_in[j][r];
+ for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+ callee_args[MAX_BPF_REG + r] = info[subprog].at_in[j][MAX_BPF_REG + r];
} else if (bpf_calls_callback(env, idx)) {
callee = find_callback_subprog(env, insn, idx, &caller_reg, &cb_callee_reg);
if (callee == -2) {
@@ -1853,6 +1938,8 @@ static int analyze_subprog(struct bpf_verifier_env *env,
for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
callee_args[r] = none;
+ for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+ callee_args[MAX_BPF_REG + r] = none;
callee_args[cb_callee_reg] = info[subprog].at_in[j][caller_reg];
} else {
continue;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (8 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:33 ` [PATCH bpf-next v3 11/24] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
` (13 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
The interpreter does not understand the bpf register r11
(BPF_REG_PARAMS) used for stack arguments. So reject interpreter
usage if stack arguments are used either in the main program or
any subprogram.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/core.c | 2 +-
kernel/bpf/fixups.c | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index ae10b9ca018d..958d86f0beac 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
goto finalize;
if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
- bpf_prog_has_kfunc_call(fp))
+ bpf_prog_has_kfunc_call(fp) || (env && env->subprog_info[0].stack_arg_cnt))
jit_needed = true;
if (!bpf_prog_select_interpreter(fp))
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index ba86039789fd..19056016eed8 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1407,6 +1407,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
return -EINVAL;
}
+ for (i = 1; i < env->subprog_cnt; i++) {
+ if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
+ verbose(env, "stack args are not supported in non-JITed programs\n");
+ return -EINVAL;
+ }
+ }
if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
/* When JIT fails the progs with bpf2bpf calls and tail_calls
* have to be rejected, since interpreter doesn't support them yet.
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 11/24] bpf: Prepare architecture JIT support for stack arguments
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (9 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs Yonghong Song
@ 2026-05-11 5:33 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 12/24] bpf: Enable r11 based insns Yonghong Song
` (12 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:33 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
Add bpf_jit_supports_stack_args() as a weak function defaulting to
false. Architectures that implement JIT support for stack arguments
override it to return true.
Reject BPF functions with more than 5 parameters at verification
time if the architecture does not support stack arguments.
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
include/linux/filter.h | 1 +
kernel/bpf/btf.c | 8 +++++++-
kernel/bpf/core.c | 5 +++++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 918d9b34eac6..a515a9769078 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1184,6 +1184,7 @@ bool bpf_jit_inlines_helper_call(s32 imm);
bool bpf_jit_supports_subprog_tailcalls(void);
bool bpf_jit_supports_percpu_insn(void);
bool bpf_jit_supports_kfunc_call(void);
+bool bpf_jit_supports_stack_args(void);
bool bpf_jit_supports_far_kfunc_call(void);
bool bpf_jit_supports_exceptions(void);
bool bpf_jit_supports_ptr_xchg(void);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index ec3fb8c8f4ee..3d8080eba544 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7886,8 +7886,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
MAX_BPF_FUNC_ARGS, tname, nargs);
return -EFAULT;
}
- if (nargs > MAX_BPF_FUNC_REG_ARGS)
+ if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+ if (!bpf_jit_supports_stack_args()) {
+ bpf_log(log, "JIT does not support function %s() with %d args\n",
+ tname, nargs);
+ return -EFAULT;
+ }
sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+ }
if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 958d86f0beac..e6b836f846eb 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -3217,6 +3217,11 @@ bool __weak bpf_jit_supports_kfunc_call(void)
return false;
}
+bool __weak bpf_jit_supports_stack_args(void)
+{
+ return false;
+}
+
bool __weak bpf_jit_supports_far_kfunc_call(void)
{
return false;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 12/24] bpf: Enable r11 based insns
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (10 preceding siblings ...)
2026-05-11 5:33 ` [PATCH bpf-next v3 11/24] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 13/24] bpf: Support stack arguments for kfunc calls Yonghong Song
` (11 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
BPF_REG_PARAMS (r11) is used for stack argument accesses and
the following are only insns with r11 presence:
- load incoming stack arg
- store register to outgoing stack arg
- store immediate to outgoing stack arg
The detailed insn format can be found in is_stack_arg_ldx/st/stx()
helpers. After this patch, stack arg ldx/st/stx insns become valid
for kernel and these insns can be properly checked by verifier.
The LLVM compiler [1] implemented the above BPF_REG_PARAMS insns.
[1] https://github.com/llvm/llvm-project/pull/189060
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0a0157b0972a..5c07f54679fe 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18001,11 +18001,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
return err;
for (i = 0; i < insn_cnt; i++, insn++) {
- if (insn->dst_reg >= MAX_BPF_REG) {
+ if (insn->dst_reg >= MAX_BPF_REG &&
+ !is_stack_arg_st(insn) && !is_stack_arg_stx(insn)) {
verbose(env, "R%d is invalid\n", insn->dst_reg);
return -EINVAL;
}
- if (insn->src_reg >= MAX_BPF_REG) {
+ if (insn->src_reg >= MAX_BPF_REG && !is_stack_arg_ldx(insn)) {
verbose(env, "R%d is invalid\n", insn->src_reg);
return -EINVAL;
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 13/24] bpf: Support stack arguments for kfunc calls
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (11 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 12/24] bpf: Enable r11 based insns Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable Yonghong Song
` (10 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Extend the stack argument mechanism to kfunc calls, allowing kfuncs
with more than 5 parameters to receive additional arguments via the
r11-based stack arg area.
For kfuncs, the caller is a BPF program and the callee is a kernel
function. The BPF program writes outgoing args at negative r11
offsets, following the same convention as BPF-to-BPF calls:
Outgoing: r11 - 8 (arg6), ..., r11 - N*8 (last arg)
The following is an example:
int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
...
kfunc1(a1, a2, a3, a4, a5, a6, a7, a8);
...
kfunc2(a1, a2, a3, a4, a5, a6, a7, a8, a9);
...
}
Caller (foo), generated by llvm
===============================
Incoming (positive offsets):
r11+8: [incoming arg 6]
r11+16: [incoming arg 7]
Outgoing for kfunc1 (negative offsets):
r11-8: [outgoing arg 6]
r11-16: [outgoing arg 7]
r11-24: [outgoing arg 8]
Outgoing for kfunc2 (negative offsets):
r11-8: [outgoing arg 6]
r11-16: [outgoing arg 7]
r11-24: [outgoing arg 8]
r11-32: [outgoing arg 9]
Later JIT will marshal outgoing arguments to the native calling
convention for kfunc1() and kfunc2().
For kfunc calls where stack args are used as constant or size
parameters, a mark_stack_arg_precision() helper is used to propagate
precision and do proper backtracking.
There are two places where meta->release_regno needs to keep
regno for later releasing the reference. Also, 'cur_aux(env)->arg_prog = regno'
is also keeping regno for later fixup. Since stack arguments don't have a valid
register number (regno is negative), these three cases are rejected for now
if the argument is on the stack.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 78 +++++++++++++++++++++++++++++++++----------
1 file changed, 61 insertions(+), 17 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5c07f54679fe..d596e6bd9a81 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11157,14 +11157,12 @@ bool bpf_is_kfunc_pkt_changing(struct bpf_kfunc_call_arg_meta *meta)
}
static enum kfunc_ptr_arg_type
-get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
- struct bpf_kfunc_call_arg_meta *meta,
+get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, struct bpf_func_state *caller,
+ struct bpf_reg_state *regs, struct bpf_kfunc_call_arg_meta *meta,
const struct btf_type *t, const struct btf_type *ref_t,
const char *ref_tname, const struct btf_param *args,
int arg, int nargs, argno_t argno, struct bpf_reg_state *reg)
{
- u32 regno = arg + 1;
- struct bpf_reg_state *regs = cur_regs(env);
bool arg_mem_size = false;
if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
@@ -11173,8 +11171,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
return KF_ARG_PTR_TO_CTX;
if (arg + 1 < nargs &&
- (is_kfunc_arg_mem_size(meta->btf, &args[arg + 1], ®s[regno + 1]) ||
- is_kfunc_arg_const_mem_size(meta->btf, &args[arg + 1], ®s[regno + 1])))
+ (is_kfunc_arg_mem_size(meta->btf, &args[arg + 1], get_func_arg_reg(caller, regs, arg + 1)) ||
+ is_kfunc_arg_const_mem_size(meta->btf, &args[arg + 1], get_func_arg_reg(caller, regs, arg + 1))))
arg_mem_size = true;
/* In this function, we verify the kfunc's BTF as per the argument type,
@@ -11839,6 +11837,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
int insn_idx)
{
const char *func_name = meta->func_name, *ref_tname;
+ struct bpf_func_state *caller = cur_func(env);
+ struct bpf_reg_state *regs = cur_regs(env);
const struct btf *btf = meta->btf;
const struct btf_param *args;
struct btf_record *rec;
@@ -11847,21 +11847,31 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
args = (const struct btf_param *)(meta->func_proto + 1);
nargs = btf_type_vlen(meta->func_proto);
- if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+ if (nargs > MAX_BPF_FUNC_ARGS) {
verbose(env, "Function %s has %d > %d args\n", func_name, nargs,
- MAX_BPF_FUNC_REG_ARGS);
+ MAX_BPF_FUNC_ARGS);
return -EINVAL;
}
+ if (nargs > MAX_BPF_FUNC_REG_ARGS && !bpf_jit_supports_stack_args()) {
+ verbose(env, "JIT does not support kfunc %s() with %d args\n",
+ func_name, nargs);
+ return -ENOTSUPP;
+ }
+
+ ret = check_outgoing_stack_args(env, caller, nargs);
+ if (ret)
+ return ret;
/* Check that BTF function arguments match actual types that the
* verifier sees.
*/
for (i = 0; i < nargs; i++) {
- struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[i + 1];
+ struct bpf_reg_state *reg = get_func_arg_reg(caller, regs, i);
const struct btf_type *t, *ref_t, *resolve_ret;
enum bpf_arg_type arg_type = ARG_DONTCARE;
argno_t argno = argno_from_arg(i + 1);
- u32 regno = i + 1, ref_id, type_size;
+ int regno = reg_from_argno(argno);
+ u32 ref_id, type_size;
bool is_ret_buf_sz = false;
int kf_arg_type;
@@ -11871,6 +11881,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
verifier_bug(env, "Only 1 prog->aux argument supported per-kfunc");
return -EFAULT;
}
+ if (regno < 0) {
+ verbose(env, "%s prog->aux cannot be a stack argument\n",
+ reg_arg_name(env, argno));
+ return -EINVAL;
+ }
meta->arg_prog = true;
cur_aux(env)->arg_prog = regno;
continue;
@@ -11897,7 +11912,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
reg_arg_name(env, argno));
return -EINVAL;
}
- ret = mark_chain_precision(env, regno);
+ if (regno >= 0)
+ ret = mark_chain_precision(env, regno);
+ else
+ ret = mark_stack_arg_precision(env, i);
if (ret < 0)
return ret;
meta->arg_constant.found = true;
@@ -11922,7 +11940,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
}
meta->r0_size = reg->var_off.value;
- ret = mark_chain_precision(env, regno);
+ if (regno >= 0)
+ ret = mark_chain_precision(env, regno);
+ else
+ ret = mark_stack_arg_precision(env, i);
if (ret)
return ret;
}
@@ -11950,14 +11971,21 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
return -EFAULT;
}
meta->ref_obj_id = reg->ref_obj_id;
- if (is_kfunc_release(meta))
+ if (is_kfunc_release(meta)) {
+ if (regno < 0) {
+ verbose(env, "%s release arg cannot be a stack argument\n",
+ reg_arg_name(env, argno));
+ return -EINVAL;
+ }
meta->release_regno = regno;
+ }
}
ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
ref_tname = btf_name_by_offset(btf, ref_t->name_off);
- kf_arg_type = get_kfunc_ptr_arg_type(env, meta, t, ref_t, ref_tname, args, i, nargs, argno, reg);
+ kf_arg_type = get_kfunc_ptr_arg_type(env, caller, regs, meta, t, ref_t, ref_tname,
+ args, i, nargs, argno, reg);
if (kf_arg_type < 0)
return kf_arg_type;
@@ -12107,6 +12135,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
dynptr_arg_type |= DYNPTR_TYPE_FILE;
} else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_file_discard]) {
dynptr_arg_type |= DYNPTR_TYPE_FILE | OBJ_RELEASE;
+ if (regno < 0) {
+ verbose(env, "%s release arg cannot be a stack argument\n",
+ reg_arg_name(env, argno));
+ return -EINVAL;
+ }
meta->release_regno = regno;
} else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_clone] &&
(dynptr_arg_type & MEM_UNINIT)) {
@@ -12261,9 +12294,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
break;
case KF_ARG_PTR_TO_MEM_SIZE:
{
- struct bpf_reg_state *buff_reg = ®s[regno];
+ struct bpf_reg_state *buff_reg = reg;
const struct btf_param *buff_arg = &args[i];
- struct bpf_reg_state *size_reg = ®s[regno + 1];
+ struct bpf_reg_state *size_reg = get_func_arg_reg(caller, regs, i + 1);
const struct btf_param *size_arg = &args[i + 1];
argno_t next_argno = argno_from_arg(i + 2);
@@ -13167,8 +13200,19 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
clear_all_pkt_pointers(env);
nargs = btf_type_vlen(meta.func_proto);
+ if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+ struct bpf_func_state *caller = cur_func(env);
+ struct bpf_subprog_info *caller_info = &env->subprog_info[caller->subprogno];
+ u16 out_stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+ u16 stack_arg_cnt = bpf_in_stack_arg_cnt(caller_info) + out_stack_arg_cnt;
+
+ if (stack_arg_cnt > caller_info->stack_arg_cnt)
+ caller_info->stack_arg_cnt = stack_arg_cnt;
+ invalidate_outgoing_stack_args(caller);
+ }
+
args = (const struct btf_param *)(meta.func_proto + 1);
- for (i = 0; i < nargs; i++) {
+ for (i = 0; i < min_t(int, nargs, MAX_BPF_FUNC_REG_ARGS); i++) {
u32 regno = i + 1;
t = btf_type_skip_modifiers(desc_btf, args[i].type, NULL);
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (12 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 13/24] bpf: Support stack arguments for kfunc calls Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:34 ` [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile() Yonghong Song
` (9 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Tail calls are deprecated and will be replaced by indirect calls
in the future. Reject programs that combine tail calls with stack
arguments rather than adding complexity for a deprecated feature.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d596e6bd9a81..2f2814035f37 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5267,14 +5267,23 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
* this info will be utilized by JIT so that we will be preserving the
* tail call counter throughout bpf2bpf calls combined with tailcalls
*/
- if (tail_call_reachable)
+ if (tail_call_reachable) {
for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller) {
if (subprog[tmp].is_exception_cb) {
verbose(env, "cannot tail call within exception cb\n");
return -EINVAL;
}
+ if (bpf_in_stack_arg_cnt(&subprog[tmp])) {
+ verbose(env, "tail_calls are not allowed in programs with stack args\n");
+ return -EINVAL;
+ }
subprog[tmp].tail_call_reachable = true;
}
+ } else if (!idx && subprog[0].has_tail_call && bpf_in_stack_arg_cnt(&subprog[0])) {
+ verbose(env, "tail_calls are not allowed in programs with stack args\n");
+ return -EINVAL;
+ }
+
if (subprog[0].tail_call_reachable)
env->prog->aux->tail_call_reachable = true;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile()
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (13 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 16:38 ` Alexei Starovoitov
2026-05-11 5:34 ` [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments Yonghong Song
` (8 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Add a bpf_subprog_info parameter to bpf_int_jit_compile() so that
JIT backends can access per-subprog metadata directly, such as
stack_arg_cnt and arg_cnt, without encoding them into bpf_prog_aux.
Previously, subprog properties needed by the JIT were copied into
bpf_prog_aux fields (e.g., out_stack_arg_cnt). Passing
bpf_subprog_info directly avoids adding more fields to bpf_prog_aux
and gives JIT backends a single place to query subprog metadata.
In jit_subprogs(), each subprog's corresponding bpf_subprog_info
is passed to bpf_int_jit_compile(). For the single-subprog path
in bpf_prog_jit_compile(), &env->subprog_info[0] is passed when
env is available, or NULL when env is NULL. The env is NULL for
internal kernel BPF programs created via bpf_prog_create() (e.g.,
the PTP packet classifier in net/core/ptp_classifier.c) which
bypass the verifier and never use stack arguments.
All architecture JIT implementations are updated to accept the new
parameter. No functional change for architectures that do not yet
use subprog_info.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
arch/arc/net/bpf_jit_core.c | 3 ++-
arch/arm/net/bpf_jit_32.c | 3 ++-
arch/arm64/net/bpf_jit_comp.c | 3 ++-
arch/loongarch/net/bpf_jit.c | 3 ++-
arch/mips/net/bpf_jit_comp.c | 3 ++-
arch/parisc/net/bpf_jit_core.c | 3 ++-
arch/powerpc/net/bpf_jit_comp.c | 3 ++-
arch/riscv/net/bpf_jit_core.c | 3 ++-
arch/s390/net/bpf_jit_comp.c | 3 ++-
arch/sparc/net/bpf_jit_comp_64.c | 3 ++-
arch/x86/net/bpf_jit_comp.c | 3 ++-
arch/x86/net/bpf_jit_comp32.c | 3 ++-
include/linux/filter.h | 4 +++-
kernel/bpf/core.c | 9 ++++++---
kernel/bpf/fixups.c | 4 ++--
15 files changed, 35 insertions(+), 18 deletions(-)
diff --git a/arch/arc/net/bpf_jit_core.c b/arch/arc/net/bpf_jit_core.c
index 639a2736f029..771de9a8147d 100644
--- a/arch/arc/net/bpf_jit_core.c
+++ b/arch/arc/net/bpf_jit_core.c
@@ -1400,7 +1400,8 @@ static struct bpf_prog *do_extra_pass(struct bpf_prog *prog)
* (re)locations involved that their addresses are not known
* during the first run.
*/
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
vm_dump(prog);
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 9ede81afbc50..17692af0f11f 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -2148,7 +2148,8 @@ bool bpf_jit_needs_zext(void)
return true;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
struct bpf_binary_header *header;
struct jit_ctx ctx;
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0816c40fc7af..c9bdeef31ab9 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2003,7 +2003,8 @@ struct arm64_jit_data {
struct jit_ctx ctx;
};
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
int image_size, prog_size, extable_size, extable_align, extable_offset;
struct bpf_binary_header *header;
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 24913dc7f4e8..416dabe9947f 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -2166,7 +2166,8 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
return ret < 0 ? ret : ret * LOONGARCH_INSN_SIZE;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
bool extra_pass = false;
u8 *image_ptr, *ro_image_ptr;
diff --git a/arch/mips/net/bpf_jit_comp.c b/arch/mips/net/bpf_jit_comp.c
index 6ee4abe6a1f7..2f85c90bba07 100644
--- a/arch/mips/net/bpf_jit_comp.c
+++ b/arch/mips/net/bpf_jit_comp.c
@@ -909,7 +909,8 @@ bool bpf_jit_needs_zext(void)
return true;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
struct bpf_binary_header *header = NULL;
struct jit_context ctx;
diff --git a/arch/parisc/net/bpf_jit_core.c b/arch/parisc/net/bpf_jit_core.c
index 172770132440..b24275aef8db 100644
--- a/arch/parisc/net/bpf_jit_core.c
+++ b/arch/parisc/net/bpf_jit_core.c
@@ -41,7 +41,8 @@ bool bpf_jit_needs_zext(void)
return true;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
unsigned int prog_size = 0, extable_size = 0;
bool extra_pass = false;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 53ab97ad6074..5a414da47e4e 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -162,7 +162,8 @@ static void priv_stack_check_guard(void __percpu *priv_stack_ptr, int alloc_size
}
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp,
+ struct bpf_subprog_info *subprog_info)
{
u32 proglen;
u32 alloclen;
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index 4365d07aaf54..3ae8adf0f697 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -41,7 +41,8 @@ bool bpf_jit_needs_zext(void)
return true;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
unsigned int prog_size = 0, extable_size = 0;
bool extra_pass = false;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 14eaaa5b2185..bc29f405161a 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -2337,7 +2337,8 @@ static struct bpf_binary_header *bpf_jit_alloc(struct bpf_jit *jit,
/*
* Compile eBPF program "fp"
*/
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp,
+ struct bpf_subprog_info *subprog_info)
{
struct bpf_binary_header *header;
struct s390_jit_data *jit_data;
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index 2fa0e9375127..90bc01d069fe 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1477,7 +1477,8 @@ struct sparc64_jit_data {
struct jit_ctx ctx;
};
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
struct sparc64_jit_data *jit_data;
struct bpf_binary_header *header;
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index ea9e707e8abf..e9718faa0124 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3715,7 +3715,8 @@ struct x64_jit_data {
#define MAX_PASSES 20
#define PADDING_PASSES (MAX_PASSES - 5)
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
struct bpf_binary_header *rw_header = NULL;
struct bpf_binary_header *header = NULL;
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 852baf2e4db4..e89558a27834 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2518,7 +2518,8 @@ bool bpf_jit_needs_zext(void)
return true;
}
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
struct bpf_binary_header *header = NULL;
int proglen, oldproglen = 0;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index a515a9769078..3576ee523bee 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -37,6 +37,7 @@ struct xdp_buff;
struct sock_reuseport;
struct ctl_table;
struct ctl_table_header;
+struct bpf_subprog_info;
/* ArgX, context and stack frame pointer register positions. Note,
* Arg1, Arg2, Arg3, etc are used as argument mappings of function
@@ -1177,7 +1178,8 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
(void *)__bpf_call_base)
-struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog);
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info);
void bpf_jit_compile(struct bpf_prog *prog);
bool bpf_jit_needs_zext(void);
bool bpf_jit_inlines_helper_call(s32 imm);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index e6b836f846eb..dcba6f166a91 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2547,7 +2547,8 @@ static struct bpf_prog *bpf_prog_jit_compile(struct bpf_verifier_env *env, struc
struct bpf_insn_aux_data *orig_insn_aux;
if (!bpf_prog_need_blind(prog))
- return bpf_int_jit_compile(env, prog);
+ return bpf_int_jit_compile(env, prog,
+ env ? &env->subprog_info[0] : NULL);
if (env) {
/*
@@ -2569,7 +2570,8 @@ static struct bpf_prog *bpf_prog_jit_compile(struct bpf_verifier_env *env, struc
if (IS_ERR(prog))
goto out_restore;
- prog = bpf_int_jit_compile(env, prog);
+ prog = bpf_int_jit_compile(env, prog,
+ env ? &env->subprog_info[0] : NULL);
if (prog->jited) {
bpf_jit_prog_release_other(prog, orig_prog);
if (env)
@@ -3145,7 +3147,8 @@ const struct bpf_func_proto bpf_tail_call_proto = {
* It is encouraged to implement bpf_int_jit_compile() instead, so that
* eBPF and implicitly also cBPF can get JITed!
*/
-struct bpf_prog * __weak bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
+struct bpf_prog * __weak bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog,
+ struct bpf_subprog_info *subprog_info)
{
return prog;
}
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 19056016eed8..62f1823f089e 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1162,7 +1162,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
func[i]->aux->token = prog->aux->token;
if (!i)
func[i]->aux->exception_boundary = env->seen_exception;
- func[i] = bpf_int_jit_compile(env, func[i]);
+ func[i] = bpf_int_jit_compile(env, func[i], &env->subprog_info[i]);
if (!func[i]->jited) {
err = -ENOTSUPP;
goto out_free;
@@ -1206,7 +1206,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
}
for (i = 0; i < env->subprog_cnt; i++) {
old_bpf_func = func[i]->bpf_func;
- tmp = bpf_int_jit_compile(env, func[i]);
+ tmp = bpf_int_jit_compile(env, func[i], &env->subprog_info[i]);
if (tmp != func[i] || func[i]->bpf_func != old_bpf_func) {
verbose(env, "JIT doesn't support bpf-to-bpf calls\n");
err = -ENOTSUPP;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (14 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile() Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 16:39 ` Alexei Starovoitov
2026-05-11 5:34 ` [PATCH bpf-next v3 17/24] selftests/bpf: Add tests for BPF function " Yonghong Song
` (7 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
Add x86_64 JIT support for BPF functions and kfuncs with more than
5 arguments. The extra arguments are passed through a stack area
addressed by register r11 (BPF_REG_PARAMS) in BPF bytecode,
which the JIT translates to native code.
The JIT follows the x86-64 calling convention for both BPF-to-BPF
and kfunc calls:
- Arg 6 is passed in the R9 register
- Args 7+ are passed on the stack
Incoming arg 6 (BPF r11+8) is translated to a MOV from R9 rather
than a memory load. Incoming args 7+ (BPF r11+16, r11+24, ...) map
directly to [rbp + 16], [rbp + 24], ..., matching the x86-64 stack
layout after CALL + PUSH RBP, so no offset adjustment is needed.
tail_call_reachable is rejected by the verifier and priv_stack is
disabled by the JIT when stack args exist, so R9 is always
available. When BPF bytecode writes to the arg-6 stack slot
(offset -8), the JIT emits a MOV into R9 instead of a memory store.
Outgoing args 7+ are placed at [rsp] in a pre-allocated area below
callee-saved registers, using:
native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16
The native x86_64 stack layout with stack arguments:
high address
+-------------------------+
| incoming stack arg N | [rbp + 16 + (N-7)*8] (from caller)
| ... |
| incoming stack arg 7 | [rbp + 16]
+-------------------------+
| return address | [rbp + 8]
| saved rbp | [rbp]
+-------------------------+
| BPF program stack | (round_up(stack_depth, 8) bytes)
+-------------------------+
| callee-saved regs | (r12, rbx, r13, r14, r15 as needed)
+-------------------------+
| outgoing arg M | [rsp + (M-7)*8]
| ... |
| outgoing arg 7 | [rsp]
+-------------------------+ rsp
low address
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
arch/x86/net/bpf_jit_comp.c | 160 ++++++++++++++++++++++++++++++++++--
1 file changed, 152 insertions(+), 8 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e9718faa0124..900ff5318729 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -10,6 +10,7 @@
#include <linux/if_vlan.h>
#include <linux/bitfield.h>
#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
#include <linux/memory.h>
#include <linux/sort.h>
#include <asm/extable.h>
@@ -390,6 +391,34 @@ static void pop_callee_regs(u8 **pprog, bool *callee_regs_used)
*pprog = prog;
}
+/* add rsp, depth */
+static void emit_add_rsp(u8 **pprog, u16 depth)
+{
+ u8 *prog = *pprog;
+
+ if (!depth)
+ return;
+ if (is_imm8(depth))
+ EMIT4(0x48, 0x83, 0xC4, depth); /* add rsp, imm8 */
+ else
+ EMIT3_off32(0x48, 0x81, 0xC4, depth); /* add rsp, imm32 */
+ *pprog = prog;
+}
+
+/* sub rsp, depth */
+static void emit_sub_rsp(u8 **pprog, u16 depth)
+{
+ u8 *prog = *pprog;
+
+ if (!depth)
+ return;
+ if (is_imm8(depth))
+ EMIT4(0x48, 0x83, 0xEC, depth); /* sub rsp, imm8 */
+ else
+ EMIT3_off32(0x48, 0x81, 0xEC, depth); /* sub rsp, imm32 */
+ *pprog = prog;
+}
+
static void emit_nops(u8 **pprog, int len)
{
u8 *prog = *pprog;
@@ -1649,7 +1678,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
return 0;
}
-static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
+static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog,
+ struct bpf_subprog_info *subprog_info, int *addrs, u8 *image,
u8 *rw_image, int oldproglen, struct jit_context *ctx, bool jmp_padding)
{
bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
@@ -1659,21 +1689,48 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
bool seen_exit = false;
u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
void __percpu *priv_frame_ptr = NULL;
+ u16 out_stack_arg_cnt, outgoing_rsp;
u64 arena_vm_start, user_vm_start;
void __percpu *priv_stack_ptr;
int i, excnt = 0;
int ilen, proglen = 0;
u8 *ip, *prog = temp;
u32 stack_depth;
+ int callee_saved_size;
+ s32 outgoing_arg_base;
int err;
stack_depth = bpf_prog->aux->stack_depth;
+ out_stack_arg_cnt = subprog_info ?
+ subprog_info->stack_arg_cnt - bpf_in_stack_arg_cnt(subprog_info) : 0;
priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
if (priv_stack_ptr) {
priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
stack_depth = 0;
}
+ /*
+ * Follow x86-64 calling convention for both BPF-to-BPF and
+ * kfunc calls:
+ * - Arg 6 is passed in R9 register
+ * - Args 7+ are passed on the stack at [rsp]
+ *
+ * Incoming arg 6 is read from R9 (BPF r11+8 → MOV from R9).
+ * Incoming args 7+ are read from [rbp + 16], [rbp + 24], ...
+ * (BPF r11+16, r11+24, ... map directly with no offset change).
+ *
+ * tail_call_reachable is rejected by the verifier and priv_stack
+ * is disabled by the JIT when stack args exist, so R9 is always
+ * available.
+ *
+ * Stack layout (high to low):
+ * [rbp + 16 + ...] incoming stack args 7+ (from caller)
+ * [rbp + 8] return address
+ * [rbp] saved rbp
+ * [rbp - prog_stack] program stack
+ * [below] callee-saved regs
+ * [below] outgoing args 7+ (= rsp)
+ */
arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
@@ -1700,6 +1757,42 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
push_r12(&prog);
push_callee_regs(&prog, callee_regs_used);
}
+
+ /* Compute callee-saved register area size. */
+ callee_saved_size = 0;
+ if (bpf_prog->aux->exception_boundary || arena_vm_start)
+ callee_saved_size += 8; /* r12 */
+ if (bpf_prog->aux->exception_boundary) {
+ callee_saved_size += 4 * 8; /* rbx, r13, r14, r15 */
+ } else {
+ int j;
+
+ for (j = 0; j < 4; j++)
+ if (callee_regs_used[j])
+ callee_saved_size += 8;
+ }
+ /*
+ * Base offset from rbp for translating BPF outgoing args 7+
+ * to native offsets. BPF uses negative offsets from r11
+ * (r11-8 for arg6, r11-16 for arg7, ...) while x86 uses
+ * positive offsets from rsp ([rsp+0] for arg7, [rsp+8] for
+ * arg8, ...). Arg 6 goes to R9 directly.
+ *
+ * The translation reverses direction:
+ * native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16
+ *
+ * Note that tail_call_reachable is guaranteed to be false when
+ * stack args exist, so tcc pushes need not be accounted for.
+ */
+ outgoing_arg_base = -(round_up(stack_depth, 8) + callee_saved_size);
+
+ /*
+ * Allocate outgoing stack arg area for args 7+ only.
+ * Arg 6 goes into r9 register, not on stack.
+ */
+ outgoing_rsp = out_stack_arg_cnt > 1 ? (out_stack_arg_cnt - 1) * 8 : 0;
+ emit_sub_rsp(&prog, outgoing_rsp);
+
if (arena_vm_start)
emit_mov_imm64(&prog, X86_REG_R12,
arena_vm_start >> 32, (u32) arena_vm_start);
@@ -1721,7 +1814,7 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
u8 b2 = 0, b3 = 0;
u8 *start_of_ldx;
s64 jmp_offset;
- s16 insn_off;
+ s32 insn_off;
u8 jmp_cond;
u8 *func;
int nops;
@@ -2134,12 +2227,27 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
EMIT1(0xC7);
goto st;
case BPF_ST | BPF_MEM | BPF_DW:
+ if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+ /* Arg 6: store immediate in r9 register */
+ emit_mov_imm64(&prog, X86_REG_R9, imm32 >> 31, (u32)imm32);
+ break;
+ }
EMIT2(add_1mod(0x48, dst_reg), 0xC7);
-st: if (is_imm8(insn->off))
- EMIT2(add_1reg(0x40, dst_reg), insn->off);
+st: insn_off = insn->off;
+ if (dst_reg == BPF_REG_PARAMS) {
+ /*
+ * Args 7+: reverse BPF negative offsets to
+ * x86 positive rsp offsets.
+ * BPF off=-16 → [rsp+0], off=-24 → [rsp+8], ...
+ */
+ insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+ dst_reg = BPF_REG_FP;
+ }
+ if (is_imm8(insn_off))
+ EMIT2(add_1reg(0x40, dst_reg), insn_off);
else
- EMIT1_off32(add_1reg(0x80, dst_reg), insn->off);
+ EMIT1_off32(add_1reg(0x80, dst_reg), insn_off);
EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(insn->code)));
break;
@@ -2149,7 +2257,17 @@ st: if (is_imm8(insn->off))
case BPF_STX | BPF_MEM | BPF_H:
case BPF_STX | BPF_MEM | BPF_W:
case BPF_STX | BPF_MEM | BPF_DW:
- emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
+ if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+ /* Arg 6: store register value in r9 */
+ EMIT_mov(X86_REG_R9, src_reg);
+ break;
+ }
+ insn_off = insn->off;
+ if (dst_reg == BPF_REG_PARAMS) {
+ insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+ dst_reg = BPF_REG_FP;
+ }
+ emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
break;
case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
@@ -2248,6 +2366,19 @@ st: if (is_imm8(insn->off))
case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
insn_off = insn->off;
+ if (src_reg == BPF_REG_PARAMS) {
+ if (insn_off == 8) {
+ /* Incoming arg 6: read from r9 */
+ EMIT_mov(dst_reg, X86_REG_R9);
+ break;
+ }
+ src_reg = BPF_REG_FP;
+ /*
+ * Incoming args 7+: native_off == bpf_off
+ * (r11+16 → [rbp+16], r11+24 → [rbp+24], ...)
+ * No offset adjustment needed.
+ */
+ }
if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
@@ -2736,6 +2867,8 @@ st: if (is_imm8(insn->off))
if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
return -EINVAL;
}
+ /* Deallocate outgoing args 7+ area. */
+ emit_add_rsp(&prog, outgoing_rsp);
if (bpf_prog->aux->exception_boundary) {
pop_callee_regs(&prog, all_callee_regs_used);
pop_r12(&prog);
@@ -3744,7 +3877,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
prog->aux->jit_data = jit_data;
}
priv_stack_ptr = prog->aux->priv_stack_ptr;
- if (!priv_stack_ptr && prog->aux->jits_use_priv_stack) {
+ /*
+ * x86-64 uses R9 for both private stack frame pointer and arg 6,
+ * so disable private stack when stack args are present.
+ */
+ if (!priv_stack_ptr && prog->aux->jits_use_priv_stack &&
+ (!subprog_info || subprog_info->stack_arg_cnt == bpf_in_stack_arg_cnt(subprog_info))) {
/* Allocate actual private stack size with verifier-calculated
* stack size plus two memory guards to protect overflow and
* underflow.
@@ -3794,7 +3932,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
for (pass = 0; pass < MAX_PASSES || image; pass++) {
if (!padding && pass >= PADDING_PASSES)
padding = true;
- proglen = do_jit(env, prog, addrs, image, rw_image, oldproglen, &ctx, padding);
+ proglen = do_jit(env, prog, subprog_info, addrs, image, rw_image, oldproglen,
+ &ctx, padding);
if (proglen <= 0) {
out_image:
image = NULL;
@@ -3911,6 +4050,11 @@ bool bpf_jit_supports_kfunc_call(void)
return true;
}
+bool bpf_jit_supports_stack_args(void)
+{
+ return true;
+}
+
void *bpf_arch_text_copy(void *dst, void *src, size_t len)
{
if (text_poke_copy(dst, src, len) == NULL)
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 17/24] selftests/bpf: Add tests for BPF function stack arguments
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (15 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 18/24] selftests/bpf: Add tests for stack argument validation Yonghong Song
` (6 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
Add selftests covering stack argument passing for both BPF-to-BPF
subprog calls and kfunc calls with more than 5 arguments. All tests
are guarded by __BPF_FEATURE_STACK_ARGUMENT and __TARGET_ARCH_x86.
BPF-to-BPF subprog call tests (stack_arg.c):
- Scalar stack args
- Pointer stack args
- Mixed pointer/scalar stack args
- Nested calls
- Dynptr stack arg
- Two callees with different stack arg counts
- Async callback
Kfunc call tests (stack_arg_kfunc.c, with bpf_testmod kfuncs):
- Scalar stack args
- Pointer stack args
- Mixed pointer/scalar stack args
- Dynptr stack arg
- Memory buffer + size pair
- Iterator
- Const string pointer
- Timer pointer
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../selftests/bpf/prog_tests/stack_arg.c | 139 ++++++++++
tools/testing/selftests/bpf/progs/stack_arg.c | 252 ++++++++++++++++++
.../selftests/bpf/progs/stack_arg_kfunc.c | 163 +++++++++++
.../selftests/bpf/test_kmods/bpf_testmod.c | 65 +++++
.../bpf/test_kmods/bpf_testmod_kfunc.h | 20 +-
5 files changed, 638 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg.c b/tools/testing/selftests/bpf/prog_tests/stack_arg.c
new file mode 100644
index 000000000000..d61bac33f809
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include <network_helpers.h>
+#include "stack_arg.skel.h"
+#include "stack_arg_kfunc.skel.h"
+
+static void run_subtest(struct bpf_program *prog, int expected)
+{
+ int err, prog_fd;
+ LIBBPF_OPTS(bpf_test_run_opts, topts,
+ .data_in = &pkt_v4,
+ .data_size_in = sizeof(pkt_v4),
+ .repeat = 1,
+ );
+
+ prog_fd = bpf_program__fd(prog);
+ err = bpf_prog_test_run_opts(prog_fd, &topts);
+ ASSERT_OK(err, "test_run");
+ ASSERT_EQ(topts.retval, expected, "retval");
+}
+
+static void test_global_many(void)
+{
+ struct stack_arg *skel;
+
+ skel = stack_arg__open();
+ if (!ASSERT_OK_PTR(skel, "open"))
+ return;
+
+ if (!skel->rodata->has_stack_arg) {
+ test__skip();
+ goto out;
+ }
+
+ if (!ASSERT_OK(stack_arg__load(skel), "load"))
+ goto out;
+
+ run_subtest(skel->progs.test_global_many_args, 36);
+
+out:
+ stack_arg__destroy(skel);
+}
+
+static void test_async_cb_many(void)
+{
+ struct stack_arg *skel;
+
+ skel = stack_arg__open();
+ if (!ASSERT_OK_PTR(skel, "open"))
+ return;
+
+ if (!skel->rodata->has_stack_arg) {
+ test__skip();
+ goto out;
+ }
+
+ if (!ASSERT_OK(stack_arg__load(skel), "load"))
+ goto out;
+
+ run_subtest(skel->progs.test_async_cb_many_args, 0);
+
+ /* Wait for the timer callback to fire and verify the result.
+ * 10+20+30+40+50+60+70+80 = 360
+ */
+ usleep(50);
+ ASSERT_EQ(skel->bss->timer_result, 360, "timer_result");
+
+out:
+ stack_arg__destroy(skel);
+}
+
+static void test_bpf2bpf(void)
+{
+ struct stack_arg *skel;
+
+ skel = stack_arg__open();
+ if (!ASSERT_OK_PTR(skel, "open"))
+ return;
+
+ if (!skel->rodata->has_stack_arg) {
+ test__skip();
+ goto out;
+ }
+
+ if (!ASSERT_OK(stack_arg__load(skel), "load"))
+ goto out;
+
+ run_subtest(skel->progs.test_bpf2bpf_ptr_stack_arg, 45);
+ run_subtest(skel->progs.test_bpf2bpf_mix_stack_args, 51);
+ run_subtest(skel->progs.test_bpf2bpf_nesting_stack_arg, 50);
+ run_subtest(skel->progs.test_bpf2bpf_dynptr_stack_arg, 69);
+ run_subtest(skel->progs.test_two_callees, 91);
+
+out:
+ stack_arg__destroy(skel);
+}
+
+static void test_kfunc(void)
+{
+ struct stack_arg_kfunc *skel;
+
+ skel = stack_arg_kfunc__open();
+ if (!ASSERT_OK_PTR(skel, "open"))
+ return;
+
+ if (!skel->rodata->has_stack_arg) {
+ test__skip();
+ goto out;
+ }
+
+ if (!ASSERT_OK(stack_arg_kfunc__load(skel), "load"))
+ goto out;
+
+ run_subtest(skel->progs.test_stack_arg_scalar, 36);
+ run_subtest(skel->progs.test_stack_arg_ptr, 45);
+ run_subtest(skel->progs.test_stack_arg_mix, 51);
+ run_subtest(skel->progs.test_stack_arg_dynptr, 69);
+ run_subtest(skel->progs.test_stack_arg_mem, 151);
+ run_subtest(skel->progs.test_stack_arg_iter, 115);
+ run_subtest(skel->progs.test_stack_arg_const_str, 15);
+ run_subtest(skel->progs.test_stack_arg_timer, 15);
+
+out:
+ stack_arg_kfunc__destroy(skel);
+}
+
+void test_stack_arg(void)
+{
+ if (test__start_subtest("global_many_args"))
+ test_global_many();
+ if (test__start_subtest("async_cb_many_args"))
+ test_async_cb_many();
+ if (test__start_subtest("bpf2bpf"))
+ test_bpf2bpf();
+ if (test__start_subtest("kfunc"))
+ test_kfunc();
+}
diff --git a/tools/testing/selftests/bpf/progs/stack_arg.c b/tools/testing/selftests/bpf/progs/stack_arg.c
new file mode 100644
index 000000000000..ab6240b997c5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg.c
@@ -0,0 +1,252 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <stdbool.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_kfuncs.h"
+
+#define CLOCK_MONOTONIC 1
+
+struct timer_elem {
+ struct bpf_timer timer;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 1);
+ __type(key, int);
+ __type(value, struct timer_elem);
+} timer_map SEC(".maps");
+
+int timer_result;
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+const volatile bool has_stack_arg = true;
+
+__noinline static int static_func_many_args(int a, int b, int c, int d,
+ int e, int f, int g, int h)
+{
+ return a + b + c + d + e + f + g + h;
+}
+
+__noinline int global_calls_many_args(int a, int b, int c)
+{
+ return static_func_many_args(a, b, c, 4, 5, 6, 7, 8);
+}
+
+SEC("tc")
+int test_global_many_args(void)
+{
+ return global_calls_many_args(1, 2, 3);
+}
+
+struct test_data {
+ long x;
+ long y;
+};
+
+/* 1 + 2 + 3 + 4 + 5 + 10 + 20 = 45 */
+__noinline static long func_with_ptr_stack_arg(long a, long b, long c, long d,
+ long e, struct test_data *p)
+{
+ return a + b + c + d + e + p->x + p->y;
+}
+
+__noinline long global_ptr_stack_arg(long a, long b, long c, long d, long e)
+{
+ struct test_data data = { .x = 10, .y = 20 };
+
+ return func_with_ptr_stack_arg(a, b, c, d, e, &data);
+}
+
+SEC("tc")
+int test_bpf2bpf_ptr_stack_arg(void)
+{
+ return global_ptr_stack_arg(1, 2, 3, 4, 5);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + 10 + 6 + 20 = 51 */
+__noinline static long func_with_mix_stack_args(long a, long b, long c, long d,
+ long e, struct test_data *p,
+ long f, struct test_data *q)
+{
+ return a + b + c + d + e + p->x + f + q->y;
+}
+
+__noinline long global_mix_stack_args(long a, long b, long c, long d, long e)
+{
+ struct test_data p = { .x = 10 };
+ struct test_data q = { .y = 20 };
+
+ return func_with_mix_stack_args(a, b, c, d, e, &p, e + 1, &q);
+}
+
+SEC("tc")
+int test_bpf2bpf_mix_stack_args(void)
+{
+ return global_mix_stack_args(1, 2, 3, 4, 5);
+}
+
+/*
+ * Nesting test: func_outer calls func_inner, both with struct pointer
+ * as stack arg.
+ *
+ * func_inner: (a+1) + (b+1) + (c+1) + (d+1) + (e+1) + p->x + p->y
+ * = 2 + 3 + 4 + 5 + 6 + 10 + 20 = 50
+ */
+__noinline static long func_inner_ptr(long a, long b, long c, long d,
+ long e, struct test_data *p)
+{
+ return a + b + c + d + e + p->x + p->y;
+}
+
+__noinline static long func_outer_ptr(long a, long b, long c, long d,
+ long e, struct test_data *p)
+{
+ return func_inner_ptr(a + 1, b + 1, c + 1, d + 1, e + 1, p);
+}
+
+__noinline long global_nesting_ptr(long a, long b, long c, long d, long e)
+{
+ struct test_data data = { .x = 10, .y = 20 };
+
+ return func_outer_ptr(a, b, c, d, e, &data);
+}
+
+SEC("tc")
+int test_bpf2bpf_nesting_stack_arg(void)
+{
+ return global_nesting_ptr(1, 2, 3, 4, 5);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + sizeof(pkt_v4) = 15 + 54 = 69 */
+__noinline static long func_with_dynptr(long a, long b, long c, long d,
+ long e, struct bpf_dynptr *ptr)
+{
+ return a + b + c + d + e + bpf_dynptr_size(ptr);
+}
+
+__noinline long global_dynptr_stack_arg(void *ctx __arg_ctx, long a, long b,
+ long c, long d)
+{
+ struct bpf_dynptr ptr;
+
+ bpf_dynptr_from_skb(ctx, 0, &ptr);
+ return func_with_dynptr(a, b, c, d, d + 1, &ptr);
+}
+
+SEC("tc")
+int test_bpf2bpf_dynptr_stack_arg(struct __sk_buff *skb)
+{
+ return global_dynptr_stack_arg(skb, 1, 2, 3, 4);
+}
+
+/* foo1: a+b+c+d+e+f+g+h */
+__noinline static int foo1(int a, int b, int c, int d,
+ int e, int f, int g, int h)
+{
+ return a + b + c + d + e + f + g + h;
+}
+
+/* foo2: a+b+c+d+e+f+g+h+i+j */
+__noinline static int foo2(int a, int b, int c, int d, int e,
+ int f, int g, int h, int i, int j)
+{
+ return a + b + c + d + e + f + g + h + i + j;
+}
+
+/* global_two_callees calls foo1 (3 stack args) and foo2 (5 stack args).
+ * The outgoing stack arg area is sized for foo2 (the larger callee).
+ * Stores for foo1 are a subset of the area used by foo2.
+ * Result: foo1(1,2,3,4,5,6,7,8) + foo2(1,2,3,4,5,6,7,8,9,10) = 36 + 55 = 91
+ *
+ * Pass a-e through so the compiler can't constant-fold the stack args away.
+ */
+__noinline int global_two_callees(int a, int b, int c, int d, int e)
+{
+ int ret;
+
+ ret = foo1(a, b, c, d, e, a + 5, a + 6, a + 7);
+ ret += foo2(a, b, c, d, e, a + 5, a + 6, a + 7, a + 8, a + 9);
+ return ret;
+}
+
+SEC("tc")
+int test_two_callees(void)
+{
+ return global_two_callees(1, 2, 3, 4, 5);
+}
+
+static int timer_cb_many_args(void *map, int *key, struct bpf_timer *timer)
+{
+ timer_result = static_func_many_args(10, 20, 30, 40, 50, 60, 70, 80);
+ return 0;
+}
+
+SEC("tc")
+int test_async_cb_many_args(void)
+{
+ struct timer_elem *elem;
+ int key = 0;
+
+ elem = bpf_map_lookup_elem(&timer_map, &key);
+ if (!elem)
+ return -1;
+
+ bpf_timer_init(&elem->timer, &timer_map, CLOCK_MONOTONIC);
+ bpf_timer_set_callback(&elem->timer, timer_cb_many_args);
+ bpf_timer_start(&elem->timer, 1, 0);
+ return 0;
+}
+
+#else
+
+const volatile bool has_stack_arg = false;
+
+SEC("tc")
+int test_global_many_args(void)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_ptr_stack_arg(void)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_mix_stack_args(void)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_nesting_stack_arg(void)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_dynptr_stack_arg(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_two_callees(void)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_async_cb_many_args(void)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
new file mode 100644
index 000000000000..fa9def876ea5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_kfuncs.h"
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+const volatile bool has_stack_arg = true;
+
+struct bpf_iter_testmod_seq {
+ u64 :64;
+ u64 :64;
+};
+
+extern int bpf_iter_testmod_seq_new(struct bpf_iter_testmod_seq *it, s64 value, int cnt) __ksym;
+extern void bpf_iter_testmod_seq_destroy(struct bpf_iter_testmod_seq *it) __ksym;
+
+struct timer_map_value {
+ struct bpf_timer timer;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 1);
+ __type(key, int);
+ __type(value, struct timer_map_value);
+} kfunc_timer_map SEC(".maps");
+
+SEC("tc")
+int test_stack_arg_scalar(struct __sk_buff *skb)
+{
+ return bpf_kfunc_call_stack_arg(1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+SEC("tc")
+int test_stack_arg_ptr(struct __sk_buff *skb)
+{
+ struct prog_test_pass1 p = { .x0 = 10, .x1 = 20 };
+
+ return bpf_kfunc_call_stack_arg_ptr(1, 2, 3, 4, 5, &p);
+}
+
+SEC("tc")
+int test_stack_arg_mix(struct __sk_buff *skb)
+{
+ struct prog_test_pass1 p = { .x0 = 10 };
+ struct prog_test_pass1 q = { .x1 = 20 };
+
+ return bpf_kfunc_call_stack_arg_mix(1, 2, 3, 4, 5, &p, 6, &q);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + sizeof(pkt_v4) = 15 + 54 = 69 */
+SEC("tc")
+int test_stack_arg_dynptr(struct __sk_buff *skb)
+{
+ struct bpf_dynptr ptr;
+
+ bpf_dynptr_from_skb(skb, 0, &ptr);
+ return bpf_kfunc_call_stack_arg_dynptr(1, 2, 3, 4, 5, &ptr);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + (1 + 2 + ... + 16) = 15 + 136 = 151 */
+SEC("tc")
+int test_stack_arg_mem(struct __sk_buff *skb)
+{
+ char buf[16] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
+
+ return bpf_kfunc_call_stack_arg_mem(1, 2, 3, 4, 5, buf, sizeof(buf));
+}
+
+/* 1 + 2 + 3 + 4 + 5 + 100 = 115 */
+SEC("tc")
+int test_stack_arg_iter(struct __sk_buff *skb)
+{
+ struct bpf_iter_testmod_seq it;
+ u64 ret;
+
+ bpf_iter_testmod_seq_new(&it, 100, 10);
+ ret = bpf_kfunc_call_stack_arg_iter(1, 2, 3, 4, 5, &it);
+ bpf_iter_testmod_seq_destroy(&it);
+ return ret;
+}
+
+const char cstr[] = "hello";
+
+/* 1 + 2 + 3 + 4 + 5 = 15 */
+SEC("tc")
+int test_stack_arg_const_str(struct __sk_buff *skb)
+{
+ return bpf_kfunc_call_stack_arg_const_str(1, 2, 3, 4, 5, cstr);
+}
+
+/* 1 + 2 + 3 + 4 + 5 = 15 */
+SEC("tc")
+int test_stack_arg_timer(struct __sk_buff *skb)
+{
+ struct timer_map_value *val;
+ int key = 0;
+
+ val = bpf_map_lookup_elem(&kfunc_timer_map, &key);
+ if (!val)
+ return 0;
+ return bpf_kfunc_call_stack_arg_timer(1, 2, 3, 4, 5, &val->timer);
+}
+
+#else
+
+const volatile bool has_stack_arg = false;
+
+SEC("tc")
+int test_stack_arg_scalar(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_ptr(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_mix(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_dynptr(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_mem(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_iter(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_const_str(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+SEC("tc")
+int test_stack_arg_timer(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index d876314a4d67..aef2f68b7e83 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -825,6 +825,63 @@ __bpf_kfunc int bpf_kfunc_call_test5(u8 a, u16 b, u32 c)
return 0;
}
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg(u64 a, u64 b, u64 c, u64 d,
+ u64 e, u64 f, u64 g, u64 h)
+{
+ return a + b + c + d + e + f + g + h;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_ptr(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct prog_test_pass1 *p)
+{
+ return a + b + c + d + e + p->x0 + p->x1;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_mix(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct prog_test_pass1 *p, u64 f,
+ struct prog_test_pass1 *q)
+{
+ return a + b + c + d + e + p->x0 + f + q->x1;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_dynptr(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct bpf_dynptr *ptr)
+{
+ const struct bpf_dynptr_kern *kern_ptr = (void *)ptr;
+
+ return a + b + c + d + e + (kern_ptr->size & 0xFFFFFF);
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_mem(u64 a, u64 b, u64 c, u64 d, u64 e,
+ void *mem, int mem__sz)
+{
+ const unsigned char *p = mem;
+ u64 sum = a + b + c + d + e;
+ int i;
+
+ for (i = 0; i < mem__sz; i++)
+ sum += p[i];
+ return sum;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_iter(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct bpf_iter_testmod_seq *it__iter)
+{
+ return a + b + c + d + e + it__iter->value;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_const_str(u64 a, u64 b, u64 c, u64 d, u64 e,
+ const char *str__str)
+{
+ return a + b + c + d + e;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_timer(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct bpf_timer *timer)
+{
+ return a + b + c + d + e;
+}
+
static struct prog_test_ref_kfunc prog_test_struct = {
.a = 42,
.b = 108,
@@ -1288,6 +1345,14 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test2)
BTF_ID_FLAGS(func, bpf_kfunc_call_test3)
BTF_ID_FLAGS(func, bpf_kfunc_call_test4)
BTF_ID_FLAGS(func, bpf_kfunc_call_test5)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_ptr)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mix)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_dynptr)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mem)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_iter)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_const_str)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_timer)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_acquire, KF_ACQUIRE | KF_RET_NULL)
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
index aa0b8d41e71b..2c1cb118f886 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
@@ -26,6 +26,8 @@ struct prog_test_ref_kfunc {
};
#endif
+struct bpf_iter_testmod_seq;
+
struct prog_test_pass1 {
int x0;
struct {
@@ -111,7 +113,23 @@ int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;
struct sock *bpf_kfunc_call_test3(struct sock *sk) __ksym;
long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym;
int bpf_kfunc_call_test5(__u8 a, __u16 b, __u32 c) __ksym;
-
+__u64 bpf_kfunc_call_stack_arg(__u64 a, __u64 b, __u64 c, __u64 d,
+ __u64 e, __u64 f, __u64 g, __u64 h) __ksym;
+__u64 bpf_kfunc_call_stack_arg_ptr(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct prog_test_pass1 *p) __ksym;
+__u64 bpf_kfunc_call_stack_arg_mix(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct prog_test_pass1 *p, __u64 f,
+ struct prog_test_pass1 *q) __ksym;
+__u64 bpf_kfunc_call_stack_arg_dynptr(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct bpf_dynptr *ptr) __ksym;
+__u64 bpf_kfunc_call_stack_arg_mem(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ void *mem, int mem__sz) __ksym;
+__u64 bpf_kfunc_call_stack_arg_iter(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct bpf_iter_testmod_seq *it__iter) __ksym;
+__u64 bpf_kfunc_call_stack_arg_const_str(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ const char *str__str) __ksym;
+__u64 bpf_kfunc_call_stack_arg_timer(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct bpf_timer *timer) __ksym;
void bpf_kfunc_call_test_pass_ctx(struct __sk_buff *skb) __ksym;
void bpf_kfunc_call_test_pass1(struct prog_test_pass1 *p) __ksym;
void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 18/24] selftests/bpf: Add tests for stack argument validation
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (16 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 17/24] selftests/bpf: Add tests for BPF function " Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 19/24] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
` (5 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Add negative tests that verify the kfunc (rejecting kfunc call
with >8 byte struct as stack argument) and the verifier
(rejecting invalid uses of r11 for stack arguments).
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../selftests/bpf/prog_tests/stack_arg_fail.c | 10 ++
.../selftests/bpf/progs/stack_arg_fail.c | 114 ++++++++++++++++++
.../selftests/bpf/test_kmods/bpf_testmod.c | 7 ++
.../bpf/test_kmods/bpf_testmod_kfunc.h | 8 ++
4 files changed, 139 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_fail.c
diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c b/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
new file mode 100644
index 000000000000..090af1330953
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "stack_arg_fail.skel.h"
+
+void test_stack_arg_fail(void)
+{
+ RUN_TESTS(stack_arg_fail);
+}
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_fail.c b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
new file mode 100644
index 000000000000..ad9d4bfe15dc
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+#include "bpf_misc.h"
+
+#if defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+SEC("tc")
+__failure __msg("Unrecognized *(R11-8) type STRUCT")
+int test_stack_arg_big(struct __sk_buff *skb)
+{
+ struct prog_test_big_arg s = { .a = 1, .b = 2 };
+
+ return bpf_kfunc_call_stack_arg_big(1, 2, 3, 4, 5, s);
+}
+
+SEC("socket")
+__description("r11 in ALU instruction")
+__failure __msg("R11 is invalid")
+__naked void r11_alu_reject(void)
+{
+ asm volatile (
+ "r11 += 1;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with non-DW size")
+__failure __msg("R11 is invalid")
+__naked void r11_store_non_dw(void)
+{
+ asm volatile (
+ "*(u32 *)(r11 - 8) = r1;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with unaligned offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_unaligned(void)
+{
+ asm volatile (
+ "*(u64 *)(r11 - 4) = r1;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with positive offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_positive_off(void)
+{
+ asm volatile (
+ "*(u64 *)(r11 + 8) = r1;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 load with negative offset")
+__failure __msg("R11 is invalid")
+__naked void r11_load_negative_off(void)
+{
+ asm volatile (
+ "r0 = *(u64 *)(r11 - 8);"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 load with non-DW size")
+__failure __msg("R11 is invalid")
+__naked void r11_load_non_dw(void)
+{
+ asm volatile (
+ "r0 = *(u32 *)(r11 + 8);"
+ "exit;"
+ ::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with zero offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_zero_off(void)
+{
+ asm volatile (
+ "*(u64 *)(r11 + 0) = r1;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all);
+}
+
+#else
+
+SEC("tc")
+__description("stack_arg_fail: not supported, dummy test")
+__success
+int test_stack_arg_big(struct __sk_buff *skb)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index aef2f68b7e83..0be918fe3021 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -882,6 +882,12 @@ __bpf_kfunc u64 bpf_kfunc_call_stack_arg_timer(u64 a, u64 b, u64 c, u64 d, u64 e
return a + b + c + d + e;
}
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_big(u64 a, u64 b, u64 c, u64 d, u64 e,
+ struct prog_test_big_arg s)
+{
+ return a + b + c + d + e + s.a + s.b;
+}
+
static struct prog_test_ref_kfunc prog_test_struct = {
.a = 42,
.b = 108,
@@ -1353,6 +1359,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mem)
BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_iter)
BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_const_str)
BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_timer)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_big)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_acquire, KF_ACQUIRE | KF_RET_NULL)
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
index 2c1cb118f886..2edc36b66de9 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
@@ -50,6 +50,11 @@ struct prog_test_pass2 {
} x;
};
+struct prog_test_big_arg {
+ __u64 a;
+ __u64 b;
+};
+
struct prog_test_fail1 {
void *p;
int x;
@@ -130,6 +135,9 @@ __u64 bpf_kfunc_call_stack_arg_const_str(__u64 a, __u64 b, __u64 c, __u64 d, __u
const char *str__str) __ksym;
__u64 bpf_kfunc_call_stack_arg_timer(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
struct bpf_timer *timer) __ksym;
+__u64 bpf_kfunc_call_stack_arg_big(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+ struct prog_test_big_arg s) __ksym;
+
void bpf_kfunc_call_test_pass_ctx(struct __sk_buff *skb) __ksym;
void bpf_kfunc_call_test_pass1(struct prog_test_pass1 *p) __ksym;
void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 19/24] selftests/bpf: Add BTF fixup for __naked subprog parameter names
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (17 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 18/24] selftests/bpf: Add tests for stack argument validation Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
` (4 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
When __naked subprogs are used in verifier tests, clang drops
parameter names from their BTF FUNC_PROTO entries. This prevents
the verifier from resolving stack argument slots by name.
Add a __btf_func_path(path) annotation that points to a separate
BTF file containing properly-named FUNC entries. The test_loader
matches FUNC entries by name, detects anonymous parameters, and
replaces the FUNC_PROTO with a new one that carries parameter
names from the custom file while preserving the original type IDs.
The custom BTF file also serves as btf_custom_path for kfunc
resolution when no separate btf_custom_path is specified.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
tools/testing/selftests/bpf/progs/bpf_misc.h | 1 +
tools/testing/selftests/bpf/test_loader.c | 136 ++++++++++++++++++-
2 files changed, 136 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index a0d7b15a24b1..9eeb5b0b63d6 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -152,6 +152,7 @@
#define __auxiliary __test_tag("test_auxiliary")
#define __auxiliary_unpriv __test_tag("test_auxiliary_unpriv")
#define __btf_path(path) __test_tag("test_btf_path=" path)
+#define __btf_func_path(path) __test_tag("test_btf_func_path=" path)
#define __arch(arch) __test_tag("test_arch=" arch)
#define __arch_x86_64 __arch("X86_64")
#define __arch_arm64 __arch("ARM64")
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index ee637809a1d4..abdb9e6e3713 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -63,6 +63,7 @@ struct test_spec {
struct test_subspec priv;
struct test_subspec unpriv;
const char *btf_custom_path;
+ const char *btf_custom_func_path;
int log_level;
int prog_flags;
int mode_mask;
@@ -590,6 +591,8 @@ static int parse_test_spec(struct test_loader *tester,
jit_on_next_line = true;
} else if ((val = str_has_pfx(s, "test_btf_path="))) {
spec->btf_custom_path = val;
+ } else if ((val = str_has_pfx(s, "test_btf_func_path="))) {
+ spec->btf_custom_func_path = val;
} else if ((val = str_has_pfx(s, "test_caps_unpriv="))) {
err = parse_caps(val, &spec->unpriv.caps, "test caps");
if (err)
@@ -1175,6 +1178,123 @@ static int get_stream(int stream_id, int prog_fd, char *text, size_t text_sz)
return ret;
}
+/*
+ * Fix up the program's BTF using BTF from a separate file.
+ *
+ * For __naked subprogs, clang drops parameter names from BTF. Find FUNC
+ * entries with anonymous parameters and replace their FUNC_PROTO with the
+ * properly-named version from the custom file.
+ */
+static int fixup_btf_from_path(struct bpf_object *obj, const char *path)
+{
+ struct btf *prog_btf, *custom_btf;
+ __u32 i, j, cnt, custom_cnt;
+ int err = 0;
+
+ prog_btf = bpf_object__btf(obj);
+ if (!prog_btf)
+ return 0;
+
+ custom_btf = btf__parse(path, NULL);
+ if (!ASSERT_OK_PTR(custom_btf, "parse_custom_btf"))
+ return -EINVAL;
+
+ cnt = btf__type_cnt(prog_btf);
+ custom_cnt = btf__type_cnt(custom_btf);
+
+ /* Fix up FUNC entries with anonymous params.
+ * Save all data from prog_btf BEFORE calling btf__add_*,
+ * since those calls may reallocate the BTF data buffer
+ * and invalidate any pointers obtained from btf__type_by_id.
+ */
+ for (i = 1; i < cnt; i++) {
+ const struct btf_type *t = btf__type_by_id(prog_btf, i);
+ const struct btf_type *fp, *custom_t, *custom_fp;
+ const struct btf_param *params, *custom_params;
+ __u32 ret_type_id, vlen;
+ __u32 *prog_param_types = NULL;
+ const char *name;
+ int new_proto_id;
+
+ if (!btf_is_func(t))
+ continue;
+
+ fp = btf__type_by_id(prog_btf, t->type);
+ if (!fp || !btf_is_func_proto(fp) || btf_vlen(fp) == 0)
+ continue;
+
+ /* Check if any param is anonymous */
+ params = btf_params(fp);
+ if (params[0].name_off != 0)
+ continue;
+
+ /* Find matching FUNC by name in custom BTF */
+ name = btf__name_by_offset(prog_btf, t->name_off);
+ if (!name)
+ continue;
+
+ for (j = 1; j < custom_cnt; j++) {
+ const char *cname;
+
+ custom_t = btf__type_by_id(custom_btf, j);
+ if (!btf_is_func(custom_t))
+ continue;
+ cname = btf__name_by_offset(custom_btf, custom_t->name_off);
+ if (cname && strcmp(name, cname) == 0)
+ break;
+ }
+ if (j >= custom_cnt)
+ continue;
+
+ custom_fp = btf__type_by_id(custom_btf, custom_t->type);
+ if (!custom_fp || !btf_is_func_proto(custom_fp))
+ continue;
+
+ vlen = btf_vlen(fp);
+ if (vlen != btf_vlen(custom_fp))
+ continue;
+
+ /* Save data before btf__add_* calls invalidate pointers */
+ ret_type_id = fp->type;
+ prog_param_types = malloc(vlen * sizeof(*prog_param_types));
+ if (!prog_param_types) {
+ err = -ENOMEM;
+ break;
+ }
+ for (j = 0; j < vlen; j++)
+ prog_param_types[j] = params[j].type;
+
+ /* Add a new FUNC_PROTO: param names from custom, types from prog */
+ new_proto_id = btf__add_func_proto(prog_btf, ret_type_id);
+ if (new_proto_id < 0) {
+ err = new_proto_id;
+ free(prog_param_types);
+ break;
+ }
+
+ custom_params = btf_params(custom_fp);
+ for (j = 0; j < vlen; j++) {
+ const char *pname;
+
+ pname = btf__name_by_offset(custom_btf, custom_params[j].name_off);
+ err = btf__add_func_param(prog_btf, pname ?: "", prog_param_types[j]);
+ if (err)
+ break;
+ }
+ free(prog_param_types);
+ if (err)
+ break;
+
+ /* Update the FUNC to point to the new FUNC_PROTO (re-fetch
+ * since btf__add_* may have reallocated the data buffer).
+ */
+ ((struct btf_type *)btf__type_by_id(prog_btf, i))->type = new_proto_id;
+ }
+
+ btf__free(custom_btf);
+ return err;
+}
+
/* this function is forced noinline and has short generic name to look better
* in test_progs output (in case of a failure)
*/
@@ -1231,13 +1351,27 @@ void run_subtest(struct test_loader *tester,
}
}
- /* Implicitly reset to NULL if next test case doesn't specify */
+ /* Implicitly reset to NULL if next test case doesn't specify.
+ * btf_custom_func_path also serves as btf_custom_path for kfunc resolution.
+ */
open_opts->btf_custom_path = spec->btf_custom_path;
+ if (!open_opts->btf_custom_path)
+ open_opts->btf_custom_path = spec->btf_custom_func_path;
tobj = bpf_object__open_mem(obj_bytes, obj_byte_cnt, open_opts);
if (!ASSERT_OK_PTR(tobj, "obj_open_mem")) /* shouldn't happen */
goto subtest_cleanup;
+ /* Fix up __naked subprog BTF using a separate file with named params */
+ if (spec->btf_custom_func_path) {
+ err = fixup_btf_from_path(tobj, spec->btf_custom_func_path);
+ if (err) {
+ PRINT_FAIL("failed to fixup BTF from %s: %d\n",
+ spec->btf_custom_func_path, err);
+ goto tobj_cleanup;
+ }
+ }
+
i = 0;
bpf_object__for_each_program(tprog_iter, tobj) {
spec_iter = &specs[i++];
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (18 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 19/24] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:34 ` [PATCH bpf-next v3 21/24] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
` (3 subsequent siblings)
23 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Add inline-asm based verifier tests that exercise stack argument
validation logic directly.
Positive tests:
- subprog call with 6 arg's
- Two sequential calls to different subprogs (6-arg and 7-arg)
- Share a r11 store for both branches
Negative tests — verifier rejection:
- Read from uninitialized incoming stack arg slot
- Gap in outgoing slots: only r11-16 written, r11-8 missing
- Write at r11-80, exceeding max 7 stack args
- Missing store on one branch with a shared store
- First call has proper stack arguments and the second
call intends to inherit stack arguments but not working
- r11 load ordering issue
Negative tests — pointer/ref tracking:
- Pruning type mismatch: one branch stores PTR_TO_STACK, the
other stores a scalar, callee dereferences — must not prune
- Release invalidation: bpf_sk_release invalidates a socket
pointer stored in a stack arg slot
- Packet pointer invalidation: bpf_skb_pull_data invalidates
a packet pointer stored in a stack arg slot
- Null propagation: PTR_TO_MAP_VALUE_OR_NULL stored in stack
arg slot, null branch attempts dereference via callee
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../selftests/bpf/prog_tests/verifier.c | 4 +
.../bpf/progs/btf__verifier_stack_arg_order.c | 30 ++
.../selftests/bpf/progs/verifier_stack_arg.c | 444 ++++++++++++++++++
.../bpf/progs/verifier_stack_arg_order.c | 86 ++++
4 files changed, 564 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index a96b25ebff23..ee3d929fac8a 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -91,6 +91,8 @@
#include "verifier_sockmap_mutate.skel.h"
#include "verifier_spill_fill.skel.h"
#include "verifier_spin_lock.skel.h"
+#include "verifier_stack_arg.skel.h"
+#include "verifier_stack_arg_order.skel.h"
#include "verifier_stack_ptr.skel.h"
#include "verifier_store_release.skel.h"
#include "verifier_subprog_precision.skel.h"
@@ -238,6 +240,8 @@ void test_verifier_sock_addr(void) { RUN(verifier_sock_addr); }
void test_verifier_sockmap_mutate(void) { RUN(verifier_sockmap_mutate); }
void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }
void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); }
+void test_verifier_stack_arg(void) { RUN(verifier_stack_arg); }
+void test_verifier_stack_arg_order(void) { RUN(verifier_stack_arg_order); }
void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); }
void test_verifier_store_release(void) { RUN(verifier_store_release); }
void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); }
diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
new file mode 100644
index 000000000000..2d5ddb24e241
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
+{
+ return a + b + c + d + e + f;
+}
+
+int subprog_call_before_load_6args(int a, int b, int c, int d, int e, int f)
+{
+ return a + b + c + d + e + f;
+}
+
+#else
+
+int subprog_bad_order_6args(void)
+{
+ return 0;
+}
+
+int subprog_call_before_load_6args(void)
+{
+ return 0;
+}
+
+#endif
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
new file mode 100644
index 000000000000..d38beba6b5e9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
@@ -0,0 +1,444 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(max_entries, 1);
+ __type(key, long long);
+ __type(value, long long);
+} map_hash_8b SEC(".maps");
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+__noinline __used
+static int subprog_6args(int a, int b, int c, int d, int e, int f)
+{
+ return a + b + c + d + e + f;
+}
+
+__noinline __used
+static int subprog_7args(int a, int b, int c, int d, int e, int f, int g)
+{
+ return a + b + c + d + e + f + g;
+}
+
+__noinline __used
+static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
+{
+ return *f;
+}
+
+SEC("tc")
+__description("stack_arg: subprog with 6 args")
+__success __retval(21)
+__naked void stack_arg_6args(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 6;"
+ "call subprog_6args;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: two subprogs with >5 args")
+__success __retval(90)
+__naked void stack_arg_two_subprogs(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 10;"
+ "call subprog_6args;"
+ "r6 = r0;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 16) = 30;"
+ "*(u64 *)(r11 - 8) = 20;"
+ "call subprog_7args;"
+ "r0 += r6;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: read from uninitialized stack arg slot")
+__failure
+__msg("invalid read from stack arg off 8 depth 0")
+__naked void stack_arg_read_uninitialized(void)
+{
+ asm volatile (
+ "r0 = *(u64 *)(r11 + 8);"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: gap at offset -8, only wrote -16")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_gap_at_minus8(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 16) = 30;"
+ "call subprog_7args;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: pruning with different stack arg types")
+__failure
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("R{{[0-9]}} invalid mem access 'scalar'")
+__naked void stack_arg_pruning_type_mismatch(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "r6 = r0;"
+ /* local = 0 on program stack */
+ "r7 = 0;"
+ "*(u64 *)(r10 - 8) = r7;"
+ /* Branch based on random value */
+ "if r6 s> 3 goto l0_%=;"
+ /* Path 1: store stack pointer to outgoing arg6 */
+ "r1 = r10;"
+ "r1 += -8;"
+ "*(u64 *)(r11 - 8) = r1;"
+ "goto l1_%=;"
+ "l0_%=:"
+ /* Path 2: store scalar to outgoing arg6 */
+ "*(u64 *)(r11 - 8) = 42;"
+ "l1_%=:"
+ /* Call subprog that dereferences arg6 */
+ "r1 = r6;"
+ "r2 = 0;"
+ "r3 = 0;"
+ "r4 = 0;"
+ "r5 = 0;"
+ "call subprog_deref_arg6;"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: release_reference invalidates stack arg slot")
+__failure
+__msg("R{{[0-9]}} invalid mem access 'scalar'")
+__naked void stack_arg_release_ref(void)
+{
+ asm volatile (
+ "r6 = r1;"
+ /* struct bpf_sock_tuple tuple = {} */
+ "r2 = 0;"
+ "*(u32 *)(r10 - 8) = r2;"
+ "*(u64 *)(r10 - 16) = r2;"
+ "*(u64 *)(r10 - 24) = r2;"
+ "*(u64 *)(r10 - 32) = r2;"
+ "*(u64 *)(r10 - 40) = r2;"
+ "*(u64 *)(r10 - 48) = r2;"
+ /* sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof(tuple), 0, 0) */
+ "r1 = r6;"
+ "r2 = r10;"
+ "r2 += -48;"
+ "r3 = %[sizeof_bpf_sock_tuple];"
+ "r4 = 0;"
+ "r5 = 0;"
+ "call %[bpf_sk_lookup_tcp];"
+ /* r0 = sk (PTR_TO_SOCK_OR_NULL) */
+ "if r0 == 0 goto l0_%=;"
+ /* Store sock ref to outgoing arg6 slot */
+ "*(u64 *)(r11 - 8) = r0;"
+ /* Release the reference — invalidates the stack arg slot */
+ "r1 = r0;"
+ "call %[bpf_sk_release];"
+ /* Call subprog that dereferences arg6 — should fail */
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_deref_arg6;"
+ "l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_sk_lookup_tcp),
+ __imm(bpf_sk_release),
+ __imm_const(sizeof_bpf_sock_tuple, sizeof(struct bpf_sock_tuple))
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
+__failure
+__msg("R{{[0-9]}} invalid mem access 'scalar'")
+__naked void stack_arg_stale_pkt_ptr(void)
+{
+ asm volatile (
+ "r6 = r1;"
+ "r7 = *(u32 *)(r6 + %[__sk_buff_data]);"
+ "r8 = *(u32 *)(r6 + %[__sk_buff_data_end]);"
+ /* check pkt has at least 1 byte */
+ "r0 = r7;"
+ "r0 += 8;"
+ "if r0 > r8 goto l0_%=;"
+ /* Store valid pkt pointer to outgoing arg6 slot */
+ "*(u64 *)(r11 - 8) = r7;"
+ /* bpf_skb_pull_data invalidates all pkt pointers */
+ "r1 = r6;"
+ "r2 = 0;"
+ "call %[bpf_skb_pull_data];"
+ /* Call subprog that dereferences arg6 — should fail */
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_deref_arg6;"
+ "l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_skb_pull_data),
+ __imm_const(__sk_buff_data, offsetof(struct __sk_buff, data)),
+ __imm_const(__sk_buff_data_end, offsetof(struct __sk_buff, data_end))
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: null propagation rejects deref on null branch")
+__failure
+__msg("R{{[0-9]}} invalid mem access 'scalar'")
+__naked void stack_arg_null_propagation_fail(void)
+{
+ asm volatile (
+ "r1 = 0;"
+ "*(u64 *)(r10 - 8) = r1;"
+ /* r0 = bpf_map_lookup_elem(&map_hash_8b, &key) */
+ "r2 = r10;"
+ "r2 += -8;"
+ "r1 = %[map_hash_8b] ll;"
+ "call %[bpf_map_lookup_elem];"
+ /* Store PTR_TO_MAP_VALUE_OR_NULL to outgoing arg6 slot */
+ "*(u64 *)(r11 - 8) = r0;"
+ /* null check on r0 */
+ "if r0 != 0 goto l0_%=;"
+ /*
+ * On null branch, outgoing slot is SCALAR(0).
+ * Call subprog that dereferences arg6 — should fail.
+ */
+ "r1 = 0;"
+ "r2 = 0;"
+ "r3 = 0;"
+ "r4 = 0;"
+ "r5 = 0;"
+ "call subprog_deref_arg6;"
+ "l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_map_lookup_elem),
+ __imm_addr(map_hash_8b)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: missing store on one branch")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_missing_store_one_branch(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ /* Write arg7 (r11-16) before branch */
+ "*(u64 *)(r11 - 16) = 20;"
+ "call %[bpf_get_prandom_u32];"
+ "if r0 > 0 goto l0_%=;"
+ /* Path 1: write arg6 and call */
+ "*(u64 *)(r11 - 8) = 10;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_7args;"
+ "goto l1_%=;"
+ "l0_%=:"
+ /* Path 2: missing arg6 store, call should fail */
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_7args;"
+ "l1_%=:"
+ "r0 = 0;"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: share a store for both branches")
+__success __retval(0)
+__naked void stack_arg_shared_store(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ /* Write arg7 (r11-16) before branch */
+ "*(u64 *)(r11 - 16) = 20;"
+ "call %[bpf_get_prandom_u32];"
+ "if r0 > 0 goto l0_%=;"
+ /* Path 1: write arg6 and call */
+ "*(u64 *)(r11 - 8) = 10;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_7args;"
+ "goto l1_%=;"
+ "l0_%=:"
+ /* Path 2: also write arg6 and call */
+ "*(u64 *)(r11 - 8) = 30;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_7args;"
+ "l1_%=:"
+ "r0 = 0;"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: write beyond max outgoing depth")
+__failure
+__msg("stack arg write offset -80 exceeds max 7 stack args")
+__naked void stack_arg_write_beyond_max(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ /* Write to offset -80, way beyond any callee's needs */
+ "*(u64 *)(r11 - 80) = 99;"
+ "*(u64 *)(r11 - 16) = 20;"
+ "*(u64 *)(r11 - 8) = 10;"
+ "call subprog_7args;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: write unused stack arg slot")
+__failure
+__msg("func#0 writes 5 stack arg slots, but calls only require 2")
+__naked void stack_arg_write_unused_slot(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ /* Write to offset -40, unused for the callee */
+ "*(u64 *)(r11 - 40) = 99;"
+ "*(u64 *)(r11 - 16) = 20;"
+ "*(u64 *)(r11 - 8) = 10;"
+ "call subprog_7args;"
+ "r0 = 0;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: sequential calls reuse slots")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_sequential_calls(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 6;"
+ "*(u64 *)(r11 - 16) = 7;"
+ "call subprog_7args;"
+ "r6 = r0;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call subprog_7args;"
+ "r0 += r6;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg is not supported by compiler or jit, use a dummy test")
+__success
+int dummy_test(void)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
new file mode 100644
index 000000000000..671c79969c6c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+__noinline __used __naked
+static int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
+{
+ asm volatile (
+ "*(u64 *)(r11 - 8) = r1;"
+ "r0 = *(u64 *)(r11 + 8);"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: r11 load after r11 store")
+__failure
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_load_after_store(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 6;"
+ "call subprog_bad_order_6args;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+__noinline __used __naked
+static int subprog_call_before_load_6args(int a, int b, int c, int d, int e,
+ int f)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "r0 = *(u64 *)(r11 + 8);"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: r11 load after a call")
+__failure
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_load_after_call(void)
+{
+ asm volatile (
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 6;"
+ "call subprog_call_before_load_6args;"
+ "exit;"
+ ::: __clobber_all
+ );
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg order is not supported by compiler or jit, use a dummy test")
+__success
+int dummy_test(void)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 21/24] selftests/bpf: Add precision backtracking test for stack arguments
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (19 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
@ 2026-05-11 5:34 ` Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 22/24] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
` (2 subsequent siblings)
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:34 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
Add a test that verifies precision backtracking works correctly
across BPF-to-BPF calls when stack arguments are involved.
The test passes a size value as incoming stack arg (arg6) to a
subprog, which forwards it as the mem__sz parameter (outgoing arg7)
to bpf_kfunc_call_stack_arg_mem. The expected __msg annotations
verify that precision propagates from the kfunc's mem__sz argument
back through the subprog frame to the caller's outgoing stack arg
store.
A companion BTF file (btf__stack_arg_precision.c) provides named
parameter BTF for the __naked subprog via __btf_func_path.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../bpf/prog_tests/stack_arg_precision.c | 10 ++
.../bpf/progs/btf__stack_arg_precision.c | 23 +++
.../selftests/bpf/progs/stack_arg_precision.c | 134 ++++++++++++++++++
3 files changed, 167 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
create mode 100644 tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_precision.c
diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c b/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
new file mode 100644
index 000000000000..1ab041d66de3
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "stack_arg_precision.skel.h"
+
+void test_stack_arg_precision(void)
+{
+ RUN_TESTS(stack_arg_precision);
+}
diff --git a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
new file mode 100644
index 000000000000..296fddfe6804
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
+{
+ char buf[8] = {};
+
+ return bpf_kfunc_call_stack_arg_mem(a, b, c, d, e, buf, size);
+}
+
+#else
+
+long subprog_call_mem_kfunc(void)
+{
+ return 0;
+}
+
+#endif
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
new file mode 100644
index 000000000000..2a0a344c83ca
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+#include "bpf_misc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+/* Force kfunc extern BTF generation for inline asm call below.
+ * Uses its own SEC so it's not included as a .text subprog.
+ * The '?' prefix sets autoload=false so libbpf won't load it.
+ */
+SEC("?tc")
+int __btf_kfunc_gen(struct __sk_buff *ctx)
+{
+ char buf[8] = {};
+
+ return bpf_kfunc_call_stack_arg_mem(0, 0, 0, 0, 0, buf, sizeof(buf));
+}
+
+/*
+ * Test precision backtracking across bpf-to-bpf call for kfunc stack arg.
+ * subprog_call_mem_kfunc receives a size as incoming stack arg (arg6)
+ * and forwards it as mem__sz (arg7) to bpf_kfunc_call_stack_arg_mem.
+ */
+__naked __noinline __used
+static long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
+{
+ asm volatile (
+ "r1 = *(u64 *)(r11 + 8);" /* r1 = incoming arg6 (size) */
+ "r2 = 0x0807060504030201 ll;" /* r2 = buf contents */
+ "*(u64 *)(r10 - 8) = r2;" /* store buf to stack */
+ "r2 = r10;"
+ "r2 += -8;" /* r2 = &buf */
+ "*(u64 *)(r11 - 8) = r2;" /* outgoing arg6 = buf */
+ "*(u64 *)(r11 - 16) = r1;" /* outgoing arg7 = size */
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "call %[bpf_kfunc_call_stack_arg_mem];"
+ "exit;"
+ :
+ : __imm(bpf_kfunc_call_stack_arg_mem)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: precision backtracking across bpf2bpf call for kfunc")
+__success
+__log_level(2)
+__flag(BPF_F_TEST_STATE_FREQ)
+__btf_func_path("btf__stack_arg_precision.bpf.o")
+__msg("mark_precise: frame1: last_idx 26 first_idx 13 subseq_idx -1")
+__msg("mark_precise: frame1: regs= stack= before 25: (b7) r5 = 5")
+__msg("mark_precise: frame1: regs= stack= before 24: (b7) r4 = 4")
+__msg("mark_precise: frame1: regs= stack= before 23: (b7) r3 = 3")
+__msg("mark_precise: frame1: regs= stack= before 22: (b7) r2 = 2")
+__msg("mark_precise: frame1: regs= stack= before 21: (b7) r1 = 1")
+__msg("mark_precise: frame1: regs= stack= before 20: (7b) *(u64 *)(r11 -16) = r1")
+__msg("mark_precise: frame1: regs=r1 stack= before 19: (7b) *(u64 *)(r11 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 18: (07) r2 += -8")
+__msg("mark_precise: frame1: regs=r1 stack= before 17: (bf) r2 = r10")
+__msg("mark_precise: frame1: regs=r1 stack= before 16: (7b) *(u64 *)(r10 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 14: (18) r2 = 0x807060504030201")
+__msg("mark_precise: frame1: regs=r1 stack= before 13: (79) r1 = *(u64 *)(r11 +8)")
+__msg("mark_precise: frame1: parent state regs= stack=: frame1: R10=fp0")
+__msg("mark_precise: frame0: parent state regs= stack=: R10=fp0")
+__msg("mark_precise: frame1: last_idx 11 first_idx 11 subseq_idx 13")
+__msg("mark_precise: frame1: regs= stack= before 11: (85) call pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=: R1=1 R2=2 R3=3 R4=4 R5=5 R10=fp0")
+__msg("mark_precise: frame0: last_idx 9 first_idx 7 subseq_idx 11")
+__msg("mark_precise: frame0: regs= stack= before 9: (05) goto pc+1")
+__msg("mark_precise: frame0: regs= stack= before 8: (7a) *(u64 *)(r11 -8) = 4")
+__msg("mark_precise: frame1: last_idx 26 first_idx 13 subseq_idx -1 ")
+__msg("mark_precise: frame1: regs= stack= before 25: (b7) r5 = 5")
+__msg("mark_precise: frame1: regs= stack= before 24: (b7) r4 = 4")
+__msg("mark_precise: frame1: regs= stack= before 23: (b7) r3 = 3")
+__msg("mark_precise: frame1: regs= stack= before 22: (b7) r2 = 2")
+__msg("mark_precise: frame1: regs= stack= before 21: (b7) r1 = 1")
+__msg("mark_precise: frame1: regs= stack= before 20: (7b) *(u64 *)(r11 -16) = r1")
+__msg("mark_precise: frame1: regs=r1 stack= before 19: (7b) *(u64 *)(r11 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 18: (07) r2 += -8")
+__msg("mark_precise: frame1: regs=r1 stack= before 17: (bf) r2 = r10")
+__msg("mark_precise: frame1: regs=r1 stack= before 16: (7b) *(u64 *)(r10 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 14: (18) r2 = 0x807060504030201")
+__msg("mark_precise: frame1: regs=r1 stack= before 13: (79) r1 = *(u64 *)(r11 +8)")
+__msg("mark_precise: frame1: parent state regs= stack=: frame1: R10=fp0")
+__msg("mark_precise: frame0: parent state regs= stack=: R10=fp0")
+__msg("mark_precise: frame1: last_idx 11 first_idx 11 subseq_idx 13 ")
+__msg("mark_precise: frame1: regs= stack= before 11: (85) call pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=: R1=1 R2=2 R3=3 R4=4 R5=5 R10=fp0")
+__msg("mark_precise: frame0: last_idx 10 first_idx 10 subseq_idx 11 ")
+__msg("mark_precise: frame0: regs= stack= before 10: (7a) *(u64 *)(r11 -8) = 6")
+__naked void stack_arg_precision_bpf2bpf(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "r6 = r0;"
+ "r1 = 1;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "if r6 < 2 goto l0_%=;"
+ "*(u64 *)(r11 - 8) = 4;"
+ "goto l1_%=;"
+ "l0_%=:"
+ "*(u64 *)(r11 - 8) = 6;"
+ "l1_%=:"
+ "call subprog_call_mem_kfunc;"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg_precision: not supported, dummy test")
+__success
+int dummy_test(void)
+{
+ return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 22/24] bpf, arm64: Map BPF_REG_0 to x8 instead of x7
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (20 preceding siblings ...)
2026-05-11 5:34 ` [PATCH bpf-next v3 21/24] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
@ 2026-05-11 5:35 ` Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 23/24] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 24/24] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:35 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
From: Puranjay Mohan <puranjay@kernel.org>
Move the BPF return value register from x7 to x8, freeing x7 for use
as an argument register. AAPCS64 designates x8 as the indirect result
location register; it is caller-saved and not used for argument
passing, making it a suitable home for BPF_REG_0.
This is a prerequisite for stack argument support, which needs x5-x7
to pass arguments 6-8 to native kfuncs following the AAPCS64 calling
convention.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
arch/arm64/net/bpf_jit_comp.c | 4 ++--
arch/arm64/net/bpf_timed_may_goto.S | 8 ++++----
.../testing/selftests/bpf/progs/verifier_jit_inline.c | 2 +-
tools/testing/selftests/bpf/progs/verifier_ldsx.c | 6 +++---
.../selftests/bpf/progs/verifier_private_stack.c | 10 +++++-----
5 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index c9bdeef31ab9..b7bf3476e2ad 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -47,7 +47,7 @@
/* Map BPF registers to A64 registers */
static const int bpf2a64[] = {
/* return value from in-kernel function, and exit value from eBPF */
- [BPF_REG_0] = A64_R(7),
+ [BPF_REG_0] = A64_R(8),
/* arguments from eBPF program to in-kernel function */
[BPF_REG_1] = A64_R(0),
[BPF_REG_2] = A64_R(1),
@@ -1048,7 +1048,7 @@ static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
/* Restore FP/LR registers */
emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
- /* Move the return value from bpf:r0 (aka x7) to x0 */
+ /* Move the return value from bpf:r0 (aka x8) to x0 */
emit(A64_MOV(1, A64_R(0), r0), ctx);
/* Authenticate lr */
diff --git a/arch/arm64/net/bpf_timed_may_goto.S b/arch/arm64/net/bpf_timed_may_goto.S
index 894cfcd7b241..a9a802711a7f 100644
--- a/arch/arm64/net/bpf_timed_may_goto.S
+++ b/arch/arm64/net/bpf_timed_may_goto.S
@@ -8,8 +8,8 @@ SYM_FUNC_START(arch_bpf_timed_may_goto)
stp x29, x30, [sp, #-64]!
mov x29, sp
- /* Save BPF registers R0 - R5 (x7, x0-x4)*/
- stp x7, x0, [sp, #16]
+ /* Save BPF registers R0 - R5 (x8, x0-x4)*/
+ stp x8, x0, [sp, #16]
stp x1, x2, [sp, #32]
stp x3, x4, [sp, #48]
@@ -28,8 +28,8 @@ SYM_FUNC_START(arch_bpf_timed_may_goto)
/* BPF_REG_AX(x9) will be stored into count, so move return value to it. */
mov x9, x0
- /* Restore BPF registers R0 - R5 (x7, x0-x4) */
- ldp x7, x0, [sp, #16]
+ /* Restore BPF registers R0 - R5 (x8, x0-x4) */
+ ldp x8, x0, [sp, #16]
ldp x1, x2, [sp, #32]
ldp x3, x4, [sp, #48]
diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
index 4ea254063646..885ff69a3a62 100644
--- a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
+++ b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
@@ -9,7 +9,7 @@ __success __retval(0)
__arch_x86_64
__jited(" addq %gs:{{.*}}, %rax")
__arch_arm64
-__jited(" mrs x7, SP_EL0")
+__jited(" mrs x8, SP_EL0")
int inline_bpf_get_current_task(void)
{
bpf_get_current_task();
diff --git a/tools/testing/selftests/bpf/progs/verifier_ldsx.c b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
index 1026524a1983..41340877dc9d 100644
--- a/tools/testing/selftests/bpf/progs/verifier_ldsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
@@ -274,11 +274,11 @@ __jited("movslq 0x10(%rdi,%r12), %r15")
__jited("movswq 0x18(%rdi,%r12), %r15")
__jited("movsbq 0x20(%rdi,%r12), %r15")
__arch_arm64
-__jited("add x11, x7, x28")
+__jited("add x11, x8, x28")
__jited("ldrsw x21, [x11, #0x10]")
-__jited("add x11, x7, x28")
+__jited("add x11, x8, x28")
__jited("ldrsh x21, [x11, #0x18]")
-__jited("add x11, x7, x28")
+__jited("add x11, x8, x28")
__jited("ldrsb x21, [x11, #0x20]")
__jited("add x11, x0, x28")
__jited("ldrsw x22, [x11, #0x10]")
diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
index 646e8ef82051..c5078face38d 100644
--- a/tools/testing/selftests/bpf/progs/verifier_private_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
@@ -170,12 +170,12 @@ __jited(" mrs x10, TPIDR_EL{{[0-1]}}")
__jited(" add x27, x27, x10")
__jited(" add x25, x27, {{.*}}")
__jited(" bl 0x{{.*}}")
-__jited(" mov x7, x0")
+__jited(" mov x8, x0")
__jited(" mov x0, #0x2a")
__jited(" str x0, [x27]")
__jited(" bl 0x{{.*}}")
-__jited(" mov x7, x0")
-__jited(" mov x7, #0x0")
+__jited(" mov x8, x0")
+__jited(" mov x8, #0x0")
__jited(" ldp x25, x27, [sp], {{.*}}")
__naked void private_stack_callback(void)
{
@@ -220,7 +220,7 @@ __jited(" mov x0, #0x2a")
__jited(" str x0, [x27]")
__jited(" mov x0, #0x0")
__jited(" bl 0x{{.*}}")
-__jited(" mov x7, x0")
+__jited(" mov x8, x0")
__jited(" ldp x27, x28, [sp], #0x10")
int private_stack_exception_main_prog(void)
{
@@ -258,7 +258,7 @@ __jited(" add x25, x27, {{.*}}")
__jited(" mov x0, #0x2a")
__jited(" str x0, [x27]")
__jited(" bl 0x{{.*}}")
-__jited(" mov x7, x0")
+__jited(" mov x8, x0")
__jited(" ldp x27, x28, [sp], #0x10")
int private_stack_exception_sub_prog(void)
{
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 23/24] bpf, arm64: Add JIT support for stack arguments
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (21 preceding siblings ...)
2026-05-11 5:35 ` [PATCH bpf-next v3 22/24] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
@ 2026-05-11 5:35 ` Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 24/24] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:35 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
From: Puranjay Mohan <puranjay@kernel.org>
Implement stack argument passing for BPF-to-BPF and kfunc calls with
more than 5 parameters on arm64, following the AAPCS64 calling
convention.
BPF R1-R5 already map to x0-x4. With BPF_REG_0 moved to x8 by the
previous commit, x5-x7 are free for arguments 6-8. Arguments 9-12
spill onto the stack at [SP+0], [SP+8], ... and the callee reads
them from [FP+16], [FP+24], ... (above the saved FP/LR pair).
BPF convention uses fixed offsets from BPF_REG_PARAMS (r11): off=-8 is
always arg 6, off=-16 arg 7, etc. The verifier invalidates all outgoing
stack arg slots after each call, so the compiler must re-store before
every call. This means x5-x7 don't need to be saved on stack.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
arch/arm64/net/bpf_jit_comp.c | 88 ++++++++++++++++++++++++++++++++++-
1 file changed, 87 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index b7bf3476e2ad..4e98a0f0b468 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -10,6 +10,7 @@
#include <linux/arm-smccc.h>
#include <linux/bitfield.h>
#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
#include <linux/cfi.h>
#include <linux/filter.h>
#include <linux/memory.h>
@@ -86,6 +87,7 @@ struct jit_ctx {
__le32 *image;
__le32 *ro_image;
u32 stack_size;
+ u16 stack_arg_size;
u64 user_vm_start;
u64 arena_vm_start;
bool fp_used;
@@ -533,13 +535,19 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
* | |
* +-----+ <= (BPF_FP - prog->aux->stack_depth)
* |RSVD | padding
- * current A64_SP => +-----+ <= (BPF_FP - ctx->stack_size)
+ * +-----+ <= (BPF_FP - ctx->stack_size)
+ * | |
+ * | ... | outgoing stack args (9+, if any)
+ * | |
+ * current A64_SP => +-----+
* | |
* | ... | Function call stack
* | |
* +-----+
* low
*
+ * Stack args 6-8 are passed in x5-x7, args 9+ at [SP].
+ * Incoming args 9+ are at [FP + 16], [FP + 24], ...
*/
emit_kcfi(is_main_prog ? cfi_bpf_hash : cfi_bpf_subprog_hash, ctx);
@@ -613,6 +621,9 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
if (ctx->stack_size && !ctx->priv_sp_used)
emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
+ if (ctx->stack_arg_size)
+ emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
if (ctx->arena_vm_start)
emit_a64_mov_i64(arena_vm_base, ctx->arena_vm_start, ctx);
@@ -673,6 +684,9 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
/* Update tail_call_cnt if the slot is populated. */
emit(A64_STR64I(tcc, ptr, 0), ctx);
+ if (ctx->stack_arg_size)
+ emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
/* restore SP */
if (ctx->stack_size && !ctx->priv_sp_used)
emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
@@ -1034,6 +1048,9 @@ static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
const u8 r0 = bpf2a64[BPF_REG_0];
const u8 ptr = bpf2a64[TCCNT_PTR];
+ if (ctx->stack_arg_size)
+ emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
/* We're done with BPF stack */
if (ctx->stack_size && !ctx->priv_sp_used)
emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
@@ -1191,6 +1208,41 @@ static int add_exception_handler(const struct bpf_insn *insn,
return 0;
}
+static const u8 stack_arg_reg[] = { A64_R(5), A64_R(6), A64_R(7) };
+
+#define NR_STACK_ARG_REGS ARRAY_SIZE(stack_arg_reg)
+
+static void emit_stack_arg_load(u8 dst, s16 bpf_off, struct jit_ctx *ctx)
+{
+ int idx = bpf_off / sizeof(u64) - 1;
+
+ if (idx < NR_STACK_ARG_REGS)
+ emit(A64_MOV(1, dst, stack_arg_reg[idx]), ctx);
+ else
+ emit(A64_LDR64I(dst, A64_FP, (idx - NR_STACK_ARG_REGS) * sizeof(u64) + 16), ctx);
+}
+
+static void emit_stack_arg_store(u8 src_a64, s16 bpf_off, struct jit_ctx *ctx)
+{
+ int idx = -bpf_off / sizeof(u64) - 1;
+
+ if (idx < NR_STACK_ARG_REGS)
+ emit(A64_MOV(1, stack_arg_reg[idx], src_a64), ctx);
+ else
+ emit(A64_STR64I(src_a64, A64_SP, (idx - NR_STACK_ARG_REGS) * sizeof(u64)), ctx);
+}
+
+static void emit_stack_arg_store_imm(s32 imm, s16 bpf_off, const u8 tmp, struct jit_ctx *ctx)
+{
+ int idx = -bpf_off / sizeof(u64) - 1;
+
+ emit_a64_mov_i(1, tmp, imm, ctx);
+ if (idx < NR_STACK_ARG_REGS)
+ emit(A64_MOV(1, stack_arg_reg[idx], tmp), ctx);
+ else
+ emit(A64_STR64I(tmp, A64_SP, (idx - NR_STACK_ARG_REGS) * sizeof(u64)), ctx);
+}
+
/* JITs an eBPF instruction.
* Returns:
* 0 - successfully JITed an 8-byte eBPF instruction.
@@ -1646,6 +1698,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
case BPF_LDX | BPF_MEM | BPF_H:
case BPF_LDX | BPF_MEM | BPF_B:
case BPF_LDX | BPF_MEM | BPF_DW:
+ if (insn->src_reg == BPF_REG_PARAMS) {
+ emit_stack_arg_load(dst, off, ctx);
+ break;
+ }
+ fallthrough;
case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
case BPF_LDX | BPF_PROBE_MEM | BPF_W:
case BPF_LDX | BPF_PROBE_MEM | BPF_H:
@@ -1672,6 +1729,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
if (src == fp) {
src_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
off_adj = off + ctx->stack_size;
+ if (!ctx->priv_sp_used)
+ off_adj += ctx->stack_arg_size;
} else {
src_adj = src;
off_adj = off;
@@ -1752,6 +1811,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
case BPF_ST | BPF_MEM | BPF_H:
case BPF_ST | BPF_MEM | BPF_B:
case BPF_ST | BPF_MEM | BPF_DW:
+ if (insn->dst_reg == BPF_REG_PARAMS) {
+ emit_stack_arg_store_imm(imm, off, tmp, ctx);
+ break;
+ }
+ fallthrough;
case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
@@ -1763,6 +1827,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
if (dst == fp) {
dst_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
off_adj = off + ctx->stack_size;
+ if (!ctx->priv_sp_used)
+ off_adj += ctx->stack_arg_size;
} else {
dst_adj = dst;
off_adj = off;
@@ -1814,6 +1880,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
case BPF_STX | BPF_MEM | BPF_H:
case BPF_STX | BPF_MEM | BPF_B:
case BPF_STX | BPF_MEM | BPF_DW:
+ if (insn->dst_reg == BPF_REG_PARAMS) {
+ emit_stack_arg_store(src, off, ctx);
+ break;
+ }
+ fallthrough;
case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
@@ -1825,6 +1896,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
if (dst == fp) {
dst_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
off_adj = off + ctx->stack_size;
+ if (!ctx->priv_sp_used)
+ off_adj += ctx->stack_arg_size;
} else {
dst_adj = dst;
off_adj = off;
@@ -2066,6 +2139,14 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
ctx.arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena);
+ if (subprog_info && subprog_info->stack_arg_cnt > bpf_in_stack_arg_cnt(subprog_info)) {
+ int out_cnt = subprog_info->stack_arg_cnt - bpf_in_stack_arg_cnt(subprog_info);
+ int nr_on_stack = out_cnt - NR_STACK_ARG_REGS;
+
+ if (nr_on_stack > 0)
+ ctx.stack_arg_size = round_up(nr_on_stack * sizeof(u64), 16);
+ }
+
if (priv_stack_ptr)
ctx.priv_sp_used = true;
@@ -2230,6 +2311,11 @@ bool bpf_jit_supports_kfunc_call(void)
return true;
}
+bool bpf_jit_supports_stack_args(void)
+{
+ return true;
+}
+
void *bpf_arch_text_copy(void *dst, void *src, size_t len)
{
if (!aarch64_insn_copy(dst, src, len))
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH bpf-next v3 24/24] selftests/bpf: Enable stack argument tests for arm64
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
` (22 preceding siblings ...)
2026-05-11 5:35 ` [PATCH bpf-next v3 23/24] bpf, arm64: Add JIT support for stack arguments Yonghong Song
@ 2026-05-11 5:35 ` Yonghong Song
23 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 5:35 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
From: Puranjay Mohan <puranjay@kernel.org>
Now that arm64 supports stack arguments, enable the existing stack_arg,
stack_arg_kfunc and verifier_stack_arg tests for __TARGET_ARCH_arm64.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c | 3 ++-
.../selftests/bpf/progs/btf__verifier_stack_arg_order.c | 3 ++-
tools/testing/selftests/bpf/progs/stack_arg.c | 3 ++-
tools/testing/selftests/bpf/progs/stack_arg_kfunc.c | 3 ++-
tools/testing/selftests/bpf/progs/stack_arg_precision.c | 3 ++-
tools/testing/selftests/bpf/progs/verifier_stack_arg.c | 3 ++-
tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c | 3 ++-
7 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
index 296fddfe6804..8d38aafe66a2 100644
--- a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
+++ b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
@@ -4,7 +4,8 @@
#include <bpf/bpf_helpers.h>
#include "../test_kmods/bpf_testmod_kfunc.h"
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
{
diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
index 2d5ddb24e241..9a05bbecd170 100644
--- a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -3,7 +3,8 @@
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
{
diff --git a/tools/testing/selftests/bpf/progs/stack_arg.c b/tools/testing/selftests/bpf/progs/stack_arg.c
index ab6240b997c5..b5e9929a4d63 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg.c
@@ -21,7 +21,8 @@ struct {
int timer_result;
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
const volatile bool has_stack_arg = true;
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
index fa9def876ea5..da0d4f91d273 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
@@ -6,7 +6,8 @@
#include "bpf_kfuncs.h"
#include "../test_kmods/bpf_testmod_kfunc.h"
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
const volatile bool has_stack_arg = true;
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
index 2a0a344c83ca..bee2eeec021d 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg_precision.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
@@ -6,7 +6,8 @@
#include "../test_kmods/bpf_testmod_kfunc.h"
#include "bpf_misc.h"
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
/* Force kfunc extern BTF generation for inline asm call below.
* Uses its own SEC so it's not included as a .text subprog.
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
index d38beba6b5e9..1939c931b6f3 100644
--- a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
@@ -12,7 +12,8 @@ struct {
__type(value, long long);
} map_hash_8b SEC(".maps");
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
__noinline __used
static int subprog_6args(int a, int b, int c, int d, int e, int f)
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
index 671c79969c6c..16b3eb5f51a2 100644
--- a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
@@ -5,7 +5,8 @@
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+ defined(__BPF_FEATURE_STACK_ARGUMENT)
__noinline __used __naked
static int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation
2026-05-11 5:34 ` [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:49 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 5478 bytes --]
> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index a96b25ebff23..ee3d929fac8a 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -91,6 +91,8 @@
> #include "verifier_sockmap_mutate.skel.h"
> #include "verifier_spill_fill.skel.h"
> #include "verifier_spin_lock.skel.h"
> +#include "verifier_stack_arg.skel.h"
> +#include "verifier_stack_arg_order.skel.h"
> #include "verifier_stack_ptr.skel.h"
> #include "verifier_store_release.skel.h"
> #include "verifier_subprog_precision.skel.h"
> @@ -238,6 +240,8 @@ void test_verifier_sock_addr(void) { RUN(verifier_sock_addr); }
> void test_verifier_sockmap_mutate(void) { RUN(verifier_sockmap_mutate); }
> void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }
> void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); }
> +void test_verifier_stack_arg(void) { RUN(verifier_stack_arg); }
> +void test_verifier_stack_arg_order(void) { RUN(verifier_stack_arg_order); }
> void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); }
> void test_verifier_store_release(void) { RUN(verifier_store_release); }
> void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); }
In v2 (May 7), sashiko-bot raised a concern about whether the tests in
verifier_stack_arg_order.c are actually being executed:
The tests in verifier_stack_arg_order.c are never executed because
the file is not registered in the selftests runner. [...] Did you
intend to include verifier_stack_arg_order.skel.h here as well?
[...] Could this be missing a RUN(verifier_stack_arg_order) entry?
Reference: https://lore.kernel.org/bpf/20260509013809.18E7DC2BCB0@smtp.kernel.org/
You responded in v3 (May 10) saying "Will fix," but this commit includes
both verifier_stack_arg_order.skel.h and RUN(verifier_stack_arg_order),
which appears to address the concern. Was the registration added after the
discussion, or is there still something missing?
[ ... ]
> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
> new file mode 100644
> index 000000000000..d38beba6b5e9
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
[ ... ]
> +__noinline __used
> +static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
> +{
> + return *f;
> +}
[ ... ]
> +SEC("tc")
> +__description("stack_arg: pruning with different stack arg types")
> +__failure
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
> +__naked void stack_arg_pruning_type_mismatch(void)
> +{
> + asm volatile (
> + "call %[bpf_get_prandom_u32];"
> + "r6 = r0;"
> + /* local = 0 on program stack */
> + "r7 = 0;"
> + "*(u64 *)(r10 - 8) = r7;"
> + /* Branch based on random value */
> + "if r6 s> 3 goto l0_%=;"
> + /* Path 1: store stack pointer to outgoing arg6 */
> + "r1 = r10;"
> + "r1 += -8;"
> + "*(u64 *)(r11 - 8) = r1;"
> + "goto l1_%=;"
> + "l0_%=:"
> + /* Path 2: store scalar to outgoing arg6 */
> + "*(u64 *)(r11 - 8) = 42;"
> + "l1_%=:"
> + /* Call subprog that dereferences arg6 */
> + "r1 = r6;"
> + "r2 = 0;"
> + "r3 = 0;"
> + "r4 = 0;"
> + "r5 = 0;"
> + "call subprog_deref_arg6;"
> + "exit;"
> + :: __imm(bpf_get_prandom_u32)
> + : __clobber_all
> + );
> +}
[ ... ]
> +SEC("tc")
> +__description("stack_arg: release_reference invalidates stack arg slot")
> +__failure
> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
> +__naked void stack_arg_release_ref(void)
[ ... ]
> +SEC("tc")
> +__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
> +__failure
> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
> +__naked void stack_arg_stale_pkt_ptr(void)
[ ... ]
> +SEC("tc")
> +__description("stack_arg: null propagation rejects deref on null branch")
> +__failure
> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
> +__naked void stack_arg_null_propagation_fail(void)
Multiple reviewers (sashiko-bot in v6, bpf-ci bot in April 24 and v2)
raised concerns about test fragility in these tests
(stack_arg_pruning_type_mismatch, stack_arg_release_ref,
stack_arg_stale_pkt_ptr, and stack_arg_null_propagation_fail).
Since subprog_deref_arg6 is standard C (not __naked), LLVM determines which
register is used for dereferencing. The reviewers noted that if a future
LLVM version chooses a different register, the verifier error will mention
that register instead, causing the test to fail.
Reference: https://lore.kernel.org/bpf/b03ce38af2e3d574b5a79963da1a4db70e896e6712b2bf2e9fce69178433540d@mail.kernel.org/
You responded in v6 that you would use a regex pattern like
R{{[0-9]}}, and in v2 (May 7) you stated you prefer to remove the register
prefix entirely. The current tests use __msg("R{{[0-9]}} invalid mem access
'scalar'"), which matches any register number.
Is the regex pattern R{{[0-9]}} sufficient to address the fragility
concern, or would it be safer to make subprog_deref_arg6 a __naked
function with explicit register usage?
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs
2026-05-11 5:33 ` [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:42 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2029 bytes --]
> bpf: Reject stack arguments in non-JITed programs
This isn't a bug, but should this commit include a Fixes: tag? This
appears to be fixing an issue introduced by commit 3330ee8db989
("bpf: Support stack arguments for bpf functions").
> The interpreter does not understand the bpf register r11
> (BPF_REG_PARAMS) used for stack arguments. So reject interpreter
> usage if stack arguments are used either in the main program or
> any subprogram.
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index ae10b9ca018d..958d86f0beac 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
> goto finalize;
>
> if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
> - bpf_prog_has_kfunc_call(fp))
> + bpf_prog_has_kfunc_call(fp) || (env && env->subprog_info[0].stack_arg_cnt))
> jit_needed = true;
>
> if (!bpf_prog_select_interpreter(fp))
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index ba86039789fd..19056016eed8 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
> @@ -1407,6 +1407,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
> verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
> return -EINVAL;
> }
> + for (i = 1; i < env->subprog_cnt; i++) {
> + if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
> + verbose(env, "stack args are not supported in non-JITed programs\n");
> + return -EINVAL;
> + }
> + }
> if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
> /* When JIT fails the progs with bpf2bpf calls and tail_calls
> * have to be rejected, since interpreter doesn't support them yet.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
2026-05-11 5:33 ` [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:29 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 862 bytes --]
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 77af44d8a3ad..a33a5b4122f8 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
> }
> args = (const struct btf_param *)(t + 1);
> nargs = btf_type_vlen(t);
> + sub->arg_cnt = nargs;
> if (nargs > MAX_BPF_FUNC_REG_ARGS) {
> if (!is_global)
> return -EINVAL;
> @@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
> return -EINVAL;
> }
>
> - sub->arg_cnt = nargs;
> sub->args_cached = true;
>
> return 0;
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable
2026-05-11 5:34 ` [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
0 siblings, 0 replies; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3305 bytes --]
> commit 442f18d6c7d955b76622d405f1eddaf4f2545a7d
> Author: Yonghong Song <yonghong.song@linux.dev>
> Date: Sun May 10 22:34:13 2026 -0700
>
> bpf: Reject stack arguments if tail call reachable
>
> Tail calls are deprecated and will be replaced by indirect calls
> in the future. Reject programs that combine tail calls with stack
> arguments rather than adding complexity for a deprecated feature.
Multiple reviewers across v4, v5, v6, v1, and v2 noted that the commit
message inaccurately states tail calls are deprecated when the actual
technical reason is that tail calls and stack arguments require incompatible
stack frame handling.
Looking at the mailing list discussion on the v1 series, Alexei's guidance
was about disallowing mixing stack arguments and tail calls due to stack
frame layout incompatibility, not because tail calls are being deprecated.
Could the commit message more precisely describe the actual technical
constraint rather than framing it as deprecation?
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d596e6bd9a81..2f2814035f37 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5267,14 +5267,23 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
> * this info will be utilized by JIT so that we will be preserving the
> * tail call counter throughout bpf2bpf calls combined with tailcalls
> */
> - if (tail_call_reachable)
> + if (tail_call_reachable) {
> for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller) {
> if (subprog[tmp].is_exception_cb) {
> verbose(env, "cannot tail call within exception cb\n");
> return -EINVAL;
> }
> + if (bpf_in_stack_arg_cnt(&subprog[tmp])) {
> + verbose(env, "tail_calls are not allowed in programs with stack args\n");
> + return -EINVAL;
> + }
> subprog[tmp].tail_call_reachable = true;
> }
> + } else if (!idx && subprog[0].has_tail_call && bpf_in_stack_arg_cnt(&subprog[0])) {
> + verbose(env, "tail_calls are not allowed in programs with stack args\n");
> + return -EINVAL;
> + }
When the traversal pops a frame to return to the caller and jumps back
to continue_func (not shown in this diff), the tail_call_reachable
variable is not reset to false.
Consider this scenario:
1. Verifier traverses subprog A with a tail call, setting
tail_call_reachable=true
2. Traversal returns to the parent (pops frame):
idx = dinfo[idx].caller;
goto continue_func;
3. Verifier then traverses unrelated sibling subprog B that uses stack
arguments
At step 3, tail_call_reachable is still true from subprog A, so the code
would incorrectly reject the valid sibling subprog B.
Is there a state leak in this DFS traversal that could cause false positive
rejections of valid BPF programs? The author acknowledged this issue in v5
and v6, noting it would be fixed in the next revision.
> +
> if (subprog[0].tail_call_reachable)
> env->prog->aux->tail_call_reachable = true;
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots
2026-05-11 5:33 ` [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
0 siblings, 0 replies; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 791 bytes --]
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index adf00585a627..338e54011d9d 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
[ ... ]
> diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
> index 5e93e57fb7ae..2e4ae0ef0860 100644
> --- a/kernel/bpf/backtrack.c
> +++ b/kernel/bpf/backtrack.c
[ ... ]
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3e65dd0edbf9..0a0157b0972a 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots
2026-05-11 5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:35 ` Yonghong Song
2026-05-11 16:34 ` Alexei Starovoitov
1 sibling, 1 reply; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3116 bytes --]
> diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
> --- a/kernel/bpf/const_fold.c
> +++ b/kernel/bpf/const_fold.c
> @@ -58,6 +58,14 @@ struct const_arg_info unknown = { .state = CONST_ARG_UNKNOWN, .val = 0 };
> u8 opcode = BPF_OP(insn->code) | BPF_SRC(insn->code);
> int r;
>
> + /* Stack arg stores (r11-based) are outside the tracked register set. */
> + if (is_stack_arg_st(insn) || is_stack_arg_stx(insn))
> + return;
> + if (is_stack_arg_ldx(insn)) {
> + ci_out[insn->dst_reg] = unknown;
> + return;
> + }
> +
> switch (class) {
> case BPF_ALU:
> case BPF_ALU64:
> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
> --- a/kernel/bpf/liveness.c
> +++ b/kernel/bpf/liveness.c
[ ... ]
> @@ -1576,19 +1651,19 @@ struct arg_track none = { .frame = ARG_NONE };
> goto err_free;
>
> for (i = 0; i < len; i++) {
> - for (r = 0; r < MAX_BPF_REG; r++)
> + for (r = 0; r < MAX_AT_TRACK_REGS; r++)
> at_in[i][r] = unvisited;
> for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
> at_stack_in[i][r] = unvisited;
> }
>
> - for (r = 0; r < MAX_BPF_REG; r++)
> + for (r = 0; r < MAX_AT_TRACK_REGS; r++)
> at_in[0][r] = none;
>
> /* Entry: R10 is always precisely the current frame's FP */
> at_in[0][BPF_REG_FP] = arg_single(depth, 0);
>
> - /* R1-R5: from caller or ARG_NONE for main */
> + /* R1-R5 and outgoing stack args: from caller or ARG_NONE for main */
> if (callee_entry) {
> for (r = BPF_REG_1; r <= BPF_REG_5; r++)
> at_in[0][r] = callee_entry[r];
The comment says "R1-R5 and outgoing stack args" but the following code
only copies R1-R5 from callee_entry to at_in[0]. The outgoing stack arg
slots at at_in[0][MAX_BPF_REG..MAX_AT_TRACK_REGS-1] remain at ARG_NONE
(set by the earlier loop).
Looking further down in the function, the incoming stack args from the
caller are actually placed in the separate at_stack_arg_entry[] array:
/* Entry: incoming stack args from caller, or ARG_NONE for main */
for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
The comment conflates the outgoing stack arg state carried in
at_in[0][MAX_BPF_REG + r] (which always starts at ARG_NONE and gets
updated later by arg_track_xfer when the subprog performs stack-arg
stores) with the incoming stack arg state (stored in
at_stack_arg_entry[], read back via stack-arg ldx).
Should the comment be revised to only mention R1-R5, since the incoming
stack args are handled separately below?
> @@ -1598,6 +1673,10 @@ struct arg_track none = { .frame = ARG_NONE };
> for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
> at_stack_in[0][r] = none;
>
> + /* Entry: incoming stack args from caller, or ARG_NONE for main */
> + for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
> + at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
> +
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 5:33 ` [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions Yonghong Song
@ 2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 15:46 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: bot+bpf-ci @ 2026-05-11 6:19 UTC (permalink / raw)
To: yonghong.song, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3756 bytes --]
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 321b9d69cf9c..f9020a4ea005 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -402,6 +402,7 @@ struct bpf_func_state {
> bool in_callback_fn;
> bool in_async_callback_fn;
> bool in_exception_callback_fn;
> + bool no_stack_arg_load;
[ ... ]
> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
> index bd9c22945050..c249eb40c6d6 100644
> --- a/kernel/bpf/states.c
> +++ b/kernel/bpf/states.c
> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> return true;
> }
>
> +/*
> + * Compare stack arg slots between old and current states.
> + * Outgoing stack args are path-local state and must agree for pruning.
> + */
> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
> + enum exact_level exact)
> +{
> + int i, nslots;
> +
> + nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
> + for (i = 0; i < nslots; i++) {
> + struct bpf_reg_state *old_arg, *cur_arg;
> + struct bpf_reg_state not_init = { .type = NOT_INIT };
> +
> + old_arg = i < old->out_stack_arg_cnt ?
> + &old->stack_arg_regs[i] : ¬_init;
> + cur_arg = i < cur->out_stack_arg_cnt ?
> + &cur->stack_arg_regs[i] : ¬_init;
> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
> + return false;
> + }
> +
> + return true;
> +}
> +
> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
> struct bpf_idmap *idmap)
> {
> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
> return false;
>
> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
> + return false;
> +
> return true;
> }
In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
no_stack_arg_load is not compared during state pruning in
func_states_equal():
"The same issue applies to state pruning: func_states_equal/
stack_arg_safe does not compare no_stack_arg_load either, so a
state with no_stack_arg_load=true can be pruned to an old state
with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
be placed before stack (to be memcpy'd) or should copy_stack_state()
explicitly assign it?"
The field placement was fixed (moving no_stack_arg_load before 'stack'
so it gets copied by memcpy), but func_states_equal() still does not
explicitly compare this field.
Could this allow incorrect pruning? If a BPF program writes to stack
args and calls a function on one path (setting no_stack_arg_load=true),
then branches and tries to read from stack args on another path, the
verifier might incorrectly prune to the old state with
no_stack_arg_load=false, missing the enforcement that stack arg reads
must happen before writes/calls.
Looking at func_states_equal():
kernel/bpf/states.c:func_states_equal() {
if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
return false;
if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
return false;
return true;
}
Should no_stack_arg_load be compared here, similar to how other boolean
flags in bpf_func_state are handled during state comparison?
Reference: https://lore.kernel.org/bpf/11c51daec78a68837f719172ae1c21db8b3e98e0a76aa5bcd59c8b089760c40d@mail.kernel.org/
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 15:46 ` Yonghong Song
2026-05-11 16:05 ` Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 15:46 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 321b9d69cf9c..f9020a4ea005 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -402,6 +402,7 @@ struct bpf_func_state {
>> bool in_callback_fn;
>> bool in_async_callback_fn;
>> bool in_exception_callback_fn;
>> + bool no_stack_arg_load;
> [ ... ]
>
>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>> index bd9c22945050..c249eb40c6d6 100644
>> --- a/kernel/bpf/states.c
>> +++ b/kernel/bpf/states.c
>> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>> return true;
>> }
>>
>> +/*
>> + * Compare stack arg slots between old and current states.
>> + * Outgoing stack args are path-local state and must agree for pruning.
>> + */
>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
>> + enum exact_level exact)
>> +{
>> + int i, nslots;
>> +
>> + nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
>> + for (i = 0; i < nslots; i++) {
>> + struct bpf_reg_state *old_arg, *cur_arg;
>> + struct bpf_reg_state not_init = { .type = NOT_INIT };
>> +
>> + old_arg = i < old->out_stack_arg_cnt ?
>> + &old->stack_arg_regs[i] : ¬_init;
>> + cur_arg = i < cur->out_stack_arg_cnt ?
>> + &cur->stack_arg_regs[i] : ¬_init;
>> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>> struct bpf_idmap *idmap)
>> {
>> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>> return false;
>>
>> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>> + return false;
>> +
>> return true;
>> }
> In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
> no_stack_arg_load is not compared during state pruning in
> func_states_equal():
>
> "The same issue applies to state pruning: func_states_equal/
> stack_arg_safe does not compare no_stack_arg_load either, so a
> state with no_stack_arg_load=true can be pruned to an old state
> with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
> be placed before stack (to be memcpy'd) or should copy_stack_state()
> explicitly assign it?"
>
> The field placement was fixed (moving no_stack_arg_load before 'stack'
> so it gets copied by memcpy), but func_states_equal() still does not
> explicitly compare this field.
>
> Could this allow incorrect pruning? If a BPF program writes to stack
> args and calls a function on one path (setting no_stack_arg_load=true),
> then branches and tries to read from stack args on another path, the
> verifier might incorrectly prune to the old state with
> no_stack_arg_load=false, missing the enforcement that stack arg reads
> must happen before writes/calls.
>
> Looking at func_states_equal():
>
> kernel/bpf/states.c:func_states_equal() {
> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
> return false;
>
> if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
> return false;
>
> return true;
> }
>
> Should no_stack_arg_load be compared here, similar to how other boolean
> flags in bpf_func_state are handled during state comparison?
You are right. The following is an example:
/* subprog with incoming stack arg */
r1 = *(r11+8); /* read incoming arg, no_stack_arg_load = false */
if cond goto path2;
path1:
call some_helper; /* sets no_stack_arg_load = true */
goto join;
path2:
/* no call, no_stack_arg_load = false */
goto join;
join:
r2 = *(r11+8); /* read incoming arg again */
In the above case, at 'join' point, we have no_stack_arg_load = true and false
respectively. In this case, we cannot do pruning.
Will fix.
>
> Reference: https://lore.kernel.org/bpf/11c51daec78a68837f719172ae1c21db8b3e98e0a76aa5bcd59c8b089760c40d@mail.kernel.org/
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 15:46 ` Yonghong Song
@ 2026-05-11 16:05 ` Alexei Starovoitov
2026-05-11 16:21 ` Yonghong Song
2026-05-12 4:17 ` Yonghong Song
0 siblings, 2 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-11 16:05 UTC (permalink / raw)
To: Yonghong Song, bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On Mon May 11, 2026 at 8:46 AM PDT, Yonghong Song wrote:
>
>
> On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>>> index 321b9d69cf9c..f9020a4ea005 100644
>>> --- a/include/linux/bpf_verifier.h
>>> +++ b/include/linux/bpf_verifier.h
>>> @@ -402,6 +402,7 @@ struct bpf_func_state {
>>> bool in_callback_fn;
>>> bool in_async_callback_fn;
>>> bool in_exception_callback_fn;
>>> + bool no_stack_arg_load;
>> [ ... ]
>>
>>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>>> index bd9c22945050..c249eb40c6d6 100644
>>> --- a/kernel/bpf/states.c
>>> +++ b/kernel/bpf/states.c
>>> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>> return true;
>>> }
>>>
>>> +/*
>>> + * Compare stack arg slots between old and current states.
>>> + * Outgoing stack args are path-local state and must agree for pruning.
>>> + */
>>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
>>> + enum exact_level exact)
>>> +{
>>> + int i, nslots;
>>> +
>>> + nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
>>> + for (i = 0; i < nslots; i++) {
>>> + struct bpf_reg_state *old_arg, *cur_arg;
>>> + struct bpf_reg_state not_init = { .type = NOT_INIT };
>>> +
>>> + old_arg = i < old->out_stack_arg_cnt ?
>>> + &old->stack_arg_regs[i] : ¬_init;
>>> + cur_arg = i < cur->out_stack_arg_cnt ?
>>> + &cur->stack_arg_regs[i] : ¬_init;
>>> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>>> + return false;
>>> + }
>>> +
>>> + return true;
>>> +}
>>> +
>>> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>>> struct bpf_idmap *idmap)
>>> {
>>> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>> return false;
>>>
>>> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>> + return false;
>>> +
>>> return true;
>>> }
>> In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
>> no_stack_arg_load is not compared during state pruning in
>> func_states_equal():
>>
>> "The same issue applies to state pruning: func_states_equal/
>> stack_arg_safe does not compare no_stack_arg_load either, so a
>> state with no_stack_arg_load=true can be pruned to an old state
>> with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
>> be placed before stack (to be memcpy'd) or should copy_stack_state()
>> explicitly assign it?"
>>
>> The field placement was fixed (moving no_stack_arg_load before 'stack'
>> so it gets copied by memcpy), but func_states_equal() still does not
>> explicitly compare this field.
>>
>> Could this allow incorrect pruning? If a BPF program writes to stack
>> args and calls a function on one path (setting no_stack_arg_load=true),
>> then branches and tries to read from stack args on another path, the
>> verifier might incorrectly prune to the old state with
>> no_stack_arg_load=false, missing the enforcement that stack arg reads
>> must happen before writes/calls.
>>
>> Looking at func_states_equal():
>>
>> kernel/bpf/states.c:func_states_equal() {
>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>> return false;
>>
>> if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>> return false;
>>
>> return true;
>> }
>>
>> Should no_stack_arg_load be compared here, similar to how other boolean
>> flags in bpf_func_state are handled during state comparison?
>
> You are right. The following is an example:
>
> /* subprog with incoming stack arg */
> r1 = *(r11+8); /* read incoming arg, no_stack_arg_load = false */
>
> if cond goto path2;
>
> path1:
> call some_helper; /* sets no_stack_arg_load = true */
> goto join;
>
> path2:
> /* no call, no_stack_arg_load = false */
> goto join;
>
> join:
> r2 = *(r11+8); /* read incoming arg again */
>
> In the above case, at 'join' point, we have no_stack_arg_load = true and false
> respectively. In this case, we cannot do pruning.
>
> Will fix.
Hold on. Didn't we agree that any call should scratch all arg slots?
In the above example call some_helper will scratch it and last read shouldn't be allowed.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields
2026-05-11 5:33 ` [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
@ 2026-05-11 16:17 ` Alexei Starovoitov
2026-05-11 16:33 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-11 16:17 UTC (permalink / raw)
To: Yonghong Song, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On Sun May 10, 2026 at 10:33 PM PDT, Yonghong Song wrote:
> Move stack slot index (spi) and frame number out of the flags field
> in bpf_jmp_history_entry into dedicated bitfields. This simplifies
> the encoding and makes room for new flags.
>
> Previously, spi and frame were packed into the lower 9 bits of the
> 12-bit flags field (3 bits frame + 6 bits spi), with INSN_F_STACK_ACCESS
> at BIT(9) and INSN_F_DST/SRC_REG_STACK at BIT(10)/BIT(11).
> But this has no room for an INSN_F_* flag for stack arguments.
>
> To resolve this issue, bpf_jmp_history_entry field idx is narrowed to
> 20 bits (sufficient for insn indices up to 1M), and the freed bits hold
> spi (6 bits) and frame (3 bits) as dedicated struct fields. The flags
> enum is simplified accordingly:
> INSN_F_STACK_ACCESS -> BIT(0)
> INSN_F_DST_REG_STACK -> BIT(1)
> INSN_F_SRC_REG_STACK -> BIT(2)
> which allows more room for additional INSN_F_* flags.
>
> bpf_push_jmp_history() now takes explicit spi and frame parameters
> instead of encoding them into flags. The insn_stack_access_flags(),
> insn_stack_access_spi(), and insn_stack_access_frameno() helpers are
> removed.
>
> No functional change.
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
> include/linux/bpf_verifier.h | 34 ++++++++++++++--------------------
> kernel/bpf/backtrack.c | 24 +++++++++---------------
> kernel/bpf/states.c | 2 +-
> kernel/bpf/verifier.c | 23 +++++++++++------------
> 4 files changed, 35 insertions(+), 48 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index f9020a4ea005..adf00585a627 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -435,31 +435,22 @@ struct bpf_func_state {
>
> #define MAX_CALL_FRAMES 8
>
> -/* instruction history flags, used in bpf_jmp_history_entry.flags field */
> +/* instruction history flags, used in bpf_jmp_history_entry.flags field.
> + * Frame number and SPI are stored in dedicated fields of bpf_jmp_history_entry.
> + */
> enum {
> - /* instruction references stack slot through PTR_TO_STACK register;
> - * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8)
> - * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512,
> - * 8 bytes per slot, so slot index (spi) is [0, 63])
> - */
> - INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */
> -
> - INSN_F_SPI_MASK = 0x3f, /* 6 bits */
> - INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */
> + INSN_F_STACK_ACCESS = BIT(0),
>
> - INSN_F_STACK_ACCESS = BIT(9),
> -
> - INSN_F_DST_REG_STACK = BIT(10), /* dst_reg is PTR_TO_STACK */
> - INSN_F_SRC_REG_STACK = BIT(11), /* src_reg is PTR_TO_STACK */
> - /* total 12 bits are used now. */
> + INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
> + INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
> };
>
> -static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES);
> -static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8);
> -
> struct bpf_jmp_history_entry {
> - u32 idx;
> /* insn idx can't be bigger than 1 million */
> + u32 idx : 20;
> + u32 frame : 3; /* stack access frame number */
> + u32 spi : 6; /* stack slot index (0..63) */
> + u32 : 3;
> u32 prev_idx : 20;
> /* special INSN_F_xxx flags */
> u32 flags : 12;
If so, should 'flags' width be reduced as well?
We don't need to burn 12 bits after this conversion ?
3 bits for flags will do?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 16:05 ` Alexei Starovoitov
@ 2026-05-11 16:21 ` Yonghong Song
2026-05-12 4:17 ` Yonghong Song
1 sibling, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:21 UTC (permalink / raw)
To: Alexei Starovoitov, bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 6:05 PM, Alexei Starovoitov wrote:
> On Mon May 11, 2026 at 8:46 AM PDT, Yonghong Song wrote:
>>
>> On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>>>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>>>> index 321b9d69cf9c..f9020a4ea005 100644
>>>> --- a/include/linux/bpf_verifier.h
>>>> +++ b/include/linux/bpf_verifier.h
>>>> @@ -402,6 +402,7 @@ struct bpf_func_state {
>>>> bool in_callback_fn;
>>>> bool in_async_callback_fn;
>>>> bool in_exception_callback_fn;
>>>> + bool no_stack_arg_load;
>>> [ ... ]
>>>
>>>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>>>> index bd9c22945050..c249eb40c6d6 100644
>>>> --- a/kernel/bpf/states.c
>>>> +++ b/kernel/bpf/states.c
>>>> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>> return true;
>>>> }
>>>>
>>>> +/*
>>>> + * Compare stack arg slots between old and current states.
>>>> + * Outgoing stack args are path-local state and must agree for pruning.
>>>> + */
>>>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
>>>> + enum exact_level exact)
>>>> +{
>>>> + int i, nslots;
>>>> +
>>>> + nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
>>>> + for (i = 0; i < nslots; i++) {
>>>> + struct bpf_reg_state *old_arg, *cur_arg;
>>>> + struct bpf_reg_state not_init = { .type = NOT_INIT };
>>>> +
>>>> + old_arg = i < old->out_stack_arg_cnt ?
>>>> + &old->stack_arg_regs[i] : ¬_init;
>>>> + cur_arg = i < cur->out_stack_arg_cnt ?
>>>> + &cur->stack_arg_regs[i] : ¬_init;
>>>> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>>>> + return false;
>>>> + }
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>>>> struct bpf_idmap *idmap)
>>>> {
>>>> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>>>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>>> return false;
>>>>
>>>> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>>> + return false;
>>>> +
>>>> return true;
>>>> }
>>> In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
>>> no_stack_arg_load is not compared during state pruning in
>>> func_states_equal():
>>>
>>> "The same issue applies to state pruning: func_states_equal/
>>> stack_arg_safe does not compare no_stack_arg_load either, so a
>>> state with no_stack_arg_load=true can be pruned to an old state
>>> with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
>>> be placed before stack (to be memcpy'd) or should copy_stack_state()
>>> explicitly assign it?"
>>>
>>> The field placement was fixed (moving no_stack_arg_load before 'stack'
>>> so it gets copied by memcpy), but func_states_equal() still does not
>>> explicitly compare this field.
>>>
>>> Could this allow incorrect pruning? If a BPF program writes to stack
>>> args and calls a function on one path (setting no_stack_arg_load=true),
>>> then branches and tries to read from stack args on another path, the
>>> verifier might incorrectly prune to the old state with
>>> no_stack_arg_load=false, missing the enforcement that stack arg reads
>>> must happen before writes/calls.
>>>
>>> Looking at func_states_equal():
>>>
>>> kernel/bpf/states.c:func_states_equal() {
>>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>> return false;
>>>
>>> if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>> return false;
>>>
>>> return true;
>>> }
>>>
>>> Should no_stack_arg_load be compared here, similar to how other boolean
>>> flags in bpf_func_state are handled during state comparison?
>> You are right. The following is an example:
>>
>> /* subprog with incoming stack arg */
>> r1 = *(r11+8); /* read incoming arg, no_stack_arg_load = false */
>>
>> if cond goto path2;
>>
>> path1:
>> call some_helper; /* sets no_stack_arg_load = true */
>> goto join;
>>
>> path2:
>> /* no call, no_stack_arg_load = false */
>> goto join;
>>
>> join:
>> r2 = *(r11+8); /* read incoming arg again */
>>
>> In the above case, at 'join' point, we have no_stack_arg_load = true and false
>> respectively. In this case, we cannot do pruning.
>>
>> Will fix.
> Hold on. Didn't we agree that any call should scratch all arg slots?
> In the above example call some_helper will scratch it and last read shouldn't be allowed.
You are right.
join:
r2 = *(r11+8); /* read incoming arg again */
is not allowed for path1. We can avoid this no_stack_arg_load
comparison in stack_arg_safe() and later the verifier will reject
at above join point with path1.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 16:29 ` Yonghong Song
2026-05-11 17:18 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:29 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index 77af44d8a3ad..a33a5b4122f8 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>> }
>> args = (const struct btf_param *)(t + 1);
>> nargs = btf_type_vlen(t);
>> + sub->arg_cnt = nargs;
>> if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>> if (!is_global)
>> return -EINVAL;
>> @@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>> return -EINVAL;
>> }
>>
>> - sub->arg_cnt = nargs;
>> sub->args_cached = true;
>>
>> return 0;
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
In v3, for the *main* program, we have the following:
/* if main BPF program has associated BTF info, validate that
* it's matching expected signature, and otherwise mark BTF
* info for main program as unreliable
*/
if (env->prog->aux->func_info_aux) {
ret = btf_prepare_func_args(env, 0);
if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX) {
env->prog->aux->func_info_aux[0].unreliable = true;
sub->arg_cnt = 1;
sub->stack_arg_cnt = 0;
}
}
Since sub->arg_cnt and sub->stack_arg_cnt is set here for the main program,
patch #4 is not needed any more.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields
2026-05-11 16:17 ` Alexei Starovoitov
@ 2026-05-11 16:33 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:33 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On 5/11/26 6:17 PM, Alexei Starovoitov wrote:
> On Sun May 10, 2026 at 10:33 PM PDT, Yonghong Song wrote:
>> Move stack slot index (spi) and frame number out of the flags field
>> in bpf_jmp_history_entry into dedicated bitfields. This simplifies
>> the encoding and makes room for new flags.
>>
>> Previously, spi and frame were packed into the lower 9 bits of the
>> 12-bit flags field (3 bits frame + 6 bits spi), with INSN_F_STACK_ACCESS
>> at BIT(9) and INSN_F_DST/SRC_REG_STACK at BIT(10)/BIT(11).
>> But this has no room for an INSN_F_* flag for stack arguments.
>>
>> To resolve this issue, bpf_jmp_history_entry field idx is narrowed to
>> 20 bits (sufficient for insn indices up to 1M), and the freed bits hold
>> spi (6 bits) and frame (3 bits) as dedicated struct fields. The flags
>> enum is simplified accordingly:
>> INSN_F_STACK_ACCESS -> BIT(0)
>> INSN_F_DST_REG_STACK -> BIT(1)
>> INSN_F_SRC_REG_STACK -> BIT(2)
>> which allows more room for additional INSN_F_* flags.
>>
>> bpf_push_jmp_history() now takes explicit spi and frame parameters
>> instead of encoding them into flags. The insn_stack_access_flags(),
>> insn_stack_access_spi(), and insn_stack_access_frameno() helpers are
>> removed.
>>
>> No functional change.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> include/linux/bpf_verifier.h | 34 ++++++++++++++--------------------
>> kernel/bpf/backtrack.c | 24 +++++++++---------------
>> kernel/bpf/states.c | 2 +-
>> kernel/bpf/verifier.c | 23 +++++++++++------------
>> 4 files changed, 35 insertions(+), 48 deletions(-)
>>
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index f9020a4ea005..adf00585a627 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -435,31 +435,22 @@ struct bpf_func_state {
>>
>> #define MAX_CALL_FRAMES 8
>>
>> -/* instruction history flags, used in bpf_jmp_history_entry.flags field */
>> +/* instruction history flags, used in bpf_jmp_history_entry.flags field.
>> + * Frame number and SPI are stored in dedicated fields of bpf_jmp_history_entry.
>> + */
>> enum {
>> - /* instruction references stack slot through PTR_TO_STACK register;
>> - * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8)
>> - * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512,
>> - * 8 bytes per slot, so slot index (spi) is [0, 63])
>> - */
>> - INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */
>> -
>> - INSN_F_SPI_MASK = 0x3f, /* 6 bits */
>> - INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */
>> + INSN_F_STACK_ACCESS = BIT(0),
>>
>> - INSN_F_STACK_ACCESS = BIT(9),
>> -
>> - INSN_F_DST_REG_STACK = BIT(10), /* dst_reg is PTR_TO_STACK */
>> - INSN_F_SRC_REG_STACK = BIT(11), /* src_reg is PTR_TO_STACK */
>> - /* total 12 bits are used now. */
>> + INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
>> + INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
>> };
>>
>> -static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES);
>> -static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8);
>> -
>> struct bpf_jmp_history_entry {
>> - u32 idx;
>> /* insn idx can't be bigger than 1 million */
>> + u32 idx : 20;
>> + u32 frame : 3; /* stack access frame number */
>> + u32 spi : 6; /* stack slot index (0..63) */
>> + u32 : 3;
>> u32 prev_idx : 20;
>> /* special INSN_F_xxx flags */
>> u32 flags : 12;
> If so, should 'flags' width be reduced as well?
> We don't need to burn 12 bits after this conversion ?
> 3 bits for flags will do?
Right, the next patch will add a flag for STACK_ARG. So
total 4 bits for flags. Will make the change.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots
2026-05-11 5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 16:34 ` Alexei Starovoitov
2026-05-11 16:40 ` Yonghong Song
1 sibling, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-11 16:34 UTC (permalink / raw)
To: Yonghong Song, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On Sun May 10, 2026 at 10:33 PM PDT, Yonghong Song wrote:
> @@ -1071,8 +1105,24 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
> struct arg_track *dst = &at_out[insn->dst_reg];
> struct arg_track *src = &at_out[insn->src_reg];
> struct arg_track none = { .frame = ARG_NONE };
> - int r;
> -
> + int r, slot;
> +
> + /* Handle stack arg stores and loads. */
> + if (is_stack_arg_st(insn) || is_stack_arg_stx(insn)) {
> + slot = stack_arg_off_to_slot(insn->off);
> + if (slot >= 0) {
> + if (is_stack_arg_stx(insn))
> + at_out[MAX_BPF_REG + slot] = at_out[insn->src_reg];
> + else
> + at_out[MAX_BPF_REG + slot] = none;
> + }
> + return;
> + }
> + if (is_stack_arg_ldx(insn)) {
> + slot = stack_arg_off_to_slot(insn->off);
> + at_out[insn->dst_reg] = (slot >= 0) ? at_stack_arg_entry[slot] : none;
> + return;
> + }
> if (class == BPF_ALU64 && BPF_SRC(insn->code) == BPF_K) {
claude doesn't have a taste.
Please use 'else if' like the rest of the function and remove both 'return'.
> if (code == BPF_MOV) {
> *dst = none;
> @@ -1297,6 +1347,14 @@ static int record_load_store_access(struct bpf_verifier_env *env,
> struct arg_track resolved, *ptr;
> int oi;
>
> + /*
> + * Stack arg insns use dst_reg=BPF_REG_PARAMS(11), but at[11] tracks
> + * the value stored in stack arg slot 0, not a memory base pointer.
> + * Skip to avoid misinterpreting that value as an FP-derived pointer.
The comment is confusing. 'not a memory base pointer'? what does it mean?
> + */
> + if (is_stack_arg_stx(insn) || is_stack_arg_st(insn) || is_stack_arg_ldx(insn))
> + return 0;
> +
> -/* Return true if any of R1-R5 is derived from a frame pointer. */
> +/* Return true if any of R1-R5 or stack args is derived from a frame pointer. */
> static bool has_fp_args(struct arg_track *args)
> {
> for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
> if (args[r].frame != ARG_NONE)
let's make it consisten with below and use arg_is_fp here?
> return true;
> + for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
> + if (arg_is_fp(&args[MAX_BPF_REG + r]))
> + return true;
> return false;
> }
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 16:35 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:35 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
>> --- a/kernel/bpf/const_fold.c
>> +++ b/kernel/bpf/const_fold.c
>> @@ -58,6 +58,14 @@ struct const_arg_info unknown = { .state = CONST_ARG_UNKNOWN, .val = 0 };
>> u8 opcode = BPF_OP(insn->code) | BPF_SRC(insn->code);
>> int r;
>>
>> + /* Stack arg stores (r11-based) are outside the tracked register set. */
>> + if (is_stack_arg_st(insn) || is_stack_arg_stx(insn))
>> + return;
>> + if (is_stack_arg_ldx(insn)) {
>> + ci_out[insn->dst_reg] = unknown;
>> + return;
>> + }
>> +
>> switch (class) {
>> case BPF_ALU:
>> case BPF_ALU64:
>> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
>> --- a/kernel/bpf/liveness.c
>> +++ b/kernel/bpf/liveness.c
> [ ... ]
>
>> @@ -1576,19 +1651,19 @@ struct arg_track none = { .frame = ARG_NONE };
>> goto err_free;
>>
>> for (i = 0; i < len; i++) {
>> - for (r = 0; r < MAX_BPF_REG; r++)
>> + for (r = 0; r < MAX_AT_TRACK_REGS; r++)
>> at_in[i][r] = unvisited;
>> for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
>> at_stack_in[i][r] = unvisited;
>> }
>>
>> - for (r = 0; r < MAX_BPF_REG; r++)
>> + for (r = 0; r < MAX_AT_TRACK_REGS; r++)
>> at_in[0][r] = none;
>>
>> /* Entry: R10 is always precisely the current frame's FP */
>> at_in[0][BPF_REG_FP] = arg_single(depth, 0);
>>
>> - /* R1-R5: from caller or ARG_NONE for main */
>> + /* R1-R5 and outgoing stack args: from caller or ARG_NONE for main */
>> if (callee_entry) {
>> for (r = BPF_REG_1; r <= BPF_REG_5; r++)
>> at_in[0][r] = callee_entry[r];
> The comment says "R1-R5 and outgoing stack args" but the following code
> only copies R1-R5 from callee_entry to at_in[0]. The outgoing stack arg
> slots at at_in[0][MAX_BPF_REG..MAX_AT_TRACK_REGS-1] remain at ARG_NONE
> (set by the earlier loop).
>
> Looking further down in the function, the incoming stack args from the
> caller are actually placed in the separate at_stack_arg_entry[] array:
>
> /* Entry: incoming stack args from caller, or ARG_NONE for main */
> for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
> at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
>
> The comment conflates the outgoing stack arg state carried in
> at_in[0][MAX_BPF_REG + r] (which always starts at ARG_NONE and gets
> updated later by arg_track_xfer when the subprog performs stack-arg
> stores) with the incoming stack arg state (stored in
> at_stack_arg_entry[], read back via stack-arg ldx).
>
> Should the comment be revised to only mention R1-R5, since the incoming
> stack args are handled separately below?
Yes. Will fix the comment. The old comment is a leftover from the v2.
>
>> @@ -1598,6 +1673,10 @@ struct arg_track none = { .frame = ARG_NONE };
>> for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
>> at_stack_in[0][r] = none;
>>
>> + /* Entry: incoming stack args from caller, or ARG_NONE for main */
>> + for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
>> + at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
>> +
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile()
2026-05-11 5:34 ` [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile() Yonghong Song
@ 2026-05-11 16:38 ` Alexei Starovoitov
2026-05-11 16:47 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-11 16:38 UTC (permalink / raw)
To: Yonghong Song, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On Sun May 10, 2026 at 10:34 PM PDT, Yonghong Song wrote:
> - func[i] = bpf_int_jit_compile(env, func[i]);
> + func[i] = bpf_int_jit_compile(env, func[i], &env->subprog_info[i]);
Touching all JITs is too much churn. env already holds subprog_info.
Figure out how JITs should see which subprog they're processing.
See bpf_is_subprog(), for example.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments
2026-05-11 5:34 ` [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments Yonghong Song
@ 2026-05-11 16:39 ` Alexei Starovoitov
2026-05-11 16:47 ` Yonghong Song
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-11 16:39 UTC (permalink / raw)
To: Yonghong Song, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
On Sun May 10, 2026 at 10:34 PM PDT, Yonghong Song wrote:
> -static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
> +static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog,
> + struct bpf_subprog_info *subprog_info, int *addrs, u8 *image,
Same issue. Do not add subprog_info.
env and bpf_prog is enough.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots
2026-05-11 16:34 ` Alexei Starovoitov
@ 2026-05-11 16:40 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:40 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On 5/11/26 6:34 PM, Alexei Starovoitov wrote:
> On Sun May 10, 2026 at 10:33 PM PDT, Yonghong Song wrote:
>> @@ -1071,8 +1105,24 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
>> struct arg_track *dst = &at_out[insn->dst_reg];
>> struct arg_track *src = &at_out[insn->src_reg];
>> struct arg_track none = { .frame = ARG_NONE };
>> - int r;
>> -
>> + int r, slot;
>> +
>> + /* Handle stack arg stores and loads. */
>> + if (is_stack_arg_st(insn) || is_stack_arg_stx(insn)) {
>> + slot = stack_arg_off_to_slot(insn->off);
>> + if (slot >= 0) {
>> + if (is_stack_arg_stx(insn))
>> + at_out[MAX_BPF_REG + slot] = at_out[insn->src_reg];
>> + else
>> + at_out[MAX_BPF_REG + slot] = none;
>> + }
>> + return;
>> + }
>> + if (is_stack_arg_ldx(insn)) {
>> + slot = stack_arg_off_to_slot(insn->off);
>> + at_out[insn->dst_reg] = (slot >= 0) ? at_stack_arg_entry[slot] : none;
>> + return;
>> + }
>> if (class == BPF_ALU64 && BPF_SRC(insn->code) == BPF_K) {
> claude doesn't have a taste.
> Please use 'else if' like the rest of the function and remove both 'return'.
Ack.
>
>> if (code == BPF_MOV) {
>> *dst = none;
>> @@ -1297,6 +1347,14 @@ static int record_load_store_access(struct bpf_verifier_env *env,
>> struct arg_track resolved, *ptr;
>> int oi;
>>
>> + /*
>> + * Stack arg insns use dst_reg=BPF_REG_PARAMS(11), but at[11] tracks
>> + * the value stored in stack arg slot 0, not a memory base pointer.
>> + * Skip to avoid misinterpreting that value as an FP-derived pointer.
> The comment is confusing. 'not a memory base pointer'? what does it mean?
at[11] intends to track stack arg slot 0, but at '11' also corresponds to
r11. I will rewrite to make it clear.
>
>> + */
>> + if (is_stack_arg_stx(insn) || is_stack_arg_st(insn) || is_stack_arg_ldx(insn))
>> + return 0;
>> +
>> -/* Return true if any of R1-R5 is derived from a frame pointer. */
>> +/* Return true if any of R1-R5 or stack args is derived from a frame pointer. */
>> static bool has_fp_args(struct arg_track *args)
>> {
>> for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
>> if (args[r].frame != ARG_NONE)
> let's make it consisten with below and use arg_is_fp here?
Sure.
>
>> return true;
>> + for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
>> + if (arg_is_fp(&args[MAX_BPF_REG + r]))
>> + return true;
>> return false;
>> }
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 16:42 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:42 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>> bpf: Reject stack arguments in non-JITed programs
> This isn't a bug, but should this commit include a Fixes: tag? This
> appears to be fixing an issue introduced by commit 3330ee8db989
> ("bpf: Support stack arguments for bpf functions").
There is no need for fix tag. At this point, r11 based
insns are rejected by verifier. So nothing to fix.
>
>> The interpreter does not understand the bpf register r11
>> (BPF_REG_PARAMS) used for stack arguments. So reject interpreter
>> usage if stack arguments are used either in the main program or
>> any subprogram.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>> index ae10b9ca018d..958d86f0beac 100644
>> --- a/kernel/bpf/core.c
>> +++ b/kernel/bpf/core.c
>> @@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
>> goto finalize;
>>
>> if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
>> - bpf_prog_has_kfunc_call(fp))
>> + bpf_prog_has_kfunc_call(fp) || (env && env->subprog_info[0].stack_arg_cnt))
>> jit_needed = true;
>>
>> if (!bpf_prog_select_interpreter(fp))
>> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
>> index ba86039789fd..19056016eed8 100644
>> --- a/kernel/bpf/fixups.c
>> +++ b/kernel/bpf/fixups.c
>> @@ -1407,6 +1407,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
>> verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
>> return -EINVAL;
>> }
>> + for (i = 1; i < env->subprog_cnt; i++) {
>> + if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
>> + verbose(env, "stack args are not supported in non-JITed programs\n");
>> + return -EINVAL;
>> + }
>> + }
>> if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
>> /* When JIT fails the progs with bpf2bpf calls and tail_calls
>> * have to be rejected, since interpreter doesn't support them yet.
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile()
2026-05-11 16:38 ` Alexei Starovoitov
@ 2026-05-11 16:47 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:47 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau
On 5/11/26 6:38 PM, Alexei Starovoitov wrote:
> On Sun May 10, 2026 at 10:34 PM PDT, Yonghong Song wrote:
>> - func[i] = bpf_int_jit_compile(env, func[i]);
>> + func[i] = bpf_int_jit_compile(env, func[i], &env->subprog_info[i]);
> Touching all JITs is too much churn. env already holds subprog_info.
> Figure out how JITs should see which subprog they're processing.
> See bpf_is_subprog(), for example.
Good point.
func[i]->aux->func_idx = i; We can get func_idx from prog->aux->func_idx
and use func_idx to index env->subprog_info[prog->aux->func_idx].
Indeed, this is much simpler.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments
2026-05-11 16:39 ` Alexei Starovoitov
@ 2026-05-11 16:47 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:47 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan
On 5/11/26 6:39 PM, Alexei Starovoitov wrote:
> On Sun May 10, 2026 at 10:34 PM PDT, Yonghong Song wrote:
>> -static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
>> +static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog,
>> + struct bpf_subprog_info *subprog_info, int *addrs, u8 *image,
> Same issue. Do not add subprog_info.
> env and bpf_prog is enough.
Ack.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation
2026-05-11 6:19 ` bot+bpf-ci
@ 2026-05-11 16:49 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 16:49 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
>> index a96b25ebff23..ee3d929fac8a 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
>> @@ -91,6 +91,8 @@
>> #include "verifier_sockmap_mutate.skel.h"
>> #include "verifier_spill_fill.skel.h"
>> #include "verifier_spin_lock.skel.h"
>> +#include "verifier_stack_arg.skel.h"
>> +#include "verifier_stack_arg_order.skel.h"
>> #include "verifier_stack_ptr.skel.h"
>> #include "verifier_store_release.skel.h"
>> #include "verifier_subprog_precision.skel.h"
>> @@ -238,6 +240,8 @@ void test_verifier_sock_addr(void) { RUN(verifier_sock_addr); }
>> void test_verifier_sockmap_mutate(void) { RUN(verifier_sockmap_mutate); }
>> void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }
>> void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); }
>> +void test_verifier_stack_arg(void) { RUN(verifier_stack_arg); }
>> +void test_verifier_stack_arg_order(void) { RUN(verifier_stack_arg_order); }
>> void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); }
>> void test_verifier_store_release(void) { RUN(verifier_store_release); }
>> void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); }
> In v2 (May 7), sashiko-bot raised a concern about whether the tests in
> verifier_stack_arg_order.c are actually being executed:
>
> The tests in verifier_stack_arg_order.c are never executed because
> the file is not registered in the selftests runner. [...] Did you
> intend to include verifier_stack_arg_order.skel.h here as well?
> [...] Could this be missing a RUN(verifier_stack_arg_order) entry?
>
> Reference: https://lore.kernel.org/bpf/20260509013809.18E7DC2BCB0@smtp.kernel.org/
>
> You responded in v3 (May 10) saying "Will fix," but this commit includes
> both verifier_stack_arg_order.skel.h and RUN(verifier_stack_arg_order),
> which appears to address the concern. Was the registration added after the
> discussion, or is there still something missing?
The new code is added after the discussion. So it is covered.
>
> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
>> new file mode 100644
>> index 000000000000..d38beba6b5e9
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
> [ ... ]
>
>> +__noinline __used
>> +static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
>> +{
>> + return *f;
>> +}
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: pruning with different stack arg types")
>> +__failure
>> +__flag(BPF_F_TEST_STATE_FREQ)
>> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
>> +__naked void stack_arg_pruning_type_mismatch(void)
>> +{
>> + asm volatile (
>> + "call %[bpf_get_prandom_u32];"
>> + "r6 = r0;"
>> + /* local = 0 on program stack */
>> + "r7 = 0;"
>> + "*(u64 *)(r10 - 8) = r7;"
>> + /* Branch based on random value */
>> + "if r6 s> 3 goto l0_%=;"
>> + /* Path 1: store stack pointer to outgoing arg6 */
>> + "r1 = r10;"
>> + "r1 += -8;"
>> + "*(u64 *)(r11 - 8) = r1;"
>> + "goto l1_%=;"
>> + "l0_%=:"
>> + /* Path 2: store scalar to outgoing arg6 */
>> + "*(u64 *)(r11 - 8) = 42;"
>> + "l1_%=:"
>> + /* Call subprog that dereferences arg6 */
>> + "r1 = r6;"
>> + "r2 = 0;"
>> + "r3 = 0;"
>> + "r4 = 0;"
>> + "r5 = 0;"
>> + "call subprog_deref_arg6;"
>> + "exit;"
>> + :: __imm(bpf_get_prandom_u32)
>> + : __clobber_all
>> + );
>> +}
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: release_reference invalidates stack arg slot")
>> +__failure
>> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
>> +__naked void stack_arg_release_ref(void)
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
>> +__failure
>> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
>> +__naked void stack_arg_stale_pkt_ptr(void)
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: null propagation rejects deref on null branch")
>> +__failure
>> +__msg("R{{[0-9]}} invalid mem access 'scalar'")
>> +__naked void stack_arg_null_propagation_fail(void)
> Multiple reviewers (sashiko-bot in v6, bpf-ci bot in April 24 and v2)
> raised concerns about test fragility in these tests
> (stack_arg_pruning_type_mismatch, stack_arg_release_ref,
> stack_arg_stale_pkt_ptr, and stack_arg_null_propagation_fail).
>
> Since subprog_deref_arg6 is standard C (not __naked), LLVM determines which
> register is used for dereferencing. The reviewers noted that if a future
> LLVM version chooses a different register, the verifier error will mention
> that register instead, causing the test to fail.
>
> Reference: https://lore.kernel.org/bpf/b03ce38af2e3d574b5a79963da1a4db70e896e6712b2bf2e9fce69178433540d@mail.kernel.org/
>
> You responded in v6 that you would use a regex pattern like
> R{{[0-9]}}, and in v2 (May 7) you stated you prefer to remove the register
> prefix entirely. The current tests use __msg("R{{[0-9]}} invalid mem access
> 'scalar'"), which matches any register number.
>
> Is the regex pattern R{{[0-9]}} sufficient to address the fragility
> concern, or would it be safer to make subprog_deref_arg6 a __naked
> function with explicit register usage?
I think R{{[0-9]}} is sufficient.
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25652623893
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
2026-05-11 16:29 ` Yonghong Song
@ 2026-05-11 17:18 ` Yonghong Song
0 siblings, 0 replies; 50+ messages in thread
From: Yonghong Song @ 2026-05-11 17:18 UTC (permalink / raw)
To: bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 6:29 PM, Yonghong Song wrote:
>
>
> On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>>> index 77af44d8a3ad..a33a5b4122f8 100644
>>> --- a/kernel/bpf/btf.c
>>> +++ b/kernel/bpf/btf.c
>>> @@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct
>>> bpf_verifier_env *env, int subprog)
>>> }
>>> args = (const struct btf_param *)(t + 1);
>>> nargs = btf_type_vlen(t);
>>> + sub->arg_cnt = nargs;
>>> if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>>> if (!is_global)
>>> return -EINVAL;
>>> @@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct
>>> bpf_verifier_env *env, int subprog)
>>> return -EINVAL;
>>> }
>>>
>>> - sub->arg_cnt = nargs;
>>> sub->args_cached = true;
>>>
>>> return 0;
>>
>> ---
>> AI reviewed your patch. Please fix the bug or email reply why it's
>> not a bug.
>
> In v3, for the *main* program, we have the following:
>
> /* if main BPF program has associated BTF info,
> validate that
> * it's matching expected signature, and otherwise
> mark BTF
> * info for main program as unreliable
> */
> if (env->prog->aux->func_info_aux) {
> ret = btf_prepare_func_args(env, 0);
> if (ret || sub->arg_cnt != 1 ||
> sub->args[0].arg_type != ARG_PTR_TO_CTX) {
> env->prog->aux->func_info_aux[0].unreliable = true;
> sub->arg_cnt = 1;
> sub->stack_arg_cnt = 0;
> }
> }
>
> Since sub->arg_cnt and sub->stack_arg_cnt is set here for the main
> program,
> patch #4 is not needed any more.
Okay, actually, this patch still needed.
In function btf_check_func_arg_match(), we have
ret = btf_prepare_func_args(env, subprog);
if (ret)
return ret;
and its caller:
static int btf_check_subprog_call(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs)
{
struct bpf_prog *prog = env->prog;
struct btf *btf = prog->aux->btf;
u32 btf_id;
int err;
if (!prog->aux->func_info)
return -EINVAL;
btf_id = prog->aux->func_info[subprog].type_id;
if (!btf_id)
return -EFAULT;
if (prog->aux->func_info_aux[subprog].unreliable)
return -EINVAL;
err = btf_check_func_arg_match(env, subprog, btf, regs);
/* Compiler optimizations can remove arguments from static functions
* or mismatched type can be passed into a global function.
* In such cases mark the function as unreliable from BTF point of view.
*/
if (err)
prog->aux->func_info_aux[subprog].unreliable = true;
return err;
}
static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
int insn_idx, int subprog,
set_callee_state_fn set_callee_state_cb)
{
struct bpf_verifier_state *state = env->cur_state, *callback_state;
struct bpf_func_state *caller, *callee;
int err;
caller = state->frame[state->curframe];
err = btf_check_subprog_call(env, subprog, caller->regs);
if (err == -EFAULT)
return err;
/* set_callee_state is used for direct subprog calls, but we are
* interested in validating only BPF helpers that can call subprogs as
* callbacks
*/
env->subprog_info[subprog].is_cb = true;
...
}
So in this case, we still need sub->arg_cnt is set earlier so
it can be used to checking incoming argument cnt.
>
>
>> See:
>> https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>>
>> CI run summary:
>> https://github.com/kernel-patches/bpf/actions/runs/25652623893
>
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-11 16:05 ` Alexei Starovoitov
2026-05-11 16:21 ` Yonghong Song
@ 2026-05-12 4:17 ` Yonghong Song
2026-05-12 16:23 ` Alexei Starovoitov
1 sibling, 1 reply; 50+ messages in thread
From: Yonghong Song @ 2026-05-12 4:17 UTC (permalink / raw)
To: Alexei Starovoitov, bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On 5/11/26 6:05 PM, Alexei Starovoitov wrote:
> On Mon May 11, 2026 at 8:46 AM PDT, Yonghong Song wrote:
>>
>> On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>>>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>>>> index 321b9d69cf9c..f9020a4ea005 100644
>>>> --- a/include/linux/bpf_verifier.h
>>>> +++ b/include/linux/bpf_verifier.h
>>>> @@ -402,6 +402,7 @@ struct bpf_func_state {
>>>> bool in_callback_fn;
>>>> bool in_async_callback_fn;
>>>> bool in_exception_callback_fn;
>>>> + bool no_stack_arg_load;
>>> [ ... ]
>>>
>>>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>>>> index bd9c22945050..c249eb40c6d6 100644
>>>> --- a/kernel/bpf/states.c
>>>> +++ b/kernel/bpf/states.c
>>>> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>> return true;
>>>> }
>>>>
>>>> +/*
>>>> + * Compare stack arg slots between old and current states.
>>>> + * Outgoing stack args are path-local state and must agree for pruning.
>>>> + */
>>>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
>>>> + enum exact_level exact)
>>>> +{
>>>> + int i, nslots;
>>>> +
>>>> + nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
>>>> + for (i = 0; i < nslots; i++) {
>>>> + struct bpf_reg_state *old_arg, *cur_arg;
>>>> + struct bpf_reg_state not_init = { .type = NOT_INIT };
>>>> +
>>>> + old_arg = i < old->out_stack_arg_cnt ?
>>>> + &old->stack_arg_regs[i] : ¬_init;
>>>> + cur_arg = i < cur->out_stack_arg_cnt ?
>>>> + &cur->stack_arg_regs[i] : ¬_init;
>>>> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>>>> + return false;
>>>> + }
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>>>> struct bpf_idmap *idmap)
>>>> {
>>>> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>>>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>>> return false;
>>>>
>>>> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>>> + return false;
>>>> +
>>>> return true;
>>>> }
>>> In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
>>> no_stack_arg_load is not compared during state pruning in
>>> func_states_equal():
>>>
>>> "The same issue applies to state pruning: func_states_equal/
>>> stack_arg_safe does not compare no_stack_arg_load either, so a
>>> state with no_stack_arg_load=true can be pruned to an old state
>>> with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
>>> be placed before stack (to be memcpy'd) or should copy_stack_state()
>>> explicitly assign it?"
>>>
>>> The field placement was fixed (moving no_stack_arg_load before 'stack'
>>> so it gets copied by memcpy), but func_states_equal() still does not
>>> explicitly compare this field.
>>>
>>> Could this allow incorrect pruning? If a BPF program writes to stack
>>> args and calls a function on one path (setting no_stack_arg_load=true),
>>> then branches and tries to read from stack args on another path, the
>>> verifier might incorrectly prune to the old state with
>>> no_stack_arg_load=false, missing the enforcement that stack arg reads
>>> must happen before writes/calls.
>>>
>>> Looking at func_states_equal():
>>>
>>> kernel/bpf/states.c:func_states_equal() {
>>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>> return false;
>>>
>>> if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>> return false;
>>>
>>> return true;
>>> }
>>>
>>> Should no_stack_arg_load be compared here, similar to how other boolean
>>> flags in bpf_func_state are handled during state comparison?
>> You are right. The following is an example:
>>
>> /* subprog with incoming stack arg */
>> r1 = *(r11+8); /* read incoming arg, no_stack_arg_load = false */
>>
>> if cond goto path2;
>>
>> path1:
>> call some_helper; /* sets no_stack_arg_load = true */
>> goto join;
>>
>> path2:
>> /* no call, no_stack_arg_load = false */
>> goto join;
>>
>> join:
>> r2 = *(r11+8); /* read incoming arg again */
>>
>> In the above case, at 'join' point, we have no_stack_arg_load = true and false
>> respectively. In this case, we cannot do pruning.
>>
>> Will fix.
> Hold on. Didn't we agree that any call should scratch all arg slots?
> In the above example call some_helper will scratch it and last read shouldn't be allowed.
Looks like we may still need to compare no_stack_arg_load in func_states_equal().
The following is an example:
The selftest:
+__noinline __used __naked
+static int subprog_pruning_call_before_load_6args(int a, int b, int c, int d,
+ int e, int f)
+{
+ asm volatile (
+ "if r1 s> 0 goto l0_%=;"
+ "goto l1_%=;"
+ "l0_%=:"
+ "call %[bpf_get_prandom_u32];"
+ "l1_%=:"
+ "r0 = *(u64 *)(r11 + 8);"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
+
+SEC("tc")
+__description("stack_arg: pruning keeps r11 load ordering")
+__failure
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_pruning_load_after_call(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "r1 = r0;"
+ "r2 = 2;"
+ "r3 = 3;"
+ "r4 = 4;"
+ "r5 = 5;"
+ "*(u64 *)(r11 - 8) = 6;"
+ "call subprog_pruning_call_before_load_6args;"
+ "exit;"
+ :: __imm(bpf_get_prandom_u32)
+ : __clobber_all
+ );
+}
It needs the following
diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
index 2d5ddb24e241..83692570d5bc 100644
--- a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -15,6 +15,11 @@ int subprog_call_before_load_6args(int a, int b, int c, int d, int e, int f)
return a + b + c + d + e + f;
}
+int subprog_pruning_call_before_load_6args(int a, int b, int c, int d, int e, int f)
+{
+ return a + b + c + d + e + f;
+}
+
#else
int subprog_bad_order_6args(void)
@@ -27,4 +32,9 @@ int subprog_call_before_load_6args(void)
return 0;
}
+int subprog_pruning_call_before_load_6args(void)
+{
+ return 0;
+}
+
#endif
in order to get proper BTF for subprog_pruning_call_before_load_6args().
With the above, the following is the verifier log:
func#0 @0
func#1 @9
topo_order[0] = subprog_pruning_call_before_load_6args
topo_order[1] = stack_arg_pruning_load_after_call
subprog#0: analyzing (depth 0)...
subprog#0 stack_arg_pruning_load_after_call:
0: (85) call bpf_get_prandom_u32#7
1: (bf) r1 = r0
2: (b7) r2 = 2
3: (b7) r3 = 3
4: (b7) r4 = 4
5: (b7) r5 = 5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
8: (95) exit
subprog#1: analyzing (depth 0)...
subprog#1 subprog_pruning_call_before_load_6args:
9: (65) if r1 s> 0x0 goto pc+1
10: (05) goto pc+1
11: (85) call bpf_get_prandom_u32#7
12: (79) r0 = *(u64 *)(r11 +8)
13: (95) exit
stack use/def subprog#0 stack_arg_pruning_load_after_call (d0,cs0):
0: (85) call bpf_get_prandom_u32#7
1: (bf) r1 = r0
2: (b7) r2 = 2
3: (b7) r3 = 3
4: (b7) r4 = 4
5: (b7) r5 = 5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
8: (95) exit
stack use/def subprog#1 subprog_pruning_call_before_load_6args (d0,cs9):
9: (65) if r1 s> 0x0 goto pc+1
10: (05) goto pc+1
11: (85) call bpf_get_prandom_u32#7
12: (79) r0 = *(u64 *)(r11 +8)
13: (95) exit
Live regs before insn:
0: .......... (85) call bpf_get_prandom_u32#7
1: 0......... (bf) r1 = r0
2: .1........ (b7) r2 = 2
3: .12....... (b7) r3 = 3
4: .123...... (b7) r4 = 4
5: .1234..... (b7) r5 = 5
6: .12345.... (7a) *(u64 *)(r11 -8) = 6
7: .12345.... (85) call pc+1
8: 0......... (95) exit
9: .1........ (65) if r1 s> 0x0 goto pc+1
10: .......... (05) goto pc+1
11: .......... (85) call bpf_get_prandom_u32#7
12: .......... (79) r0 = *(u64 *)(r11 +8)
13: 0......... (95) exit
0: R1=ctx() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
0: (85) call bpf_get_prandom_u32#7 ; R0=scalar()
1: (bf) r1 = r0 ; R0=scalar(id=1) R1=scalar(id=1)
2: (b7) r2 = 2 ; R2=2
3: (b7) r3 = 3 ; R3=3
4: (b7) r4 = 4 ; R4=4
5: (b7) r5 = 5 ; R5=5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
caller:
R10=fp0
callee:
frame1: R1=scalar() R2=2 R3=3 R4=4 R5=5 R10=fp0
9: frame1: R1=scalar() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
9: (65) if r1 s> 0x0 goto pc+1 ; frame1: R1=scalar(smax=0)
10: (05) goto pc+1
12: (79) r0 = *(u64 *)(r11 +8) ; frame1: R0=6
13: (95) exit
returning from callee:
frame1: R0=6 R10=fp0
to caller at 8:
R0=6 R10=fp0
from 13 to 8: R0=6 R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
8: (95) exit
from 9 to 11: frame1: R10=fp0
11: frame1: R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
11: (85) call bpf_get_prandom_u32#7
12: safe
processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 6 peak_states 6 mark_read 0
=============
The insn 12 (r0 = *(u64 *)(r11 +8)) is considered safe
and the verification succeeded.
But this is not correct. The verification should fail due to insn 11.
If we add the following in states.c:
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index 45d86bfe3b68..877338136009 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -941,6 +941,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
if (old->callback_depth > cur->callback_depth)
return false;
+ if (!old->no_stack_arg_load && cur->no_stack_arg_load)
+ return false;
+
for (i = 0; i < MAX_BPF_REG; i++)
if (((1 << i) & live_regs) &&
!regsafe(env, &old->regs[i], &cur->regs[i],
The verification will fail:
0: (85) call bpf_get_prandom_u32#7 ; R0=scalar()
1: (bf) r1 = r0 ; R0=scalar(id=1) R1=scalar(id=1)
2: (b7) r2 = 2 ; R2=2
3: (b7) r3 = 3 ; R3=3
4: (b7) r4 = 4 ; R4=4
5: (b7) r5 = 5 ; R5=5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
caller:
R10=fp0
callee:
frame1: R1=scalar() R2=2 R3=3 R4=4 R5=5 R10=fp0
9: frame1: R1=scalar() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
9: (65) if r1 s> 0x0 goto pc+1 ; frame1: R1=scalar(smax=0)
10: (05) goto pc+1
12: (79) r0 = *(u64 *)(r11 +8) ; frame1: R0=6
13: (95) exit
returning from callee:
frame1: R0=6 R10=fp0
to caller at 8:
R0=6 R10=fp0
from 13 to 8: R0=6 R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
8: (95) exit
from 9 to 11: frame1: R10=fp0
11: frame1: R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
11: (85) call bpf_get_prandom_u32#7 ; frame1:
12: (79) r0 = *(u64 *)(r11 +8)
r11 load must be before any r11 store or call insn
processed 15 insns (limit 1000000) max_states_per_insn 1 total_states 7 peak_states 7 mark_read 0
Without the above states.c change, insn 12 does not have
chance to check load vs. store or call insn, and hence
considering state is equivalent.
With the above states.c change, old no_stack_arg_load is false and
cur no_stack_arg_load is true, func_states_equal returns false to
allow further for cur verification, which enters insn 12 which will
fail the verification.
What do you think about the above states.c change?
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
2026-05-12 4:17 ` Yonghong Song
@ 2026-05-12 16:23 ` Alexei Starovoitov
0 siblings, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2026-05-12 16:23 UTC (permalink / raw)
To: Yonghong Song, bot+bpf-ci, bpf
Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
eddyz87, clm, ihor.solodrai
On Mon May 11, 2026 at 9:17 PM PDT, Yonghong Song wrote:
>
>
> The insn 12 (r0 = *(u64 *)(r11 +8)) is considered safe
> and the verification succeeded.
>
> But this is not correct. The verification should fail due to insn 11.
makes sense.
> If we add the following in states.c:
>
> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
> index 45d86bfe3b68..877338136009 100644
> --- a/kernel/bpf/states.c
> +++ b/kernel/bpf/states.c
> @@ -941,6 +941,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
> if (old->callback_depth > cur->callback_depth)
> return false;
>
> + if (!old->no_stack_arg_load && cur->no_stack_arg_load)
> + return false;
> +
...
> With the above states.c change, old no_stack_arg_load is false and
> cur no_stack_arg_load is true, func_states_equal returns false to
> allow further for cur verification, which enters insn 12 which will
> fail the verification.
>
> What do you think about the above states.c change?
also makes sense to me.
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2026-05-12 16:23 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 02/24] bpf: Remove copy_register_state wrapper function Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 03/24] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:29 ` Yonghong Song
2026-05-11 17:18 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 15:46 ` Yonghong Song
2026-05-11 16:05 ` Alexei Starovoitov
2026-05-11 16:21 ` Yonghong Song
2026-05-12 4:17 ` Yonghong Song
2026-05-12 16:23 ` Alexei Starovoitov
2026-05-11 5:33 ` [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
2026-05-11 16:17 ` Alexei Starovoitov
2026-05-11 16:33 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:33 ` [PATCH bpf-next v3 08/24] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:35 ` Yonghong Song
2026-05-11 16:34 ` Alexei Starovoitov
2026-05-11 16:40 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:42 ` Yonghong Song
2026-05-11 5:33 ` [PATCH bpf-next v3 11/24] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 12/24] bpf: Enable r11 based insns Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 13/24] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 5:34 ` [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile() Yonghong Song
2026-05-11 16:38 ` Alexei Starovoitov
2026-05-11 16:47 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-05-11 16:39 ` Alexei Starovoitov
2026-05-11 16:47 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 17/24] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 18/24] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 19/24] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
2026-05-11 6:19 ` bot+bpf-ci
2026-05-11 16:49 ` Yonghong Song
2026-05-11 5:34 ` [PATCH bpf-next v3 21/24] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 22/24] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 23/24] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-11 5:35 ` [PATCH bpf-next v3 24/24] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox