[PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs
@ 2026-05-07 21:29 Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
                   ` (23 more replies)
  0 siblings, 24 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:29 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Currently, bpf function calls and kfunc's are limited by 5 reg-level
parameters. For function calls with more than 5 parameters,
developers can use always inlining or pass a struct pointer
after packing more parameters in that struct although it may have
some inconvenience. But there is no workaround for kfunc if more
than 5 parameters is needed.

This patch set lifts the 5-argument limit by introducing stack-based
argument passing for BPF functions and kfunc's, coordinated with
compiler support in LLVM [1]. The compiler emits loads/stores through
a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
the 5th, keeping the stack arg area separate from the r10-based program
stack. The current maximum number of arguments is capped at
MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
use cases.

All kfunc/bpf-function arguments are caller saved, including stack
arguments. For register arguments (r1-r5), the verifier already marks
them as clobbered after each call. For stack arguments, the verifier
invalidates all outgoing stack arg slots immediately after a call,
requiring the compiler to re-store them before any subsequent call.
This follows the native calling convention where all function
parameters are caller saved.

The x86_64 JIT translates r11-relative accesses to RBP-relative
native instructions. Each function's stack allocation is extended
by 'max_outgoing' bytes to hold the outgoing arg area below the
callee-saved registers. This makes implementation easier as the r10
can be reused for stack argument access. At both BPF-to-BPF and kfunc
calls, outgoing args are pushed onto the expected calling convention
locations directly. The incoming parameters can directly get the value
from caller.

Global subprogs and freplace progs with >5 args are not yet supported.
Only x86_64 and arm64 are supported for now. Same selftests are tested
by both x86_64 and arm64. Please see each individual patch for details.

  [1] https://github.com/llvm/llvm-project/pull/189060

Note:
  - The patch set is on top of the following commit:
    2ca6723a5f7b6 selftests/bpf: Test insns processed breakdown
  - This patch set requires latest llvm23 compiler. It is possible that a build
    failure may appear:
      /home/yhs/work/bpf-next/scripts/mod/modpost.c:59:13: error: variable 'extra_warn' set but not used [-Werror,-
Wunused-but-set-global]
             59 | static bool extra_warn;
                |             ^
          1 error generated.
    In this case, the following hack can workaround the build issue:
      --- a/Makefile
      +++ b/Makefile
      @@ -467,7 +467,7 @@ KERNELDOC       = $(srctree)/tools/docs/kernel-doc
       export KERNELDOC

       KBUILD_USERHOSTCFLAGS := -Wall -Wmissing-prototypes -Wstrict-prototypes \
      -                        -O2 -fomit-frame-pointer -std=gnu11
      +                        -O2 -fomit-frame-pointer -std=gnu11 -Wno-unused-but-set-global
       KBUILD_USERCFLAGS  := $(KBUILD_USERHOSTCFLAGS) $(USERCFLAGS)
       KBUILD_USERLDFLAGS := $(USERLDFLAGS)

Changelogs:
  v1 -> v2:
    - v1: https://lore.kernel.org/bpf/20260424171433.2034470-1-yonghong.song@linux.dev/
    - Several refactoring (convert bpf_get_spilled_reg macro to static inline func,
      Remove copy_register_state(), Refactor jmp history, Refactor record_call_access(), etc),
      suggested by Eduard.
    - Use incoming_stack_arg_cnt/stack_arg_cnt instead of incoming_stack_arg_depth/stack_arg_depth,
      suggested by Eduard.
    - Fix a stack arg pruning bug, from Eduard.
    - Fix a bug for precision marking and backtracking, basically callee needs to get the
      stack arg value from callers, helped from Eduard.
    - Set sub->arg_cnt earlier in btf_prepare_func_args(), this will avoid having
      incoming_stack_arg_cnt in bpf_subprog_info.
    - Do stack-arg liveness analysis together with r10 based liveness analysis,
      suggested by Eduard.
    - Fix a few tests to ensure that r11-based loads cannot be ahead of r11-based stores,
      and r11-based loads cannot be after kfunc/helper/bpf-function.

Puranjay Mohan (3):
  bpf, arm64: Map BPF_REG_0 to x8 instead of x7
  bpf, arm64: Add JIT support for stack arguments
  selftests/bpf: Enable stack argument tests for arm64

Yonghong Song (20):
  bpf: Convert bpf_get_spilled_reg macro to static inline function
  bpf: Remove copy_register_state wrapper function
  bpf: Add helper functions for r11-based stack argument insns
  bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
  bpf: Support stack arguments for bpf functions
  bpf: Refactor jmp history to use dedicated spi/frame fields
  bpf: Add precision marking and backtracking for stack argument slots
  bpf: Refactor record_call_access() to extract per-arg logic
  bpf: Extend liveness analysis to track stack argument slots
  bpf: Reject stack arguments in non-JITed programs
  bpf: Prepare architecture JIT support for stack arguments
  bpf: Enable r11 based insns
  bpf: Support stack arguments for kfunc calls
  bpf: Reject stack arguments if tail call reachable
  bpf,x86: Implement JIT support for stack arguments
  selftests/bpf: Add tests for BPF function stack arguments
  selftests/bpf: Add tests for stack argument validation
  selftests/bpf: Add BTF fixup for __naked subprog parameter names
  selftests/bpf: Add verifier tests for stack argument validation
  selftests/bpf: Add precision backtracking test for stack arguments

 arch/arm64/net/bpf_jit_comp.c                 |  91 +++-
 arch/arm64/net/bpf_timed_may_goto.S           |   8 +-
 arch/x86/net/bpf_jit_comp.c                   | 155 +++++-
 include/linux/bpf.h                           |   2 +
 include/linux/bpf_verifier.h                  |  89 +++-
 include/linux/filter.h                        |  22 +
 kernel/bpf/backtrack.c                        |  87 +++-
 kernel/bpf/btf.c                              |  20 +-
 kernel/bpf/const_fold.c                       |   8 +
 kernel/bpf/core.c                             |   7 +-
 kernel/bpf/fixups.c                           |  29 +-
 kernel/bpf/liveness.c                         | 146 ++++--
 kernel/bpf/states.c                           |  31 +-
 kernel/bpf/verifier.c                         | 388 ++++++++++++---
 .../selftests/bpf/prog_tests/stack_arg.c      | 139 ++++++
 .../selftests/bpf/prog_tests/stack_arg_fail.c |  10 +
 .../bpf/prog_tests/stack_arg_precision.c      |  10 +
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |   1 +
 .../bpf/progs/btf__stack_arg_precision.c      |  24 +
 .../bpf/progs/btf__verifier_stack_arg_order.c |  31 ++
 tools/testing/selftests/bpf/progs/stack_arg.c | 253 ++++++++++
 .../selftests/bpf/progs/stack_arg_fail.c      | 114 +++++
 .../selftests/bpf/progs/stack_arg_kfunc.c     | 164 +++++++
 .../selftests/bpf/progs/stack_arg_precision.c | 138 ++++++
 .../selftests/bpf/progs/verifier_jit_inline.c |   2 +-
 .../selftests/bpf/progs/verifier_ldsx.c       |   6 +-
 .../bpf/progs/verifier_private_stack.c        |  10 +-
 .../selftests/bpf/progs/verifier_stack_arg.c  | 445 ++++++++++++++++++
 .../bpf/progs/verifier_stack_arg_order.c      |  87 ++++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |  72 +++
 .../bpf/test_kmods/bpf_testmod_kfunc.h        |  26 +
 tools/testing/selftests/bpf/test_loader.c     | 136 +++++-
 33 files changed, 2579 insertions(+), 174 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
 create mode 100644 tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
 create mode 100644 tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_fail.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_precision.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c

-- 
2.53.0-Meta


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
@ 2026-05-07 21:29 ` Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 02/23] bpf: Remove copy_register_state wrapper function Yonghong Song
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:29 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Convert the bpf_get_spilled_reg() macro to a static inline function
for better type safety and readability. This also simplifies the macro
definition in preparation for upcoming stack argument support which
will introduce additional macros.

No functional change.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/bpf_verifier.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 976e2b2f40e8..321b9d69cf9c 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -552,10 +552,14 @@ struct bpf_verifier_state {
 	u32 may_goto_depth;
 };
 
-#define bpf_get_spilled_reg(slot, frame, mask)				\
-	(((slot < frame->allocated_stack / BPF_REG_SIZE) &&		\
-	  ((1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & (mask))) \
-	 ? &frame->stack[slot].spilled_ptr : NULL)
+static inline struct bpf_reg_state *
+bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
+{
+	if (slot < frame->allocated_stack / BPF_REG_SIZE &&
+	    (1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & mask)
+		return &frame->stack[slot].spilled_ptr;
+	return NULL;
+}
 
 /* Iterate over 'frame', setting 'reg' to either NULL or a spilled register. */
 #define bpf_for_each_spilled_reg(iter, frame, reg, mask)			\
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 02/23] bpf: Remove copy_register_state wrapper function
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
@ 2026-05-07 21:29 ` Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 03/23] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:29 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Remove the copy_register_state() helper which was just a plain struct
assignment wrapper and replace all call sites with direct struct
assignment. This simplifies the code in preparation for upcoming stack
argument support.

No functional change.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c | 44 +++++++++++++++++++------------------------
 1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 11054ad89c14..3bafb7ad2ba7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3410,12 +3410,6 @@ static void assign_scalar_id_before_mov(struct bpf_verifier_env *env,
 		src_reg->id = ++env->id_gen;
 }
 
-/* Copy src state preserving dst->parent and dst->live fields */
-static void copy_register_state(struct bpf_reg_state *dst, const struct bpf_reg_state *src)
-{
-	*dst = *src;
-}
-
 static void save_register_state(struct bpf_verifier_env *env,
 				struct bpf_func_state *state,
 				int spi, struct bpf_reg_state *reg,
@@ -3423,7 +3417,7 @@ static void save_register_state(struct bpf_verifier_env *env,
 {
 	int i;
 
-	copy_register_state(&state->stack[spi].spilled_ptr, reg);
+	state->stack[spi].spilled_ptr = *reg;
 
 	for (i = BPF_REG_SIZE; i > BPF_REG_SIZE - size; i--)
 		state->stack[spi].slot_type[i - 1] = STACK_SPILL;
@@ -3822,7 +3816,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 					 * with the destination register on fill.
 					 */
 					assign_scalar_id_before_mov(env, reg);
-				copy_register_state(&state->regs[dst_regno], reg);
+				state->regs[dst_regno] = *reg;
 				state->regs[dst_regno].subreg_def = subreg_def;
 
 				/* Break the relation on a narrowing fill.
@@ -3877,7 +3871,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 				 * with the destination register on fill.
 				 */
 				assign_scalar_id_before_mov(env, reg);
-			copy_register_state(&state->regs[dst_regno], reg);
+			state->regs[dst_regno] = *reg;
 			/* mark reg as written since spilled pointer state likely
 			 * has its liveness marks cleared by is_state_visited()
 			 * which resets stack/reg liveness for state transitions
@@ -6031,7 +6025,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, struct b
 						     size);
 					return -EACCES;
 				}
-				copy_register_state(&regs[value_regno], reg);
+				regs[value_regno] = *reg;
 				add_scalar_to_reg(&regs[value_regno], off);
 				regs[value_regno].type = PTR_TO_INSN;
 			} else {
@@ -13248,7 +13242,7 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
 	 */
 	if (!ptr_is_dst_reg) {
 		tmp = *dst_reg;
-		copy_register_state(dst_reg, ptr_reg);
+		*dst_reg = *ptr_reg;
 	}
 	err = sanitize_speculative_path(env, NULL, env->insn_idx + 1, env->insn_idx);
 	if (err < 0)
@@ -14698,7 +14692,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 					 * copy register state to dest reg
 					 */
 					assign_scalar_id_before_mov(env, src_reg);
-					copy_register_state(dst_reg, src_reg);
+					*dst_reg = *src_reg;
 					dst_reg->subreg_def = DEF_NOT_SUBREG;
 				} else {
 					/* case: R1 = (s8, s16 s32)R2 */
@@ -14713,7 +14707,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 						no_sext = reg_umax(src_reg) < (1ULL << (insn->off - 1));
 						if (no_sext)
 							assign_scalar_id_before_mov(env, src_reg);
-						copy_register_state(dst_reg, src_reg);
+						*dst_reg = *src_reg;
 						if (!no_sext)
 							clear_scalar_id(dst_reg);
 						coerce_reg_to_size_sx(dst_reg, insn->off >> 3);
@@ -14735,7 +14729,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 
 						if (is_src_reg_u32)
 							assign_scalar_id_before_mov(env, src_reg);
-						copy_register_state(dst_reg, src_reg);
+						*dst_reg = *src_reg;
 						/* Make sure ID is cleared if src_reg is not in u32
 						 * range otherwise dst_reg min/max could be incorrectly
 						 * propagated into src_reg by sync_linked_regs()
@@ -14749,7 +14743,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 
 						if (no_sext)
 							assign_scalar_id_before_mov(env, src_reg);
-						copy_register_state(dst_reg, src_reg);
+						*dst_reg = *src_reg;
 						if (!no_sext)
 							clear_scalar_id(dst_reg);
 						dst_reg->subreg_def = env->insn_idx + 1;
@@ -15629,7 +15623,7 @@ static void sync_linked_regs(struct bpf_verifier_env *env, struct bpf_verifier_s
 		    reg->delta == known_reg->delta) {
 			s32 saved_subreg_def = reg->subreg_def;
 
-			copy_register_state(reg, known_reg);
+			*reg = *known_reg;
 			reg->subreg_def = saved_subreg_def;
 		} else {
 			s32 saved_subreg_def = reg->subreg_def;
@@ -15640,7 +15634,7 @@ static void sync_linked_regs(struct bpf_verifier_env *env, struct bpf_verifier_s
 			__mark_reg_known(&fake_reg, (s64)reg->delta - (s64)known_reg->delta);
 
 			/* reg = known_reg; reg += delta */
-			copy_register_state(reg, known_reg);
+			*reg = *known_reg;
 			/*
 			 * Must preserve off, id and subreg_def flag,
 			 * otherwise another sync_linked_regs() will be incorrect.
@@ -15743,10 +15737,10 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	}
 
 	is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;
-	copy_register_state(&env->false_reg1, dst_reg);
-	copy_register_state(&env->false_reg2, src_reg);
-	copy_register_state(&env->true_reg1, dst_reg);
-	copy_register_state(&env->true_reg2, src_reg);
+	env->false_reg1 = *dst_reg;
+	env->false_reg2 = *src_reg;
+	env->true_reg1 = *dst_reg;
+	env->true_reg2 = *src_reg;
 	pred = is_branch_taken(env, dst_reg, src_reg, opcode, is_jmp32);
 	if (pred >= 0) {
 		/* If we get here with a dst_reg pointer type it is because
@@ -15815,11 +15809,11 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
-	copy_register_state(dst_reg, &env->false_reg1);
-	copy_register_state(src_reg, &env->false_reg2);
-	copy_register_state(&other_branch_regs[insn->dst_reg], &env->true_reg1);
+	*dst_reg = env->false_reg1;
+	*src_reg = env->false_reg2;
+	other_branch_regs[insn->dst_reg] = env->true_reg1;
 	if (BPF_SRC(insn->code) == BPF_X)
-		copy_register_state(&other_branch_regs[insn->src_reg], &env->true_reg2);
+		other_branch_regs[insn->src_reg] = env->true_reg2;
 
 	if (BPF_SRC(insn->code) == BPF_X &&
 	    src_reg->type == SCALAR_VALUE && src_reg->id &&
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 03/23] bpf: Add helper functions for r11-based stack argument insns
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
  2026-05-07 21:29 ` [PATCH bpf-next v2 02/23] bpf: Remove copy_register_state wrapper function Yonghong Song
@ 2026-05-07 21:29 ` Yonghong Song
  2026-05-07 21:30 ` [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:29 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Add three static inline helper functions — is_stack_arg_ldx(),
is_stack_arg_st(), and is_stack_arg_stx() — that identify r11-based
(BPF_REG_PARAMS) instructions used for stack argument passing. These
helpers encapsulate the detailed encoding requirements (operand size,
register, offset alignment and sign) and hide raw BPF_REG_PARAMS usage
from the verifier, making call sites more readable and explicit.

A later patch ("bpf: Enable r11 based insns") will wire these helpers
into the verifier. Until then, check_and_resolve_insns() rejects any
r11-based registers.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/filter.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index b77d0b06db6e..918d9b34eac6 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -749,6 +749,27 @@ static inline u32 bpf_prog_run_pin_on_cpu(const struct bpf_prog *prog,
 	return ret;
 }
 
+static inline bool is_stack_arg_ldx(const struct bpf_insn *insn)
+{
+	return insn->code == (BPF_LDX | BPF_MEM | BPF_DW) &&
+	       insn->src_reg == BPF_REG_PARAMS &&
+	       insn->off > 0 && insn->off % 8 == 0;
+}
+
+static inline bool is_stack_arg_st(const struct bpf_insn *insn)
+{
+	return insn->code == (BPF_ST | BPF_MEM | BPF_DW) &&
+	       insn->dst_reg == BPF_REG_PARAMS &&
+	       insn->off < 0 && insn->off % 8 == 0;
+}
+
+static inline bool is_stack_arg_stx(const struct bpf_insn *insn)
+{
+	return insn->code == (BPF_STX | BPF_MEM | BPF_DW) &&
+	       insn->dst_reg == BPF_REG_PARAMS &&
+	       insn->off < 0 && insn->off % 8 == 0;
+}
+
 #define BPF_SKB_CB_LEN QDISC_CB_PRIV_LEN
 
 struct bpf_skb_data_end {
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (2 preceding siblings ...)
  2026-05-07 21:29 ` [PATCH bpf-next v2 03/23] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
                   ` (19 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Move the "sub->arg_cnt = nargs" assignment to immediately after
nargs is computed from btf_type_vlen(), instead of at the end of
btf_prepare_func_args().

btf_prepare_func_args() can return -EINVAL early in several cases,
e.g. when a static function has some non-int/enum arguments.
Since -EINVAL from btf_prepare_func_args() does not immediately
reject verification, arg_cnt remains zero after the early return.
This causes later stack argument based load/store insns to
incorrectly assume the function has no arguments.

Setting arg_cnt right after nargs ensures it is available regardless
of which path btf_prepare_func_args() takes.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/btf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 77af44d8a3ad..a33a5b4122f8 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
 	}
 	args = (const struct btf_param *)(t + 1);
 	nargs = btf_type_vlen(t);
+	sub->arg_cnt = nargs;
 	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
 		if (!is_global)
 			return -EINVAL;
@@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
 		return -EINVAL;
 	}
 
-	sub->arg_cnt = nargs;
 	sub->args_cached = true;
 
 	return 0;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (3 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:26   ` bot+bpf-ci
  2026-05-08 18:00   ` Alexei Starovoitov
  2026-05-07 21:30 ` [PATCH bpf-next v2 06/23] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
                   ` (18 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Currently BPF functions (subprogs) are limited to 5 register arguments.
With [1], the compiler can emit code that passes additional arguments
via a dedicated stack area through bpf register BPF_REG_PARAMS (r11),
introduced in an earlier patch ([2]).

The compiler uses positive r11 offsets for incoming (callee-side) args
and negative r11 offsets for outgoing (caller-side) args, following the
x86_64/arm64 calling convention direction. There is an 8-byte gap at
offset 0 separating two regions:
  Incoming (callee reads):   r11+8 (arg6), r11+16 (arg7), ...
  Outgoing (caller writes):  r11-8 (arg6), r11-16 (arg7), ...

The following is an example to show how stack arguments are saved
and transferred between caller and callee:

  int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
    ...
    bar(a1, a2, a3, a4, a5, a6, a7, a8);
    ...
  }

  Caller (foo)                           Callee (bar)
  ============                           ============
  Incoming (positive offsets):           Incoming (positive offsets):

  r11+8:  [incoming arg 6]               r11+8:  [incoming arg 6] <-+
  r11+16: [incoming arg 7]               r11+16: [incoming arg 7] <-|+
                                         r11+24: [incoming arg 8] <-||+
  Outgoing (negative offsets):                                      |||
  r11-8:  [outgoing arg 6 to bar] -------->-------------------------+||
  r11-16: [outgoing arg 7 to bar] -------->--------------------------+|
  r11-24: [outgoing arg 8 to bar] -------->---------------------------+

If the bpf function has more than one call:

  int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
    ...
    bar1(a1, a2, a3, a4, a5, a6, a7, a8);
    ...
    bar2(a1, a2, a3, a4, a5, a6, a7, a8, a9);
    ...
  }

  Caller (foo)                             Callee (bar2)
  ============                             ==============
  Incoming (positive offsets):             Incoming (positive offsets):

  r11+8:  [incoming arg 6]                 r11+8:  [incoming arg 6] <+
  r11+16: [incoming arg 7]                 r11+16: [incoming arg 7] <|+
                                           r11+24: [incoming arg 8] <||+
  Outgoing for bar2 (negative offsets):    r11+32: [incoming arg 9] <|||+
  r11-8:  [outgoing arg 6] ---->----------->-------------------------+|||
  r11-16: [outgoing arg 7] ---->----------->--------------------------+||
  r11-24: [outgoing arg 8] ---->----------->---------------------------+|
  r11-32: [outgoing arg 9] ---->----------->----------------------------+

The verifier tracks outgoing stack arguments in stack_arg_regs[] and
out_stack_arg_cnt in bpf_func_state, separately from the regular
r10 stack. The callee does not copy incoming args — it reads them
directly from the caller's outgoing slots at positive r11 offsets.
Similar to stacksafe(), introduce stack_arg_safe() to do pruning
check.

Outgoing stack arg slots are invalidated when the callee returns
(in prepare_func_exit), not at call time. This allows the callee to
read incoming args from the caller's outgoing slots during
verification. The following are a few examples.

Example 1:
  *(u64 *)(r11 - 8) = r6;
  *(u64 *)(r11 - 16) = r7;
  call bar1;                // arg6 = r6, arg7 = r7
  call bar2;                // expected with 2 stack arguments, failed

Example 2:
To fix the Example 1:
  *(u64 *)(r11 - 8) = r6;
  *(u64 *)(r11 - 16) = r7;
  call bar1;                // arg6 = r6, arg7 = r7
  *(u64 *)(r11 - 8) = r8;
  *(u64 *)(r11 - 16) = r9;
  call bar2;                // arg6 = r8, arg7 = r9

Example 3:
The compiler can hoist the shared stack arg stores above the branch:
  *(u64 *)(r11 - 16) = r7;
  if cond goto else;
    *(u64 *)(r11 - 8) = r8;
    call bar1;               // arg6 = r8, arg7 = r7
    goto end;
  else:
    *(u64 *)(r11 - 8) = r9;
    call bar2;               // arg6 = r9, arg7 = r7
  end:

Example 4:
Within a loop:
  loop:
    *(u64 *)(r11 - 8) = r6;  // arg6, before loop
    call bar;                // reuses arg6 each iteration
    if ... goto loop;

A separate max_out_stack_arg_cnt field in bpf_subprog_info tracks
the deepest outgoing slot actually written. This intends to
reject programs that write to slots beyond what any callee expects.
It is necessary for JIT.

Similar to typical compiler generated code, enforce the following
orderings:
  - all stack arg reads must be ahead of any stack arg write
  - all stack arg reads must be before any bpf func, kfunc and helpers
This is needed as JIT may emit 'mov' insns for read/write with
the same register and bpf function, kfunc and helper will invalidate
all arguments immediately after the call.

Callback functions with stack arguments need kernel setup parameter
types (including stack parameters) properly and then callback function
can retrieve such information for verification purpose.

Global subprogs and freplace with >5 args are not yet supported.

  [1] https://github.com/llvm/llvm-project/pull/189060
  [2] https://lore.kernel.org/bpf/20260423033506.2542005-1-yonghong.song@linux.dev/

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/bpf.h          |   2 +
 include/linux/bpf_verifier.h |  37 ++++++-
 kernel/bpf/btf.c             |  14 ++-
 kernel/bpf/fixups.c          |  23 ++++-
 kernel/bpf/states.c          |  29 ++++++
 kernel/bpf/verifier.c        | 195 ++++++++++++++++++++++++++++++++++-
 6 files changed, 288 insertions(+), 12 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 715b6df9c403..f9e57c49f1cb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1669,6 +1669,8 @@ struct bpf_prog_aux {
 	u32 max_pkt_offset;
 	u32 max_tp_access;
 	u32 stack_depth;
+	u16 incoming_stack_arg_cnt;
+	u16 stack_arg_cnt; /* both incoming and max outgoing of stack argument slots */
 	u32 id;
 	u32 func_cnt; /* used by non-func prog as the number of func progs */
 	u32 real_func_cnt; /* includes hidden progs, only used for JIT and freeing progs */
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 321b9d69cf9c..1fba16fced28 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -427,6 +427,10 @@ struct bpf_func_state {
 	 * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE.
 	 */
 	int allocated_stack;
+
+	bool no_stack_arg_load;
+	u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */
+	struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */
 };
 
 #define MAX_CALL_FRAMES 8
@@ -561,12 +565,27 @@ bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
 	return NULL;
 }
 
+static inline struct bpf_reg_state *
+bpf_get_spilled_stack_arg(int slot, struct bpf_func_state *frame, u32 mask)
+{
+	if (slot < frame->out_stack_arg_cnt &&
+	    frame->stack_arg_regs[slot].type != NOT_INIT)
+		return &frame->stack_arg_regs[slot];
+	return NULL;
+}
+
 /* Iterate over 'frame', setting 'reg' to either NULL or a spilled register. */
 #define bpf_for_each_spilled_reg(iter, frame, reg, mask)			\
 	for (iter = 0, reg = bpf_get_spilled_reg(iter, frame, mask);		\
 	     iter < frame->allocated_stack / BPF_REG_SIZE;		\
 	     iter++, reg = bpf_get_spilled_reg(iter, frame, mask))
 
+/* Iterate over 'frame', setting 'reg' to either NULL or a spilled stack arg. */
+#define bpf_for_each_spilled_stack_arg(iter, frame, reg, mask)         \
+	for (iter = 0, reg = bpf_get_spilled_stack_arg(iter, frame, mask); \
+	     iter < frame->out_stack_arg_cnt;              \
+	     iter++, reg = bpf_get_spilled_stack_arg(iter, frame, mask))
+
 #define bpf_for_each_reg_in_vstate_mask(__vst, __state, __reg, __mask, __expr)   \
 	({                                                               \
 		struct bpf_verifier_state *___vstate = __vst;            \
@@ -584,6 +603,11 @@ bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask)
 					continue;                        \
 				(void)(__expr);                          \
 			}                                                \
+			bpf_for_each_spilled_stack_arg(___j, __state, __reg, __mask) { \
+				if (!__reg)                              \
+					continue;                        \
+				(void)(__expr);                          \
+			}						 \
 		}                                                        \
 	})
 
@@ -799,12 +823,21 @@ struct bpf_subprog_info {
 	bool keep_fastcall_stack: 1;
 	bool changes_pkt_data: 1;
 	bool might_sleep: 1;
-	u8 arg_cnt:3;
+	u8 arg_cnt:4;
 
 	enum priv_stack_mode priv_stack_mode;
-	struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
+	struct bpf_subprog_arg_info args[MAX_BPF_FUNC_ARGS];
+	u16 stack_arg_cnt; /* incoming + max outgoing */
+	u16 max_out_stack_arg_cnt;
 };
 
+static inline u16 bpf_in_stack_arg_cnt(struct bpf_subprog_info *sub)
+{
+	if (sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS)
+		return sub->arg_cnt - MAX_BPF_FUNC_REG_ARGS;
+	return 0;
+}
+
 struct bpf_verifier_env;
 
 struct backtrack_state {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a33a5b4122f8..ec3fb8c8f4ee 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7881,10 +7881,16 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
 	args = (const struct btf_param *)(t + 1);
 	nargs = btf_type_vlen(t);
 	sub->arg_cnt = nargs;
-	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
-		if (!is_global)
-			return -EINVAL;
-		bpf_log(log, "Global function %s() with %d > %d args. Buggy compiler.\n",
+	if (nargs > MAX_BPF_FUNC_ARGS) {
+		bpf_log(log, "kernel supports at most %d parameters, function %s has %d\n",
+			MAX_BPF_FUNC_ARGS, tname, nargs);
+		return -EFAULT;
+	}
+	if (nargs > MAX_BPF_FUNC_REG_ARGS)
+		sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+
+	if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
+		bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",
 			tname, nargs, MAX_BPF_FUNC_REG_ARGS);
 		return -EINVAL;
 	}
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index fba9e8c00878..0ff8cdf4394f 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1123,6 +1123,9 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 
 		func[i]->aux->name[0] = 'F';
 		func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
+		func[i]->aux->incoming_stack_arg_cnt =
+			bpf_in_stack_arg_cnt(&env->subprog_info[i]);
+		func[i]->aux->stack_arg_cnt = env->subprog_info[i].stack_arg_cnt;
 		if (env->subprog_info[i].priv_stack_mode == PRIV_STACK_ADAPTIVE)
 			func[i]->aux->jits_use_priv_stack = true;
 
@@ -1301,8 +1304,10 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
 	struct bpf_insn_aux_data *orig_insn_aux;
 	u32 *orig_subprog_starts;
 
-	if (env->subprog_cnt <= 1)
+	if (env->subprog_cnt <= 1) {
+		env->prog->aux->stack_arg_cnt = env->subprog_info[0].stack_arg_cnt;
 		return 0;
+	}
 
 	prog = orig_prog = env->prog;
 	if (bpf_prog_need_blind(prog)) {
@@ -1378,9 +1383,21 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
 	struct bpf_prog *prog = env->prog;
 	struct bpf_insn *insn = prog->insnsi;
 	bool has_kfunc_call = bpf_prog_has_kfunc_call(prog);
-	int i, depth;
+	int depth;
 #endif
-	int err = 0;
+	int i, err = 0;
+
+	for (i = 0; i < env->subprog_cnt; i++) {
+		struct bpf_subprog_info *subprog = &env->subprog_info[i];
+		u16 outgoing = subprog->stack_arg_cnt - bpf_in_stack_arg_cnt(subprog);
+
+		if (subprog->max_out_stack_arg_cnt > outgoing) {
+			verbose(env,
+				"func#%d writes %u stack arg slots, but calls only require %u\n",
+				i, subprog->max_out_stack_arg_cnt, outgoing);
+			return -EINVAL;
+		}
+	}
 
 	if (env->prog->jit_requested &&
 	    !bpf_prog_is_offloaded(env->prog->aux)) {
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index bd9c22945050..c249eb40c6d6 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 	return true;
 }
 
+/*
+ * Compare stack arg slots between old and current states.
+ * Outgoing stack args are path-local state and must agree for pruning.
+ */
+static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
+			   struct bpf_func_state *cur, struct bpf_idmap *idmap,
+			   enum exact_level exact)
+{
+	int i, nslots;
+
+	nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
+	for (i = 0; i < nslots; i++) {
+		struct bpf_reg_state *old_arg, *cur_arg;
+		struct bpf_reg_state not_init = { .type = NOT_INIT };
+
+		old_arg = i < old->out_stack_arg_cnt ?
+			  &old->stack_arg_regs[i] : &not_init;
+		cur_arg = i < cur->out_stack_arg_cnt ?
+			  &cur->stack_arg_regs[i] : &not_init;
+		if (!regsafe(env, old_arg, cur_arg, idmap, exact))
+			return false;
+	}
+
+	return true;
+}
+
 static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
 		    struct bpf_idmap *idmap)
 {
@@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
 	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
 		return false;
 
+	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
+		return false;
+
 	return true;
 }
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3bafb7ad2ba7..61d6663498f7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1362,6 +1362,18 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st
 		return -ENOMEM;
 
 	dst->allocated_stack = src->allocated_stack;
+
+	/* copy stack args state */
+	n = src->out_stack_arg_cnt;
+	if (n) {
+		dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n,
+						 sizeof(struct bpf_reg_state),
+						 GFP_KERNEL_ACCOUNT);
+		if (!dst->stack_arg_regs)
+			return -ENOMEM;
+	}
+
+	dst->out_stack_arg_cnt = src->out_stack_arg_cnt;
 	return 0;
 }
 
@@ -1403,6 +1415,23 @@ static int grow_stack_state(struct bpf_verifier_env *env, struct bpf_func_state
 	return 0;
 }
 
+static int grow_stack_arg_slots(struct bpf_verifier_env *env,
+				struct bpf_func_state *state, int cnt)
+{
+	size_t old_n = state->out_stack_arg_cnt;
+
+	if (old_n >= cnt)
+		return 0;
+
+	state->stack_arg_regs = realloc_array(state->stack_arg_regs, old_n, cnt,
+					      sizeof(struct bpf_reg_state));
+	if (!state->stack_arg_regs)
+		return -ENOMEM;
+
+	state->out_stack_arg_cnt = cnt;
+	return 0;
+}
+
 /* Acquire a pointer id from the env and update the state->refs to include
  * this new pointer reference.
  * On success, returns a valid pointer id to associate with the register
@@ -1565,6 +1594,7 @@ static void free_func_state(struct bpf_func_state *state)
 {
 	if (!state)
 		return;
+	kfree(state->stack_arg_regs);
 	kfree(state->stack);
 	kfree(state);
 }
@@ -4050,6 +4080,105 @@ static int check_stack_write(struct bpf_verifier_env *env,
 	return err;
 }
 
+/*
+ * Write a value to the outgoing stack arg area.
+ * off is a negative offset from r11 (e.g. -8 for arg6, -16 for arg7).
+ */
+static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_state *state,
+				 int off, struct bpf_reg_state *value_reg)
+{
+	int max_stack_arg_regs = MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS;
+	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+	int spi = -off / BPF_REG_SIZE - 1;
+	struct bpf_func_state *cur;
+	struct bpf_reg_state *arg;
+	int err;
+
+	if (spi >= max_stack_arg_regs) {
+		verbose(env, "stack arg write offset %d exceeds max %d stack args\n",
+			off, max_stack_arg_regs);
+		return -EINVAL;
+	}
+
+	err = grow_stack_arg_slots(env, state, spi + 1);
+	if (err)
+		return err;
+
+	/* Track the max outgoing stack arg slot count. */
+	if (spi + 1 > subprog->max_out_stack_arg_cnt)
+		subprog->max_out_stack_arg_cnt = spi + 1;
+
+	cur = env->cur_state->frame[env->cur_state->curframe];
+	if (value_reg) {
+		state->stack_arg_regs[spi] = *value_reg;
+	} else {
+		/* BPF_ST: store immediate, treat as scalar */
+		arg = &state->stack_arg_regs[spi];
+		arg->type = SCALAR_VALUE;
+		__mark_reg_known(arg, env->prog->insnsi[env->insn_idx].imm);
+	}
+	state->no_stack_arg_load = true;
+	return 0;
+}
+
+/*
+ * Read a value from the incoming stack arg area.
+ * off is a positive offset from r11 (e.g. +8 for arg6, +16 for arg7).
+ */
+static int check_stack_arg_read(struct bpf_verifier_env *env, struct bpf_func_state *state,
+				int off, int dst_regno)
+{
+	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+	struct bpf_verifier_state *vstate = env->cur_state;
+	int spi = off / BPF_REG_SIZE - 1;
+	struct bpf_func_state *caller, *cur;
+	struct bpf_reg_state *arg;
+
+	if (state->no_stack_arg_load) {
+		verbose(env, "r11 load must be before any r11 store or call insn\n");
+		return -EINVAL;
+	}
+
+	if (spi + 1 > bpf_in_stack_arg_cnt(subprog)) {
+		verbose(env, "invalid read from stack arg off %d depth %d\n",
+			off, bpf_in_stack_arg_cnt(subprog) * BPF_REG_SIZE);
+		return -EACCES;
+	}
+
+	caller = vstate->frame[vstate->curframe - 1];
+	arg = &caller->stack_arg_regs[spi];
+	cur = vstate->frame[vstate->curframe];
+	cur->regs[dst_regno] = *arg;
+	return 0;
+}
+
+static int check_outgoing_stack_args(struct bpf_verifier_env *env, struct bpf_func_state *caller,
+				     int nargs)
+{
+	int i, spi;
+
+	for (i = MAX_BPF_FUNC_REG_ARGS; i < nargs; i++) {
+		spi = i - MAX_BPF_FUNC_REG_ARGS;
+		if (spi >= caller->out_stack_arg_cnt ||
+		    caller->stack_arg_regs[spi].type == NOT_INIT) {
+			verbose(env, "caller expects %d args, stack arg%d is not initialized\n",
+				nargs, spi + 1);
+			return -EFAULT;
+		}
+	}
+
+	return 0;
+}
+
+static struct bpf_reg_state *get_func_arg_reg(struct bpf_func_state *caller,
+					      struct bpf_reg_state *regs, int arg)
+{
+	if (arg < MAX_BPF_FUNC_REG_ARGS)
+		return &regs[arg + 1];
+
+	return &caller->stack_arg_regs[arg - MAX_BPF_FUNC_REG_ARGS];
+}
+
 static int check_map_access_type(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 				 int off, int size, enum bpf_access_type type)
 {
@@ -6217,10 +6346,20 @@ static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			  bool strict_alignment_once, bool is_ldsx,
 			  bool allow_trust_mismatch, const char *ctx)
 {
+	struct bpf_verifier_state *vstate = env->cur_state;
+	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *regs = cur_regs(env);
 	enum bpf_reg_type src_reg_type;
 	int err;
 
+	/* Handle stack arg read */
+	if (is_stack_arg_ldx(insn)) {
+		err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK);
+		if (err)
+			return err;
+		return check_stack_arg_read(env, state, insn->off, insn->dst_reg);
+	}
+
 	/* check src operand */
 	err = check_reg_arg(env, insn->src_reg, SRC_OP);
 	if (err)
@@ -6249,10 +6388,20 @@ static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
 static int check_store_reg(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			   bool strict_alignment_once)
 {
+	struct bpf_verifier_state *vstate = env->cur_state;
+	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *regs = cur_regs(env);
 	enum bpf_reg_type dst_reg_type;
 	int err;
 
+	/* Handle stack arg write */
+	if (is_stack_arg_stx(insn)) {
+		err = check_reg_arg(env, insn->src_reg, SRC_OP);
+		if (err)
+			return err;
+		return check_stack_arg_write(env, state, insn->off, regs + insn->src_reg);
+	}
+
 	/* check src1 operand */
 	err = check_reg_arg(env, insn->src_reg, SRC_OP);
 	if (err)
@@ -8860,6 +9009,14 @@ static void clear_caller_saved_regs(struct bpf_verifier_env *env,
 	}
 }
 
+static void invalidate_outgoing_stack_args(struct bpf_func_state *state)
+{
+	int i, nslots = state->out_stack_arg_cnt;
+
+	for (i = 0; i < nslots; i++)
+		state->stack_arg_regs[i].type = NOT_INIT;
+}
+
 typedef int (*set_callee_state_fn)(struct bpf_verifier_env *env,
 				   struct bpf_func_state *caller,
 				   struct bpf_func_state *callee,
@@ -8922,6 +9079,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
 				    struct bpf_reg_state *regs)
 {
 	struct bpf_subprog_info *sub = subprog_info(env, subprog);
+	struct bpf_func_state *caller = cur_func(env);
 	struct bpf_verifier_log *log = &env->log;
 	u32 i;
 	int ret;
@@ -8930,13 +9088,16 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
 	if (ret)
 		return ret;
 
+	ret = check_outgoing_stack_args(env, caller, sub->arg_cnt);
+	if (ret)
+		return ret;
+
 	/* check that BTF function arguments match actual types that the
 	 * verifier sees.
 	 */
 	for (i = 0; i < sub->arg_cnt; i++) {
 		argno_t argno = argno_from_arg(i + 1);
-		u32 regno = i + 1;
-		struct bpf_reg_state *reg = &regs[regno];
+		struct bpf_reg_state *reg = get_func_arg_reg(caller, regs, i);
 		struct bpf_subprog_arg_info *arg = &sub->args[i];
 
 		if (arg->arg_type == ARG_ANYTHING) {
@@ -9124,6 +9285,8 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			   int *insn_idx)
 {
 	struct bpf_verifier_state *state = env->cur_state;
+	struct bpf_subprog_info *caller_info;
+	u16 callee_incoming, stack_arg_cnt;
 	struct bpf_func_state *caller;
 	int err, subprog, target_insn;
 
@@ -9177,6 +9340,16 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		return 0;
 	}
 
+	/*
+	 * Track caller's total stack arg count (incoming + max outgoing).
+	 * This is needed so the JIT knows how much stack arg space to allocate.
+	 */
+	caller_info = &env->subprog_info[caller->subprogno];
+	callee_incoming = bpf_in_stack_arg_cnt(&env->subprog_info[subprog]);
+	stack_arg_cnt = bpf_in_stack_arg_cnt(caller_info) + callee_incoming;
+	if (stack_arg_cnt > caller_info->stack_arg_cnt)
+		caller_info->stack_arg_cnt = stack_arg_cnt;
+
 	/* for regular function entry setup new frame and continue
 	 * from that frame.
 	 */
@@ -9534,6 +9707,7 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
 	 * bpf_throw, this will be done by copy_verifier_state for extra frames. */
 	free_func_state(callee);
 	state->frame[state->curframe--] = NULL;
+	invalidate_outgoing_stack_args(caller);
 
 	/* for callbacks widen imprecise scalars to make programs like below verify:
 	 *
@@ -16961,6 +17135,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 		return check_store_reg(env, insn, false);
 
 	case BPF_ST: {
+		/* Handle stack arg write (store immediate) */
+		if (is_stack_arg_st(insn)) {
+			struct bpf_verifier_state *vstate = env->cur_state;
+			struct bpf_func_state *state = vstate->frame[vstate->curframe];
+
+			return check_stack_arg_write(env, state, insn->off, NULL);
+		}
+
 		enum bpf_reg_type dst_reg_type;
 
 		err = check_reg_arg(env, insn->dst_reg, SRC_OP);
@@ -16995,6 +17177,7 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 				}
 			}
 			mark_reg_scratched(env, BPF_REG_0);
+			cur_func(env)->no_stack_arg_load = true;
 			if (insn->src_reg == BPF_PSEUDO_CALL)
 				return check_func_call(env, insn, &env->insn_idx);
 			if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
@@ -18110,7 +18293,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
 				goto out;
 			}
 		}
-		for (i = BPF_REG_1; i <= sub->arg_cnt; i++) {
+		for (i = BPF_REG_1; i <= min_t(u32, sub->arg_cnt, MAX_BPF_FUNC_REG_ARGS); i++) {
 			arg = &sub->args[i - BPF_REG_1];
 			reg = &regs[i];
 
@@ -18153,6 +18336,12 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
 				goto out;
 			}
 		}
+		if (env->prog->type == BPF_PROG_TYPE_EXT && sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS) {
+			verbose(env, "freplace programs with >%d args not supported yet\n",
+				MAX_BPF_FUNC_REG_ARGS);
+			ret = -EINVAL;
+			goto out;
+		}
 	} else {
 		/* if main BPF program has associated BTF info, validate that
 		 * it's matching expected signature, and otherwise mark BTF
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 06/23] bpf: Refactor jmp history to use dedicated spi/frame fields
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (4 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Move stack slot index (spi) and frame number out of the flags field
in bpf_jmp_history_entry into dedicated bitfields. This simplifies
the encoding and makes room for new flags.

Previously, spi and frame were packed into the lower 9 bits of the
12-bit flags field (3 bits frame + 6 bits spi), with INSN_F_STACK_ACCESS
at BIT(9) and INSN_F_DST/SRC_REG_STACK at BIT(10)/BIT(11).
But this has no room for an INSN_F_* flag for stack arguments.

To resolve this issue, bpf_jmp_history_entry field idx is narrowed to
20 bits (sufficient for insn indices up to 1M), and the freed bits hold
spi (6 bits) and frame (3 bits) as dedicated struct fields. The flags
enum is simplified accordingly:
  INSN_F_STACK_ACCESS  -> BIT(0)
  INSN_F_DST_REG_STACK -> BIT(1)
  INSN_F_SRC_REG_STACK -> BIT(2)
which allows more room for additional INSN_F_* flags.

bpf_push_jmp_history() now takes explicit spi and frame parameters
instead of encoding them into flags. The insn_stack_access_flags(),
insn_stack_access_spi(), and insn_stack_access_frameno() helpers are
removed.

No functional change.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/bpf_verifier.h | 34 ++++++++++++++--------------------
 kernel/bpf/backtrack.c       | 23 +++++++++--------------
 kernel/bpf/states.c          |  2 +-
 kernel/bpf/verifier.c        | 22 +++++++++++-----------
 4 files changed, 35 insertions(+), 46 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 1fba16fced28..2d5c8f36f451 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -435,31 +435,22 @@ struct bpf_func_state {
 
 #define MAX_CALL_FRAMES 8
 
-/* instruction history flags, used in bpf_jmp_history_entry.flags field */
+/* instruction history flags, used in bpf_jmp_history_entry.flags field.
+ * Frame number and SPI are stored in dedicated fields of bpf_jmp_history_entry.
+ */
 enum {
-	/* instruction references stack slot through PTR_TO_STACK register;
-	 * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8)
-	 * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512,
-	 * 8 bytes per slot, so slot index (spi) is [0, 63])
-	 */
-	INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */
-
-	INSN_F_SPI_MASK = 0x3f, /* 6 bits */
-	INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */
+	INSN_F_STACK_ACCESS = BIT(0),
 
-	INSN_F_STACK_ACCESS = BIT(9),
-
-	INSN_F_DST_REG_STACK = BIT(10), /* dst_reg is PTR_TO_STACK */
-	INSN_F_SRC_REG_STACK = BIT(11), /* src_reg is PTR_TO_STACK */
-	/* total 12 bits are used now. */
+	INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
+	INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
 };
 
-static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES);
-static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8);
-
 struct bpf_jmp_history_entry {
-	u32 idx;
 	/* insn idx can't be bigger than 1 million */
+	u32 idx : 20;
+	u32 frame : 3;	/* stack access frame number */
+	u32 spi : 6;	/* stack slot index (0..63) */
+	u32 : 3;
 	u32 prev_idx : 20;
 	/* special INSN_F_xxx flags */
 	u32 flags : 12;
@@ -469,6 +460,9 @@ struct bpf_jmp_history_entry {
 	u64 linked_regs;
 };
 
+static_assert(MAX_CALL_FRAMES <= (1 << 3));
+static_assert(MAX_BPF_STACK / 8 <= (1 << 6));
+
 /* Maximum number of register states that can exist at once */
 #define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES)
 struct bpf_verifier_state {
@@ -1180,7 +1174,7 @@ struct list_head *bpf_explored_state(struct bpf_verifier_env *env, int idx);
 void bpf_free_verifier_state(struct bpf_verifier_state *state, bool free_self);
 void bpf_free_backedges(struct bpf_scc_visit *visit);
 int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur,
-			 int insn_flags, u64 linked_regs);
+			 int insn_flags, int spi, int frame, u64 linked_regs);
 void bpf_bt_sync_linked_regs(struct backtrack_state *bt, struct bpf_jmp_history_entry *hist);
 void bpf_mark_reg_not_init(const struct bpf_verifier_env *env,
 			   struct bpf_reg_state *reg);
diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
index 854731dc93fe..93738a076c88 100644
--- a/kernel/bpf/backtrack.c
+++ b/kernel/bpf/backtrack.c
@@ -9,7 +9,7 @@
 
 /* for any branch, call, exit record the history of jmps in the given state */
 int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur,
-			 int insn_flags, u64 linked_regs)
+			 int insn_flags, int spi, int frame, u64 linked_regs)
 {
 	u32 cnt = cur->jmp_history_cnt;
 	struct bpf_jmp_history_entry *p;
@@ -25,6 +25,8 @@ int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state
 				env, "insn history: insn_idx %d cur flags %x new flags %x",
 				env->insn_idx, env->cur_hist_ent->flags, insn_flags);
 		env->cur_hist_ent->flags |= insn_flags;
+		env->cur_hist_ent->spi = spi;
+		env->cur_hist_ent->frame = frame;
 		verifier_bug_if(env->cur_hist_ent->linked_regs != 0, env,
 				"insn history: insn_idx %d linked_regs: %#llx",
 				env->insn_idx, env->cur_hist_ent->linked_regs);
@@ -43,6 +45,8 @@ int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state
 	p->idx = env->insn_idx;
 	p->prev_idx = env->prev_insn_idx;
 	p->flags = insn_flags;
+	p->spi = spi;
+	p->frame = frame;
 	p->linked_regs = linked_regs;
 	cur->jmp_history_cnt = cnt;
 	env->cur_hist_ent = p;
@@ -64,15 +68,6 @@ static bool is_atomic_fetch_insn(const struct bpf_insn *insn)
 	       (insn->imm & BPF_FETCH);
 }
 
-static int insn_stack_access_spi(int insn_flags)
-{
-	return (insn_flags >> INSN_F_SPI_SHIFT) & INSN_F_SPI_MASK;
-}
-
-static int insn_stack_access_frameno(int insn_flags)
-{
-	return insn_flags & INSN_F_FRAMENO_MASK;
-}
 
 /* Backtrack one insn at a time. If idx is not at the top of recorded
  * history then previous instruction came from straight line execution.
@@ -353,8 +348,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 		 * that [fp - off] slot contains scalar that needs to be
 		 * tracked with precision
 		 */
-		spi = insn_stack_access_spi(hist->flags);
-		fr = insn_stack_access_frameno(hist->flags);
+		spi = hist->spi;
+		fr = hist->frame;
 		bpf_bt_set_frame_slot(bt, fr, spi);
 	} else if (class == BPF_STX || class == BPF_ST) {
 		if (bt_is_reg_set(bt, dreg))
@@ -366,8 +361,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 		/* scalars can only be spilled into stack */
 		if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
 			return 0;
-		spi = insn_stack_access_spi(hist->flags);
-		fr = insn_stack_access_frameno(hist->flags);
+		spi = hist->spi;
+		fr = hist->frame;
 		if (!bt_is_frame_slot_set(bt, fr, spi))
 			return 0;
 		bt_clear_frame_slot(bt, fr, spi);
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index c249eb40c6d6..45d86bfe3b68 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -1400,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 */
 			err = 0;
 			if (bpf_is_jmp_point(env, env->insn_idx))
-				err = bpf_push_jmp_history(env, cur, 0, 0);
+				err = bpf_push_jmp_history(env, cur, 0, 0, 0, 0);
 			err = err ? : propagate_precision(env, &sl->state, cur, NULL);
 			if (err)
 				return err;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 61d6663498f7..010efdb766ac 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3198,10 +3198,6 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
 	return __check_reg_arg(env, state->regs, regno, t);
 }
 
-static int insn_stack_access_flags(int frameno, int spi)
-{
-	return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno;
-}
 
 static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
 {
@@ -3517,7 +3513,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
 	int i, slot = -off - 1, spi = slot / BPF_REG_SIZE, err;
 	struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
 	struct bpf_reg_state *reg = NULL;
-	int insn_flags = insn_stack_access_flags(state->frameno, spi);
+	int insn_flags = INSN_F_STACK_ACCESS;
+	int hist_spi = spi, hist_frame = state->frameno;
 
 	/* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0,
 	 * so it's aligned access and [off, off + size) are within stack limits
@@ -3613,7 +3610,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
 	}
 
 	if (insn_flags)
-		return bpf_push_jmp_history(env, env->cur_state, insn_flags, 0);
+		return bpf_push_jmp_history(env, env->cur_state, insn_flags,
+					    hist_spi, hist_frame, 0);
 	return 0;
 }
 
@@ -3809,7 +3807,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 	int i, slot = -off - 1, spi = slot / BPF_REG_SIZE;
 	struct bpf_reg_state *reg;
 	u8 *stype, type;
-	int insn_flags = insn_stack_access_flags(reg_state->frameno, spi);
+	int insn_flags = INSN_F_STACK_ACCESS;
+	int hist_spi = spi, hist_frame = reg_state->frameno;
 
 	stype = reg_state->stack[spi].slot_type;
 	reg = &reg_state->stack[spi].spilled_ptr;
@@ -3940,7 +3939,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 		insn_flags = 0; /* we are not restoring spilled register */
 	}
 	if (insn_flags)
-		return bpf_push_jmp_history(env, env->cur_state, insn_flags, 0);
+		return bpf_push_jmp_history(env, env->cur_state, insn_flags,
+					    hist_spi, hist_frame, 0);
 	return 0;
 }
 
@@ -15905,7 +15905,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	}
 
 	if (insn_flags) {
-		err = bpf_push_jmp_history(env, this_branch, insn_flags, 0);
+		err = bpf_push_jmp_history(env, this_branch, insn_flags, 0, 0, 0);
 		if (err)
 			return err;
 	}
@@ -15969,7 +15969,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	if (dst_reg->type == SCALAR_VALUE && dst_reg->id)
 		collect_linked_regs(env, this_branch, dst_reg->id, &linked_regs);
 	if (linked_regs.cnt > 1) {
-		err = bpf_push_jmp_history(env, this_branch, 0, linked_regs_pack(&linked_regs));
+		err = bpf_push_jmp_history(env, this_branch, 0, 0, 0, linked_regs_pack(&linked_regs));
 		if (err)
 			return err;
 	}
@@ -17275,7 +17275,7 @@ static int do_check(struct bpf_verifier_env *env)
 		}
 
 		if (bpf_is_jmp_point(env, env->insn_idx)) {
-			err = bpf_push_jmp_history(env, state, 0, 0);
+			err = bpf_push_jmp_history(env, state, 0, 0, 0, 0);
 			if (err)
 				return err;
 		}
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (5 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 06/23] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  4:05   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 08/23] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
                   ` (16 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Extend the precision marking and backtracking infrastructure to
support stack argument slots (r11-based accesses). Without this,
precision demands for scalar values passed through stack arguments
are silently dropped, which could allow the verifier to incorrectly
prune states with different constant values in stack arg slots.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/bpf_verifier.h |  8 +++++
 kernel/bpf/backtrack.c       | 64 +++++++++++++++++++++++++++++++++++-
 kernel/bpf/verifier.c        | 32 +++++++++++++++---
 3 files changed, 98 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 2d5c8f36f451..24171221f6cf 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -443,6 +443,8 @@ enum {
 
 	INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */
 	INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */
+
+	INSN_F_STACK_ARG_ACCESS = BIT(3),
 };
 
 struct bpf_jmp_history_entry {
@@ -839,6 +841,7 @@ struct backtrack_state {
 	u32 frame;
 	u32 reg_masks[MAX_CALL_FRAMES];
 	u64 stack_masks[MAX_CALL_FRAMES];
+	u8 stack_arg_masks[MAX_CALL_FRAMES];
 };
 
 struct bpf_id_pair {
@@ -1237,6 +1240,11 @@ static inline void bpf_bt_set_frame_slot(struct backtrack_state *bt, u32 frame,
 	bt->stack_masks[frame] |= 1ull << slot;
 }
 
+static inline void bt_set_frame_stack_arg_slot(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+	bt->stack_arg_masks[frame] |= 1 << slot;
+}
+
 static inline bool bt_is_frame_reg_set(struct backtrack_state *bt, u32 frame, u32 reg)
 {
 	return bt->reg_masks[frame] & (1 << reg);
diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
index 93738a076c88..17f254a02831 100644
--- a/kernel/bpf/backtrack.c
+++ b/kernel/bpf/backtrack.c
@@ -130,11 +130,21 @@ static inline u32 bt_empty(struct backtrack_state *bt)
 	int i;
 
 	for (i = 0; i <= bt->frame; i++)
-		mask |= bt->reg_masks[i] | bt->stack_masks[i];
+		mask |= bt->reg_masks[i] | bt->stack_masks[i] | bt->stack_arg_masks[i];
 
 	return mask == 0;
 }
 
+static inline void bt_clear_frame_stack_arg_slot(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+	bt->stack_arg_masks[frame] &= ~(1 << slot);
+}
+
+static inline bool bt_is_frame_stack_arg_slot_set(struct backtrack_state *bt, u32 frame, u32 slot)
+{
+	return bt->stack_arg_masks[frame] & (1 << slot);
+}
+
 static inline int bt_subprog_enter(struct backtrack_state *bt)
 {
 	if (bt->frame == MAX_CALL_FRAMES - 1) {
@@ -195,6 +205,11 @@ static inline u64 bt_stack_mask(struct backtrack_state *bt)
 	return bt->stack_masks[bt->frame];
 }
 
+static inline u8 bt_stack_arg_mask(struct backtrack_state *bt)
+{
+	return bt->stack_arg_masks[bt->frame];
+}
+
 static inline bool bt_is_reg_set(struct backtrack_state *bt, u32 reg)
 {
 	return bt->reg_masks[bt->frame] & (1 << reg);
@@ -336,6 +351,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 			return 0;
 		bt_clear_reg(bt, load_reg);
 
+		if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
+			spi = hist->spi;
+			/*
+			 * Stack arg read: callee reads from r11+off, but
+			 * the data lives in the caller's stack_arg_regs.
+			 * Set the mask in the caller frame so precision
+			 * is marked in the caller's slot at the callee
+			 * entry checkpoint.
+			 */
+			bt_set_frame_stack_arg_slot(bt, bt->frame - 1, spi);
+			return 0;
+		}
+
 		/* scalars can only be spilled into stack w/o losing precision.
 		 * Load from any other memory can be zero extended.
 		 * The desire to keep that precision is already indicated
@@ -358,6 +386,17 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 			 * encountered a case of pointer subtraction.
 			 */
 			return -ENOTSUPP;
+
+		if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
+			spi = hist->spi;
+			if (!bt_is_frame_stack_arg_slot_set(bt, bt->frame, spi))
+				return 0;
+			bt_clear_frame_stack_arg_slot(bt, bt->frame, spi);
+			if (class == BPF_STX)
+				bt_set_reg(bt, sreg);
+			return 0;
+		}
+
 		/* scalars can only be spilled into stack */
 		if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
 			return 0;
@@ -426,6 +465,18 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 						bpf_bt_set_frame_reg(bt, bt->frame - 1, i);
 					}
 				}
+				/*
+				 * propagate callee's incoming stack arg precision
+				 * to caller's outgoing stack arg slots
+				 */
+				if (bt_stack_arg_mask(bt)) {
+					for (i = 0; i < MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS; i++) {
+						if (!bt_is_frame_stack_arg_slot_set(bt, bt->frame, i))
+							continue;
+						bt_clear_frame_stack_arg_slot(bt, bt->frame, i);
+						bt_set_frame_stack_arg_slot(bt, bt->frame - 1, i);
+					}
+				}
 				if (bt_subprog_exit(bt))
 					return -EFAULT;
 				return 0;
@@ -896,6 +947,17 @@ int bpf_mark_chain_precision(struct bpf_verifier_env *env,
 					*changed = true;
 				}
 			}
+			for (i = 0; i < func->out_stack_arg_cnt; i++) {
+				if (!bt_is_frame_stack_arg_slot_set(bt, fr, i))
+					continue;
+				reg = &func->stack_arg_regs[i];
+				if (reg->type != SCALAR_VALUE || reg->precise) {
+					bt_clear_frame_stack_arg_slot(bt, fr, i);
+				} else {
+					reg->precise = true;
+					*changed = true;
+				}
+			}
 			if (env->log.level & BPF_LOG_LEVEL2) {
 				fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN,
 					     bt_frame_reg_mask(bt, fr));
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 010efdb766ac..4827df9f817c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -292,6 +292,11 @@ static int arg_from_argno(argno_t a)
 	return -1;
 }
 
+static int arg_idx_from_argno(argno_t a)
+{
+	return arg_from_argno(a) - 1;
+}
+
 static const char *btf_type_name(const struct btf *btf, u32 id)
 {
 	return btf_name_by_offset(btf, btf_type_by_id(btf, id)->name_off);
@@ -4118,7 +4123,8 @@ static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_s
 		__mark_reg_known(arg, env->prog->insnsi[env->insn_idx].imm);
 	}
 	state->no_stack_arg_load = true;
-	return 0;
+	return bpf_push_jmp_history(env, env->cur_state,
+				    INSN_F_STACK_ARG_ACCESS, spi, 0, 0);
 }
 
 /*
@@ -4149,7 +4155,17 @@ static int check_stack_arg_read(struct bpf_verifier_env *env, struct bpf_func_st
 	arg = &caller->stack_arg_regs[spi];
 	cur = vstate->frame[vstate->curframe];
 	cur->regs[dst_regno] = *arg;
-	return 0;
+	return bpf_push_jmp_history(env, env->cur_state,
+				    INSN_F_STACK_ARG_ACCESS, spi, 0, 0);
+}
+
+static int mark_stack_arg_precision(struct bpf_verifier_env *env, int arg_idx)
+{
+	struct bpf_func_state *caller = cur_func(env);
+	int spi = arg_idx - MAX_BPF_FUNC_REG_ARGS;
+
+	bt_set_frame_stack_arg_slot(&env->bt, caller->frameno, spi);
+	return mark_chain_precision_batch(env, env->cur_state);
 }
 
 static int check_outgoing_stack_args(struct bpf_verifier_env *env, struct bpf_func_state *caller,
@@ -6878,8 +6894,14 @@ static int check_mem_size_reg(struct bpf_verifier_env *env,
 	}
 	err = check_helper_mem_access(env, mem_reg, mem_argno, reg_umax(size_reg),
 				      access_type, zero_size_allowed, meta);
-	if (!err)
-		err = mark_chain_precision(env, reg_from_argno(size_argno));
+	if (!err) {
+		int regno = reg_from_argno(size_argno);
+
+		if (regno >= 0)
+			err = mark_chain_precision(env, regno);
+		else
+			err = mark_stack_arg_precision(env, arg_idx_from_argno(size_argno));
+	}
 	return err;
 }
 
@@ -7328,7 +7350,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, struct bpf_reg_state *
 			    struct bpf_kfunc_call_arg_meta *meta)
 {
 	const struct btf_type *t;
-	u32 arg_idx = arg_from_argno(argno) - 1;
+	u32 arg_idx = arg_idx_from_argno(argno);
 	int spi, err, i, nr_slots, btf_id;
 
 	if (reg->type != PTR_TO_STACK) {
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 08/23] bpf: Refactor record_call_access() to extract per-arg logic
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (6 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Extract the per-argument FP-derived pointer handling from
record_call_access() into a new record_arg_access() helper.

The existing loop body — checking arg_is_fp, querying stack access
bytes, and calling record_stack_access/record_imprecise — will be
reused for stack argument slots in the next patch. Factoring it out
now avoids duplicating the logic.

No functional change.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/liveness.c | 65 +++++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 27 deletions(-)

diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index 332e6e003f27..de0b4c6263ff 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -1343,6 +1343,42 @@ static int record_load_store_access(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static int record_arg_access(struct bpf_verifier_env *env,
+			     struct func_instance *instance,
+			     struct bpf_insn *insn,
+			     struct arg_track *at, int arg_idx,
+			     int insn_idx)
+{
+	int depth = instance->depth;
+	int frame = at->frame;
+	int err = 0;
+	s64 bytes;
+
+	if (!arg_is_fp(at))
+		return 0;
+
+	if (bpf_helper_call(insn)) {
+		bytes = bpf_helper_stack_access_bytes(env, insn, arg_idx, insn_idx);
+	} else if (bpf_pseudo_kfunc_call(insn)) {
+		bytes = bpf_kfunc_stack_access_bytes(env, insn, arg_idx, insn_idx);
+	} else {
+		for (int f = 0; f <= depth; f++) {
+			err = mark_stack_read(instance, f, insn_idx, SPIS_ALL);
+			if (err)
+				return err;
+		}
+		return 0;
+	}
+	if (bytes == 0)
+		return 0;
+
+	if (frame >= 0 && frame <= depth)
+		err = record_stack_access(instance, at, bytes, frame, insn_idx);
+	else if (frame == ARG_IMPRECISE)
+		err = record_imprecise(instance, at->mask, insn_idx);
+	return err;
+}
+
 /* Record stack access for a given 'at' state of helper/kfunc 'insn' */
 static int record_call_access(struct bpf_verifier_env *env,
 			      struct func_instance *instance,
@@ -1350,9 +1386,8 @@ static int record_call_access(struct bpf_verifier_env *env,
 			      int insn_idx)
 {
 	struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
-	int depth = instance->depth;
 	struct bpf_call_summary cs;
-	int r, err = 0, num_params = 5;
+	int r, err, num_params = 5;
 
 	if (bpf_pseudo_call(insn))
 		return 0;
@@ -1361,31 +1396,7 @@ static int record_call_access(struct bpf_verifier_env *env,
 		num_params = cs.num_params;
 
 	for (r = BPF_REG_1; r < BPF_REG_1 + num_params; r++) {
-		int frame = at[r].frame;
-		s64 bytes;
-
-		if (!arg_is_fp(&at[r]))
-			continue;
-
-		if (bpf_helper_call(insn)) {
-			bytes = bpf_helper_stack_access_bytes(env, insn, r - 1, insn_idx);
-		} else if (bpf_pseudo_kfunc_call(insn)) {
-			bytes = bpf_kfunc_stack_access_bytes(env, insn, r - 1, insn_idx);
-		} else {
-			for (int f = 0; f <= depth; f++) {
-				err = mark_stack_read(instance, f, insn_idx, SPIS_ALL);
-				if (err)
-					return err;
-			}
-			return 0;
-		}
-		if (bytes == 0)
-			continue;
-
-		if (frame >= 0 && frame <= depth)
-			err = record_stack_access(instance, &at[r], bytes, frame, insn_idx);
-		else if (frame == ARG_IMPRECISE)
-			err = record_imprecise(instance, at[r].mask, insn_idx);
+		err = record_arg_access(env, instance, insn, &at[r], r - 1, insn_idx);
 		if (err)
 			return err;
 	}
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (7 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 08/23] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  0:59   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
                   ` (14 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

BPF_REG_PARAMS (R11) is at index MAX_BPF_REG, which is beyond the
register tracking arrays in const_fold.c and liveness.c. Handle it
explicitly to avoid out-of-bounds accesses.

Extend the arg tracking dataflow to cover stack arg slots. Otherwise,
pointers passed through stack args are invisible to liveness, causing
the pointed-to stack slots to be incorrectly poisoned.

Extend the at_out tracking array to MAX_AT_TRACK_REGS (registers
plus stack arg slots) so that outgoing stack arg stores are tracked
alongside registers. Add a separate at_stack_arg_entry array in
arg_track_xfer() to restore FP-derived values on incoming stack arg
reads.

Extend record_call_access() to check stack arg slots for FP-derived
pointers at kfunc call sites, reusing the record_arg_access() helper
extracted in the previous patch. Pass stack arg state from caller to
callee in analyze_subprog() so that callees can track pointers received
through stack args, hence avoid poisoning.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/const_fold.c |  8 ++++
 kernel/bpf/liveness.c   | 83 ++++++++++++++++++++++++++++++++++-------
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
index db73c4740b1e..b2a19acadb91 100644
--- a/kernel/bpf/const_fold.c
+++ b/kernel/bpf/const_fold.c
@@ -58,6 +58,14 @@ static void const_reg_xfer(struct bpf_verifier_env *env, struct const_arg_info *
 	u8 opcode = BPF_OP(insn->code) | BPF_SRC(insn->code);
 	int r;
 
+	/* Stack arg stores (r11-based) are outside the tracked register set. */
+	if (is_stack_arg_st(insn) || is_stack_arg_stx(insn))
+		return;
+	if (is_stack_arg_ldx(insn)) {
+		ci_out[insn->dst_reg] = unknown;
+		return;
+	}
+
 	switch (class) {
 	case BPF_ALU:
 	case BPF_ALU64:
diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index de0b4c6263ff..eed7260f2bb0 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -610,6 +610,24 @@ enum arg_track_state {
 /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
 #define MAX_ARG_SPILL_SLOTS 64
 
+/* Track stack arg slots: outgoing starts at -(i+1)*8, incoming at +(i+1)*8 */
+#define MAX_STACK_ARG_SLOTS (MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS)
+
+/*
+ * Combined register + stack arg tracking: R0-R10 at indices 0-10,
+ * outgoing stack arg slots at indices MAX_BPF_REG..MAX_BPF_REG+6.
+ */
+#define MAX_AT_TRACK_REGS (MAX_BPF_REG + MAX_STACK_ARG_SLOTS)
+
+static int stack_arg_off_to_slot(s16 off)
+{
+	int aoff = off < 0 ? -off : off;
+
+	if (aoff / 8 > MAX_STACK_ARG_SLOTS)
+		return -1;
+	return aoff / 8 - 1;
+}
+
 static bool arg_is_visited(const struct arg_track *at)
 {
 	return at->frame != ARG_UNVISITED;
@@ -1062,6 +1080,7 @@ static bool can_be_local_fp(int depth, int regno, struct arg_track *at)
 static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			   int insn_idx,
 			   struct arg_track *at_out, struct arg_track *at_stack_out,
+			   const struct arg_track *at_stack_arg_entry,
 			   struct func_instance *instance,
 			   u32 *callsites)
 {
@@ -1071,8 +1090,24 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	struct arg_track *dst = &at_out[insn->dst_reg];
 	struct arg_track *src = &at_out[insn->src_reg];
 	struct arg_track none = { .frame = ARG_NONE };
-	int r;
-
+	int r, slot;
+
+	/* Handle stack arg stores and loads. */
+	if (is_stack_arg_st(insn) || is_stack_arg_stx(insn)) {
+		slot = stack_arg_off_to_slot(insn->off);
+		if (slot >= 0) {
+			if (is_stack_arg_stx(insn))
+				at_out[MAX_BPF_REG + slot] = at_out[insn->src_reg];
+			else
+				at_out[MAX_BPF_REG + slot] = none;
+		}
+		return;
+	}
+	if (is_stack_arg_ldx(insn)) {
+		slot = stack_arg_off_to_slot(insn->off);
+		at_out[insn->dst_reg] = (slot >= 0) ? at_stack_arg_entry[slot] : none;
+		return;
+	}
 	if (class == BPF_ALU64 && BPF_SRC(insn->code) == BPF_K) {
 		if (code == BPF_MOV) {
 			*dst = none;
@@ -1395,11 +1430,18 @@ static int record_call_access(struct bpf_verifier_env *env,
 	if (bpf_get_call_summary(env, insn, &cs))
 		num_params = cs.num_params;
 
-	for (r = BPF_REG_1; r < BPF_REG_1 + num_params; r++) {
+	for (r = BPF_REG_1; r < BPF_REG_1 + min(num_params, MAX_BPF_FUNC_REG_ARGS); r++) {
 		err = record_arg_access(env, instance, insn, &at[r], r - 1, insn_idx);
 		if (err)
 			return err;
 	}
+
+	for (r = 0; r < MAX_STACK_ARG_SLOTS && r < num_params - MAX_BPF_FUNC_REG_ARGS; r++) {
+		err = record_arg_access(env, instance, insn, &at[MAX_BPF_REG + r],
+					r + MAX_BPF_FUNC_REG_ARGS, insn_idx);
+		if (err)
+			return err;
+	}
 	return 0;
 }
 
@@ -1456,7 +1498,7 @@ static int find_callback_subprog(struct bpf_verifier_env *env,
 
 /* Per-subprog intermediate state kept alive across analysis phases */
 struct subprog_at_info {
-	struct arg_track (*at_in)[MAX_BPF_REG];
+	struct arg_track (*at_in)[MAX_AT_TRACK_REGS];
 	int len;
 };
 
@@ -1554,10 +1596,11 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 	int end = env->subprog_info[subprog + 1].start;
 	int po_end = env->subprog_info[subprog + 1].postorder_start;
 	int len = end - start;
-	struct arg_track (*at_in)[MAX_BPF_REG] = NULL;
-	struct arg_track at_out[MAX_BPF_REG];
+	struct arg_track (*at_in)[MAX_AT_TRACK_REGS] = NULL;
+	struct arg_track at_out[MAX_AT_TRACK_REGS];
 	struct arg_track (*at_stack_in)[MAX_ARG_SPILL_SLOTS] = NULL;
 	struct arg_track *at_stack_out = NULL;
+	struct arg_track at_stack_arg_entry[MAX_STACK_ARG_SLOTS];
 	struct arg_track unvisited = { .frame = ARG_UNVISITED };
 	struct arg_track none = { .frame = ARG_NONE };
 	bool changed;
@@ -1576,19 +1619,19 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 		goto err_free;
 
 	for (i = 0; i < len; i++) {
-		for (r = 0; r < MAX_BPF_REG; r++)
+		for (r = 0; r < MAX_AT_TRACK_REGS; r++)
 			at_in[i][r] = unvisited;
 		for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
 			at_stack_in[i][r] = unvisited;
 	}
 
-	for (r = 0; r < MAX_BPF_REG; r++)
+	for (r = 0; r < MAX_AT_TRACK_REGS; r++)
 		at_in[0][r] = none;
 
 	/* Entry: R10 is always precisely the current frame's FP */
 	at_in[0][BPF_REG_FP] = arg_single(depth, 0);
 
-	/* R1-R5: from caller or ARG_NONE for main */
+	/* R1-R5 and outgoing stack args: from caller or ARG_NONE for main */
 	if (callee_entry) {
 		for (r = BPF_REG_1; r <= BPF_REG_5; r++)
 			at_in[0][r] = callee_entry[r];
@@ -1598,6 +1641,10 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 	for (r = 0; r < MAX_ARG_SPILL_SLOTS; r++)
 		at_stack_in[0][r] = none;
 
+	/* Entry: incoming stack args from caller, or ARG_NONE for main */
+	for (r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+		at_stack_arg_entry[r] = callee_entry ? callee_entry[MAX_BPF_REG + r] : none;
+
 	if (env->log.level & BPF_LOG_LEVEL2)
 		verbose(env, "subprog#%d: analyzing (depth %d)...\n", subprog, depth);
 
@@ -1616,7 +1663,8 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 		memcpy(at_out, at_in[i], sizeof(at_out));
 		memcpy(at_stack_out, at_stack_in[i], MAX_ARG_SPILL_SLOTS * sizeof(*at_stack_out));
 
-		arg_track_xfer(env, insn, idx, at_out, at_stack_out, instance, callsites);
+		arg_track_xfer(env, insn, idx, at_out, at_stack_out,
+			       at_stack_arg_entry, instance, callsites);
 		arg_track_log(env, insn, idx, at_in[i], at_stack_in[i], at_out, at_stack_out);
 
 		/* Propagate to successors within this subprogram */
@@ -1630,7 +1678,7 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 				continue;
 			ti = target - start;
 
-			for (r = 0; r < MAX_BPF_REG; r++)
+			for (r = 0; r < MAX_AT_TRACK_REGS; r++)
 				changed |= arg_track_join(env, idx, target, r,
 							  &at_in[ti][r], at_out[r]);
 
@@ -1685,12 +1733,15 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
 	return err;
 }
 
-/* Return true if any of R1-R5 is derived from a frame pointer. */
+/* Return true if any of R1-R5 or stack args is derived from a frame pointer. */
 static bool has_fp_args(struct arg_track *args)
 {
 	for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
 		if (args[r].frame != ARG_NONE)
 			return true;
+	for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+		if (arg_is_fp(&args[MAX_BPF_REG + r]))
+			return true;
 	return false;
 }
 
@@ -1814,7 +1865,7 @@ static int analyze_subprog(struct bpf_verifier_env *env,
 	/* For each reachable call site in the subprog, recurse into callees */
 	for (int p = po_start; p < po_end; p++) {
 		int idx = env->cfg.insn_postorder[p];
-		struct arg_track callee_args[BPF_REG_5 + 1];
+		struct arg_track callee_args[MAX_AT_TRACK_REGS] = {};
 		struct arg_track none = { .frame = ARG_NONE };
 		struct bpf_insn *insn = &insns[idx];
 		struct func_instance *callee_instance;
@@ -1829,9 +1880,11 @@ static int analyze_subprog(struct bpf_verifier_env *env,
 			if (callee < 0)
 				continue;
 
-			/* Build entry args: R1-R5 from at_in at call site */
+			/* Build entry args: R1-R5 and stack args from at_in at call site */
 			for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
 				callee_args[r] = info[subprog].at_in[j][r];
+			for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+				callee_args[MAX_BPF_REG + r] = info[subprog].at_in[j][MAX_BPF_REG + r];
 		} else if (bpf_calls_callback(env, idx)) {
 			callee = find_callback_subprog(env, insn, idx, &caller_reg, &cb_callee_reg);
 			if (callee == -2) {
@@ -1853,6 +1906,8 @@ static int analyze_subprog(struct bpf_verifier_env *env,
 
 			for (int r = BPF_REG_1; r <= BPF_REG_5; r++)
 				callee_args[r] = none;
+			for (int r = 0; r < MAX_STACK_ARG_SLOTS; r++)
+				callee_args[MAX_BPF_REG + r] = none;
 			callee_args[cb_callee_reg] = info[subprog].at_in[j][caller_reg];
 		} else {
 			continue;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (8 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  2:10   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
                   ` (13 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

The interpreter does not understand the bpf register r11
(BPF_REG_PARAMS) used for stack arguments. So reject interpreter
usage if stack arguments are used either in the main program or
any subprogram.

Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/core.c   | 2 +-
 kernel/bpf/fixups.c | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index ae10b9ca018d..6f94b0ef3659 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
 		goto finalize;
 
 	if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
-	    bpf_prog_has_kfunc_call(fp))
+	    bpf_prog_has_kfunc_call(fp) || fp->aux->stack_arg_cnt)
 		jit_needed = true;
 
 	if (!bpf_prog_select_interpreter(fp))
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 0ff8cdf4394f..4c630d97eaad 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1412,6 +1412,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
 		verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
 		return -EINVAL;
 	}
+	for (i = 1; i < env->subprog_cnt; i++) {
+		if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
+			verbose(env, "stack args are not supported in non-JITed programs\n");
+			return -EINVAL;
+		}
+	}
 	if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
 		/* When JIT fails the progs with bpf2bpf calls and tail_calls
 		 * have to be rejected, since interpreter doesn't support them yet.
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (9 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-09  2:19   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns Yonghong Song
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

Add bpf_jit_supports_stack_args() as a weak function defaulting to
false. Architectures that implement JIT support for stack arguments
override it to return true.

Reject BPF functions with more than 5 parameters at verification
time if the architecture does not support stack arguments.

Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 include/linux/filter.h | 1 +
 kernel/bpf/btf.c       | 8 +++++++-
 kernel/bpf/core.c      | 5 +++++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 918d9b34eac6..a515a9769078 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1184,6 +1184,7 @@ bool bpf_jit_inlines_helper_call(s32 imm);
 bool bpf_jit_supports_subprog_tailcalls(void);
 bool bpf_jit_supports_percpu_insn(void);
 bool bpf_jit_supports_kfunc_call(void);
+bool bpf_jit_supports_stack_args(void);
 bool bpf_jit_supports_far_kfunc_call(void);
 bool bpf_jit_supports_exceptions(void);
 bool bpf_jit_supports_ptr_xchg(void);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index ec3fb8c8f4ee..bf83b737d55c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7886,8 +7886,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
 			MAX_BPF_FUNC_ARGS, tname, nargs);
 		return -EFAULT;
 	}
-	if (nargs > MAX_BPF_FUNC_REG_ARGS)
+	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+		if (!bpf_jit_supports_stack_args()) {
+			bpf_log(log, "JIT does not support function %s() with %d args\n",
+				tname, nargs);
+			return -ENOTSUPP;
+		}
 		sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+	}
 
 	if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
 		bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 6f94b0ef3659..5d8932f5fb38 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -3217,6 +3217,11 @@ bool __weak bpf_jit_supports_kfunc_call(void)
 	return false;
 }
 
+bool __weak bpf_jit_supports_stack_args(void)
+{
+	return false;
+}
+
 bool __weak bpf_jit_supports_far_kfunc_call(void)
 {
 	return false;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (10 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-09  2:59   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 13/23] bpf: Support stack arguments for kfunc calls Yonghong Song
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

BPF_REG_PARAMS (r11) is used for stack argument accesses and
the following are only insns with r11 presence:
    - load incoming stack arg
    - store register to outgoing stack arg
    - store immediate to outgoing stack arg

The detailed insn format can be found in is_stack_arg_ldx/st/stx()
helpers. After this patch, stack arg ldx/st/stx insns become valid
for kernel and these insns can be properly checked by verifier.

The LLVM compiler [1] implemented the above BPF_REG_PARAMS insns.

  [1] https://github.com/llvm/llvm-project/pull/189060

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4827df9f817c..e208807e7995 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18004,11 +18004,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
 		return err;
 
 	for (i = 0; i < insn_cnt; i++, insn++) {
-		if (insn->dst_reg >= MAX_BPF_REG) {
+		if (insn->dst_reg >= MAX_BPF_REG &&
+		    !is_stack_arg_st(insn) && !is_stack_arg_stx(insn)) {
 			verbose(env, "R%d is invalid\n", insn->dst_reg);
 			return -EINVAL;
 		}
-		if (insn->src_reg >= MAX_BPF_REG) {
+		if (insn->src_reg >= MAX_BPF_REG && !is_stack_arg_ldx(insn)) {
 			verbose(env, "R%d is invalid\n", insn->src_reg);
 			return -EINVAL;
 		}
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 13/23] bpf: Support stack arguments for kfunc calls
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (11 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Extend the stack argument mechanism to kfunc calls, allowing kfuncs
with more than 5 parameters to receive additional arguments via the
r11-based stack arg area.

For kfuncs, the caller is a BPF program and the callee is a kernel
function. The BPF program writes outgoing args at negative r11
offsets, following the same convention as BPF-to-BPF calls:

  Outgoing: r11 - 8 (arg6), ..., r11 - N*8 (last arg)

The following is an example:

  int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
    ...
    kfunc1(a1, a2, a3, a4, a5, a6, a7, a8);
    ...
    kfunc2(a1, a2, a3, a4, a5, a6, a7, a8, a9);
    ...
  }

   Caller (foo), generated by llvm
   ===============================
   Incoming (positive offsets):
     r11+8:  [incoming arg 6]
     r11+16: [incoming arg 7]

   Outgoing for kfunc1 (negative offsets):
     r11-8:  [outgoing arg 6]
     r11-16: [outgoing arg 7]
     r11-24: [outgoing arg 8]

   Outgoing for kfunc2 (negative offsets):
     r11-8:  [outgoing arg 6]
     r11-16: [outgoing arg 7]
     r11-24: [outgoing arg 8]
     r11-32: [outgoing arg 9]

Later JIT will marshal outgoing arguments to the native calling
convention for kfunc1() and kfunc2().

For kfunc calls where stack args are used as constant or size
parameters, a mark_stack_arg_precision() helper is used to propagate
precision and do proper backtracking.

There are two places where meta->release_regno needs to keep
regno for later releasing the reference. Also, 'cur_aux(env)->arg_prog = regno'
is also keeping regno for later fixup. Since stack arguments don't have a valid
register number (regno is negative), these three cases are rejected for now
if the argument is on the stack.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c | 78 +++++++++++++++++++++++++++++++++----------
 1 file changed, 61 insertions(+), 17 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e208807e7995..b6d07f7a52cd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11160,14 +11160,12 @@ bool bpf_is_kfunc_pkt_changing(struct bpf_kfunc_call_arg_meta *meta)
 }
 
 static enum kfunc_ptr_arg_type
-get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
-		       struct bpf_kfunc_call_arg_meta *meta,
+get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, struct bpf_func_state *caller,
+		       struct bpf_reg_state *regs, struct bpf_kfunc_call_arg_meta *meta,
 		       const struct btf_type *t, const struct btf_type *ref_t,
 		       const char *ref_tname, const struct btf_param *args,
 		       int arg, int nargs, argno_t argno, struct bpf_reg_state *reg)
 {
-	u32 regno = arg + 1;
-	struct bpf_reg_state *regs = cur_regs(env);
 	bool arg_mem_size = false;
 
 	if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
@@ -11176,8 +11174,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 		return KF_ARG_PTR_TO_CTX;
 
 	if (arg + 1 < nargs &&
-	    (is_kfunc_arg_mem_size(meta->btf, &args[arg + 1], &regs[regno + 1]) ||
-	     is_kfunc_arg_const_mem_size(meta->btf, &args[arg + 1], &regs[regno + 1])))
+	    (is_kfunc_arg_mem_size(meta->btf, &args[arg + 1], get_func_arg_reg(caller, regs, arg + 1)) ||
+	     is_kfunc_arg_const_mem_size(meta->btf, &args[arg + 1], get_func_arg_reg(caller, regs, arg + 1))))
 		arg_mem_size = true;
 
 	/* In this function, we verify the kfunc's BTF as per the argument type,
@@ -11842,6 +11840,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 			    int insn_idx)
 {
 	const char *func_name = meta->func_name, *ref_tname;
+	struct bpf_func_state *caller = cur_func(env);
+	struct bpf_reg_state *regs = cur_regs(env);
 	const struct btf *btf = meta->btf;
 	const struct btf_param *args;
 	struct btf_record *rec;
@@ -11850,21 +11850,31 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 
 	args = (const struct btf_param *)(meta->func_proto + 1);
 	nargs = btf_type_vlen(meta->func_proto);
-	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+	if (nargs > MAX_BPF_FUNC_ARGS) {
 		verbose(env, "Function %s has %d > %d args\n", func_name, nargs,
-			MAX_BPF_FUNC_REG_ARGS);
+			MAX_BPF_FUNC_ARGS);
 		return -EINVAL;
 	}
+	if (nargs > MAX_BPF_FUNC_REG_ARGS && !bpf_jit_supports_stack_args()) {
+		verbose(env, "JIT does not support kfunc %s() with %d args\n",
+			func_name, nargs);
+		return -ENOTSUPP;
+	}
+
+	ret = check_outgoing_stack_args(env, caller, nargs);
+	if (ret)
+		return ret;
 
 	/* Check that BTF function arguments match actual types that the
 	 * verifier sees.
 	 */
 	for (i = 0; i < nargs; i++) {
-		struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[i + 1];
+		struct bpf_reg_state *reg = get_func_arg_reg(caller, regs, i);
 		const struct btf_type *t, *ref_t, *resolve_ret;
 		enum bpf_arg_type arg_type = ARG_DONTCARE;
 		argno_t argno = argno_from_arg(i + 1);
-		u32 regno = i + 1, ref_id, type_size;
+		int regno = reg_from_argno(argno);
+		u32 ref_id, type_size;
 		bool is_ret_buf_sz = false;
 		int kf_arg_type;
 
@@ -11874,6 +11884,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				verifier_bug(env, "Only 1 prog->aux argument supported per-kfunc");
 				return -EFAULT;
 			}
+			if (regno < 0) {
+				verbose(env, "%s prog->aux cannot be a stack argument\n",
+					reg_arg_name(env, argno));
+				return -EINVAL;
+			}
 			meta->arg_prog = true;
 			cur_aux(env)->arg_prog = regno;
 			continue;
@@ -11900,7 +11915,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 						reg_arg_name(env, argno));
 					return -EINVAL;
 				}
-				ret = mark_chain_precision(env, regno);
+				if (regno >= 0)
+					ret = mark_chain_precision(env, regno);
+				else
+					ret = mark_stack_arg_precision(env, i);
 				if (ret < 0)
 					return ret;
 				meta->arg_constant.found = true;
@@ -11925,7 +11943,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				}
 
 				meta->r0_size = reg->var_off.value;
-				ret = mark_chain_precision(env, regno);
+				if (regno >= 0)
+					ret = mark_chain_precision(env, regno);
+				else
+					ret = mark_stack_arg_precision(env, i);
 				if (ret)
 					return ret;
 			}
@@ -11953,14 +11974,21 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				return -EFAULT;
 			}
 			meta->ref_obj_id = reg->ref_obj_id;
-			if (is_kfunc_release(meta))
+			if (is_kfunc_release(meta)) {
+				if (regno < 0) {
+					verbose(env, "%s release arg cannot be a stack argument\n",
+						reg_arg_name(env, argno));
+					return -EINVAL;
+				}
 				meta->release_regno = regno;
+			}
 		}
 
 		ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
 		ref_tname = btf_name_by_offset(btf, ref_t->name_off);
 
-		kf_arg_type = get_kfunc_ptr_arg_type(env, meta, t, ref_t, ref_tname, args, i, nargs, argno, reg);
+		kf_arg_type = get_kfunc_ptr_arg_type(env, caller, regs, meta, t, ref_t, ref_tname,
+						     args, i, nargs, argno, reg);
 		if (kf_arg_type < 0)
 			return kf_arg_type;
 
@@ -12110,6 +12138,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				dynptr_arg_type |= DYNPTR_TYPE_FILE;
 			} else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_file_discard]) {
 				dynptr_arg_type |= DYNPTR_TYPE_FILE | OBJ_RELEASE;
+				if (regno < 0) {
+					verbose(env, "%s release arg cannot be a stack argument\n",
+						reg_arg_name(env, argno));
+					return -EINVAL;
+				}
 				meta->release_regno = regno;
 			} else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_clone] &&
 				   (dynptr_arg_type & MEM_UNINIT)) {
@@ -12264,9 +12297,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 			break;
 		case KF_ARG_PTR_TO_MEM_SIZE:
 		{
-			struct bpf_reg_state *buff_reg = &regs[regno];
+			struct bpf_reg_state *buff_reg = reg;
 			const struct btf_param *buff_arg = &args[i];
-			struct bpf_reg_state *size_reg = &regs[regno + 1];
+			struct bpf_reg_state *size_reg = get_func_arg_reg(caller, regs, i + 1);
 			const struct btf_param *size_arg = &args[i + 1];
 			argno_t next_argno = argno_from_arg(i + 2);
 
@@ -13170,8 +13203,19 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		clear_all_pkt_pointers(env);
 
 	nargs = btf_type_vlen(meta.func_proto);
+	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
+		struct bpf_func_state *caller = cur_func(env);
+		struct bpf_subprog_info *caller_info = &env->subprog_info[caller->subprogno];
+		u16 out_stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
+		u16 stack_arg_cnt = bpf_in_stack_arg_cnt(caller_info) + out_stack_arg_cnt;
+
+		if (stack_arg_cnt > caller_info->stack_arg_cnt)
+			caller_info->stack_arg_cnt = stack_arg_cnt;
+		invalidate_outgoing_stack_args(caller);
+	}
+
 	args = (const struct btf_param *)(meta.func_proto + 1);
-	for (i = 0; i < nargs; i++) {
+	for (i = 0; i < min_t(int, nargs, MAX_BPF_FUNC_REG_ARGS); i++) {
 		u32 regno = i + 1;
 
 		t = btf_type_skip_modifiers(desc_btf, args[i].type, NULL);
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (12 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 13/23] bpf: Support stack arguments for kfunc calls Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  1:42   ` sashiko-bot
  2026-05-07 21:30 ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments Yonghong Song
                   ` (9 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Tail calls are deprecated and will be replaced by indirect calls
in the future. Reject programs that combine tail calls with stack
arguments rather than adding complexity for a deprecated feature.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b6d07f7a52cd..95fe21d05c10 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5117,6 +5117,11 @@ struct bpf_subprog_call_depth_info {
 	int frame; /* # of consecutive static call stack frames on top of stack */
 };
 
+static bool subprog_has_stack_args(const struct bpf_subprog_info *si)
+{
+	return si->stack_arg_cnt;
+}
+
 /* starting from main bpf function walk all instructions of the function
  * and recursively walk all callees that given function can call.
  * Ignore jump and exit insns.
@@ -5270,14 +5275,23 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
 	 * this info will be utilized by JIT so that we will be preserving the
 	 * tail call counter throughout bpf2bpf calls combined with tailcalls
 	 */
-	if (tail_call_reachable)
+	if (tail_call_reachable) {
 		for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller) {
 			if (subprog[tmp].is_exception_cb) {
 				verbose(env, "cannot tail call within exception cb\n");
 				return -EINVAL;
 			}
+			if (subprog_has_stack_args(&subprog[tmp])) {
+				verbose(env, "tail_calls are not allowed in programs with stack args\n");
+				return -EINVAL;
+			}
 			subprog[tmp].tail_call_reachable = true;
 		}
+	} else if (!idx && subprog[0].has_tail_call && subprog_has_stack_args(&subprog[0])) {
+		verbose(env, "tail_calls are not allowed in programs with stack args\n");
+		return -EINVAL;
+	}
+
 	if (subprog[0].tail_call_reachable)
 		env->prog->aux->tail_call_reachable = true;
 
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (13 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
@ 2026-05-07 21:30 ` Yonghong Song
  2026-05-07 22:26   ` bot+bpf-ci
  2026-05-09  2:21   ` sashiko-bot
  2026-05-07 21:31 ` [PATCH bpf-next v2 16/23] selftests/bpf: Add tests for BPF function " Yonghong Song
                   ` (8 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:30 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

Add x86_64 JIT support for BPF functions and kfuncs with more than
5 arguments. The extra arguments are passed through a stack area
addressed by register r11 (BPF_REG_PARAMS) in BPF bytecode,
which the JIT translates to native code.

The JIT follows the x86-64 calling convention for both BPF-to-BPF
and kfunc calls:
  - Arg 6 is passed in the R9 register
  - Args 7+ are passed on the stack

Incoming arg 6 (BPF r11+8) is translated to a MOV from R9 rather
than a memory load. Incoming args 7+ (BPF r11+16, r11+24, ...) map
directly to [rbp + 16], [rbp + 24], ..., matching the x86-64 stack
layout after CALL + PUSH RBP, so no offset adjustment is needed.

tail_call_reachable is rejected by the verifier and priv_stack is
disabled by the JIT when stack args exist, so R9 is always
available. When BPF bytecode writes to the arg-6 stack slot
(offset -8), the JIT emits a MOV into R9 instead of a memory store.
Outgoing args 7+ are placed at [rsp] in a pre-allocated area below
callee-saved registers, using:
  native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16

The native x86_64 stack layout with stack arguments:

  high address
  +-------------------------+
  | incoming stack arg N    |  [rbp + 16 + (N-7)*8]  (from caller)
  | ...                     |
  | incoming stack arg 7    |  [rbp + 16]
  +-------------------------+
  | return address          |  [rbp + 8]
  | saved rbp               |  [rbp]
  +-------------------------+
  | BPF program stack       |  (round_up(stack_depth, 8) bytes)
  +-------------------------+
  | callee-saved regs       |  (r12, rbx, r13, r14, r15 as needed)
  +-------------------------+
  | outgoing arg M          |  [rsp + (M-7)*8]
  | ...                     |
  | outgoing arg 7          |  [rsp]
  +-------------------------+  rsp
  low address

Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 arch/x86/net/bpf_jit_comp.c | 155 ++++++++++++++++++++++++++++++++++--
 1 file changed, 149 insertions(+), 6 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index ea9e707e8abf..67c2f4a3b9cc 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -390,6 +390,34 @@ static void pop_callee_regs(u8 **pprog, bool *callee_regs_used)
 	*pprog = prog;
 }
 
+/* add rsp, depth */
+static void emit_add_rsp(u8 **pprog, u16 depth)
+{
+	u8 *prog = *pprog;
+
+	if (!depth)
+		return;
+	if (is_imm8(depth))
+		EMIT4(0x48, 0x83, 0xC4, depth); /* add rsp, imm8 */
+	else
+		EMIT3_off32(0x48, 0x81, 0xC4, depth); /* add rsp, imm32 */
+	*pprog = prog;
+}
+
+/* sub rsp, depth */
+static void emit_sub_rsp(u8 **pprog, u16 depth)
+{
+	u8 *prog = *pprog;
+
+	if (!depth)
+		return;
+	if (is_imm8(depth))
+		EMIT4(0x48, 0x83, 0xEC, depth); /* sub rsp, imm8 */
+	else
+		EMIT3_off32(0x48, 0x81, 0xEC, depth); /* sub rsp, imm32 */
+	*pprog = prog;
+}
+
 static void emit_nops(u8 **pprog, int len)
 {
 	u8 *prog = *pprog;
@@ -1664,16 +1692,45 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 	int i, excnt = 0;
 	int ilen, proglen = 0;
 	u8 *ip, *prog = temp;
+	u16 stack_arg_depth, incoming_stack_arg_depth, outgoing_stack_arg_depth; /* in bytes */
+	u16 outgoing_rsp;
 	u32 stack_depth;
+	int callee_saved_size;
+	s32 outgoing_arg_base;
 	int err;
 
 	stack_depth = bpf_prog->aux->stack_depth;
+	stack_arg_depth = bpf_prog->aux->stack_arg_cnt * 8;
+	incoming_stack_arg_depth = bpf_prog->aux->incoming_stack_arg_cnt * 8;
+	outgoing_stack_arg_depth = stack_arg_depth - incoming_stack_arg_depth;
 	priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
 	if (priv_stack_ptr) {
 		priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
 		stack_depth = 0;
 	}
 
+	/*
+	 * Follow x86-64 calling convention for both BPF-to-BPF and
+	 * kfunc calls:
+	 *   - Arg 6 is passed in R9 register
+	 *   - Args 7+ are passed on the stack at [rsp]
+	 *
+	 * Incoming arg 6 is read from R9 (BPF r11+8 → MOV from R9).
+	 * Incoming args 7+ are read from [rbp + 16], [rbp + 24], ...
+	 * (BPF r11+16, r11+24, ... map directly with no offset change).
+	 *
+	 * tail_call_reachable is rejected by the verifier and priv_stack
+	 * is disabled by the JIT when stack args exist, so R9 is always
+	 * available.
+	 *
+	 * Stack layout (high to low):
+	 *   [rbp + 16 + ...]    incoming stack args 7+ (from caller)
+	 *   [rbp + 8]           return address
+	 *   [rbp]               saved rbp
+	 *   [rbp - prog_stack]  program stack
+	 *   [below]             callee-saved regs
+	 *   [below]             outgoing args 7+ (= rsp)
+	 */
 	arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
 	user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
 
@@ -1700,6 +1757,42 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 			push_r12(&prog);
 		push_callee_regs(&prog, callee_regs_used);
 	}
+
+	/* Compute callee-saved register area size. */
+	callee_saved_size = 0;
+	if (bpf_prog->aux->exception_boundary || arena_vm_start)
+		callee_saved_size += 8; /* r12 */
+	if (bpf_prog->aux->exception_boundary) {
+		callee_saved_size += 4 * 8; /* rbx, r13, r14, r15 */
+	} else {
+		int j;
+
+		for (j = 0; j < 4; j++)
+			if (callee_regs_used[j])
+				callee_saved_size += 8;
+	}
+	/*
+	 * Base offset from rbp for translating BPF outgoing args 7+
+	 * to native offsets. BPF uses negative offsets from r11
+	 * (r11-8 for arg6, r11-16 for arg7, ...) while x86 uses
+	 * positive offsets from rsp ([rsp+0] for arg7, [rsp+8] for
+	 * arg8, ...). Arg 6 goes to R9 directly.
+	 *
+	 * The translation reverses direction:
+	 *   native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16
+	 *
+	 * Note that tail_call_reachable is guaranteed to be false when
+	 * stack args exist, so tcc pushes need not be accounted for.
+	 */
+	outgoing_arg_base = -(round_up(stack_depth, 8) + callee_saved_size);
+
+	/*
+	 * Allocate outgoing stack arg area for args 7+ only.
+	 * Arg 6 goes into r9 register, not on stack.
+	 */
+	outgoing_rsp = outgoing_stack_arg_depth > 8 ? outgoing_stack_arg_depth - 8 : 0;
+	emit_sub_rsp(&prog, outgoing_rsp);
+
 	if (arena_vm_start)
 		emit_mov_imm64(&prog, X86_REG_R12,
 			       arena_vm_start >> 32, (u32) arena_vm_start);
@@ -1721,7 +1814,7 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 		u8 b2 = 0, b3 = 0;
 		u8 *start_of_ldx;
 		s64 jmp_offset;
-		s16 insn_off;
+		s32 insn_off;
 		u8 jmp_cond;
 		u8 *func;
 		int nops;
@@ -2134,12 +2227,27 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 				EMIT1(0xC7);
 			goto st;
 		case BPF_ST | BPF_MEM | BPF_DW:
+			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+				/* Arg 6: store immediate in r9 register */
+				emit_mov_imm64(&prog, X86_REG_R9, imm32 >> 31, (u32)imm32);
+				break;
+			}
 			EMIT2(add_1mod(0x48, dst_reg), 0xC7);
 
-st:			if (is_imm8(insn->off))
-				EMIT2(add_1reg(0x40, dst_reg), insn->off);
+st:			insn_off = insn->off;
+			if (dst_reg == BPF_REG_PARAMS) {
+				/*
+				 * Args 7+: reverse BPF negative offsets to
+				 * x86 positive rsp offsets.
+				 * BPF off=-16 → [rsp+0], off=-24 → [rsp+8], ...
+				 */
+				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+				dst_reg = BPF_REG_FP;
+			}
+			if (is_imm8(insn_off))
+				EMIT2(add_1reg(0x40, dst_reg), insn_off);
 			else
-				EMIT1_off32(add_1reg(0x80, dst_reg), insn->off);
+				EMIT1_off32(add_1reg(0x80, dst_reg), insn_off);
 
 			EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(insn->code)));
 			break;
@@ -2149,7 +2257,17 @@ st:			if (is_imm8(insn->off))
 		case BPF_STX | BPF_MEM | BPF_H:
 		case BPF_STX | BPF_MEM | BPF_W:
 		case BPF_STX | BPF_MEM | BPF_DW:
-			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
+			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+				/* Arg 6: store register value in r9 */
+				EMIT_mov(X86_REG_R9, src_reg);
+				break;
+			}
+			insn_off = insn->off;
+			if (dst_reg == BPF_REG_PARAMS) {
+				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+				dst_reg = BPF_REG_FP;
+			}
+			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
 			break;
 
 		case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
@@ -2248,6 +2366,19 @@ st:			if (is_imm8(insn->off))
 		case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
 		case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
 			insn_off = insn->off;
+			if (src_reg == BPF_REG_PARAMS) {
+				if (insn_off == 8) {
+					/* Incoming arg 6: read from r9 */
+					EMIT_mov(dst_reg, X86_REG_R9);
+					break;
+				}
+				src_reg = BPF_REG_FP;
+				/*
+				 * Incoming args 7+: native_off == bpf_off
+				 * (r11+16 → [rbp+16], r11+24 → [rbp+24], ...)
+				 * No offset adjustment needed.
+				 */
+			}
 
 			if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
 			    BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
@@ -2736,6 +2867,8 @@ st:			if (is_imm8(insn->off))
 				if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
 					return -EINVAL;
 			}
+			/* Deallocate outgoing args 7+ area. */
+			emit_add_rsp(&prog, outgoing_rsp);
 			if (bpf_prog->aux->exception_boundary) {
 				pop_callee_regs(&prog, all_callee_regs_used);
 				pop_r12(&prog);
@@ -3743,7 +3876,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 		prog->aux->jit_data = jit_data;
 	}
 	priv_stack_ptr = prog->aux->priv_stack_ptr;
-	if (!priv_stack_ptr && prog->aux->jits_use_priv_stack) {
+	/*
+	 * x86-64 uses R9 for both private stack frame pointer and arg 6,
+	 * so disable private stack when stack args are present.
+	 */
+	if (!priv_stack_ptr && prog->aux->jits_use_priv_stack &&
+	    prog->aux->stack_arg_cnt == 0) {
 		/* Allocate actual private stack size with verifier-calculated
 		 * stack size plus two memory guards to protect overflow and
 		 * underflow.
@@ -3910,6 +4048,11 @@ bool bpf_jit_supports_kfunc_call(void)
 	return true;
 }
 
+bool bpf_jit_supports_stack_args(void)
+{
+	return true;
+}
+
 void *bpf_arch_text_copy(void *dst, void *src, size_t len)
 {
 	if (text_poke_copy(dst, src, len) == NULL)
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 16/23] selftests/bpf: Add tests for BPF function stack arguments
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (14 preceding siblings ...)
  2026-05-07 21:30 ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-07 21:31 ` [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation Yonghong Song
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

Add selftests covering stack argument passing for both BPF-to-BPF
subprog calls and kfunc calls with more than 5 arguments. All tests
are guarded by __BPF_FEATURE_STACK_ARGUMENT and __TARGET_ARCH_x86.

BPF-to-BPF subprog call tests (stack_arg.c):
  - Scalar stack args
  - Pointer stack args
  - Mixed pointer/scalar stack args
  - Nested calls
  - Dynptr stack arg
  - Two callees with different stack arg counts
  - Async callback

Kfunc call tests (stack_arg_kfunc.c, with bpf_testmod kfuncs):
  - Scalar stack args
  - Pointer stack args
  - Mixed pointer/scalar stack args
  - Dynptr stack arg
  - Memory buffer + size pair
  - Iterator
  - Const string pointer
  - Timer pointer

Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../selftests/bpf/prog_tests/stack_arg.c      | 139 ++++++++++
 tools/testing/selftests/bpf/progs/stack_arg.c | 252 ++++++++++++++++++
 .../selftests/bpf/progs/stack_arg_kfunc.c     | 163 +++++++++++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |  65 +++++
 .../bpf/test_kmods/bpf_testmod_kfunc.h        |  20 +-
 5 files changed, 638 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_kfunc.c

diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg.c b/tools/testing/selftests/bpf/prog_tests/stack_arg.c
new file mode 100644
index 000000000000..d61bac33f809
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include <network_helpers.h>
+#include "stack_arg.skel.h"
+#include "stack_arg_kfunc.skel.h"
+
+static void run_subtest(struct bpf_program *prog, int expected)
+{
+	int err, prog_fd;
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+		.data_in = &pkt_v4,
+		.data_size_in = sizeof(pkt_v4),
+		.repeat = 1,
+	);
+
+	prog_fd = bpf_program__fd(prog);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(topts.retval, expected, "retval");
+}
+
+static void test_global_many(void)
+{
+	struct stack_arg *skel;
+
+	skel = stack_arg__open();
+	if (!ASSERT_OK_PTR(skel, "open"))
+		return;
+
+	if (!skel->rodata->has_stack_arg) {
+		test__skip();
+		goto out;
+	}
+
+	if (!ASSERT_OK(stack_arg__load(skel), "load"))
+		goto out;
+
+	run_subtest(skel->progs.test_global_many_args, 36);
+
+out:
+	stack_arg__destroy(skel);
+}
+
+static void test_async_cb_many(void)
+{
+	struct stack_arg *skel;
+
+	skel = stack_arg__open();
+	if (!ASSERT_OK_PTR(skel, "open"))
+		return;
+
+	if (!skel->rodata->has_stack_arg) {
+		test__skip();
+		goto out;
+	}
+
+	if (!ASSERT_OK(stack_arg__load(skel), "load"))
+		goto out;
+
+	run_subtest(skel->progs.test_async_cb_many_args, 0);
+
+	/* Wait for the timer callback to fire and verify the result.
+	 * 10+20+30+40+50+60+70+80 = 360
+	 */
+	usleep(50);
+	ASSERT_EQ(skel->bss->timer_result, 360, "timer_result");
+
+out:
+	stack_arg__destroy(skel);
+}
+
+static void test_bpf2bpf(void)
+{
+	struct stack_arg *skel;
+
+	skel = stack_arg__open();
+	if (!ASSERT_OK_PTR(skel, "open"))
+		return;
+
+	if (!skel->rodata->has_stack_arg) {
+		test__skip();
+		goto out;
+	}
+
+	if (!ASSERT_OK(stack_arg__load(skel), "load"))
+		goto out;
+
+	run_subtest(skel->progs.test_bpf2bpf_ptr_stack_arg, 45);
+	run_subtest(skel->progs.test_bpf2bpf_mix_stack_args, 51);
+	run_subtest(skel->progs.test_bpf2bpf_nesting_stack_arg, 50);
+	run_subtest(skel->progs.test_bpf2bpf_dynptr_stack_arg, 69);
+	run_subtest(skel->progs.test_two_callees, 91);
+
+out:
+	stack_arg__destroy(skel);
+}
+
+static void test_kfunc(void)
+{
+	struct stack_arg_kfunc *skel;
+
+	skel = stack_arg_kfunc__open();
+	if (!ASSERT_OK_PTR(skel, "open"))
+		return;
+
+	if (!skel->rodata->has_stack_arg) {
+		test__skip();
+		goto out;
+	}
+
+	if (!ASSERT_OK(stack_arg_kfunc__load(skel), "load"))
+		goto out;
+
+	run_subtest(skel->progs.test_stack_arg_scalar, 36);
+	run_subtest(skel->progs.test_stack_arg_ptr, 45);
+	run_subtest(skel->progs.test_stack_arg_mix, 51);
+	run_subtest(skel->progs.test_stack_arg_dynptr, 69);
+	run_subtest(skel->progs.test_stack_arg_mem, 151);
+	run_subtest(skel->progs.test_stack_arg_iter, 115);
+	run_subtest(skel->progs.test_stack_arg_const_str, 15);
+	run_subtest(skel->progs.test_stack_arg_timer, 15);
+
+out:
+	stack_arg_kfunc__destroy(skel);
+}
+
+void test_stack_arg(void)
+{
+	if (test__start_subtest("global_many_args"))
+		test_global_many();
+	if (test__start_subtest("async_cb_many_args"))
+		test_async_cb_many();
+	if (test__start_subtest("bpf2bpf"))
+		test_bpf2bpf();
+	if (test__start_subtest("kfunc"))
+		test_kfunc();
+}
diff --git a/tools/testing/selftests/bpf/progs/stack_arg.c b/tools/testing/selftests/bpf/progs/stack_arg.c
new file mode 100644
index 000000000000..ab6240b997c5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg.c
@@ -0,0 +1,252 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <stdbool.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_kfuncs.h"
+
+#define CLOCK_MONOTONIC 1
+
+struct timer_elem {
+	struct bpf_timer timer;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, int);
+	__type(value, struct timer_elem);
+} timer_map SEC(".maps");
+
+int timer_result;
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+const volatile bool has_stack_arg = true;
+
+__noinline static int static_func_many_args(int a, int b, int c, int d,
+					    int e, int f, int g, int h)
+{
+	return a + b + c + d + e + f + g + h;
+}
+
+__noinline int global_calls_many_args(int a, int b, int c)
+{
+	return static_func_many_args(a, b, c, 4, 5, 6, 7, 8);
+}
+
+SEC("tc")
+int test_global_many_args(void)
+{
+	return global_calls_many_args(1, 2, 3);
+}
+
+struct test_data {
+	long x;
+	long y;
+};
+
+/* 1 + 2 + 3 + 4 + 5 + 10 + 20 = 45 */
+__noinline static long func_with_ptr_stack_arg(long a, long b, long c, long d,
+					       long e, struct test_data *p)
+{
+	return a + b + c + d + e + p->x + p->y;
+}
+
+__noinline long global_ptr_stack_arg(long a, long b, long c, long d, long e)
+{
+	struct test_data data = { .x = 10, .y = 20 };
+
+	return func_with_ptr_stack_arg(a, b, c, d, e, &data);
+}
+
+SEC("tc")
+int test_bpf2bpf_ptr_stack_arg(void)
+{
+	return global_ptr_stack_arg(1, 2, 3, 4, 5);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + 10 + 6 + 20 = 51 */
+__noinline static long func_with_mix_stack_args(long a, long b, long c, long d,
+						long e, struct test_data *p,
+						long f, struct test_data *q)
+{
+	return a + b + c + d + e + p->x + f + q->y;
+}
+
+__noinline long global_mix_stack_args(long a, long b, long c, long d, long e)
+{
+	struct test_data p = { .x = 10 };
+	struct test_data q = { .y = 20 };
+
+	return func_with_mix_stack_args(a, b, c, d, e, &p, e + 1, &q);
+}
+
+SEC("tc")
+int test_bpf2bpf_mix_stack_args(void)
+{
+	return global_mix_stack_args(1, 2, 3, 4, 5);
+}
+
+/*
+ * Nesting test: func_outer calls func_inner, both with struct pointer
+ * as stack arg.
+ *
+ * func_inner: (a+1) + (b+1) + (c+1) + (d+1) + (e+1) + p->x + p->y
+ *           = 2 + 3 + 4 + 5 + 6 + 10 + 20 = 50
+ */
+__noinline static long func_inner_ptr(long a, long b, long c, long d,
+				      long e, struct test_data *p)
+{
+	return a + b + c + d + e + p->x + p->y;
+}
+
+__noinline static long func_outer_ptr(long a, long b, long c, long d,
+				      long e, struct test_data *p)
+{
+	return func_inner_ptr(a + 1, b + 1, c + 1, d + 1, e + 1, p);
+}
+
+__noinline long global_nesting_ptr(long a, long b, long c, long d, long e)
+{
+	struct test_data data = { .x = 10, .y = 20 };
+
+	return func_outer_ptr(a, b, c, d, e, &data);
+}
+
+SEC("tc")
+int test_bpf2bpf_nesting_stack_arg(void)
+{
+	return global_nesting_ptr(1, 2, 3, 4, 5);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + sizeof(pkt_v4) = 15 + 54 = 69 */
+__noinline static long func_with_dynptr(long a, long b, long c, long d,
+					long e, struct bpf_dynptr *ptr)
+{
+	return a + b + c + d + e + bpf_dynptr_size(ptr);
+}
+
+__noinline long global_dynptr_stack_arg(void *ctx __arg_ctx, long a, long b,
+					long c, long d)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_dynptr_from_skb(ctx, 0, &ptr);
+	return func_with_dynptr(a, b, c, d, d + 1, &ptr);
+}
+
+SEC("tc")
+int test_bpf2bpf_dynptr_stack_arg(struct __sk_buff *skb)
+{
+	return global_dynptr_stack_arg(skb, 1, 2, 3, 4);
+}
+
+/* foo1: a+b+c+d+e+f+g+h */
+__noinline static int foo1(int a, int b, int c, int d,
+			   int e, int f, int g, int h)
+{
+	return a + b + c + d + e + f + g + h;
+}
+
+/* foo2: a+b+c+d+e+f+g+h+i+j */
+__noinline static int foo2(int a, int b, int c, int d, int e,
+			   int f, int g, int h, int i, int j)
+{
+	return a + b + c + d + e + f + g + h + i + j;
+}
+
+/* global_two_callees calls foo1 (3 stack args) and foo2 (5 stack args).
+ * The outgoing stack arg area is sized for foo2 (the larger callee).
+ * Stores for foo1 are a subset of the area used by foo2.
+ * Result: foo1(1,2,3,4,5,6,7,8) + foo2(1,2,3,4,5,6,7,8,9,10) = 36 + 55 = 91
+ *
+ * Pass a-e through so the compiler can't constant-fold the stack args away.
+ */
+__noinline int global_two_callees(int a, int b, int c, int d, int e)
+{
+	int ret;
+
+	ret = foo1(a, b, c, d, e, a + 5, a + 6, a + 7);
+	ret += foo2(a, b, c, d, e, a + 5, a + 6, a + 7, a + 8, a + 9);
+	return ret;
+}
+
+SEC("tc")
+int test_two_callees(void)
+{
+	return global_two_callees(1, 2, 3, 4, 5);
+}
+
+static int timer_cb_many_args(void *map, int *key, struct bpf_timer *timer)
+{
+	timer_result = static_func_many_args(10, 20, 30, 40, 50, 60, 70, 80);
+	return 0;
+}
+
+SEC("tc")
+int test_async_cb_many_args(void)
+{
+	struct timer_elem *elem;
+	int key = 0;
+
+	elem = bpf_map_lookup_elem(&timer_map, &key);
+	if (!elem)
+		return -1;
+
+	bpf_timer_init(&elem->timer, &timer_map, CLOCK_MONOTONIC);
+	bpf_timer_set_callback(&elem->timer, timer_cb_many_args);
+	bpf_timer_start(&elem->timer, 1, 0);
+	return 0;
+}
+
+#else
+
+const volatile bool has_stack_arg = false;
+
+SEC("tc")
+int test_global_many_args(void)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_ptr_stack_arg(void)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_mix_stack_args(void)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_nesting_stack_arg(void)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_bpf2bpf_dynptr_stack_arg(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_two_callees(void)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_async_cb_many_args(void)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
new file mode 100644
index 000000000000..fa9def876ea5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_kfuncs.h"
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+const volatile bool has_stack_arg = true;
+
+struct bpf_iter_testmod_seq {
+	u64 :64;
+	u64 :64;
+};
+
+extern int bpf_iter_testmod_seq_new(struct bpf_iter_testmod_seq *it, s64 value, int cnt) __ksym;
+extern void bpf_iter_testmod_seq_destroy(struct bpf_iter_testmod_seq *it) __ksym;
+
+struct timer_map_value {
+	struct bpf_timer timer;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, int);
+	__type(value, struct timer_map_value);
+} kfunc_timer_map SEC(".maps");
+
+SEC("tc")
+int test_stack_arg_scalar(struct __sk_buff *skb)
+{
+	return bpf_kfunc_call_stack_arg(1, 2, 3, 4, 5, 6, 7, 8);
+}
+
+SEC("tc")
+int test_stack_arg_ptr(struct __sk_buff *skb)
+{
+	struct prog_test_pass1 p = { .x0 = 10, .x1 = 20 };
+
+	return bpf_kfunc_call_stack_arg_ptr(1, 2, 3, 4, 5, &p);
+}
+
+SEC("tc")
+int test_stack_arg_mix(struct __sk_buff *skb)
+{
+	struct prog_test_pass1 p = { .x0 = 10 };
+	struct prog_test_pass1 q = { .x1 = 20 };
+
+	return bpf_kfunc_call_stack_arg_mix(1, 2, 3, 4, 5, &p, 6, &q);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + sizeof(pkt_v4) = 15 + 54 = 69 */
+SEC("tc")
+int test_stack_arg_dynptr(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_dynptr_from_skb(skb, 0, &ptr);
+	return bpf_kfunc_call_stack_arg_dynptr(1, 2, 3, 4, 5, &ptr);
+}
+
+/* 1 + 2 + 3 + 4 + 5 + (1 + 2 + ... + 16) = 15 + 136 = 151 */
+SEC("tc")
+int test_stack_arg_mem(struct __sk_buff *skb)
+{
+	char buf[16] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
+
+	return bpf_kfunc_call_stack_arg_mem(1, 2, 3, 4, 5, buf, sizeof(buf));
+}
+
+/* 1 + 2 + 3 + 4 + 5 + 100 = 115 */
+SEC("tc")
+int test_stack_arg_iter(struct __sk_buff *skb)
+{
+	struct bpf_iter_testmod_seq it;
+	u64 ret;
+
+	bpf_iter_testmod_seq_new(&it, 100, 10);
+	ret = bpf_kfunc_call_stack_arg_iter(1, 2, 3, 4, 5, &it);
+	bpf_iter_testmod_seq_destroy(&it);
+	return ret;
+}
+
+const char cstr[] = "hello";
+
+/* 1 + 2 + 3 + 4 + 5 = 15 */
+SEC("tc")
+int test_stack_arg_const_str(struct __sk_buff *skb)
+{
+	return bpf_kfunc_call_stack_arg_const_str(1, 2, 3, 4, 5, cstr);
+}
+
+/* 1 + 2 + 3 + 4 + 5 = 15 */
+SEC("tc")
+int test_stack_arg_timer(struct __sk_buff *skb)
+{
+	struct timer_map_value *val;
+	int key = 0;
+
+	val = bpf_map_lookup_elem(&kfunc_timer_map, &key);
+	if (!val)
+		return 0;
+	return bpf_kfunc_call_stack_arg_timer(1, 2, 3, 4, 5, &val->timer);
+}
+
+#else
+
+const volatile bool has_stack_arg = false;
+
+SEC("tc")
+int test_stack_arg_scalar(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_ptr(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_mix(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_dynptr(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_mem(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_iter(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_const_str(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+SEC("tc")
+int test_stack_arg_timer(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index d876314a4d67..aef2f68b7e83 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -825,6 +825,63 @@ __bpf_kfunc int bpf_kfunc_call_test5(u8 a, u16 b, u32 c)
 	return 0;
 }
 
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg(u64 a, u64 b, u64 c, u64 d,
+					 u64 e, u64 f, u64 g, u64 h)
+{
+	return a + b + c + d + e + f + g + h;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_ptr(u64 a, u64 b, u64 c, u64 d, u64 e,
+					     struct prog_test_pass1 *p)
+{
+	return a + b + c + d + e + p->x0 + p->x1;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_mix(u64 a, u64 b, u64 c, u64 d, u64 e,
+					     struct prog_test_pass1 *p, u64 f,
+					     struct prog_test_pass1 *q)
+{
+	return a + b + c + d + e + p->x0 + f + q->x1;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_dynptr(u64 a, u64 b, u64 c, u64 d, u64 e,
+					       struct bpf_dynptr *ptr)
+{
+	const struct bpf_dynptr_kern *kern_ptr = (void *)ptr;
+
+	return a + b + c + d + e + (kern_ptr->size & 0xFFFFFF);
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_mem(u64 a, u64 b, u64 c, u64 d, u64 e,
+					     void *mem, int mem__sz)
+{
+	const unsigned char *p = mem;
+	u64 sum = a + b + c + d + e;
+	int i;
+
+	for (i = 0; i < mem__sz; i++)
+		sum += p[i];
+	return sum;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_iter(u64 a, u64 b, u64 c, u64 d, u64 e,
+					      struct bpf_iter_testmod_seq *it__iter)
+{
+	return a + b + c + d + e + it__iter->value;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_const_str(u64 a, u64 b, u64 c, u64 d, u64 e,
+						   const char *str__str)
+{
+	return a + b + c + d + e;
+}
+
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_timer(u64 a, u64 b, u64 c, u64 d, u64 e,
+					       struct bpf_timer *timer)
+{
+	return a + b + c + d + e;
+}
+
 static struct prog_test_ref_kfunc prog_test_struct = {
 	.a = 42,
 	.b = 108,
@@ -1288,6 +1345,14 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test2)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test3)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test4)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test5)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_ptr)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mix)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_dynptr)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mem)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_iter)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_const_str)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_timer)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_acquire, KF_ACQUIRE | KF_RET_NULL)
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
index aa0b8d41e71b..2c1cb118f886 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
@@ -26,6 +26,8 @@ struct prog_test_ref_kfunc {
 };
 #endif
 
+struct bpf_iter_testmod_seq;
+
 struct prog_test_pass1 {
 	int x0;
 	struct {
@@ -111,7 +113,23 @@ int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;
 struct sock *bpf_kfunc_call_test3(struct sock *sk) __ksym;
 long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym;
 int bpf_kfunc_call_test5(__u8 a, __u16 b, __u32 c) __ksym;
-
+__u64 bpf_kfunc_call_stack_arg(__u64 a, __u64 b, __u64 c, __u64 d,
+			       __u64 e, __u64 f, __u64 g, __u64 h) __ksym;
+__u64 bpf_kfunc_call_stack_arg_ptr(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				   struct prog_test_pass1 *p) __ksym;
+__u64 bpf_kfunc_call_stack_arg_mix(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				   struct prog_test_pass1 *p, __u64 f,
+				   struct prog_test_pass1 *q) __ksym;
+__u64 bpf_kfunc_call_stack_arg_dynptr(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				      struct bpf_dynptr *ptr) __ksym;
+__u64 bpf_kfunc_call_stack_arg_mem(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				   void *mem, int mem__sz) __ksym;
+__u64 bpf_kfunc_call_stack_arg_iter(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				    struct bpf_iter_testmod_seq *it__iter) __ksym;
+__u64 bpf_kfunc_call_stack_arg_const_str(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+					 const char *str__str) __ksym;
+__u64 bpf_kfunc_call_stack_arg_timer(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				     struct bpf_timer *timer) __ksym;
 void bpf_kfunc_call_test_pass_ctx(struct __sk_buff *skb) __ksym;
 void bpf_kfunc_call_test_pass1(struct prog_test_pass1 *p) __ksym;
 void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (15 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 16/23] selftests/bpf: Add tests for BPF function " Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-09  1:30   ` sashiko-bot
  2026-05-07 21:31 ` [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Add negative tests that verify the kfunc (rejecting kfunc call
with >8 byte struct as stack argument) and the verifier
(rejecting invalid uses of r11 for stack arguments).

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../selftests/bpf/prog_tests/stack_arg_fail.c |  10 ++
 .../selftests/bpf/progs/stack_arg_fail.c      | 114 ++++++++++++++++++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |   7 ++
 .../bpf/test_kmods/bpf_testmod_kfunc.h        |   8 ++
 4 files changed, 139 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_fail.c

diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c b/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
new file mode 100644
index 000000000000..090af1330953
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg_fail.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "stack_arg_fail.skel.h"
+
+void test_stack_arg_fail(void)
+{
+	RUN_TESTS(stack_arg_fail);
+}
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_fail.c b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
new file mode 100644
index 000000000000..c6db302549cb
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+#include "bpf_misc.h"
+
+#if defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+SEC("tc")
+__failure __msg("Unrecognized *(R11-8) type STRUCT")
+int test_stack_arg_big(struct __sk_buff *skb)
+{
+	struct prog_test_big_arg s = { .a = 1, .b = 2 };
+
+	return bpf_kfunc_call_stack_arg_big(1, 2, 3, 4, 5, s);
+}
+
+#else
+
+SEC("tc")
+__description("stack_arg_fail: not supported, dummy test")
+__success
+int test_stack_arg_big(struct __sk_buff *skb)
+{
+	return 0;
+}
+
+#endif
+
+SEC("socket")
+__description("r11 in ALU instruction")
+__failure __msg("R11 is invalid")
+__naked void r11_alu_reject(void)
+{
+	asm volatile (
+	"r11 += 1;"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with non-DW size")
+__failure __msg("R11 is invalid")
+__naked void r11_store_non_dw(void)
+{
+	asm volatile (
+	"*(u32 *)(r11 - 8) = r1;"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with unaligned offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_unaligned(void)
+{
+	asm volatile (
+	"*(u64 *)(r11 - 4) = r1;"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with positive offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_positive_off(void)
+{
+	asm volatile (
+	"*(u64 *)(r11 + 8) = r1;"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 load with negative offset")
+__failure __msg("R11 is invalid")
+__naked void r11_load_negative_off(void)
+{
+	asm volatile (
+	"r0 = *(u64 *)(r11 - 8);"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 load with non-DW size")
+__failure __msg("R11 is invalid")
+__naked void r11_load_non_dw(void)
+{
+	asm volatile (
+	"r0 = *(u32 *)(r11 + 8);"
+	"exit;"
+	::: __clobber_all);
+}
+
+SEC("socket")
+__description("r11 store with zero offset")
+__failure __msg("R11 is invalid")
+__naked void r11_store_zero_off(void)
+{
+	asm volatile (
+	"*(u64 *)(r11 + 0) = r1;"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index aef2f68b7e83..0be918fe3021 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -882,6 +882,12 @@ __bpf_kfunc u64 bpf_kfunc_call_stack_arg_timer(u64 a, u64 b, u64 c, u64 d, u64 e
 	return a + b + c + d + e;
 }
 
+__bpf_kfunc u64 bpf_kfunc_call_stack_arg_big(u64 a, u64 b, u64 c, u64 d, u64 e,
+					     struct prog_test_big_arg s)
+{
+	return a + b + c + d + e + s.a + s.b;
+}
+
 static struct prog_test_ref_kfunc prog_test_struct = {
 	.a = 42,
 	.b = 108,
@@ -1353,6 +1359,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_mem)
 BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_iter)
 BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_const_str)
 BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_timer)
+BTF_ID_FLAGS(func, bpf_kfunc_call_stack_arg_big)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_acquire, KF_ACQUIRE | KF_RET_NULL)
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
index 2c1cb118f886..2edc36b66de9 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
@@ -50,6 +50,11 @@ struct prog_test_pass2 {
 	} x;
 };
 
+struct prog_test_big_arg {
+	__u64 a;
+	__u64 b;
+};
+
 struct prog_test_fail1 {
 	void *p;
 	int x;
@@ -130,6 +135,9 @@ __u64 bpf_kfunc_call_stack_arg_const_str(__u64 a, __u64 b, __u64 c, __u64 d, __u
 					 const char *str__str) __ksym;
 __u64 bpf_kfunc_call_stack_arg_timer(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
 				     struct bpf_timer *timer) __ksym;
+__u64 bpf_kfunc_call_stack_arg_big(__u64 a, __u64 b, __u64 c, __u64 d, __u64 e,
+				   struct prog_test_big_arg s) __ksym;
+
 void bpf_kfunc_call_test_pass_ctx(struct __sk_buff *skb) __ksym;
 void bpf_kfunc_call_test_pass1(struct prog_test_pass1 *p) __ksym;
 void bpf_kfunc_call_test_pass2(struct prog_test_pass2 *p) __ksym;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (16 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-09  1:40   ` sashiko-bot
  2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
                   ` (5 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

When __naked subprogs are used in verifier tests, clang drops
parameter names from their BTF FUNC_PROTO entries. This prevents
the verifier from resolving stack argument slots by name.

Add a __btf_func_path(path) annotation that points to a separate
BTF file containing properly-named FUNC entries. The test_loader
matches FUNC entries by name, detects anonymous parameters, and
replaces the FUNC_PROTO with a new one that carries parameter
names from the custom file while preserving the original type IDs.

The custom BTF file also serves as btf_custom_path for kfunc
resolution when no separate btf_custom_path is specified.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 tools/testing/selftests/bpf/progs/bpf_misc.h |   1 +
 tools/testing/selftests/bpf/test_loader.c    | 136 ++++++++++++++++++-
 2 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index dcd78a3a9052..a9f58900a501 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -152,6 +152,7 @@
 #define __auxiliary		__test_tag("test_auxiliary")
 #define __auxiliary_unpriv	__test_tag("test_auxiliary_unpriv")
 #define __btf_path(path)	__test_tag("test_btf_path=" path)
+#define __btf_func_path(path)	__test_tag("test_btf_func_path=" path)
 #define __arch(arch)		__test_tag("test_arch=" arch)
 #define __arch_x86_64		__arch("X86_64")
 #define __arch_arm64		__arch("ARM64")
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index ee637809a1d4..518163344082 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -63,6 +63,7 @@ struct test_spec {
 	struct test_subspec priv;
 	struct test_subspec unpriv;
 	const char *btf_custom_path;
+	const char *btf_custom_func_path;
 	int log_level;
 	int prog_flags;
 	int mode_mask;
@@ -590,6 +591,8 @@ static int parse_test_spec(struct test_loader *tester,
 			jit_on_next_line = true;
 		} else if ((val = str_has_pfx(s, "test_btf_path="))) {
 			spec->btf_custom_path = val;
+		} else if ((val = str_has_pfx(s, "test_btf_func_path="))) {
+			spec->btf_custom_func_path = val;
 		} else if ((val = str_has_pfx(s, "test_caps_unpriv="))) {
 			err = parse_caps(val, &spec->unpriv.caps, "test caps");
 			if (err)
@@ -1175,6 +1178,123 @@ static int get_stream(int stream_id, int prog_fd, char *text, size_t text_sz)
 	return ret;
 }
 
+/*
+ * Fix up the program's BTF using BTF from a separate file.
+ *
+ * For __naked subprogs, clang drops parameter names from BTF. Find FUNC
+ * entries with anonymous parameters and replace their FUNC_PROTO with the
+ * properly-named version from the custom file.
+ */
+static int fixup_btf_from_path(struct bpf_object *obj, const char *path)
+{
+	struct btf *prog_btf, *custom_btf;
+	__u32 i, j, cnt, custom_cnt;
+	int err = 0;
+
+	prog_btf = bpf_object__btf(obj);
+	if (!prog_btf)
+		return 0;
+
+	custom_btf = btf__parse(path, NULL);
+	if (!ASSERT_OK_PTR(custom_btf, "parse_custom_btf"))
+		return -EINVAL;
+
+	cnt = btf__type_cnt(prog_btf);
+	custom_cnt = btf__type_cnt(custom_btf);
+
+	/* Fix up FUNC entries with anonymous params.
+	 * Save all data from prog_btf BEFORE calling btf__add_*,
+	 * since those calls may reallocate the BTF data buffer
+	 * and invalidate any pointers obtained from btf__type_by_id.
+	 */
+	for (i = 1; i < cnt; i++) {
+		const struct btf_type *t = btf__type_by_id(prog_btf, i);
+		const struct btf_type *fp, *custom_t, *custom_fp;
+		const struct btf_param *params, *custom_params;
+		__u32 ret_type_id, vlen;
+		__u32 *prog_param_types = NULL;
+		const char *name;
+		int new_proto_id;
+
+		if (!btf_is_func(t))
+			continue;
+
+		fp = btf__type_by_id(prog_btf, t->type);
+		if (!fp || !btf_is_func_proto(fp) || btf_vlen(fp) == 0)
+			continue;
+
+		/* Check if any param is anonymous */
+		params = btf_params(fp);
+		if (params[0].name_off != 0)
+			continue;
+
+		/* Find matching FUNC by name in custom BTF */
+		name = btf__name_by_offset(prog_btf, t->name_off);
+		if (!name)
+			continue;
+
+		for (j = 1; j < custom_cnt; j++) {
+			const char *cname;
+
+			custom_t = btf__type_by_id(custom_btf, j);
+			if (!btf_is_func(custom_t))
+				continue;
+			cname = btf__name_by_offset(custom_btf, custom_t->name_off);
+			if (cname && strcmp(name, cname) == 0)
+				break;
+		}
+		if (j >= custom_cnt)
+			continue;
+
+		custom_fp = btf__type_by_id(custom_btf, custom_t->type);
+		if (!custom_fp || !btf_is_func_proto(custom_fp))
+			continue;
+
+		vlen = btf_vlen(fp);
+		if (vlen != btf_vlen(custom_fp))
+			continue;
+
+		/* Save data before btf__add_* calls invalidate pointers */
+		ret_type_id = fp->type;
+		prog_param_types = malloc(vlen * sizeof(*prog_param_types));
+		if (!prog_param_types) {
+			err = -ENOMEM;
+			break;
+		}
+		for (j = 0; j < vlen; j++)
+			prog_param_types[j] = params[j].type;
+
+		/* Add a new FUNC_PROTO: param names from custom, types from prog */
+		new_proto_id = btf__add_func_proto(prog_btf, ret_type_id);
+		if (new_proto_id < 0) {
+			err = new_proto_id;
+			free(prog_param_types);
+			break;
+		}
+
+		custom_params = btf_params(custom_fp);
+		for (j = 0; j < vlen; j++) {
+			const char *pname;
+
+			pname = btf__name_by_offset(custom_btf, custom_params[j].name_off);
+			err = btf__add_func_param(prog_btf, pname ?: "", prog_param_types[j]);
+			if (err)
+				break;
+		}
+		free(prog_param_types);
+		if (err)
+			break;
+
+		/* Update the FUNC to point to the new FUNC_PROTO (re-fetch
+		 * since btf__add_* may have reallocated the data buffer).
+		 */
+		((struct btf_type *)btf__type_by_id(prog_btf, i))->type = new_proto_id;
+	}
+
+	btf__free(custom_btf);
+	return err;
+}
+
 /* this function is forced noinline and has short generic name to look better
  * in test_progs output (in case of a failure)
  */
@@ -1231,13 +1351,27 @@ void run_subtest(struct test_loader *tester,
 		}
 	}
 
-	/* Implicitly reset to NULL if next test case doesn't specify */
+	/* Implicitly reset to NULL if next test case doesn't specify.
+	 * btf_custom_func_path also serves as btf_custom_path for kfunc resolution.
+	 */
 	open_opts->btf_custom_path = spec->btf_custom_path;
+	if (!open_opts->btf_custom_path)
+		open_opts->btf_custom_path = spec->btf_custom_func_path;
 
 	tobj = bpf_object__open_mem(obj_bytes, obj_byte_cnt, open_opts);
 	if (!ASSERT_OK_PTR(tobj, "obj_open_mem")) /* shouldn't happen */
 		goto subtest_cleanup;
 
+	/* Fix up __naked subprog BTF using a separate file with named params */
+	if (spec->btf_custom_func_path) {
+		err = fixup_btf_from_path(tobj, spec->btf_custom_func_path);
+		if (err) {
+			PRINT_FAIL("failed to fixup BTF from %s: %d\n",
+				   spec->btf_custom_func_path, err);
+			goto subtest_cleanup;
+		}
+	}
+
 	i = 0;
 	bpf_object__for_each_program(tprog_iter, tobj) {
 		spec_iter = &specs[i++];
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (17 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  1:38   ` sashiko-bot
  2026-05-07 21:31 ` [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
                   ` (4 subsequent siblings)
  23 siblings, 2 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Add inline-asm based verifier tests that exercise stack argument
validation logic directly.

Positive tests:
  - subprog call with 6 arg's
  - Two sequential calls to different subprogs (6-arg and 7-arg)
  - Share a r11 store for both branches

Negative tests — verifier rejection:
  - Read from uninitialized incoming stack arg slot
  - Gap in outgoing slots: only r11-16 written, r11-8 missing
  - Write at r11-80, exceeding max 7 stack args
  - Missing store on one branch with a shared store
  - First call has proper stack arguments and the second
    call intends to inherit stack arguments but not working
  - r11 load ordering issue

Negative tests — pointer/ref tracking:
  - Pruning type mismatch: one branch stores PTR_TO_STACK, the
    other stores a scalar, callee dereferences — must not prune
  - Release invalidation: bpf_sk_release invalidates a socket
    pointer stored in a stack arg slot
  - Packet pointer invalidation: bpf_skb_pull_data invalidates
    a packet pointer stored in a stack arg slot
  - Null propagation: PTR_TO_MAP_VALUE_OR_NULL stored in stack
    arg slot, null branch attempts dereference via callee

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../bpf/progs/btf__verifier_stack_arg_order.c |  30 ++
 .../selftests/bpf/progs/verifier_stack_arg.c  | 444 ++++++++++++++++++
 .../bpf/progs/verifier_stack_arg_order.c      |  86 ++++
 4 files changed, 562 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index a96b25ebff23..aef21cf2987b 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -91,6 +91,7 @@
 #include "verifier_sockmap_mutate.skel.h"
 #include "verifier_spill_fill.skel.h"
 #include "verifier_spin_lock.skel.h"
+#include "verifier_stack_arg.skel.h"
 #include "verifier_stack_ptr.skel.h"
 #include "verifier_store_release.skel.h"
 #include "verifier_subprog_precision.skel.h"
@@ -238,6 +239,7 @@ void test_verifier_sock_addr(void)            { RUN(verifier_sock_addr); }
 void test_verifier_sockmap_mutate(void)       { RUN(verifier_sockmap_mutate); }
 void test_verifier_spill_fill(void)           { RUN(verifier_spill_fill); }
 void test_verifier_spin_lock(void)            { RUN(verifier_spin_lock); }
+void test_verifier_stack_arg(void)            { RUN(verifier_stack_arg); }
 void test_verifier_stack_ptr(void)            { RUN(verifier_stack_ptr); }
 void test_verifier_store_release(void)        { RUN(verifier_store_release); }
 void test_verifier_subprog_precision(void)    { RUN(verifier_subprog_precision); }
diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
new file mode 100644
index 000000000000..2d5ddb24e241
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
+{
+	return a + b + c + d + e + f;
+}
+
+int subprog_call_before_load_6args(int a, int b, int c, int d, int e, int f)
+{
+	return a + b + c + d + e + f;
+}
+
+#else
+
+int subprog_bad_order_6args(void)
+{
+	return 0;
+}
+
+int subprog_call_before_load_6args(void)
+{
+	return 0;
+}
+
+#endif
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
new file mode 100644
index 000000000000..b4f680e3b6a1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
@@ -0,0 +1,444 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 1);
+	__type(key, long long);
+	__type(value, long long);
+} map_hash_8b SEC(".maps");
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+__noinline __used
+static int subprog_6args(int a, int b, int c, int d, int e, int f)
+{
+	return a + b + c + d + e + f;
+}
+
+__noinline __used
+static int subprog_7args(int a, int b, int c, int d, int e, int f, int g)
+{
+	return a + b + c + d + e + f + g;
+}
+
+__noinline __used
+static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
+{
+	return *f;
+}
+
+SEC("tc")
+__description("stack_arg: subprog with 6 args")
+__success __retval(21)
+__naked void stack_arg_6args(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 8) = 6;"
+		"call subprog_6args;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: two subprogs with >5 args")
+__success __retval(90)
+__naked void stack_arg_two_subprogs(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 8) = 10;"
+		"call subprog_6args;"
+		"r6 = r0;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 16) = 30;"
+		"*(u64 *)(r11 - 8) = 20;"
+		"call subprog_7args;"
+		"r0 += r6;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: read from uninitialized stack arg slot")
+__failure
+__msg("invalid read from stack arg off 8 depth 0")
+__naked void stack_arg_read_uninitialized(void)
+{
+	asm volatile (
+		"r0 = *(u64 *)(r11 + 8);"
+		"r0 = 0;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: gap at offset -8, only wrote -16")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_gap_at_minus8(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 16) = 30;"
+		"call subprog_7args;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: pruning with different stack arg types")
+__failure
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("invalid mem access 'scalar'")
+__naked void stack_arg_pruning_type_mismatch(void)
+{
+	asm volatile (
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		/* local = 0 on program stack */
+		"r7 = 0;"
+		"*(u64 *)(r10 - 8) = r7;"
+		/* Branch based on random value */
+		"if r6 s> 3 goto l0_%=;"
+		/* Path 1: store stack pointer to outgoing arg6 */
+		"r1 = r10;"
+		"r1 += -8;"
+		"*(u64 *)(r11 - 8) = r1;"
+		"goto l1_%=;"
+	"l0_%=:"
+		/* Path 2: store scalar to outgoing arg6 */
+		"*(u64 *)(r11 - 8) = 42;"
+	"l1_%=:"
+		/* Call subprog that dereferences arg6 */
+		"r1 = r6;"
+		"r2 = 0;"
+		"r3 = 0;"
+		"r4 = 0;"
+		"r5 = 0;"
+		"call subprog_deref_arg6;"
+		"exit;"
+		:: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: release_reference invalidates stack arg slot")
+__failure
+__msg("invalid mem access 'scalar'")
+__naked void stack_arg_release_ref(void)
+{
+	asm volatile (
+		"r6 = r1;"
+		/* struct bpf_sock_tuple tuple = {} */
+		"r2 = 0;"
+		"*(u32 *)(r10 - 8) = r2;"
+		"*(u64 *)(r10 - 16) = r2;"
+		"*(u64 *)(r10 - 24) = r2;"
+		"*(u64 *)(r10 - 32) = r2;"
+		"*(u64 *)(r10 - 40) = r2;"
+		"*(u64 *)(r10 - 48) = r2;"
+		/* sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof(tuple), 0, 0) */
+		"r1 = r6;"
+		"r2 = r10;"
+		"r2 += -48;"
+		"r3 = %[sizeof_bpf_sock_tuple];"
+		"r4 = 0;"
+		"r5 = 0;"
+		"call %[bpf_sk_lookup_tcp];"
+		/* r0 = sk (PTR_TO_SOCK_OR_NULL) */
+		"if r0 == 0 goto l0_%=;"
+		/* Store sock ref to outgoing arg6 slot */
+		"*(u64 *)(r11 - 8) = r0;"
+		/* Release the reference — invalidates the stack arg slot */
+		"r1 = r0;"
+		"call %[bpf_sk_release];"
+		/* Call subprog that dereferences arg6 — should fail */
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_deref_arg6;"
+	"l0_%=:"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_sk_lookup_tcp),
+		  __imm(bpf_sk_release),
+		  __imm_const(sizeof_bpf_sock_tuple, sizeof(struct bpf_sock_tuple))
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
+__failure
+__msg("invalid mem access 'scalar'")
+__naked void stack_arg_stale_pkt_ptr(void)
+{
+	asm volatile (
+		"r6 = r1;"
+		"r7 = *(u32 *)(r6 + %[__sk_buff_data]);"
+		"r8 = *(u32 *)(r6 + %[__sk_buff_data_end]);"
+		/* check pkt has at least 1 byte */
+		"r0 = r7;"
+		"r0 += 8;"
+		"if r0 > r8 goto l0_%=;"
+		/* Store valid pkt pointer to outgoing arg6 slot */
+		"*(u64 *)(r11 - 8) = r7;"
+		/* bpf_skb_pull_data invalidates all pkt pointers */
+		"r1 = r6;"
+		"r2 = 0;"
+		"call %[bpf_skb_pull_data];"
+		/* Call subprog that dereferences arg6 — should fail */
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_deref_arg6;"
+	"l0_%=:"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_skb_pull_data),
+		  __imm_const(__sk_buff_data, offsetof(struct __sk_buff, data)),
+		  __imm_const(__sk_buff_data_end, offsetof(struct __sk_buff, data_end))
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: null propagation rejects deref on null branch")
+__failure
+__msg("invalid mem access 'scalar'")
+__naked void stack_arg_null_propagation_fail(void)
+{
+	asm volatile (
+		"r1 = 0;"
+		"*(u64 *)(r10 - 8) = r1;"
+		/* r0 = bpf_map_lookup_elem(&map_hash_8b, &key) */
+		"r2 = r10;"
+		"r2 += -8;"
+		"r1 = %[map_hash_8b] ll;"
+		"call %[bpf_map_lookup_elem];"
+		/* Store PTR_TO_MAP_VALUE_OR_NULL to outgoing arg6 slot */
+		"*(u64 *)(r11 - 8) = r0;"
+		/* null check on r0 */
+		"if r0 != 0 goto l0_%=;"
+		/*
+		 * On null branch, outgoing slot is SCALAR(0).
+		 * Call subprog that dereferences arg6 — should fail.
+		 */
+		"r1 = 0;"
+		"r2 = 0;"
+		"r3 = 0;"
+		"r4 = 0;"
+		"r5 = 0;"
+		"call subprog_deref_arg6;"
+	"l0_%=:"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_map_lookup_elem),
+		  __imm_addr(map_hash_8b)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: missing store on one branch")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_missing_store_one_branch(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		/* Write arg7 (r11-16) before branch */
+		"*(u64 *)(r11 - 16) = 20;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 > 0 goto l0_%=;"
+		/* Path 1: write arg6 and call */
+		"*(u64 *)(r11 - 8) = 10;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_7args;"
+		"goto l1_%=;"
+	"l0_%=:"
+		/* Path 2: missing arg6 store, call should fail */
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_7args;"
+	"l1_%=:"
+		"r0 = 0;"
+		"exit;"
+		:: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: share a store for both branches")
+__success __retval(0)
+__naked void stack_arg_shared_store(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		/* Write arg7 (r11-16) before branch */
+		"*(u64 *)(r11 - 16) = 20;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 > 0 goto l0_%=;"
+		/* Path 1: write arg6 and call */
+		"*(u64 *)(r11 - 8) = 10;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_7args;"
+		"goto l1_%=;"
+	"l0_%=:"
+		/* Path 2: also write arg6 and call */
+		"*(u64 *)(r11 - 8) = 30;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_7args;"
+	"l1_%=:"
+		"r0 = 0;"
+		"exit;"
+		:: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: write beyond max outgoing depth")
+__failure
+__msg("stack arg write offset -80 exceeds max 7 stack args")
+__naked void stack_arg_write_beyond_max(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		/* Write to offset -80, way beyond any callee's needs */
+		"*(u64 *)(r11 - 80) = 99;"
+		"*(u64 *)(r11 - 16) = 20;"
+		"*(u64 *)(r11 - 8) = 10;"
+		"call subprog_7args;"
+		"r0 = 0;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: write unused stack arg slot")
+__failure
+__msg("func#0 writes 5 stack arg slots, but calls only require 2")
+__naked void stack_arg_write_unused_slot(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		/* Write to offset -40, unused for the callee */
+		"*(u64 *)(r11 - 40) = 99;"
+		"*(u64 *)(r11 - 16) = 20;"
+		"*(u64 *)(r11 - 8) = 10;"
+		"call subprog_7args;"
+		"r0 = 0;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: sequential calls reuse slots")
+__failure
+__msg("caller expects 7 args, stack arg1 is not initialized")
+__naked void stack_arg_sequential_calls(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 8) = 6;"
+		"*(u64 *)(r11 - 16) = 7;"
+		"call subprog_7args;"
+		"r6 = r0;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call subprog_7args;"
+		"r0 += r6;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg is not supported by compiler or jit, use a dummy test")
+__success
+int dummy_test(void)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
new file mode 100644
index 000000000000..671c79969c6c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+__noinline __used __naked
+static int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
+{
+	asm volatile (
+		"*(u64 *)(r11 - 8) = r1;"
+		"r0 = *(u64 *)(r11 + 8);"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: r11 load after r11 store")
+__failure
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_load_after_store(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 8) = 6;"
+		"call subprog_bad_order_6args;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+__noinline __used __naked
+static int subprog_call_before_load_6args(int a, int b, int c, int d, int e,
+					  int f)
+{
+	asm volatile (
+		"call %[bpf_get_prandom_u32];"
+		"r0 = *(u64 *)(r11 + 8);"
+		"exit;"
+		:: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: r11 load after a call")
+__failure
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_load_after_call(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"*(u64 *)(r11 - 8) = 6;"
+		"call subprog_call_before_load_6args;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg order is not supported by compiler or jit, use a dummy test")
+__success
+int dummy_test(void)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (18 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-09  1:52   ` sashiko-bot
  2026-05-07 21:31 ` [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

Add a test that verifies precision backtracking works correctly
across BPF-to-BPF calls when stack arguments are involved.

The test passes a size value as incoming stack arg (arg6) to a
subprog, which bounds-checks it and forwards it as the mem__sz
parameter (outgoing arg7) to bpf_kfunc_call_stack_arg_mem. The
expected __msg annotations verify that precision propagates from
the kfunc's mem__sz argument back through the subprog frame to the
caller's outgoing stack arg store.

A companion BTF file (btf__stack_arg_precision.c) provides named
parameter BTF for the __naked subprog via __btf_func_path.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../bpf/prog_tests/stack_arg_precision.c      |  10 ++
 .../bpf/progs/btf__stack_arg_precision.c      |  23 +++
 .../selftests/bpf/progs/stack_arg_precision.c | 137 ++++++++++++++++++
 3 files changed, 170 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
 create mode 100644 tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
 create mode 100644 tools/testing/selftests/bpf/progs/stack_arg_precision.c

diff --git a/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c b/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
new file mode 100644
index 000000000000..1ab041d66de3
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/stack_arg_precision.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "stack_arg_precision.skel.h"
+
+void test_stack_arg_precision(void)
+{
+	RUN_TESTS(stack_arg_precision);
+}
diff --git a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
new file mode 100644
index 000000000000..296fddfe6804
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
+{
+	char buf[8] = {};
+
+	return bpf_kfunc_call_stack_arg_mem(a, b, c, d, e, buf, size);
+}
+
+#else
+
+long subprog_call_mem_kfunc(void)
+{
+	return 0;
+}
+
+#endif
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
new file mode 100644
index 000000000000..c94905f07dcc
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "../test_kmods/bpf_testmod_kfunc.h"
+#include "bpf_misc.h"
+
+#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+
+/* Force kfunc extern BTF generation for inline asm call below.
+ * Uses its own SEC so it's not included as a .text subprog.
+ * The '?' prefix sets autoload=false so libbpf won't load it.
+ */
+SEC("?tc")
+int __btf_kfunc_gen(struct __sk_buff *ctx)
+{
+	char buf[8] = {};
+
+	return bpf_kfunc_call_stack_arg_mem(0, 0, 0, 0, 0, buf, sizeof(buf));
+}
+
+/*
+ * Test precision backtracking across bpf-to-bpf call for kfunc stack arg.
+ * subprog_call_mem_kfunc receives a size as incoming stack arg (arg6),
+ * bounds-checks it, then passes it as mem__sz (arg7) to
+ * bpf_kfunc_call_stack_arg_mem.
+ *
+ * 1+2+3+4+5+(1+2+3+4) = 25
+ */
+__naked __noinline __used
+static long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
+{
+	asm volatile (
+		"r1 = *(u64 *)(r11 + 8);"	/* r1 = incoming arg6 (size) */
+		"r2 = 0x0807060504030201 ll;"	/* r2 = buf contents */
+		"*(u64 *)(r10 - 8) = r2;"	/* store buf to stack */
+		"r2 = r10;"
+		"r2 += -8;"			/* r2 = &buf */
+		"*(u64 *)(r11 - 8) = r2;"	/* outgoing arg6 = buf */
+		"*(u64 *)(r11 - 16) = r1;"	/* outgoing arg7 = size */
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"call %[bpf_kfunc_call_stack_arg_mem];"
+		"exit;"
+		:
+		: __imm(bpf_kfunc_call_stack_arg_mem)
+		: __clobber_all
+	);
+}
+
+SEC("tc")
+__description("stack_arg: precision backtracking across bpf2bpf call for kfunc")
+__success __retval(25)
+__log_level(2)
+__flag(BPF_F_TEST_STATE_FREQ)
+__btf_func_path("btf__stack_arg_precision.bpf.o")
+__msg("mark_precise: frame1: last_idx 26 first_idx 13 subseq_idx -1")
+__msg("mark_precise: frame1: regs= stack= before 25: (b7) r5 = 5")
+__msg("mark_precise: frame1: regs= stack= before 24: (b7) r4 = 4")
+__msg("mark_precise: frame1: regs= stack= before 23: (b7) r3 = 3")
+__msg("mark_precise: frame1: regs= stack= before 22: (b7) r2 = 2")
+__msg("mark_precise: frame1: regs= stack= before 21: (b7) r1 = 1")
+__msg("mark_precise: frame1: regs= stack= before 20: (7b) *(u64 *)(r11 -16) = r1")
+__msg("mark_precise: frame1: regs=r1 stack= before 19: (7b) *(u64 *)(r11 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 18: (07) r2 += -8")
+__msg("mark_precise: frame1: regs=r1 stack= before 17: (bf) r2 = r10")
+__msg("mark_precise: frame1: regs=r1 stack= before 16: (7b) *(u64 *)(r10 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 14: (18) r2 = 0x807060504030201")
+__msg("mark_precise: frame1: regs=r1 stack= before 13: (79) r1 = *(u64 *)(r11 +8)")
+__msg("mark_precise: frame1: parent state regs= stack=:  frame1: R10=fp0")
+__msg("mark_precise: frame0: parent state regs= stack=:  R10=fp0")
+__msg("mark_precise: frame1: last_idx 11 first_idx 11 subseq_idx 13")
+__msg("mark_precise: frame1: regs= stack= before 11: (85) call pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=:  R1=1 R2=2 R3=3 R4=4 R5=5 R10=fp0")
+__msg("mark_precise: frame0: last_idx 9 first_idx 7 subseq_idx 11")
+__msg("mark_precise: frame0: regs= stack= before 9: (05) goto pc+1")
+__msg("mark_precise: frame0: regs= stack= before 8: (7a) *(u64 *)(r11 -8) = 4")
+__msg("mark_precise: frame1: last_idx 26 first_idx 13 subseq_idx -1 ")
+__msg("mark_precise: frame1: regs= stack= before 25: (b7) r5 = 5")
+__msg("mark_precise: frame1: regs= stack= before 24: (b7) r4 = 4")
+__msg("mark_precise: frame1: regs= stack= before 23: (b7) r3 = 3")
+__msg("mark_precise: frame1: regs= stack= before 22: (b7) r2 = 2")
+__msg("mark_precise: frame1: regs= stack= before 21: (b7) r1 = 1")
+__msg("mark_precise: frame1: regs= stack= before 20: (7b) *(u64 *)(r11 -16) = r1")
+__msg("mark_precise: frame1: regs=r1 stack= before 19: (7b) *(u64 *)(r11 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 18: (07) r2 += -8")
+__msg("mark_precise: frame1: regs=r1 stack= before 17: (bf) r2 = r10")
+__msg("mark_precise: frame1: regs=r1 stack= before 16: (7b) *(u64 *)(r10 -8) = r2")
+__msg("mark_precise: frame1: regs=r1 stack= before 14: (18) r2 = 0x807060504030201")
+__msg("mark_precise: frame1: regs=r1 stack= before 13: (79) r1 = *(u64 *)(r11 +8)")
+__msg("mark_precise: frame1: parent state regs= stack=:  frame1: R10=fp0")
+__msg("mark_precise: frame0: parent state regs= stack=:  R10=fp0")
+__msg("mark_precise: frame1: last_idx 11 first_idx 11 subseq_idx 13 ")
+__msg("mark_precise: frame1: regs= stack= before 11: (85) call pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=:  R1=1 R2=2 R3=3 R4=4 R5=5 R10=fp0")
+__msg("mark_precise: frame0: last_idx 10 first_idx 10 subseq_idx 11 ")
+__msg("mark_precise: frame0: regs= stack= before 10: (7a) *(u64 *)(r11 -8) = 6")
+__naked void stack_arg_precision_bpf2bpf(void)
+{
+	asm volatile (
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r3 = 3;"
+		"r4 = 4;"
+		"r5 = 5;"
+		"if r6 < 2 goto l0_%=;"
+		"*(u64 *)(r11 - 8) = 4;"
+		"goto l1_%=;"
+	"l0_%=:"
+		"*(u64 *)(r11 - 8) = 6;"
+	"l1_%=:"
+		"call subprog_call_mem_kfunc;"
+		"exit;"
+		:: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
+
+#else
+
+SEC("socket")
+__description("stack_arg_precision: not supported, dummy test")
+__success
+int dummy_test(void)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (19 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
@ 2026-05-07 21:31 ` Yonghong Song
  2026-05-08 18:01   ` Alexei Starovoitov
  2026-05-07 21:32 ` [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments Yonghong Song
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:31 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

From: Puranjay Mohan <puranjay@kernel.org>

Move the BPF return value register from x7 to x8, freeing x7 for use
as an argument register. AAPCS64 designates x8 as the indirect result
location register; it is caller-saved and not used for argument
passing, making it a suitable home for BPF_REG_0.

This is a prerequisite for stack argument support, which needs x5-x7
to pass arguments 6-8 to native kfuncs following the AAPCS64 calling
convention.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 arch/arm64/net/bpf_jit_comp.c                          |  4 ++--
 arch/arm64/net/bpf_timed_may_goto.S                    |  8 ++++----
 .../testing/selftests/bpf/progs/verifier_jit_inline.c  |  2 +-
 tools/testing/selftests/bpf/progs/verifier_ldsx.c      |  6 +++---
 .../selftests/bpf/progs/verifier_private_stack.c       | 10 +++++-----
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0816c40fc7af..085e650662e3 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -47,7 +47,7 @@
 /* Map BPF registers to A64 registers */
 static const int bpf2a64[] = {
 	/* return value from in-kernel function, and exit value from eBPF */
-	[BPF_REG_0] = A64_R(7),
+	[BPF_REG_0] = A64_R(8),
 	/* arguments from eBPF program to in-kernel function */
 	[BPF_REG_1] = A64_R(0),
 	[BPF_REG_2] = A64_R(1),
@@ -1048,7 +1048,7 @@ static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
 	/* Restore FP/LR registers */
 	emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
 
-	/* Move the return value from bpf:r0 (aka x7) to x0 */
+	/* Move the return value from bpf:r0 (aka x8) to x0 */
 	emit(A64_MOV(1, A64_R(0), r0), ctx);
 
 	/* Authenticate lr */
diff --git a/arch/arm64/net/bpf_timed_may_goto.S b/arch/arm64/net/bpf_timed_may_goto.S
index 894cfcd7b241..a9a802711a7f 100644
--- a/arch/arm64/net/bpf_timed_may_goto.S
+++ b/arch/arm64/net/bpf_timed_may_goto.S
@@ -8,8 +8,8 @@ SYM_FUNC_START(arch_bpf_timed_may_goto)
 	stp     x29, x30, [sp, #-64]!
 	mov     x29, sp
 
-	/* Save BPF registers R0 - R5 (x7, x0-x4)*/
-	stp	x7, x0, [sp, #16]
+	/* Save BPF registers R0 - R5 (x8, x0-x4)*/
+	stp	x8, x0, [sp, #16]
 	stp	x1, x2, [sp, #32]
 	stp	x3, x4, [sp, #48]
 
@@ -28,8 +28,8 @@ SYM_FUNC_START(arch_bpf_timed_may_goto)
 	/* BPF_REG_AX(x9) will be stored into count, so move return value to it. */
 	mov	x9, x0
 
-	/* Restore BPF registers R0 - R5 (x7, x0-x4) */
-	ldp	x7, x0, [sp, #16]
+	/* Restore BPF registers R0 - R5 (x8, x0-x4) */
+	ldp	x8, x0, [sp, #16]
 	ldp	x1, x2, [sp, #32]
 	ldp	x3, x4, [sp, #48]
 
diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
index 4ea254063646..885ff69a3a62 100644
--- a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
+++ b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c
@@ -9,7 +9,7 @@ __success __retval(0)
 __arch_x86_64
 __jited("	addq	%gs:{{.*}}, %rax")
 __arch_arm64
-__jited("	mrs	x7, SP_EL0")
+__jited("	mrs	x8, SP_EL0")
 int inline_bpf_get_current_task(void)
 {
 	bpf_get_current_task();
diff --git a/tools/testing/selftests/bpf/progs/verifier_ldsx.c b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
index 1026524a1983..41340877dc9d 100644
--- a/tools/testing/selftests/bpf/progs/verifier_ldsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
@@ -274,11 +274,11 @@ __jited("movslq	0x10(%rdi,%r12), %r15")
 __jited("movswq	0x18(%rdi,%r12), %r15")
 __jited("movsbq	0x20(%rdi,%r12), %r15")
 __arch_arm64
-__jited("add	x11, x7, x28")
+__jited("add	x11, x8, x28")
 __jited("ldrsw	x21, [x11, #0x10]")
-__jited("add	x11, x7, x28")
+__jited("add	x11, x8, x28")
 __jited("ldrsh	x21, [x11, #0x18]")
-__jited("add	x11, x7, x28")
+__jited("add	x11, x8, x28")
 __jited("ldrsb	x21, [x11, #0x20]")
 __jited("add	x11, x0, x28")
 __jited("ldrsw	x22, [x11, #0x10]")
diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
index 646e8ef82051..c5078face38d 100644
--- a/tools/testing/selftests/bpf/progs/verifier_private_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
@@ -170,12 +170,12 @@ __jited("	mrs	x10, TPIDR_EL{{[0-1]}}")
 __jited("	add	x27, x27, x10")
 __jited("	add	x25, x27, {{.*}}")
 __jited("	bl	0x{{.*}}")
-__jited("	mov	x7, x0")
+__jited("	mov	x8, x0")
 __jited("	mov	x0, #0x2a")
 __jited("	str	x0, [x27]")
 __jited("	bl	0x{{.*}}")
-__jited("	mov	x7, x0")
-__jited("	mov	x7, #0x0")
+__jited("	mov	x8, x0")
+__jited("	mov	x8, #0x0")
 __jited("	ldp	x25, x27, [sp], {{.*}}")
 __naked void private_stack_callback(void)
 {
@@ -220,7 +220,7 @@ __jited("	mov	x0, #0x2a")
 __jited("	str	x0, [x27]")
 __jited("	mov	x0, #0x0")
 __jited("	bl	0x{{.*}}")
-__jited("	mov	x7, x0")
+__jited("	mov	x8, x0")
 __jited("	ldp	x27, x28, [sp], #0x10")
 int private_stack_exception_main_prog(void)
 {
@@ -258,7 +258,7 @@ __jited("	add	x25, x27, {{.*}}")
 __jited("	mov	x0, #0x2a")
 __jited("	str	x0, [x27]")
 __jited("	bl	0x{{.*}}")
-__jited("	mov	x7, x0")
+__jited("	mov	x8, x0")
 __jited("	ldp	x27, x28, [sp], #0x10")
 int private_stack_exception_sub_prog(void)
 {
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (20 preceding siblings ...)
  2026-05-07 21:31 ` [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
@ 2026-05-07 21:32 ` Yonghong Song
  2026-05-09  2:15   ` sashiko-bot
  2026-05-07 21:32 ` [PATCH bpf-next v2 23/23] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
  2026-05-08 18:06 ` [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
  23 siblings, 1 reply; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:32 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

From: Puranjay Mohan <puranjay@kernel.org>

Implement stack argument passing for BPF-to-BPF and kfunc calls with
more than 5 parameters on arm64, following the AAPCS64 calling
convention.

BPF R1-R5 already map to x0-x4. With BPF_REG_0 moved to x8 by the
previous commit, x5-x7 are free for arguments 6-8. Arguments 9-12
spill onto the stack at [SP+0], [SP+8], ... and the callee reads
them from [FP+16], [FP+24], ... (above the saved FP/LR pair).

BPF convention uses fixed offsets from BPF_REG_PARAMS (r11): off=-8 is
always arg 6, off=-16 arg 7, etc. The verifier invalidates all outgoing
stack arg slots after each call, so the compiler must re-store before
every call. This means x5-x7 don't need to be saved on stack.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 arch/arm64/net/bpf_jit_comp.c | 87 ++++++++++++++++++++++++++++++++++-
 1 file changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 085e650662e3..9b9d2501d2d3 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -86,6 +86,7 @@ struct jit_ctx {
 	__le32 *image;
 	__le32 *ro_image;
 	u32 stack_size;
+	u16 stack_arg_size;
 	u64 user_vm_start;
 	u64 arena_vm_start;
 	bool fp_used;
@@ -533,13 +534,19 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
 	 *                        |     |
 	 *                        +-----+ <= (BPF_FP - prog->aux->stack_depth)
 	 *                        |RSVD | padding
-	 * current A64_SP =>      +-----+ <= (BPF_FP - ctx->stack_size)
+	 *                        +-----+ <= (BPF_FP - ctx->stack_size)
+	 *                        |     |
+	 *                        | ... | outgoing stack args (9+, if any)
+	 *                        |     |
+	 * current A64_SP =>      +-----+
 	 *                        |     |
 	 *                        | ... | Function call stack
 	 *                        |     |
 	 *                        +-----+
 	 *                          low
 	 *
+	 * Stack args 6-8 are passed in x5-x7, args 9+ at [SP].
+	 * Incoming args 9+ are at [FP + 16], [FP + 24], ...
 	 */
 
 	emit_kcfi(is_main_prog ? cfi_bpf_hash : cfi_bpf_subprog_hash, ctx);
@@ -613,6 +620,9 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
 	if (ctx->stack_size && !ctx->priv_sp_used)
 		emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
 
+	if (ctx->stack_arg_size)
+		emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
 	if (ctx->arena_vm_start)
 		emit_a64_mov_i64(arena_vm_base, ctx->arena_vm_start, ctx);
 
@@ -673,6 +683,9 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	/* Update tail_call_cnt if the slot is populated. */
 	emit(A64_STR64I(tcc, ptr, 0), ctx);
 
+	if (ctx->stack_arg_size)
+		emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
 	/* restore SP */
 	if (ctx->stack_size && !ctx->priv_sp_used)
 		emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
@@ -1034,6 +1047,9 @@ static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
 	const u8 r0 = bpf2a64[BPF_REG_0];
 	const u8 ptr = bpf2a64[TCCNT_PTR];
 
+	if (ctx->stack_arg_size)
+		emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_arg_size), ctx);
+
 	/* We're done with BPF stack */
 	if (ctx->stack_size && !ctx->priv_sp_used)
 		emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
@@ -1191,6 +1207,41 @@ static int add_exception_handler(const struct bpf_insn *insn,
 	return 0;
 }
 
+static const u8 stack_arg_reg[] = { A64_R(5), A64_R(6), A64_R(7) };
+
+#define NR_STACK_ARG_REGS	ARRAY_SIZE(stack_arg_reg)
+
+static void emit_stack_arg_load(u8 dst, s16 bpf_off, struct jit_ctx *ctx)
+{
+	int idx = bpf_off / sizeof(u64) - 1;
+
+	if (idx < NR_STACK_ARG_REGS)
+		emit(A64_MOV(1, dst, stack_arg_reg[idx]), ctx);
+	else
+		emit(A64_LDR64I(dst, A64_FP, (idx - NR_STACK_ARG_REGS) * sizeof(u64) + 16), ctx);
+}
+
+static void emit_stack_arg_store(u8 src_a64, s16 bpf_off, struct jit_ctx *ctx)
+{
+	int idx = -bpf_off / sizeof(u64) - 1;
+
+	if (idx < NR_STACK_ARG_REGS)
+		emit(A64_MOV(1, stack_arg_reg[idx], src_a64), ctx);
+	else
+		emit(A64_STR64I(src_a64, A64_SP, (idx - NR_STACK_ARG_REGS) * sizeof(u64)), ctx);
+}
+
+static void emit_stack_arg_store_imm(s32 imm, s16 bpf_off, const u8 tmp, struct jit_ctx *ctx)
+{
+	int idx = -bpf_off / sizeof(u64) - 1;
+
+	emit_a64_mov_i(1, tmp, imm, ctx);
+	if (idx < NR_STACK_ARG_REGS)
+		emit(A64_MOV(1, stack_arg_reg[idx], tmp), ctx);
+	else
+		emit(A64_STR64I(tmp, A64_SP, (idx - NR_STACK_ARG_REGS) * sizeof(u64)), ctx);
+}
+
 /* JITs an eBPF instruction.
  * Returns:
  * 0  - successfully JITed an 8-byte eBPF instruction.
@@ -1646,6 +1697,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 	case BPF_LDX | BPF_MEM | BPF_H:
 	case BPF_LDX | BPF_MEM | BPF_B:
 	case BPF_LDX | BPF_MEM | BPF_DW:
+		if (insn->src_reg == BPF_REG_PARAMS) {
+			emit_stack_arg_load(dst, off, ctx);
+			break;
+		}
+		fallthrough;
 	case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
 	case BPF_LDX | BPF_PROBE_MEM | BPF_W:
 	case BPF_LDX | BPF_PROBE_MEM | BPF_H:
@@ -1672,6 +1728,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 		if (src == fp) {
 			src_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
 			off_adj = off + ctx->stack_size;
+			if (!ctx->priv_sp_used)
+				off_adj += ctx->stack_arg_size;
 		} else {
 			src_adj = src;
 			off_adj = off;
@@ -1752,6 +1810,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 	case BPF_ST | BPF_MEM | BPF_H:
 	case BPF_ST | BPF_MEM | BPF_B:
 	case BPF_ST | BPF_MEM | BPF_DW:
+		if (insn->dst_reg == BPF_REG_PARAMS) {
+			emit_stack_arg_store_imm(imm, off, tmp, ctx);
+			break;
+		}
+		fallthrough;
 	case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
 	case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
 	case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
@@ -1763,6 +1826,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 		if (dst == fp) {
 			dst_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
 			off_adj = off + ctx->stack_size;
+			if (!ctx->priv_sp_used)
+				off_adj += ctx->stack_arg_size;
 		} else {
 			dst_adj = dst;
 			off_adj = off;
@@ -1814,6 +1879,11 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 	case BPF_STX | BPF_MEM | BPF_H:
 	case BPF_STX | BPF_MEM | BPF_B:
 	case BPF_STX | BPF_MEM | BPF_DW:
+		if (insn->dst_reg == BPF_REG_PARAMS) {
+			emit_stack_arg_store(src, off, ctx);
+			break;
+		}
+		fallthrough;
 	case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
 	case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
 	case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
@@ -1825,6 +1895,8 @@ static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn
 		if (dst == fp) {
 			dst_adj = ctx->priv_sp_used ? priv_sp : A64_SP;
 			off_adj = off + ctx->stack_size;
+			if (!ctx->priv_sp_used)
+				off_adj += ctx->stack_arg_size;
 		} else {
 			dst_adj = dst;
 			off_adj = off;
@@ -2065,6 +2137,14 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 	ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
 	ctx.arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena);
 
+	if (prog->aux->stack_arg_cnt > prog->aux->incoming_stack_arg_cnt) {
+		u16 outgoing_cnt = prog->aux->stack_arg_cnt - prog->aux->incoming_stack_arg_cnt;
+		int nr_on_stack = outgoing_cnt - NR_STACK_ARG_REGS;
+
+		if (nr_on_stack > 0)
+			ctx.stack_arg_size = round_up(nr_on_stack * sizeof(u64), 16);
+	}
+
 	if (priv_stack_ptr)
 		ctx.priv_sp_used = true;
 
@@ -2229,6 +2309,11 @@ bool bpf_jit_supports_kfunc_call(void)
 	return true;
 }
 
+bool bpf_jit_supports_stack_args(void)
+{
+	return true;
+}
+
 void *bpf_arch_text_copy(void *dst, void *src, size_t len)
 {
 	if (!aarch64_insn_copy(dst, src, len))
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH bpf-next v2 23/23] selftests/bpf: Enable stack argument tests for arm64
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (21 preceding siblings ...)
  2026-05-07 21:32 ` [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments Yonghong Song
@ 2026-05-07 21:32 ` Yonghong Song
  2026-05-08 18:06 ` [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
  23 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-07 21:32 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

From: Puranjay Mohan <puranjay@kernel.org>

Now that arm64 supports stack arguments, enable the existing stack_arg,
stack_arg_kfunc and verifier_stack_arg tests for __TARGET_ARCH_arm64.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c   | 3 ++-
 .../selftests/bpf/progs/btf__verifier_stack_arg_order.c        | 3 ++-
 tools/testing/selftests/bpf/progs/stack_arg.c                  | 3 ++-
 tools/testing/selftests/bpf/progs/stack_arg_kfunc.c            | 3 ++-
 tools/testing/selftests/bpf/progs/stack_arg_precision.c        | 3 ++-
 tools/testing/selftests/bpf/progs/verifier_stack_arg.c         | 3 ++-
 tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c   | 3 ++-
 7 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
index 296fddfe6804..8d38aafe66a2 100644
--- a/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
+++ b/tools/testing/selftests/bpf/progs/btf__stack_arg_precision.c
@@ -4,7 +4,8 @@
 #include <bpf/bpf_helpers.h>
 #include "../test_kmods/bpf_testmod_kfunc.h"
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
 {
diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
index 2d5ddb24e241..9a05bbecd170 100644
--- a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -3,7 +3,8 @@
 #include <vmlinux.h>
 #include <bpf/bpf_helpers.h>
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
 {
diff --git a/tools/testing/selftests/bpf/progs/stack_arg.c b/tools/testing/selftests/bpf/progs/stack_arg.c
index ab6240b997c5..b5e9929a4d63 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg.c
@@ -21,7 +21,8 @@ struct {
 
 int timer_result;
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 const volatile bool has_stack_arg = true;
 
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
index fa9def876ea5..da0d4f91d273 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg_kfunc.c
@@ -6,7 +6,8 @@
 #include "bpf_kfuncs.h"
 #include "../test_kmods/bpf_testmod_kfunc.h"
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 const volatile bool has_stack_arg = true;
 
diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
index c94905f07dcc..235b3f660904 100644
--- a/tools/testing/selftests/bpf/progs/stack_arg_precision.c
+++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
@@ -6,7 +6,8 @@
 #include "../test_kmods/bpf_testmod_kfunc.h"
 #include "bpf_misc.h"
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 /* Force kfunc extern BTF generation for inline asm call below.
  * Uses its own SEC so it's not included as a .text subprog.
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
index b4f680e3b6a1..81edafb426e6 100644
--- a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
@@ -12,7 +12,8 @@ struct {
 	__type(value, long long);
 } map_hash_8b SEC(".maps");
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 __noinline __used
 static int subprog_6args(int a, int b, int c, int d, int e, int f)
diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
index 671c79969c6c..16b3eb5f51a2 100644
--- a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
@@ -5,7 +5,8 @@
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
-#if defined(__TARGET_ARCH_x86) && defined(__BPF_FEATURE_STACK_ARGUMENT)
+#if (defined(__TARGET_ARCH_x86) || defined(__TARGET_ARCH_arm64)) && \
+	defined(__BPF_FEATURE_STACK_ARGUMENT)
 
 __noinline __used __naked
 static int subprog_bad_order_6args(int a, int b, int c, int d, int e, int f)
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs
  2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  2:10   ` sashiko-bot
  1 sibling, 0 replies; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	puranjay, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index ae10b9ca018d..6f94b0ef3659 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
>  		goto finalize;
>
>  	if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
> -	    bpf_prog_has_kfunc_call(fp))
> +	    bpf_prog_has_kfunc_call(fp) || fp->aux->stack_arg_cnt)
>  		jit_needed = true;
>
>  	if (!bpf_prog_select_interpreter(fp))
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index 0ff8cdf4394f..4c630d97eaad 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
> @@ -1412,6 +1412,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
>  		verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
>  		return -EINVAL;
>  	}
> +	for (i = 1; i < env->subprog_cnt; i++) {
> +		if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
> +			verbose(env, "stack args are not supported in non-JITed programs\n");
> +			return -EINVAL;
> +		}
> +	}
>  	if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
>  		/* When JIT fails the progs with bpf2bpf calls and tail_calls
>  		 * have to be rejected, since interpreter doesn't support them yet.


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
  2026-05-07 21:30 ` [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09 13:05     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 77af44d8a3ad..a33a5b4122f8 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>  	}
>  	args = (const struct btf_param *)(t + 1);
>  	nargs = btf_type_vlen(t);
> +	sub->arg_cnt = nargs;
>  	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>  		if (!is_global)
>  			return -EINVAL;
> @@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>  		return -EINVAL;
>  	}
>
> -	sub->arg_cnt = nargs;
>  	sub->args_cached = true;
>
>  	return 0;


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable
  2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09  1:42   ` sashiko-bot
  1 sibling, 0 replies; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

> bpf: Reject stack arguments if tail call reachable
>
> Tail calls are deprecated and will be replaced by indirect calls
> in the future. Reject programs that combine tail calls with stack
> arguments rather than adding complexity for a deprecated feature.

Is the claim that tail calls are deprecated accurate?

Multiple reviewers across v4, v5, v6, and regular patch versions raised
concerns that the commit message inaccurately states tail calls are deprecated
when the actual technical reason is that tail calls and stack arguments
require incompatible stack frame handling.

Looking at the mailing list discussion on the v1 series, Alexei's guidance
was to disallow mixing 6+ args and tailcalls due to stack frame layout
incompatibility, not because tail calls are being deprecated.

Could the commit message more precisely describe the actual reason for the
restriction, such as "tail calls and stack arguments require incompatible
stack frame handling" to better explain the technical limitation to future
readers?

Reference: https://lore.kernel.org/bpf/f163dcbf46e9511eba14b22ed10467984bda3bf07a3d1bd6cd27ae14eadc2056@mail.kernel.org/

Should this commit include a Fixes tag? It appears to fix a bug introduced
when stack arguments support was added. Suggested: Fixes: c364b7a0f7ae
("bpf: Support stack arguments for bpf functions")

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation
  2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-10 17:27     ` Yonghong Song
  2026-05-09  1:38   ` sashiko-bot
  1 sibling, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 3049 bytes --]

> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
> new file mode 100644
> index 000000000000..b4f680e3b6a1
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c

[ ... ]

> +__noinline __used
> +static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
> +{
> +	return *f;
> +}

[ ... ]

> +SEC("tc")
> +__description("stack_arg: pruning with different stack arg types")
> +__failure
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__msg("invalid mem access 'scalar'")
> +__naked void stack_arg_pruning_type_mismatch(void)
> +{
> +	asm volatile (
> +		"call %[bpf_get_prandom_u32];"
> +		"r6 = r0;"
> +		/* local = 0 on program stack */
> +		"r7 = 0;"
> +		"*(u64 *)(r10 - 8) = r7;"
> +		/* Branch based on random value */
> +		"if r6 s> 3 goto l0_%=;"
> +		/* Path 1: store stack pointer to outgoing arg6 */
> +		"r1 = r10;"
> +		"r1 += -8;"
> +		"*(u64 *)(r11 - 8) = r1;"
> +		"goto l1_%=;"
> +	"l0_%=:"
> +		/* Path 2: store scalar to outgoing arg6 */
> +		"*(u64 *)(r11 - 8) = 42;"
> +	"l1_%=:"
> +		/* Call subprog that dereferences arg6 */
> +		"r1 = r6;"
> +		"r2 = 0;"
> +		"r3 = 0;"
> +		"r4 = 0;"
> +		"r5 = 0;"
> +		"call subprog_deref_arg6;"
> +		"exit;"
> +		:: __imm(bpf_get_prandom_u32)
> +		: __clobber_all
> +	);
> +}

The Sashiko AI bot and bpf-ci raised concerns in v6 and April 24 versions
about test fragility in this test and similar ones. Since
subprog_deref_arg6 is standard C (not __naked), LLVM determines which
register is used for dereferencing. If a future LLVM version chooses a
different register, the tests will fail because the verifier error will
mention the new register instead of R1.

In v6, you mentioned using a regex like __msg("R{{[0-9]}} invalid mem
access 'scalar'"), but the current code removed the register prefix
entirely. While this may be intentional to avoid specifying a register,
could the tests benefit from either using regex to match the register
dynamically, or making subprog_deref_arg6 a __naked function to guarantee
the register choice?

Reference: https://lore.kernel.org/bpf/20260419172100.A89FEC2BCAF@smtp.kernel.org/

> +SEC("tc")
> +__description("stack_arg: release_reference invalidates stack arg slot")
> +__failure
> +__msg("invalid mem access 'scalar'")
> +__naked void stack_arg_release_ref(void)

[ ... ]

> +SEC("tc")
> +__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
> +__failure
> +__msg("invalid mem access 'scalar'")
> +__naked void stack_arg_stale_pkt_ptr(void)

[ ... ]

> +SEC("tc")
> +__description("stack_arg: null propagation rejects deref on null branch")
> +__failure
> +__msg("invalid mem access 'scalar'")
> +__naked void stack_arg_null_propagation_fail(void)

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots
  2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09 13:29     ` Yonghong Song
  2026-05-09  0:59   ` sashiko-bot
  1 sibling, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

commit 66a19f95afbdbdfe3c3c887c8656037ebcca1be1
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Extend liveness analysis to track stack argument slots

BPF_REG_PARAMS (R11) is at index MAX_BPF_REG, which is beyond the
register tracking arrays in const_fold.c and liveness.c. Handle it
explicitly to avoid out-of-bounds accesses.

Extend the arg tracking dataflow to cover stack arg slots. Otherwise,
pointers passed through stack args are invisible to liveness, causing
the pointed-to stack slots to be incorrectly poisoned.

This commit fixes a bug introduced in c364b7a0f7ae ("bpf: Support stack
arguments for bpf functions"). That earlier commit added support for
stack arguments but did not extend the liveness analysis to track them,
resulting in incorrect poisoning of stack slots when pointers are passed
through stack arguments.

Should this commit include a Fixes tag referencing c364b7a0f7ae? The
commit message describes fixing out-of-bounds accesses and incorrect
poisoning that were introduced by the stack argument support.

Suggested: Fixes: c364b7a0f7ae ("bpf: Support stack arguments for bpf functions")

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots
  2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
@ 2026-05-07 22:11   ` bot+bpf-ci
  2026-05-09 13:08     ` Yonghong Song
  2026-05-09  4:05   ` sashiko-bot
  1 sibling, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:11 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1083 bytes --]

commit 4c0111fe676cd1905f81c3bd7d7f87d41cc028f0
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Add precision marking and backtracking for stack argument slots

Extend the precision marking and backtracking infrastructure to
support stack argument slots (r11-based accesses). Without this,
precision demands for scalar values passed through stack arguments
are silently dropped, which could allow the verifier to incorrectly
prune states with different constant values in stack arg slots.

This commit fixes a bug but lacks a Fixes: tag. Looking at the commit
message, it addresses a verifier state pruning issue introduced when
stack arguments were first supported in commit c364b7a0f7ae ("bpf:
Support stack arguments for bpf functions").

Should this include: Fixes: c364b7a0f7ae ("bpf: Support stack arguments
for bpf functions")?

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
  2026-05-07 21:30 ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments Yonghong Song
@ 2026-05-07 22:26   ` bot+bpf-ci
  2026-05-10 17:21     ` Yonghong Song
  2026-05-09  2:21   ` sashiko-bot
  1 sibling, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:26 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	puranjay, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 2650 bytes --]

> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index ea9e707e8abf..67c2f4a3b9cc 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c

[ ... ]

> @@ -1664,16 +1692,45 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>  	int i, excnt = 0;
>  	int ilen, proglen = 0;
>  	u8 *ip, *prog = temp;
> +	u16 stack_arg_depth, incoming_stack_arg_depth, outgoing_stack_arg_depth; /* in bytes */
> +	u16 outgoing_rsp;
>  	u32 stack_depth;
> +	int callee_saved_size;
> +	s32 outgoing_arg_base;
>  	int err;
>
>  	stack_depth = bpf_prog->aux->stack_depth;
> +	stack_arg_depth = bpf_prog->aux->stack_arg_cnt * 8;
> +	incoming_stack_arg_depth = bpf_prog->aux->incoming_stack_arg_cnt * 8;
> +	outgoing_stack_arg_depth = stack_arg_depth - incoming_stack_arg_depth;

Can this u16 subtraction underflow for leaf subprograms? When a leaf
function receives stack arguments but makes no calls with stack arguments,
incoming_stack_arg_depth would be non-zero (e.g., 16 bytes for 2 args),
but stack_arg_depth would be zero since the verifier never updates it
for functions that don't make calls.

The subtraction would wrap around modulo 65536:
  outgoing_stack_arg_depth = 0 - 16 = 0xFFF0 (about 65520 bytes)

Later in do_jit():

> @@ -1700,6 +1757,42 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>  		push_r12(&prog);
>  	push_callee_regs(&prog, callee_regs_used);
>  }
> +
> +	/* Compute callee-saved register area size. */

[ ... ]

> +	/*
> +	 * Allocate outgoing stack arg area for args 7+ only.
> +	 * Arg 6 goes into r9 register, not on stack.
> +	 */
> +	outgoing_rsp = outgoing_stack_arg_depth > 8 ? outgoing_stack_arg_depth - 8 : 0;
> +	emit_sub_rsp(&prog, outgoing_rsp);

This would compute outgoing_rsp = 0xFFF0 - 8 = 0xFFE8 (about 65512 bytes),
and emit_sub_rsp() would generate:
  sub rsp, 0xFFE8

allocating a 65 KB stack frame that overflows the kernel stack
(THREAD_SIZE is only 16 KB on x86_64).

This issue was identified by bot+bpf-ci@kernel.org in v1 (Fri, 24 Apr
2026 18:00:42 +0000) at
https://lore.kernel.org/bpf/e1b7281b2a52b18ae2a9f7d41091994eec0bba81e96e64adbe4c072d915acfc2@mail.kernel.org/

The author acknowledged this bug and promised to fix it with a selftest
in the next revision, but the current code still contains the vulnerable
subtraction without underflow protection.


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions
  2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
@ 2026-05-07 22:26   ` bot+bpf-ci
  2026-05-09 12:52     ` Yonghong Song
  2026-05-08 18:00   ` Alexei Starovoitov
  1 sibling, 1 reply; 68+ messages in thread
From: bot+bpf-ci @ 2026-05-07 22:26 UTC (permalink / raw)
  To: yonghong.song, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 7457 bytes --]

> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 321b9d69cf9c..1fba16fced28 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -427,6 +427,10 @@ struct bpf_func_state {
>  	 * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE.
>  	 */
>  	int allocated_stack;
> +
> +	bool no_stack_arg_load;
> +	u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */
> +	struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */
>  };

The new field no_stack_arg_load is placed after allocated_stack, but
copy_stack_state() never assigns it.

The comment at line 417 says 'The following fields should be last. See
copy_func_state()', and copy_func_state() does this:

    memcpy(dst, src, offsetof(struct bpf_func_state, stack));

so any field after stack is not memcpy'd. copy_stack_state() copies
stack, allocated_stack, out_stack_arg_cnt, and stack_arg_regs, but
not no_stack_arg_load.

Consequence: whenever a state is cloned (e.g., via push_stack() at
conditional jumps), the pushed state inherits no_stack_arg_load=0
regardless of the source state's value.

This breaks the invariant the commit message describes: 'all stack arg
reads must be ahead of any stack arg write' and 'all stack arg reads must
be before any bpf func, kfunc and helpers'.

A BPF program can write to an outgoing stack arg (which sets
cur->no_stack_arg_load=true in check_stack_arg_write), then execute a
conditional jump which calls push_stack(). The pushed state's
no_stack_arg_load is reset to 0, so a subsequent stack arg read would be
accepted by check_stack_arg_read() when it should be rejected.

The same issue applies to state pruning: func_states_equal/stack_arg_safe
does not compare no_stack_arg_load either, so a state with
no_stack_arg_load=true can be pruned to an old state with
no_stack_arg_load=false incorrectly.

Should no_stack_arg_load be placed before stack (to be memcpy'd) or should
copy_stack_state() explicitly assign it?

[ ... ]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3bafb7ad2ba7..61d6663498f7 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1362,6 +1362,18 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st
>  		return -ENOMEM;
>
>  	dst->allocated_stack = src->allocated_stack;
> +
> +	/* copy stack args state */
> +	n = src->out_stack_arg_cnt;
> +	if (n) {
> +		dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n,
> +						 sizeof(struct bpf_reg_state),
> +						 GFP_KERNEL_ACCOUNT);
> +		if (!dst->stack_arg_regs)
> +			return -ENOMEM;
> +	}
> +
> +	dst->out_stack_arg_cnt = src->out_stack_arg_cnt;
>  	return 0;
>  }

copy_stack_state() copies stack, allocated_stack, stack_arg_regs, and
out_stack_arg_cnt, but no_stack_arg_load is missing.

The field is declared in struct bpf_func_state after allocated_stack, so
it lives in the region that copy_func_state's memcpy excludes:

    static int copy_func_state(struct bpf_func_state *dst,
                               const struct bpf_func_state *src)
    {
        memcpy(dst, src, offsetof(struct bpf_func_state, stack));
        return copy_stack_state(dst, src);
    }

The memcpy stops at stack, so no_stack_arg_load is never copied. Every
state copy path through bpf_copy_verifier_state(), push_stack(), and
pop_stack() silently resets the flag.

The commit enforces the invariant 'all stack arg reads must precede any
stack arg write or any call within a frame' by setting
state->no_stack_arg_load = true in check_stack_arg_write() and in
do_check_insn()'s call branch, and checking the flag in
check_stack_arg_read().

When the verifier encounters a branch, push_stack() calls
bpf_copy_verifier_state() which loses no_stack_arg_load. A BPF program can
perform a write or call, conditionally branch, and after the branch perform
a stack arg read that should be rejected. The read passes verification
because the flag was reset to false on the restored state.

The commit message says this invariant is required because 'JIT may emit
mov insns for read/write with the same register and bpf function, kfunc and
helper will invalidate all arguments immediately after the call'. Bypassing
the flag allows JIT-generated code that reads stale or clobbered stack
argument slots at runtime.

Should copy_stack_state() add 'dst->no_stack_arg_load =
src->no_stack_arg_load;'?

[ ... ]

> @@ -4050,6 +4080,105 @@ static int check_stack_write(struct bpf_verifier_env *env,
>  	return err;
>  }
>
> +/*
> + * Write a value to the outgoing stack arg area.
> + * off is a negative offset from r11 (e.g. -8 for arg6, -16 for arg7).
> + */
> +static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_state *state,
> +				 int off, struct bpf_reg_state *value_reg)
> +{
> +	int max_stack_arg_regs = MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS;
> +	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> +	int spi = -off / BPF_REG_SIZE - 1;
> +	struct bpf_func_state *cur;
> +	struct bpf_reg_state *arg;
> +	int err;
> +
> +	if (spi >= max_stack_arg_regs) {
> +		verbose(env, "stack arg write offset %d exceeds max %d stack args\n",
> +			off, max_stack_arg_regs);
> +		return -EINVAL;
> +	}
> +
> +	err = grow_stack_arg_slots(env, state, spi + 1);
> +	if (err)
> +		return err;
> +
> +	/* Track the max outgoing stack arg slot count. */
> +	if (spi + 1 > subprog->max_out_stack_arg_cnt)
> +		subprog->max_out_stack_arg_cnt = spi + 1;
> +
> +	cur = env->cur_state->frame[env->cur_state->curframe];

This isn't a bug, but the local variable cur is assigned but never read.
Both call sites already pass state = env->cur_state->frame[curframe], so
cur is always identical to the state parameter. Should this line be
removed?

[ ... ]

> @@ -16961,6 +17135,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)

The register validation in check_and_resolve_insns() allows r11/BPF_REG_PARAMS
through for all instruction classes, but only BPF_LDX, BPF_STX, and BPF_ST
have explicit handling.

In v4 of this series (April 11), bot+bpf-ci@kernel.org raised a concern:

  If a BPF program uses r11 as src_reg or dst_reg in an ALU or JMP
  instruction, it would cause an out-of-bounds access to regs[11] (array
  is only regs[0-10]).

  Would it make sense to restrict the r11 exemption here to only the
  instruction classes that actually handle it (BPF_LDX, BPF_STX, BPF_ST),
  rather than allowing it globally?

The author acknowledged this was a bug and agreed to restrict the r11
exemption to only BPF_LDX, BPF_STX, BPF_ST. However, the current code does
not have any exemption for r11.

Looking at check_and_resolve_insns(), it checks 'insn->dst_reg >=
MAX_BPF_REG' without exception. Since BPF_REG_PARAMS == MAX_BPF_REG == 11,
this will reject all uses of r11 including legitimate LDX/STX/ST operations
on stack arguments.

Should check_and_resolve_insns() allow r11 for the specific instruction
classes that handle stack arguments (BPF_LDX, BPF_STX, BPF_ST)?

Reference:
https://lore.kernel.org/bpf/bcd1b7033de7f597d73b3991f8fc4d8b502fce451fc288d909ae72a717578a0d@mail.kernel.org/

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions
  2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
  2026-05-07 22:26   ` bot+bpf-ci
@ 2026-05-08 18:00   ` Alexei Starovoitov
  2026-05-09 12:55     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: Alexei Starovoitov @ 2026-05-08 18:00 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

On Thu May 7, 2026 at 2:30 PM PDT, Yonghong Song wrote:
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index fba9e8c00878..0ff8cdf4394f 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
> @@ -1123,6 +1123,9 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  
>  		func[i]->aux->name[0] = 'F';
>  		func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
> +		func[i]->aux->incoming_stack_arg_cnt =
> +			bpf_in_stack_arg_cnt(&env->subprog_info[i]);
> +		func[i]->aux->stack_arg_cnt = env->subprog_info[i].stack_arg_cnt;

replied earlier. Should be no need for this copy.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7
  2026-05-07 21:31 ` [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
@ 2026-05-08 18:01   ` Alexei Starovoitov
  2026-05-09 13:44     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: Alexei Starovoitov @ 2026-05-08 18:01 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan

On Thu May 7, 2026 at 2:31 PM PDT, Yonghong Song wrote:
> From: Puranjay Mohan <puranjay@kernel.org>
>
> Move the BPF return value register from x7 to x8, freeing x7 for use
> as an argument register. AAPCS64 designates x8 as the indirect result
> location register; it is caller-saved and not used for argument
> passing, making it a suitable home for BPF_REG_0.
>
> This is a prerequisite for stack argument support, which needs x5-x7
> to pass arguments 6-8 to native kfuncs following the AAPCS64 calling
> convention.
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

That's not right.
It should probably be:
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs
  2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
                   ` (22 preceding siblings ...)
  2026-05-07 21:32 ` [PATCH bpf-next v2 23/23] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
@ 2026-05-08 18:06 ` Alexei Starovoitov
  2026-05-09 13:43   ` Yonghong Song
  23 siblings, 1 reply; 68+ messages in thread
From: Alexei Starovoitov @ 2026-05-08 18:06 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau

On Thu May 7, 2026 at 2:29 PM PDT, Yonghong Song wrote:
> This patch set lifts the 5-argument limit by introducing stack-based
> argument passing for BPF functions and kfunc's, coordinated with
> compiler support in LLVM [1]. The compiler emits loads/stores through
...
>
>   [1] https://github.com/llvm/llvm-project/pull/189060

Please land llvm bits.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots
  2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09  0:59   ` sashiko-bot
  2026-05-10 16:47     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  0:59 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] record_load_store_access() mistakenly uses the tracking state of stack argument slot 0 as the base pointer for R11-based memory accesses, leading to spurious liveness marks.
- [Low] Missing verifier logging for stack argument liveness state transitions.
--

commit 00a5fce117daa13f26c61ec5ffa3622241608fc1
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Extend liveness analysis to track stack argument slots

This commit extends liveness analysis to handle stack argument slots, preventing
pointers passed through stack args from incorrectly poisoning the pointed-to
stack slots. It explicitly handles BPF_REG_PARAMS (R11) to avoid out-of-bounds
accesses and extends tracking arrays.

> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
> index de0b4c6263ff2..eed7260f2bb0a 100644
> --- a/kernel/bpf/liveness.c
> +++ b/kernel/bpf/liveness.c
> @@ -610,6 +610,24 @@ enum arg_track_state {
>  /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
>  #define MAX_ARG_SPILL_SLOTS 64
>  
> +/* Track stack arg slots: outgoing starts at -(i+1)*8, incoming at +(i+1)*8 */
> +#define MAX_STACK_ARG_SLOTS (MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS)
> +
> +/*
> + * Combined register + stack arg tracking: R0-R10 at indices 0-10,
> + * outgoing stack arg slots at indices MAX_BPF_REG..MAX_BPF_REG+6.
> + */
> +#define MAX_AT_TRACK_REGS (MAX_BPF_REG + MAX_STACK_ARG_SLOTS)

Could this repurposed index cause a spurious liveness mark in
record_load_store_access()?

Because at[MAX_BPF_REG] is repurposed to track the value stored in outgoing
stack argument slot 0, what happens when record_load_store_access() processes
stack argument instructions like is_stack_arg_stx()?

record_load_store_access() does this:

	case BPF_STX:
		if (BPF_MODE(insn->code) == BPF_ATOMIC) {
			...
		} else {
			ptr = &at[insn->dst_reg];
			sz = -sz;
		}

If the instruction is a stack argument access, insn->dst_reg is BPF_REG_PARAMS
(which equals MAX_BPF_REG). It will fetch &at[MAX_BPF_REG]. If a BPF program
previously stored a frame pointer (FP) into slot 0, at[MAX_BPF_REG] will reflect
that FP.

Does this mean record_load_store_access() will spuriously treat this FP as the
base pointer and incorrectly mark the resulting local stack frame slot as
accessed, corrupting the local stack liveness mask?

[ ... ]
> @@ -1616,7 +1663,8 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
>  		memcpy(at_out, at_in[i], sizeof(at_out));
>  		memcpy(at_stack_out, at_stack_in[i], MAX_ARG_SPILL_SLOTS * sizeof(*at_stack_out));
>  
> -		arg_track_xfer(env, insn, idx, at_out, at_stack_out, instance, callsites);
> +		arg_track_xfer(env, insn, idx, at_out, at_stack_out,
> +			       at_stack_arg_entry, instance, callsites);
>  		arg_track_log(env, insn, idx, at_in[i], at_stack_in[i], at_out, at_stack_out);

Are the state transitions for stack arguments logged here?

In arg_track_log(), the loop for logging register transitions is strictly
bounded by MAX_BPF_REG:

	for (i = 0; i < MAX_BPF_REG; i++) {
		if (arg_track_eq(&at_out[i], &at_in[i]))
			continue;

Since the tracking arrays at_in and at_out were extended to MAX_AT_TRACK_REGS to
include state for outgoing stack arguments at indices MAX_BPF_REG through
MAX_AT_TRACK_REGS - 1, does arg_track_log() silently omit logging their state
transitions?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=9

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation
  2026-05-07 21:31 ` [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation Yonghong Song
@ 2026-05-09  1:30   ` sashiko-bot
  2026-05-10 17:23     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  1:30 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] Unconditional use of the 'r11' register in inline assembly will cause selftest build failures on older Clang compilers.
--

commit 6f66b8fd81d82b7d1f5f683d16976b8d486d440b
Author: Yonghong Song <yonghong.song@linux.dev>

selftests/bpf: Add tests for stack argument validation

This commit adds negative tests to verify that kfuncs properly reject large
structs passed as stack arguments. It also adds tests to ensure the verifier
rejects invalid uses of the r11 register for stack arguments.

> diff --git a/tools/testing/selftests/bpf/progs/stack_arg_fail.c b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
> new file mode 100644
> index 0000000000000..c6db302549cbd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/stack_arg_fail.c

[ ... ]

> +#else
> +
> +SEC("tc")
> +__description("stack_arg_fail: not supported, dummy test")
> +__success
> +int test_stack_arg_big(struct __sk_buff *skb)
> +{
> +	return 0;
> +}
> +
> +#endif
> +
> +SEC("socket")
> +__description("r11 in ALU instruction")
> +__failure __msg("R11 is invalid")
> +__naked void r11_alu_reject(void)
> +{
> +	asm volatile (
> +	"r11 += 1;"

Could these inline assembly blocks using r11 cause build failures on older
Clang compilers?

The r11 register is a relatively new addition to the BPF ISA and LLVM backend.
Because these naked assembly functions are placed outside of the
__BPF_FEATURE_STACK_ARGUMENT guard, compiling this file with an older
toolchain might result in fatal assembler errors regarding an invalid operand.

Would it be safer to move these r11 tests inside the feature guard to
maintain selftest compatibility across toolchains?

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=17

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation
  2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09  1:38   ` sashiko-bot
  2026-05-10 17:27     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  1:38 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The tests in `verifier_stack_arg_order.c` are never executed because the file is not registered in the selftests runner.
--

commit 6834becc44d439e99d840e596d30164699006d53
Author: Yonghong Song <yonghong.song@linux.dev>

selftests/bpf: Add verifier tests for stack argument validation

This commit adds new BPF selftests using inline assembly to validate the
verifier's logic for stack arguments. It covers positive tests and various
negative tests for verifier rejection and pointer/reference tracking.

> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index a96b25ebff234..aef21cf2987be 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -91,6 +91,7 @@
>  #include "verifier_sockmap_mutate.skel.h"
>  #include "verifier_spill_fill.skel.h"
>  #include "verifier_spin_lock.skel.h"
> +#include "verifier_stack_arg.skel.h"

Did you intend to include verifier_stack_arg_order.skel.h here as well?

>  #include "verifier_stack_ptr.skel.h"
>  #include "verifier_store_release.skel.h"
>  #include "verifier_subprog_precision.skel.h"
> @@ -238,6 +239,7 @@ void test_verifier_sock_addr(void)            { RUN(verifier_sock_addr); }
>  void test_verifier_sockmap_mutate(void)       { RUN(verifier_sockmap_mutate); }
>  void test_verifier_spill_fill(void)           { RUN(verifier_spill_fill); }
>  void test_verifier_spin_lock(void)            { RUN(verifier_spin_lock); }
> +void test_verifier_stack_arg(void)            { RUN(verifier_stack_arg); }

Could this be missing a RUN(verifier_stack_arg_order) entry?

The patch introduces the new tests for load ordering validation below,
but without registering them in the selftests runner, it looks like
they will not be executed.

[ ... ]

> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
> new file mode 100644
> index 0000000000000..671c79969c6cd
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
> @@ -0,0 +1,86 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=19

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names
  2026-05-07 21:31 ` [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
@ 2026-05-09  1:40   ` sashiko-bot
  2026-05-10 17:24     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  1:40 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The BPF object 'tobj' is leaked when 'fixup_btf_from_path()' fails.
--

commit af529dd65e68c8ac479551b0ac6332c3b5f56771
Author: Yonghong Song <yonghong.song@linux.dev>

selftests/bpf: Add BTF fixup for __naked subprog parameter names

This commit adds a mechanism to fix up missing parameter names in the BTF
FUNC_PROTO entries of __naked subprogs by reading them from a separate BTF
file. This allows the verifier to resolve stack argument slots by name.

> diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
> --- a/tools/testing/selftests/bpf/test_loader.c
> +++ b/tools/testing/selftests/bpf/test_loader.c

[ ... ]

> @@ -1231,13 +1351,27 @@ void run_subtest(struct test_loader *tester,
>  		}
>  	}
>  
> -	/* Implicitly reset to NULL if next test case doesn't specify */
> +	/* Implicitly reset to NULL if next test case doesn't specify.
> +	 * btf_custom_func_path also serves as btf_custom_path for kfunc resolution.
> +	 */
>  	open_opts->btf_custom_path = spec->btf_custom_path;
> +	if (!open_opts->btf_custom_path)
> +		open_opts->btf_custom_path = spec->btf_custom_func_path;
>  
>  	tobj = bpf_object__open_mem(obj_bytes, obj_byte_cnt, open_opts);
>  	if (!ASSERT_OK_PTR(tobj, "obj_open_mem")) /* shouldn't happen */
>  		goto subtest_cleanup;
>  
> +	/* Fix up __naked subprog BTF using a separate file with named params */
> +	if (spec->btf_custom_func_path) {
> +		err = fixup_btf_from_path(tobj, spec->btf_custom_func_path);
> +		if (err) {
> +			PRINT_FAIL("failed to fixup BTF from %s: %d\n",
> +				   spec->btf_custom_func_path, err);
> +			goto subtest_cleanup;

Does this error path leak tobj?

If fixup_btf_from_path() fails, the code jumps to subtest_cleanup, which
appears to bypass the tobj_cleanup label where tobj would normally be
closed and freed. Should this be goto tobj_cleanup instead?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=18

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable
  2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09  1:42   ` sashiko-bot
  2026-05-10 17:15     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  1:42 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Order-dependent DFS logic and incomplete fallback checks allow bypassing the stack arguments restriction and cause false rejections.
--

commit d6cf15a73bfea82a8d40934257b79e361027a9b1
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Reject stack arguments if tail call reachable

Tail calls are deprecated and will be replaced by indirect calls
in the future. Reject programs that combine tail calls with stack
arguments rather than adding complexity for a deprecated feature.

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index b6d07f7a52cdf..95fe21d05c100 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
[ ... ]
> @@ -5270,14 +5275,23 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
>  	 * this info will be utilized by JIT so that we will be preserving the
>  	 * tail call counter throughout bpf2bpf calls combined with tailcalls
>  	 */
> -	if (tail_call_reachable)
> +	if (tail_call_reachable) {
>  		for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller) {
>  			if (subprog[tmp].is_exception_cb) {
>  				verbose(env, "cannot tail call within exception cb\n");
>  				return -EINVAL;
>  			}
> +			if (subprog_has_stack_args(&subprog[tmp])) {
> +				verbose(env, "tail_calls are not allowed in programs with stack args\n");
> +				return -EINVAL;
> +			}

Does this create an order dependence where the verifier might falsely reject
valid programs?

If the main program calls a subprogram with a tail call, the
tail_call_reachable flag becomes true. If the main program subsequently calls
an independent subprogram that uses stack arguments, the flag can leak into
that branch, causing the program to be rejected. Would reversing the call
order cause the same program to pass?

>  			subprog[tmp].tail_call_reachable = true;
>  		}
> +	} else if (!idx && subprog[0].has_tail_call && subprog_has_stack_args(&subprog[0])) {
> +		verbose(env, "tail_calls are not allowed in programs with stack args\n");
> +		return -EINVAL;
> +	}

Does this fallback check miss nested stack arguments?

If the main program directly contains a tail call, the tail_call_reachable
variable remains false. If it then calls a wrapper function that passes >5
arguments to another subprogram, won't the nested stack arguments be missed
because the fallback only checks subprog[0]?

Can this also bypass restrictions for callback roots?

Since the fallback hardcodes !idx and subprog[0], if an exception callback
(which is verified with idx > 0) contains a direct tail call and uses outgoing
stack arguments, it bypasses the main check loop because tail_call_reachable
is false. Will it also bypass this fallback block because !idx is false?

Could we decouple this restriction from the call graph traversal and globally
check env->subprog_info for the presence of both features in the program
instead?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=14

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments
  2026-05-07 21:31 ` [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
@ 2026-05-09  1:52   ` sashiko-bot
  2026-05-10 17:31     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  1:52 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] Commit message and code comment incorrectly claim that the subprogram performs a bounds check.
- [Medium] Non-deterministic branching in the test program can lead to intermittent flaky failures by returning 36 instead of the asserted 25.
--

commit 953824567df7ff801d99eb6df49d76d95736325d
Author: Yonghong Song <yonghong.song@linux.dev>

selftests/bpf: Add precision backtracking test for stack arguments

This commit adds a test to verify precision backtracking across BPF-to-BPF
calls involving stack arguments. The test aims to ensure precision propagates
from a kfunc's size argument back through a subprogram frame.

> The test passes a size value as incoming stack arg (arg6) to a
> subprog, which bounds-checks it and forwards it as the mem__sz
> parameter (outgoing arg7) to bpf_kfunc_call_stack_arg_mem.

Is the description about bounds-checking accurate? 

Looking at the inline assembly for subprog_call_mem_kfunc, it appears the
subprogram simply reads the size from the incoming stack slot and immediately
writes it to the outgoing stack slot without any bounds checking logic.

> diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
> new file mode 100644
> index 0000000000000..c94905f07dcc1
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
> @@ -0,0 +1,137 @@

[ ... ]

> +/*
> + * Test precision backtracking across bpf-to-bpf call for kfunc stack arg.
> + * subprog_call_mem_kfunc receives a size as incoming stack arg (arg6),
> + * bounds-checks it, then passes it as mem__sz (arg7) to
> + * bpf_kfunc_call_stack_arg_mem.
> + *
> + * 1+2+3+4+5+(1+2+3+4) = 25
> + */

This isn't a bug, but should the comment be updated to match the
implementation? As mentioned above, the bounds check seems to be missing from
the assembly below.

> +__naked __noinline __used
> +static long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
> +{
> +	asm volatile (
> +		"r1 = *(u64 *)(r11 + 8);"	/* r1 = incoming arg6 (size) */
> +		"r2 = 0x0807060504030201 ll;"	/* r2 = buf contents */
> +		"*(u64 *)(r10 - 8) = r2;"	/* store buf to stack */
> +		"r2 = r10;"
> +		"r2 += -8;"			/* r2 = &buf */
> +		"*(u64 *)(r11 - 8) = r2;"	/* outgoing arg6 = buf */
> +		"*(u64 *)(r11 - 16) = r1;"	/* outgoing arg7 = size */

[ ... ]

> +__naked void stack_arg_precision_bpf2bpf(void)
> +{
> +	asm volatile (
> +		"call %[bpf_get_prandom_u32];"
> +		"r6 = r0;"
> +		"r1 = 1;"
> +		"r2 = 2;"
> +		"r3 = 3;"
> +		"r4 = 4;"
> +		"r5 = 5;"
> +		"if r6 < 2 goto l0_%=;"
> +		"*(u64 *)(r11 - 8) = 4;"
> +		"goto l1_%=;"
> +	"l0_%=:"
> +		"*(u64 *)(r11 - 8) = 6;"
> +	"l1_%=:"
> +		"call subprog_call_mem_kfunc;"
> +		"exit;"
> +		:: __imm(bpf_get_prandom_u32)
> +		: __clobber_all
> +	);
> +}

Could this lead to intermittent test failures?

The test asserts a strict return value of 25:

__success __retval(25)

If bpf_get_prandom_u32() happens to return 0 or 1, the size argument is set to
6 instead of 4. In that case, bpf_kfunc_call_stack_arg_mem() will sum up 6
bytes instead of 4, returning 36 instead of 25, which would cause the test to
fail.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=20

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs
  2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09  2:10   ` sashiko-bot
  2026-05-10 16:59     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  2:10 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] The BPF verifier fails to reject the interpreter for a main program (i = 0) with stack arguments, leading to an uninitialized memory read of R11.
--

commit 4a48796d8e4e4607bc163fc6b59a84f3495c3d5c
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Reject stack arguments in non-JITed programs

This commit aims to prevent non-JITed programs from running if they use
stack arguments, since the BPF interpreter does not populate the register
used for stack arguments (R11). It updates the runtime selection and fixups
phases to reject such programs.

> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index ae10b9ca018df..6f94b0ef3659f 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
[ ... ]
> @@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
>  		goto finalize;
>  
>  	if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
> -	    bpf_prog_has_kfunc_call(fp))
> +	    bpf_prog_has_kfunc_call(fp) || fp->aux->stack_arg_cnt)

Is fp->aux->stack_arg_cnt guaranteed to be initialized at this stage?

If JIT is disabled system-wide (e.g. sysctl net.core.bpf_jit_enable=0),
bpf_jit_subprogs() is skipped, leaving the count at 0. Furthermore, even if
JIT is enabled but the program has multiple subprograms, bpf_jit_subprogs()
only populates the cloned subprograms' auxiliary data, leaving the main
program's prog->aux->stack_arg_cnt as 0.

Could this allow the check to be bypassed for the main program?

>  		jit_needed = true;
>  
>  	if (!bpf_prog_select_interpreter(fp))
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index 0ff8cdf4394f4..4c630d97eaade 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
[ ... ]
> @@ -1412,6 +1412,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
>  		verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
>  		return -EINVAL;
>  	}
> +	for (i = 1; i < env->subprog_cnt; i++) {

This loop explicitly starts at i = 1, which skips checking the main program.

If an extension program (BPF_PROG_TYPE_EXT) attaches to a function with more
than 5 arguments and has BTF_FUNC_STATIC linkage, it will inherit the incoming
stack arguments. Since the check in core.c might fail to catch the main
program due to the uninitialized stack_arg_cnt, can this extension program
bypass both rejection mechanisms and fall back to the interpreter?

> +		if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
> +			verbose(env, "stack args are not supported in non-JITed programs\n");
> +			return -EINVAL;
> +		}
> +	}

If the main program bypasses these checks and is run by the interpreter, it
will attempt to read incoming stack arguments from regs[11]. Because the
interpreter's setup in ___bpf_prog_run() only initializes R1-R5 and the frame
pointer, would reading regs[11] leak uninitialized kernel stack memory or
cause a crash if the uninitialized value is dereferenced as a pointer?

>  	if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
>  		/* When JIT fails the progs with bpf2bpf calls and tail_calls
>  		 * have to be rejected, since interpreter doesn't support them yet.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=10

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments
  2026-05-07 21:32 ` [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments Yonghong Song
@ 2026-05-09  2:15   ` sashiko-bot
  2026-05-10 17:32     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  2:15 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] In `emit_stack_arg_load()`, `bpf_off` is not negated, causing a negative index that fails array bounds checks and results in the JIT embedding an invalid instruction (BRK), leading to a kernel panic.
--

commit 6ccf042549a3902757c6335313c905e99fb41292
Author: Puranjay Mohan <puranjay@kernel.org>

bpf, arm64: Add JIT support for stack arguments

This commit implements stack argument passing for BPF-to-BPF and kfunc
calls with more than 5 parameters on arm64, following the AAPCS64 calling
convention.

> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 085e650662e3d..9b9d2501d2d36 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c

[ ... ]

> @@ -1191,6 +1207,41 @@ static int add_exception_handler(const struct bpf_insn *insn,
>  	return 0;
>  }
>  
> +static const u8 stack_arg_reg[] = { A64_R(5), A64_R(6), A64_R(7) };
> +
> +#define NR_STACK_ARG_REGS	ARRAY_SIZE(stack_arg_reg)
> +
> +static void emit_stack_arg_load(u8 dst, s16 bpf_off, struct jit_ctx *ctx)
> +{
> +	int idx = bpf_off / sizeof(u64) - 1;

Does this missing negation for bpf_off lead to an invalid instruction being
emitted?

Since BPF stack arguments use negative offsets from BPF_REG_PARAMS (such as
-8 for arg 6), the calculation bpf_off / sizeof(u64) - 1 will result in a
negative idx.

Because NR_STACK_ARG_REGS uses ARRAY_SIZE, it evaluates to an unsigned
size_t. Comparing a negative idx to it will result in a false condition,
falling through to the else block.

In the else block, the negative idx generates a negative offset for
A64_LDR64I. Since A64_LDR64I expects an unsigned 12-bit scaled immediate,
could this cause the instruction encoder to fail and embed an
AARCH64_BREAK_FAULT into the JIT image?

> +
> +	if (idx < NR_STACK_ARG_REGS)
> +		emit(A64_MOV(1, dst, stack_arg_reg[idx]), ctx);
> +	else
> +		emit(A64_LDR64I(dst, A64_FP, (idx - NR_STACK_ARG_REGS) * sizeof(u64) + 16), ctx);
> +}
> +
> +static void emit_stack_arg_store(u8 src_a64, s16 bpf_off, struct jit_ctx *ctx)
> +{
> +	int idx = -bpf_off / sizeof(u64) - 1;

For comparison, emit_stack_arg_store() correctly negates bpf_off here.

> +
> +	if (idx < NR_STACK_ARG_REGS)

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=22

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments
  2026-05-07 21:30 ` [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
@ 2026-05-09  2:19   ` sashiko-bot
  2026-05-10 17:05     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  2:19 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Static functions with more than 5 parameters are not actually rejected when the JIT does not support stack arguments. This bypasses verifier stack initialization checks and allows the program to run, leading to arbitrary kernel memory access at runtime.
- [Low] The patch returns the kernel-internal error code -ENOTSUPP instead of the standard -EOPNOTSUPP, causing generic 'Unknown error 524' messages in user-space.
--

commit e571d76e56fcb6ad5c6cc343d3221d66e6aec024
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Prepare architecture JIT support for stack arguments

This commit introduces a weak function bpf_jit_supports_stack_args() to allow
architectures to indicate support for BPF stack arguments. It also updates
the BTF preparation code to reject functions with more than 5 arguments if
the JIT does not support stack arguments.

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index ec3fb8c8f4eef..bf83b737d55cb 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
[ ... ]
> @@ -7886,8 +7886,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>  			MAX_BPF_FUNC_ARGS, tname, nargs);
>  		return -EFAULT;
>  	}
> -	if (nargs > MAX_BPF_FUNC_REG_ARGS)
> +	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
> +		if (!bpf_jit_supports_stack_args()) {
> +			bpf_log(log, "JIT does not support function %s() with %d args\n",
> +				tname, nargs);
> +			return -ENOTSUPP;

Should this return -EOPNOTSUPP instead of -ENOTSUPP?

-ENOTSUPP is a kernel-internal error code (524). If this error bubbles up
to user-space (for example, when loading a global BPF program), it might
result in a confusing "Unknown error 524" message rather than a standard
error.

Also, does returning an error here actually reject static functions with
more than 5 arguments?

Looking at the code just above this diff, sub->arg_cnt is set to nargs
before the check:

kernel/bpf/btf.c:btf_prepare_func_args() {
	...
	sub->arg_cnt = nargs;
	...
	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
		if (!bpf_jit_supports_stack_args()) {
			...
			return -ENOTSUPP;
		}
	...
}

If this returns -ENOTSUPP, it propagates to btf_check_subprog_call(),
which returns the error. However, check_func_call() ignores non-EFAULT
errors for static functions:

kernel/bpf/verifier.c:check_func_call() {
	...
	err = btf_check_subprog_call(env, subprog, caller, ...);
	if (err == -EFAULT)
		return err;
	...
}

Since the error is ignored, verification continues without rejecting the
program. Because the error was returned early, btf_check_func_arg_match()
skips check_outgoing_stack_args(), meaning the caller is never forced to
properly initialize the stack slots according to BTF types.

Since sub->arg_cnt was already set to nargs, bpf_in_stack_arg_cnt() will
calculate that the function takes stack arguments. If the caller manually
initializes the stack argument slots by writing to r11, the verifier tracks
the register types and might accept the program.

At runtime, since the JIT does not support stack arguments, BPF_REG_PARAMS
(r11) is likely missing from the JIT's register translation maps.
Uninitialized entries default to 0, which often maps to the return value
register (like rax on x86 or x0 on arm64).

Could this cause reads or writes against r11 + offset to be translated
into memory operations against rax/x0 + offset? Since the return register
is easily controllable by the BPF program, might this allow arbitrary
kernel memory reads and writes?

> +		}
>  		sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
> +	}
>  
>  	if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
>  		bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=11

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
  2026-05-07 21:30 ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments Yonghong Song
  2026-05-07 22:26   ` bot+bpf-ci
@ 2026-05-09  2:21   ` sashiko-bot
  2026-05-10 17:22     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  2:21 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] The offset translation formula for outgoing stack arguments (Args 7+) negates the BPF offset, reversing the memory layout and causing partial accesses to corrupt the stack.
- [Critical] Immediate stores (`BPF_ST`) of 8, 16, or 32 bits to Arg 6 bypass the register routing check, writing out-of-bounds to the stack and leaving Arg 6 uninitialized.
- [Critical] Partial loads and stores (`BPF_LDX` / `BPF_STX`) to Arg 6 unconditionally emit 64-bit register moves, ignoring size semantics and breaking verifier bounds tracking.
- [Critical] Sub-register accesses to Arg 6 bypass exact-offset interception, falling through to memory operations that leak the kernel return address or corrupt the stack.
--

commit ee1dd512e669afb6db2ee6c05d9e78b43305dc79
Author: Yonghong Song <yonghong.song@linux.dev>

    bpf,x86: Implement JIT support for stack arguments

This patch adds x86_64 JIT support for BPF functions and kfuncs with more
than 5 arguments. The extra arguments are passed through a stack area
addressed by the R11 register (BPF_REG_PARAMS), with argument 6 being routed
to the R9 register.

> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index ea9e707e8abff..67c2f4a3b9cc3 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
[ ... ]
> @@ -2134,12 +2227,27 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>  				EMIT1(0xC7);
>  			goto st;
>  		case BPF_ST | BPF_MEM | BPF_DW:
> +			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
> +				/* Arg 6: store immediate in r9 register */
> +				emit_mov_imm64(&prog, X86_REG_R9, imm32 >> 31, (u32)imm32);
> +				break;
> +			}

Are stores of 8, 16, or 32 bits to Arg 6 missing the register routing check?
Because BPF_B, BPF_H, and BPF_W sizes jump directly to the st: label, they
appear to skip this R9 check. Could this cause them to mistakenly apply the
stack argument translation and write out-of-bounds, leaving R9 uninitialized?

>  			EMIT2(add_1mod(0x48, dst_reg), 0xC7);
>  
> -st:			if (is_imm8(insn->off))
> -				EMIT2(add_1reg(0x40, dst_reg), insn->off);
> +st:			insn_off = insn->off;
> +			if (dst_reg == BPF_REG_PARAMS) {
> +				/*
> +				 * Args 7+: reverse BPF negative offsets to
> +				 * x86 positive rsp offsets.
> +				 * BPF off=-16 → [rsp+0], off=-24 → [rsp+8], ...
> +				 */
> +				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
> +				dst_reg = BPF_REG_FP;
> +			}

Does subtracting the BPF offset reverse the memory layout within a stack slot?

If a BPF program accesses the upper 4 bytes of Arg 7 (offset -12), the formula
evaluates to BASE - (-12) - 16 = BASE - 4. This points out-of-bounds below
the stack pointer. Similarly, writing to the upper 4 bytes of Arg 8 (-20)
would evaluate to BASE + 4, overwriting the upper half of Arg 7.

[ ... ]
> @@ -2149,7 +2257,17 @@ st:			if (is_imm8(insn->off))
>  		case BPF_STX | BPF_MEM | BPF_H:
>  		case BPF_STX | BPF_MEM | BPF_W:
>  		case BPF_STX | BPF_MEM | BPF_DW:
> -			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
> +			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
> +				/* Arg 6: store register value in r9 */
> +				EMIT_mov(X86_REG_R9, src_reg);
> +				break;
> +			}

Will sub-register stores to Arg 6 bypass this check? If a program writes to
the upper 4 bytes of outgoing Arg 6 (insn->off == -4), the exact check fails
and it will fall through to memory translation, potentially writing
out-of-bounds.

Also, does EMIT_mov() unconditionally emit a 64-bit move, ignoring the BPF_B,
BPF_H, and BPF_W sizes? If a 1-byte store is performed, this might write 64
bits to R9 instead of the intended size.

[ ... ]
> @@ -2248,6 +2366,19 @@ st:			if (is_imm8(insn->off))
>  		case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
>  		case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
>  			insn_off = insn->off;
> +			if (src_reg == BPF_REG_PARAMS) {
> +				if (insn_off == 8) {
> +					/* Incoming arg 6: read from r9 */
> +					EMIT_mov(dst_reg, X86_REG_R9);
> +					break;
> +				}
> +				src_reg = BPF_REG_FP;

What happens if a program reads a sub-register of incoming Arg 6?
If insn_off == 12 (the upper 4 bytes), the exact check fails, and src_reg
becomes BPF_REG_FP (rbp). The resulting memory read from [rbp + 12]
corresponds to the upper 4 bytes of the kernel return address, which could
be leaked to the BPF program.

Similarly, if the program performs a partial load of Arg 6 (insn_off == 8),
EMIT_mov() moves the full 64-bit value from R9. Could this bypass the
verifier's bounds tracking if the verifier assumes the upper 56 bits are
properly zeroed for a 1-byte read?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=15

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns
  2026-05-07 21:30 ` [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns Yonghong Song
@ 2026-05-09  2:59   ` sashiko-bot
  2026-05-10 17:11     ` Yonghong Song
  0 siblings, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  2:59 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] The patch enables the use of `BPF_REG_PARAMS` (`r11`) in BPF instructions but fails to include the corresponding JIT compiler and interpreter support, allowing an attacker to achieve arbitrary kernel memory reads and writes.
--

commit ae81414063b0b3f7e1da69b7583407d2bc29cb7c
Author: Yonghong Song <yonghong.song@linux.dev>

bpf: Enable r11 based insns

This patch enables the use of BPF_REG_PARAMS (r11) for stack argument
accesses in the BPF verifier. It updates check_and_resolve_insns() to
accept ldx, st, and stx instructions that use r11 if they match the
stack argument formats.

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4827df9f817c2..e208807e79959 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -18004,11 +18004,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
>  		return err;
>  
>  	for (i = 0; i < insn_cnt; i++, insn++) {
> -		if (insn->dst_reg >= MAX_BPF_REG) {
> +		if (insn->dst_reg >= MAX_BPF_REG &&
> +		    !is_stack_arg_st(insn) && !is_stack_arg_stx(insn)) {

Does enabling r11 here without corresponding JIT and interpreter updates
introduce a regression?

If a program uses r11, but the JIT backends lack a mapping for it, could
it map to an unexpected hardware register? For example, on x86-64, if
reg2hex[11] defaults to 0, it maps to %rax (BPF_REG_0).

Since a BPF program can control the value of BPF_REG_0, an instruction
like BPF_STX [r11 - 8], r1 might be compiled to mov %rdi, -8(%rax). Could
this allow an arbitrary memory write bypassing the verifier sandbox?

In non-JIT environments, could regs[11] in the interpreter be
uninitialized, leading to uninitialized pointer dereferences?

It appears this regression is addressed later in the series by commit
ee1dd512e669a ("bpf,x86: Implement JIT support for stack arguments")
and commit 6ccf042549a39 ("bpf, arm64: Add JIT support for stack
arguments"), but could applying this patch on its own break bisection?

>  			verbose(env, "R%d is invalid\n", insn->dst_reg);
>  			return -EINVAL;
>  		}
> -		if (insn->src_reg >= MAX_BPF_REG) {
> +		if (insn->src_reg >= MAX_BPF_REG && !is_stack_arg_ldx(insn)) {
>  			verbose(env, "R%d is invalid\n", insn->src_reg);
>  			return -EINVAL;
>  		}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=12

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots
  2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09  4:05   ` sashiko-bot
  2026-05-10 16:41     ` Yonghong Song
  1 sibling, 1 reply; 68+ messages in thread
From: sashiko-bot @ 2026-05-09  4:05 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Incorrect tracking and propagation of stack argument precision at subprogram boundaries in `backtrack_insn()`.
--

commit 82222fc9756a9a239fae1b810efc4aa155d2973f
Author: Yonghong Song <yonghong.song@linux.dev>

    bpf: Add precision marking and backtracking for stack argument slots

This commit extends the precision marking and backtracking infrastructure to
support stack argument slots (r11-based accesses). It ensures precision
demands for scalar values passed through stack arguments are properly
tracked.

> diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
> index 93738a076c888..17f254a02831f 100644
> --- a/kernel/bpf/backtrack.c
> +++ b/kernel/bpf/backtrack.c
[ ... ]
> @@ -336,6 +351,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
>  			return 0;
>  		bt_clear_reg(bt, load_reg);
>  
> +		if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
> +			spi = hist->spi;
> +			/*
> +			 * Stack arg read: callee reads from r11+off, but
> +			 * the data lives in the caller's stack_arg_regs.
> +			 * Set the mask in the caller frame so precision
> +			 * is marked in the caller's slot at the callee
> +			 * entry checkpoint.
> +			 */
> +			bt_set_frame_stack_arg_slot(bt, bt->frame - 1, spi);
> +			return 0;
> +		}
> +

When backtracking a callee's read of an incoming stack argument via BPF_LDX,
the logic correctly sets the mask in the caller frame (bt->frame - 1).

[ ... ]
> @@ -426,6 +465,18 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
>  						bpf_bt_set_frame_reg(bt, bt->frame - 1, i);
>  					}
>  				}
> +				/*
> +				 * propagate callee's incoming stack arg precision
> +				 * to caller's outgoing stack arg slots
> +				 */
> +				if (bt_stack_arg_mask(bt)) {
> +					for (i = 0; i < MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS; i++) {
> +						if (!bt_is_frame_stack_arg_slot_set(bt, bt->frame, i))
> +							continue;
> +						bt_clear_frame_stack_arg_slot(bt, bt->frame, i);
> +						bt_set_frame_stack_arg_slot(bt, bt->frame - 1, i);
> +					}
> +				}

Does this propagation logic operate on the correct mask?

Since the callee's BPF_LDX instructions already marked the incoming stack
arguments directly in the caller's frame (bt->frame - 1), 
bt_stack_arg_mask(bt) for the current frame (bt->frame) shouldn't contain
any incoming stack arguments here.

If bt_stack_arg_mask(bt) is non-zero at this point, wouldn't that indicate
leftover uninitialized outgoing stack arguments from the callee that were
never matched by a BPF_STX?

Instead of propagating these leftovers to the caller frame, should this
condition be treated as a verifier_bug(), similar to how leftover stack
slots are handled just above this code?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507212942.1122000-1-yonghong.song@linux.dev?part=7

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions
  2026-05-07 22:26   ` bot+bpf-ci
@ 2026-05-09 12:52     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 12:52 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	eddyz87, clm, ihor.solodrai



On 5/7/26 12:26 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 321b9d69cf9c..1fba16fced28 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -427,6 +427,10 @@ struct bpf_func_state {
>>   	 * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE.
>>   	 */
>>   	int allocated_stack;
>> +
>> +	bool no_stack_arg_load;
>> +	u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */
>> +	struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */
>>   };
> The new field no_stack_arg_load is placed after allocated_stack, but
> copy_stack_state() never assigns it.
>
> The comment at line 417 says 'The following fields should be last. See
> copy_func_state()', and copy_func_state() does this:
>
>      memcpy(dst, src, offsetof(struct bpf_func_state, stack));
>
> so any field after stack is not memcpy'd. copy_stack_state() copies
> stack, allocated_stack, out_stack_arg_cnt, and stack_arg_regs, but
> not no_stack_arg_load.
>
> Consequence: whenever a state is cloned (e.g., via push_stack() at
> conditional jumps), the pushed state inherits no_stack_arg_load=0
> regardless of the source state's value.
>
> This breaks the invariant the commit message describes: 'all stack arg
> reads must be ahead of any stack arg write' and 'all stack arg reads must
> be before any bpf func, kfunc and helpers'.
>
> A BPF program can write to an outgoing stack arg (which sets
> cur->no_stack_arg_load=true in check_stack_arg_write), then execute a
> conditional jump which calls push_stack(). The pushed state's
> no_stack_arg_load is reset to 0, so a subsequent stack arg read would be
> accepted by check_stack_arg_read() when it should be rejected.
>
> The same issue applies to state pruning: func_states_equal/stack_arg_safe
> does not compare no_stack_arg_load either, so a state with
> no_stack_arg_load=true can be pruned to an old state with
> no_stack_arg_load=false incorrectly.
>
> Should no_stack_arg_load be placed before stack (to be memcpy'd) or should
> copy_stack_state() explicitly assign it?

This is my fault. In this version, I tried to make code better to understand,
e.g. putting related fields together. But in this particular case, I should
not do this as I forgot this is for state copying. Will fix in the next revision.

>
> [ ... ]
>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 3bafb7ad2ba7..61d6663498f7 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -1362,6 +1362,18 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st
>>   		return -ENOMEM;
>>
>>   	dst->allocated_stack = src->allocated_stack;
>> +
>> +	/* copy stack args state */
>> +	n = src->out_stack_arg_cnt;
>> +	if (n) {
>> +		dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n,
>> +						 sizeof(struct bpf_reg_state),
>> +						 GFP_KERNEL_ACCOUNT);
>> +		if (!dst->stack_arg_regs)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	dst->out_stack_arg_cnt = src->out_stack_arg_cnt;
>>   	return 0;
>>   }
> copy_stack_state() copies stack, allocated_stack, stack_arg_regs, and
> out_stack_arg_cnt, but no_stack_arg_load is missing.
>
> The field is declared in struct bpf_func_state after allocated_stack, so
> it lives in the region that copy_func_state's memcpy excludes:
>
>      static int copy_func_state(struct bpf_func_state *dst,
>                                 const struct bpf_func_state *src)
>      {
>          memcpy(dst, src, offsetof(struct bpf_func_state, stack));
>          return copy_stack_state(dst, src);
>      }
>
> The memcpy stops at stack, so no_stack_arg_load is never copied. Every
> state copy path through bpf_copy_verifier_state(), push_stack(), and
> pop_stack() silently resets the flag.
>
> The commit enforces the invariant 'all stack arg reads must precede any
> stack arg write or any call within a frame' by setting
> state->no_stack_arg_load = true in check_stack_arg_write() and in
> do_check_insn()'s call branch, and checking the flag in
> check_stack_arg_read().
>
> When the verifier encounters a branch, push_stack() calls
> bpf_copy_verifier_state() which loses no_stack_arg_load. A BPF program can
> perform a write or call, conditionally branch, and after the branch perform
> a stack arg read that should be rejected. The read passes verification
> because the flag was reset to false on the restored state.
>
> The commit message says this invariant is required because 'JIT may emit
> mov insns for read/write with the same register and bpf function, kfunc and
> helper will invalidate all arguments immediately after the call'. Bypassing
> the flag allows JIT-generated code that reads stale or clobbered stack
> argument slots at runtime.
>
> Should copy_stack_state() add 'dst->no_stack_arg_load =
> src->no_stack_arg_load;'?

Again, will fix no_stack_arg_load issue in the next revision.

>
> [ ... ]
>
>> @@ -4050,6 +4080,105 @@ static int check_stack_write(struct bpf_verifier_env *env,
>>   	return err;
>>   }
>>
>> +/*
>> + * Write a value to the outgoing stack arg area.
>> + * off is a negative offset from r11 (e.g. -8 for arg6, -16 for arg7).
>> + */
>> +static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_state *state,
>> +				 int off, struct bpf_reg_state *value_reg)
>> +{
>> +	int max_stack_arg_regs = MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS;
>> +	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
>> +	int spi = -off / BPF_REG_SIZE - 1;
>> +	struct bpf_func_state *cur;
>> +	struct bpf_reg_state *arg;
>> +	int err;
>> +
>> +	if (spi >= max_stack_arg_regs) {
>> +		verbose(env, "stack arg write offset %d exceeds max %d stack args\n",
>> +			off, max_stack_arg_regs);
>> +		return -EINVAL;
>> +	}
>> +
>> +	err = grow_stack_arg_slots(env, state, spi + 1);
>> +	if (err)
>> +		return err;
>> +
>> +	/* Track the max outgoing stack arg slot count. */
>> +	if (spi + 1 > subprog->max_out_stack_arg_cnt)
>> +		subprog->max_out_stack_arg_cnt = spi + 1;
>> +
>> +	cur = env->cur_state->frame[env->cur_state->curframe];
> This isn't a bug, but the local variable cur is assigned but never read.
> Both call sites already pass state = env->cur_state->frame[curframe], so
> cur is always identical to the state parameter. Should this line be
> removed?

Right, 'cur = ...' should be removed.

>
> [ ... ]
>
>> @@ -16961,6 +17135,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> The register validation in check_and_resolve_insns() allows r11/BPF_REG_PARAMS
> through for all instruction classes, but only BPF_LDX, BPF_STX, and BPF_ST
> have explicit handling.
>
> In v4 of this series (April 11), bot+bpf-ci@kernel.org raised a concern:
>
>    If a BPF program uses r11 as src_reg or dst_reg in an ALU or JMP
>    instruction, it would cause an out-of-bounds access to regs[11] (array
>    is only regs[0-10]).
>
>    Would it make sense to restrict the r11 exemption here to only the
>    instruction classes that actually handle it (BPF_LDX, BPF_STX, BPF_ST),
>    rather than allowing it globally?
>
> The author acknowledged this was a bug and agreed to restrict the r11
> exemption to only BPF_LDX, BPF_STX, BPF_ST. However, the current code does
> not have any exemption for r11.
>
> Looking at check_and_resolve_insns(), it checks 'insn->dst_reg >=
> MAX_BPF_REG' without exception. Since BPF_REG_PARAMS == MAX_BPF_REG == 11,
> this will reject all uses of r11 including legitimate LDX/STX/ST operations
> on stack arguments.
>
> Should check_and_resolve_insns() allow r11 for the specific instruction
> classes that handle stack arguments (BPF_LDX, BPF_STX, BPF_ST)?

This actually handled in later patch "bpf: Enable r11 based insns",
so we should be fine for this one.

>
> Reference:
> https://lore.kernel.org/bpf/bcd1b7033de7f597d73b3991f8fc4d8b502fce451fc288d909ae72a717578a0d@mail.kernel.org/
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions
  2026-05-08 18:00   ` Alexei Starovoitov
@ 2026-05-09 12:55     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 12:55 UTC (permalink / raw)
  To: Alexei Starovoitov, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau



On 5/8/26 8:00 PM, Alexei Starovoitov wrote:
> On Thu May 7, 2026 at 2:30 PM PDT, Yonghong Song wrote:
>> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
>> index fba9e8c00878..0ff8cdf4394f 100644
>> --- a/kernel/bpf/fixups.c
>> +++ b/kernel/bpf/fixups.c
>> @@ -1123,6 +1123,9 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>   
>>   		func[i]->aux->name[0] = 'F';
>>   		func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
>> +		func[i]->aux->incoming_stack_arg_cnt =
>> +			bpf_in_stack_arg_cnt(&env->subprog_info[i]);
>> +		func[i]->aux->stack_arg_cnt = env->subprog_info[i].stack_arg_cnt;
> replied earlier. Should be no need for this copy.

Okay, will do.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09 13:05     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 13:05 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	eddyz87, clm, ihor.solodrai



On 5/7/26 12:11 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index 77af44d8a3ad..a33a5b4122f8 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -7880,6 +7880,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>>   	}
>>   	args = (const struct btf_param *)(t + 1);
>>   	nargs = btf_type_vlen(t);
>> +	sub->arg_cnt = nargs;
>>   	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>>   		if (!is_global)
>>   			return -EINVAL;
>> @@ -8067,7 +8068,6 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>>   		return -EINVAL;
>>   	}
>>
>> -	sub->arg_cnt = nargs;
>>   	sub->args_cached = true;
>>
>>   	return 0;
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.

sub->arg_cnt is used in two places:

First, function btf_check_func_arg_match():

static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
                                     const struct btf *btf,
                                     struct bpf_reg_state *regs)
{
         struct bpf_subprog_info *sub = subprog_info(env, subprog);
         struct bpf_func_state *caller = cur_func(env);
         struct bpf_verifier_log *log = &env->log;
         u32 i;
         int ret;

         ret = btf_prepare_func_args(env, subprog);
         if (ret)
                 return ret;

         ret = check_outgoing_stack_args(env, caller, sub->arg_cnt);
         if (ret)
                 return ret;

         /* check that BTF function arguments match actual types that the
          * verifier sees.
          */
         for (i = 0; i < sub->arg_cnt; i++) {
		...
	}
	...
}

In this case, btf_prepare_func_args return failure and sub->arg_cnt will
not be used any more. So for this case, the patch won't make a difference.

Second, function do_check_common():

         if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
                 const char *sub_name = subprog_name(env, subprog);
                 struct bpf_subprog_arg_info *arg;
                 struct bpf_reg_state *reg;

                 if (env->log.level & BPF_LOG_LEVEL)
                         verbose(env, "Validating %s() func#%d...\n", sub_name, subprog);
                 ret = btf_prepare_func_args(env, subprog);
                 if (ret)
                         goto out;

                 if (subprog_is_exc_cb(env, subprog)) {
                         state->frame[0]->in_exception_callback_fn = true;
                 
                         /*
                          * Global functions are scalar or void, make sure
                          * we return a scalar.
                          * we return a scalar.
                          */
                         if (subprog_returns_void(env, subprog)) {
                                 verbose(env, "exception cb cannot return void\n");
                                 ret = -EINVAL;
                                 goto out;
                         }

                         /* Also ensure the callback only has a single scalar argument. */
                         if (sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_ANYTHING) {
                                 verbose(env, "exception cb only supports single integer argument\n");
                                 ret = -EINVAL;
                                 goto out;
                         }
                 }
                 for (i = BPF_REG_1; i <= min_t(u32, sub->arg_cnt, MAX_BPF_FUNC_REG_ARGS); i++) {
                         arg = &sub->args[i - BPF_REG_1];
                         reg = &regs[i];

                         if (arg->arg_type == ARG_PTR_TO_CTX) {
                                 reg->type = PTR_TO_CTX;
                                 mark_reg_known_zero(env, regs, i);
                         } else if (arg->arg_type == ARG_ANYTHING) {
                                 reg->type = SCALAR_VALUE;
                                 mark_reg_unknown(env, regs, i);
                         } else if (arg->arg_type == ARG_PTR_TO_DYNPTR) {
                                 /* assume unspecial LOCAL dynptr type */
                                 __mark_dynptr_reg(reg, BPF_DYNPTR_TYPE_LOCAL, true, ++env->id_gen);
                         } else if (base_type(arg->arg_type) == ARG_PTR_TO_MEM) {
                                 reg->type = PTR_TO_MEM;
                                 reg->type |= arg->arg_type &
                                              (PTR_MAYBE_NULL | PTR_UNTRUSTED | MEM_RDONLY);
                                 mark_reg_known_zero(env, regs, i);
                                 reg->mem_size = arg->mem_size;
                                 if (arg->arg_type & PTR_MAYBE_NULL)
                                         reg->id = ++env->id_gen;
                         } else if (base_type(arg->arg_type) == ARG_PTR_TO_BTF_ID) {
                                 reg->type = PTR_TO_BTF_ID;
                                 if (arg->arg_type & PTR_MAYBE_NULL)
                                         reg->type |= PTR_MAYBE_NULL;
                                 if (arg->arg_type & PTR_UNTRUSTED)
                                         reg->type |= PTR_UNTRUSTED;
                                 if (arg->arg_type & PTR_TRUSTED)
                                         reg->type |= PTR_TRUSTED;
                                 mark_reg_known_zero(env, regs, i);
                                 reg->btf = bpf_get_btf_vmlinux(); /* can't fail at this point */
                                 reg->btf_id = arg->btf_id;
                                 reg->id = ++env->id_gen;
                         } else if (base_type(arg->arg_type) == ARG_PTR_TO_ARENA) {
                                 /* caller can pass either PTR_TO_ARENA or SCALAR */
                                 mark_reg_unknown(env, regs, i);
                         } else {
                                 verifier_bug(env, "unhandled arg#%d type %d",
                                              i - BPF_REG_1 + 1, arg->arg_type);
                                 ret = -EFAULT;
                                 goto out;
                         }
                 }
                 if (env->prog->type == BPF_PROG_TYPE_EXT && sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS) {
                         verbose(env, "freplace programs with >%d args not supported yet\n",
                                 MAX_BPF_FUNC_REG_ARGS);
                         ret = -EINVAL;
                         goto out;
                 }
         } else {
                 /* if main BPF program has associated BTF info, validate that
                  * it's matching expected signature, and otherwise mark BTF
                  * info for main program as unreliable
                  */
                 if (env->prog->aux->func_info_aux) {
                         ret = btf_prepare_func_args(env, 0);
                         if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX)
                                 env->prog->aux->func_info_aux[0].unreliable = true;
                 }

                 /* 1st arg to a function */
                 regs[BPF_REG_1].type = PTR_TO_CTX;
                 mark_reg_known_zero(env, regs, BPF_REG_1);
         }

For the then brach of 'if (subprog || env->prog->type == BPF_PROG_TYPE_EXT)',
if btf_prepare_func_args() fails, the error will return.

But the else branch of 'if (subprog || env->prog->type == BPF_PROG_TYPE_EXT)',
if btf_prepare_func_args() fails, the verification will still continue.
So it makes sense to get sub->arg_cnt earlier in btf_prepare_func_args()
since sub->arg_cnt is needed later on.

> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09 13:08     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 13:08 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	eddyz87, clm, ihor.solodrai



On 5/7/26 12:11 AM, bot+bpf-ci@kernel.org wrote:
> commit 4c0111fe676cd1905f81c3bd7d7f87d41cc028f0
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Add precision marking and backtracking for stack argument slots
>
> Extend the precision marking and backtracking infrastructure to
> support stack argument slots (r11-based accesses). Without this,
> precision demands for scalar values passed through stack arguments
> are silently dropped, which could allow the verifier to incorrectly
> prune states with different constant values in stack arg slots.
>
> This commit fixes a bug but lacks a Fixes: tag. Looking at the commit
> message, it addresses a verifier state pruning issue introduced when
> stack arguments were first supported in commit c364b7a0f7ae ("bpf:
> Support stack arguments for bpf functions").
>
> Should this include: Fixes: c364b7a0f7ae ("bpf: Support stack arguments
> for bpf functions")?

We should not add a 'Fixes' tag. At this stage, r11 based insns
will be rejected in check_and_resolve_insns(). The precision marking
and backtracking handling won't be triggered.

>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-09 13:29     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 13:29 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	eddyz87, clm, ihor.solodrai



On 5/7/26 12:11 AM, bot+bpf-ci@kernel.org wrote:
> commit 66a19f95afbdbdfe3c3c887c8656037ebcca1be1
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Extend liveness analysis to track stack argument slots
>
> BPF_REG_PARAMS (R11) is at index MAX_BPF_REG, which is beyond the
> register tracking arrays in const_fold.c and liveness.c. Handle it
> explicitly to avoid out-of-bounds accesses.
>
> Extend the arg tracking dataflow to cover stack arg slots. Otherwise,
> pointers passed through stack args are invisible to liveness, causing
> the pointed-to stack slots to be incorrectly poisoned.

We should be okay since is_stack_arg_{st,stx,ldx} will explicitly
handle the liveness and then return. So there is no out-of-bounds access.

>
> This commit fixes a bug introduced in c364b7a0f7ae ("bpf: Support stack
> arguments for bpf functions"). That earlier commit added support for
> stack arguments but did not extend the liveness analysis to track them,
> resulting in incorrect poisoning of stack slots when pointers are passed
> through stack arguments.
>
> Should this commit include a Fixes tag referencing c364b7a0f7ae? The
> commit message describes fixing out-of-bounds accesses and incorrect
> poisoning that were introduced by the stack argument support.
>
> Suggested: Fixes: c364b7a0f7ae ("bpf: Support stack arguments for bpf functions")
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs
  2026-05-08 18:06 ` [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
@ 2026-05-09 13:43   ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 13:43 UTC (permalink / raw)
  To: Alexei Starovoitov, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau



On 5/8/26 8:06 PM, Alexei Starovoitov wrote:
> On Thu May 7, 2026 at 2:29 PM PDT, Yonghong Song wrote:
>> This patch set lifts the 5-argument limit by introducing stack-based
>> argument passing for BPF functions and kfunc's, coordinated with
>> compiler support in LLVM [1]. The compiler emits loads/stores through
> ...
>>    [1] https://github.com/llvm/llvm-project/pull/189060
> Please land llvm bits.

Just merged.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7
  2026-05-08 18:01   ` Alexei Starovoitov
@ 2026-05-09 13:44     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-09 13:44 UTC (permalink / raw)
  To: Alexei Starovoitov, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Jose E . Marchesi, kernel-team, Martin KaFai Lau, Puranjay Mohan



On 5/8/26 8:01 PM, Alexei Starovoitov wrote:
> On Thu May 7, 2026 at 2:31 PM PDT, Yonghong Song wrote:
>> From: Puranjay Mohan <puranjay@kernel.org>
>>
>> Move the BPF return value register from x7 to x8, freeing x7 for use
>> as an argument register. AAPCS64 designates x8 as the indirect result
>> location register; it is caller-saved and not used for argument
>> passing, making it a suitable home for BPF_REG_0.
>>
>> This is a prerequisite for stack argument support, which needs x5-x7
>> to pass arguments 6-8 to native kfuncs following the AAPCS64 calling
>> convention.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> That's not right.
> It should probably be:
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

My mistake. Will fix.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots
  2026-05-09  4:05   ` sashiko-bot
@ 2026-05-10 16:41     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 16:41 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 6:05 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] Incorrect tracking and propagation of stack argument precision at subprogram boundaries in `backtrack_insn()`.
> --
>
> commit 82222fc9756a9a239fae1b810efc4aa155d2973f
> Author: Yonghong Song <yonghong.song@linux.dev>
>
>      bpf: Add precision marking and backtracking for stack argument slots
>
> This commit extends the precision marking and backtracking infrastructure to
> support stack argument slots (r11-based accesses). It ensures precision
> demands for scalar values passed through stack arguments are properly
> tracked.
>
>> diff --git a/kernel/bpf/backtrack.c b/kernel/bpf/backtrack.c
>> index 93738a076c888..17f254a02831f 100644
>> --- a/kernel/bpf/backtrack.c
>> +++ b/kernel/bpf/backtrack.c
> [ ... ]
>> @@ -336,6 +351,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
>>   			return 0;
>>   		bt_clear_reg(bt, load_reg);
>>   
>> +		if (hist && hist->flags & INSN_F_STACK_ARG_ACCESS) {
>> +			spi = hist->spi;
>> +			/*
>> +			 * Stack arg read: callee reads from r11+off, but
>> +			 * the data lives in the caller's stack_arg_regs.
>> +			 * Set the mask in the caller frame so precision
>> +			 * is marked in the caller's slot at the callee
>> +			 * entry checkpoint.
>> +			 */
>> +			bt_set_frame_stack_arg_slot(bt, bt->frame - 1, spi);
>> +			return 0;
>> +		}
>> +
> When backtracking a callee's read of an incoming stack argument via BPF_LDX,
> the logic correctly sets the mask in the caller frame (bt->frame - 1).
>
> [ ... ]
>> @@ -426,6 +465,18 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
>>   						bpf_bt_set_frame_reg(bt, bt->frame - 1, i);
>>   					}
>>   				}
>> +				/*
>> +				 * propagate callee's incoming stack arg precision
>> +				 * to caller's outgoing stack arg slots
>> +				 */
>> +				if (bt_stack_arg_mask(bt)) {
>> +					for (i = 0; i < MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS; i++) {
>> +						if (!bt_is_frame_stack_arg_slot_set(bt, bt->frame, i))
>> +							continue;
>> +						bt_clear_frame_stack_arg_slot(bt, bt->frame, i);
>> +						bt_set_frame_stack_arg_slot(bt, bt->frame - 1, i);
>> +					}
>> +				}
> Does this propagation logic operate on the correct mask?
>
> Since the callee's BPF_LDX instructions already marked the incoming stack
> arguments directly in the caller's frame (bt->frame - 1),
> bt_stack_arg_mask(bt) for the current frame (bt->frame) shouldn't contain
> any incoming stack arguments here.

You are right here. bt_is_frame_stack_arg_slot_set(bt, bt->frame, i) should be 0 so
the above whole loop is not necessary.

>
> If bt_stack_arg_mask(bt) is non-zero at this point, wouldn't that indicate
> leftover uninitialized outgoing stack arguments from the callee that were
> never matched by a BPF_STX?
>
> Instead of propagating these leftovers to the caller frame, should this
> condition be treated as a verifier_bug(), similar to how leftover stack
> slots are handled just above this code?

Yes, verifier_bug() makes sense to capture potential issues. This will
be similar to potential bugs with register spills.

>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots
  2026-05-09  0:59   ` sashiko-bot
@ 2026-05-10 16:47     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 16:47 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 2:59 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [High] record_load_store_access() mistakenly uses the tracking state of stack argument slot 0 as the base pointer for R11-based memory accesses, leading to spurious liveness marks.
> - [Low] Missing verifier logging for stack argument liveness state transitions.
> --
>
> commit 00a5fce117daa13f26c61ec5ffa3622241608fc1
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Extend liveness analysis to track stack argument slots
>
> This commit extends liveness analysis to handle stack argument slots, preventing
> pointers passed through stack args from incorrectly poisoning the pointed-to
> stack slots. It explicitly handles BPF_REG_PARAMS (R11) to avoid out-of-bounds
> accesses and extends tracking arrays.
>
>> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
>> index de0b4c6263ff2..eed7260f2bb0a 100644
>> --- a/kernel/bpf/liveness.c
>> +++ b/kernel/bpf/liveness.c
>> @@ -610,6 +610,24 @@ enum arg_track_state {
>>   /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
>>   #define MAX_ARG_SPILL_SLOTS 64
>>   
>> +/* Track stack arg slots: outgoing starts at -(i+1)*8, incoming at +(i+1)*8 */
>> +#define MAX_STACK_ARG_SLOTS (MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS)
>> +
>> +/*
>> + * Combined register + stack arg tracking: R0-R10 at indices 0-10,
>> + * outgoing stack arg slots at indices MAX_BPF_REG..MAX_BPF_REG+6.
>> + */
>> +#define MAX_AT_TRACK_REGS (MAX_BPF_REG + MAX_STACK_ARG_SLOTS)
> Could this repurposed index cause a spurious liveness mark in
> record_load_store_access()?
>
> Because at[MAX_BPF_REG] is repurposed to track the value stored in outgoing
> stack argument slot 0, what happens when record_load_store_access() processes
> stack argument instructions like is_stack_arg_stx()?
>
> record_load_store_access() does this:
>
> 	case BPF_STX:
> 		if (BPF_MODE(insn->code) == BPF_ATOMIC) {
> 			...
> 		} else {
> 			ptr = &at[insn->dst_reg];
> 			sz = -sz;
> 		}
>
> If the instruction is a stack argument access, insn->dst_reg is BPF_REG_PARAMS
> (which equals MAX_BPF_REG). It will fetch &at[MAX_BPF_REG]. If a BPF program
> previously stored a frame pointer (FP) into slot 0, at[MAX_BPF_REG] will reflect
> that FP.
>
> Does this mean record_load_store_access() will spuriously treat this FP as the
> base pointer and incorrectly mark the resulting local stack frame slot as
> accessed, corrupting the local stack liveness mask?

In this case, if the insn is r11-based insn in record_load_store_access(),
we should just return since record_load_store_access() is for fp based stacks.

>
> [ ... ]
>> @@ -1616,7 +1663,8 @@ static int compute_subprog_args(struct bpf_verifier_env *env,
>>   		memcpy(at_out, at_in[i], sizeof(at_out));
>>   		memcpy(at_stack_out, at_stack_in[i], MAX_ARG_SPILL_SLOTS * sizeof(*at_stack_out));
>>   
>> -		arg_track_xfer(env, insn, idx, at_out, at_stack_out, instance, callsites);
>> +		arg_track_xfer(env, insn, idx, at_out, at_stack_out,
>> +			       at_stack_arg_entry, instance, callsites);
>>   		arg_track_log(env, insn, idx, at_in[i], at_stack_in[i], at_out, at_stack_out);
> Are the state transitions for stack arguments logged here?
>
> In arg_track_log(), the loop for logging register transitions is strictly
> bounded by MAX_BPF_REG:
>
> 	for (i = 0; i < MAX_BPF_REG; i++) {
> 		if (arg_track_eq(&at_out[i], &at_in[i]))
> 			continue;
>
> Since the tracking arrays at_in and at_out were extended to MAX_AT_TRACK_REGS to
> include state for outgoing stack arguments at indices MAX_BPF_REG through
> MAX_AT_TRACK_REGS - 1, does arg_track_log() silently omit logging their state
> transitions?

Missed the logging issue for stack argument tracking. Will fix.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs
  2026-05-09  2:10   ` sashiko-bot
@ 2026-05-10 16:59     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 16:59 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 4:10 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] The BPF verifier fails to reject the interpreter for a main program (i = 0) with stack arguments, leading to an uninitialized memory read of R11.
> --
>
> commit 4a48796d8e4e4607bc163fc6b59a84f3495c3d5c
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Reject stack arguments in non-JITed programs
>
> This commit aims to prevent non-JITed programs from running if they use
> stack arguments, since the BPF interpreter does not populate the register
> used for stack arguments (R11). It updates the runtime selection and fixups
> phases to reject such programs.
>
>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>> index ae10b9ca018df..6f94b0ef3659f 100644
>> --- a/kernel/bpf/core.c
>> +++ b/kernel/bpf/core.c
> [ ... ]
>> @@ -2599,7 +2599,7 @@ struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct
>>   		goto finalize;
>>   
>>   	if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
>> -	    bpf_prog_has_kfunc_call(fp))
>> +	    bpf_prog_has_kfunc_call(fp) || fp->aux->stack_arg_cnt)
> Is fp->aux->stack_arg_cnt guaranteed to be initialized at this stage?

I would say yes. The main program is coverred in bpf_jit_subprogs():

         if (env->subprog_cnt <= 1) {
                 env->prog->aux->stack_arg_cnt = env->subprog_info[0].stack_arg_cnt;
                 return 0;
         }

For >= 2 number of subprogs, the stack_arg_cnt will set up in jit_subprogs().

>
> If JIT is disabled system-wide (e.g. sysctl net.core.bpf_jit_enable=0),
> bpf_jit_subprogs() is skipped, leaving the count at 0. Furthermore, even if
> JIT is enabled but the program has multiple subprograms, bpf_jit_subprogs()
> only populates the cloned subprograms' auxiliary data, leaving the main
> program's prog->aux->stack_arg_cnt as 0.
>
> Could this allow the check to be bypassed for the main program?

If JIT is disabled, we have

         for (i = 1; i < env->subprog_cnt; i++) {
                 if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
                         verbose(env, "stack args are not supported in non-JITed programs\n");
                         return -EINVAL;
                 }
         }

So for any subprog having >5 arguments, interpreter will not run.

For main program, the above env->prog->aux->stack_arg_cnt later will
be used in core.c to decide whether interpreter can run or not.

>
>>   		jit_needed = true;
>>   
>>   	if (!bpf_prog_select_interpreter(fp))
>> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
>> index 0ff8cdf4394f4..4c630d97eaade 100644
>> --- a/kernel/bpf/fixups.c
>> +++ b/kernel/bpf/fixups.c
> [ ... ]
>> @@ -1412,6 +1412,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
>>   		verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
>>   		return -EINVAL;
>>   	}
>> +	for (i = 1; i < env->subprog_cnt; i++) {
> This loop explicitly starts at i = 1, which skips checking the main program.
>
> If an extension program (BPF_PROG_TYPE_EXT) attaches to a function with more
> than 5 arguments and has BTF_FUNC_STATIC linkage, it will inherit the incoming
> stack arguments. Since the check in core.c might fail to catch the main
> program due to the uninitialized stack_arg_cnt, can this extension program
> bypass both rejection mechanisms and fall back to the interpreter?

Currently, BPF_PROG_TYPE_EXT programs does not support more than 5 arguments.

>
>> +		if (bpf_in_stack_arg_cnt(&env->subprog_info[i])) {
>> +			verbose(env, "stack args are not supported in non-JITed programs\n");
>> +			return -EINVAL;
>> +		}
>> +	}
> If the main program bypasses these checks and is run by the interpreter, it
> will attempt to read incoming stack arguments from regs[11]. Because the
> interpreter's setup in ___bpf_prog_run() only initializes R1-R5 and the frame
> pointer, would reading regs[11] leak uninitialized kernel stack memory or
> cause a crash if the uninitialized value is dereferenced as a pointer?

In the new revision, we will have

@@ -18160,8 +18347,11 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
                  */
                 if (env->prog->aux->func_info_aux) {
                         ret = btf_prepare_func_args(env, 0);
-                       if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX)
+                       if (ret || sub->arg_cnt != 1 || sub->args[0].arg_type != ARG_PTR_TO_CTX) {
                                 env->prog->aux->func_info_aux[0].unreliable = true;
+                               sub->arg_cnt = 1;
+                               sub->stack_arg_cnt = 0;
+                       }
                 }

In this case, stack_arg_cnt = 0 means incoming stack argumnet cnt is 0, so
any reading incoming stack arguments will cause verification failure.

>
>>   	if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
>>   		/* When JIT fails the progs with bpf2bpf calls and tail_calls
>>   		 * have to be rejected, since interpreter doesn't support them yet.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments
  2026-05-09  2:19   ` sashiko-bot
@ 2026-05-10 17:05     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:05 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 4:19 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [Critical] Static functions with more than 5 parameters are not actually rejected when the JIT does not support stack arguments. This bypasses verifier stack initialization checks and allows the program to run, leading to arbitrary kernel memory access at runtime.
> - [Low] The patch returns the kernel-internal error code -ENOTSUPP instead of the standard -EOPNOTSUPP, causing generic 'Unknown error 524' messages in user-space.
> --
>
> commit e571d76e56fcb6ad5c6cc343d3221d66e6aec024
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Prepare architecture JIT support for stack arguments
>
> This commit introduces a weak function bpf_jit_supports_stack_args() to allow
> architectures to indicate support for BPF stack arguments. It also updates
> the BTF preparation code to reject functions with more than 5 arguments if
> the JIT does not support stack arguments.
>
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index ec3fb8c8f4eef..bf83b737d55cb 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
> [ ... ]
>> @@ -7886,8 +7886,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>>   			MAX_BPF_FUNC_ARGS, tname, nargs);
>>   		return -EFAULT;
>>   	}
>> -	if (nargs > MAX_BPF_FUNC_REG_ARGS)
>> +	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>> +		if (!bpf_jit_supports_stack_args()) {
>> +			bpf_log(log, "JIT does not support function %s() with %d args\n",
>> +				tname, nargs);
>> +			return -ENOTSUPP;
> Should this return -EOPNOTSUPP instead of -ENOTSUPP?
>
> -ENOTSUPP is a kernel-internal error code (524). If this error bubbles up
> to user-space (for example, when loading a global BPF program), it might
> result in a confusing "Unknown error 524" message rather than a standard
> error.
>
> Also, does returning an error here actually reject static functions with
> more than 5 arguments?
>
> Looking at the code just above this diff, sub->arg_cnt is set to nargs
> before the check:
>
> kernel/bpf/btf.c:btf_prepare_func_args() {
> 	...
> 	sub->arg_cnt = nargs;
> 	...
> 	if (nargs > MAX_BPF_FUNC_REG_ARGS) {
> 		if (!bpf_jit_supports_stack_args()) {
> 			...
> 			return -ENOTSUPP;
> 		}
> 	...
> }
>
> If this returns -ENOTSUPP, it propagates to btf_check_subprog_call(),
> which returns the error. However, check_func_call() ignores non-EFAULT
> errors for static functions:
>
> kernel/bpf/verifier.c:check_func_call() {
> 	...
> 	err = btf_check_subprog_call(env, subprog, caller, ...);
> 	if (err == -EFAULT)
> 		return err;
> 	...
> }
>
> Since the error is ignored, verification continues without rejecting the
> program. Because the error was returned early, btf_check_func_arg_match()
> skips check_outgoing_stack_args(), meaning the caller is never forced to
> properly initialize the stack slots according to BTF types.

I will use -EFAULT error code, so the prog with potential stack arguments
will be rejected due to -EFAULT. This applies to below analysis where
they will not really work as -EFAULT will prevent that.

>
> Since sub->arg_cnt was already set to nargs, bpf_in_stack_arg_cnt() will
> calculate that the function takes stack arguments. If the caller manually
> initializes the stack argument slots by writing to r11, the verifier tracks
> the register types and might accept the program.
>
> At runtime, since the JIT does not support stack arguments, BPF_REG_PARAMS
> (r11) is likely missing from the JIT's register translation maps.
> Uninitialized entries default to 0, which often maps to the return value
> register (like rax on x86 or x0 on arm64).
>
> Could this cause reads or writes against r11 + offset to be translated
> into memory operations against rax/x0 + offset? Since the return register
> is easily controllable by the BPF program, might this allow arbitrary
> kernel memory reads and writes?
>
>> +		}
>>   		sub->stack_arg_cnt = nargs - MAX_BPF_FUNC_REG_ARGS;
>> +	}
>>   
>>   	if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
>>   		bpf_log(log, "global function %s has %d > %d args, stack args not supported\n",


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns
  2026-05-09  2:59   ` sashiko-bot
@ 2026-05-10 17:11     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:11 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 4:59 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Critical] The patch enables the use of `BPF_REG_PARAMS` (`r11`) in BPF instructions but fails to include the corresponding JIT compiler and interpreter support, allowing an attacker to achieve arbitrary kernel memory reads and writes.
> --
>
> commit ae81414063b0b3f7e1da69b7583407d2bc29cb7c
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Enable r11 based insns
>
> This patch enables the use of BPF_REG_PARAMS (r11) for stack argument
> accesses in the BPF verifier. It updates check_and_resolve_insns() to
> accept ldx, st, and stx instructions that use r11 if they match the
> stack argument formats.
>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 4827df9f817c2..e208807e79959 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -18004,11 +18004,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
>>   		return err;
>>   
>>   	for (i = 0; i < insn_cnt; i++, insn++) {
>> -		if (insn->dst_reg >= MAX_BPF_REG) {
>> +		if (insn->dst_reg >= MAX_BPF_REG &&
>> +		    !is_stack_arg_st(insn) && !is_stack_arg_stx(insn)) {
> Does enabling r11 here without corresponding JIT and interpreter updates
> introduce a regression?

This is a new feature. So it won't cause regression from this patch
set perspective. But indeed, r11 based insns are enabled, but jit is
not enabled, but this is not a regression.

>
> If a program uses r11, but the JIT backends lack a mapping for it, could
> it map to an unexpected hardware register? For example, on x86-64, if
> reg2hex[11] defaults to 0, it maps to %rax (BPF_REG_0).

No. jit is not enabled yet for stack arguments. Also for the below
example.

>
> Since a BPF program can control the value of BPF_REG_0, an instruction
> like BPF_STX [r11 - 8], r1 might be compiled to mov %rdi, -8(%rax). Could
> this allow an arbitrary memory write bypassing the verifier sandbox?
>
> In non-JIT environments, could regs[11] in the interpreter be
> uninitialized, leading to uninitialized pointer dereferences?

The verifier should be able to prevent that. See previous comments
in "bpf: Reject stack arguments in non-JITed programs".

>
> It appears this regression is addressed later in the series by commit
> ee1dd512e669a ("bpf,x86: Implement JIT support for stack arguments")
> and commit 6ccf042549a39 ("bpf, arm64: Add JIT support for stack
> arguments"), but could applying this patch on its own break bisection?

I guess it should be okay. It won't break the kernel although
verificaiton will failure.

>
>>   			verbose(env, "R%d is invalid\n", insn->dst_reg);
>>   			return -EINVAL;
>>   		}
>> -		if (insn->src_reg >= MAX_BPF_REG) {
>> +		if (insn->src_reg >= MAX_BPF_REG && !is_stack_arg_ldx(insn)) {
>>   			verbose(env, "R%d is invalid\n", insn->src_reg);
>>   			return -EINVAL;
>>   		}


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable
  2026-05-09  1:42   ` sashiko-bot
@ 2026-05-10 17:15     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:15 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 3:42 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] Order-dependent DFS logic and incomplete fallback checks allow bypassing the stack arguments restriction and cause false rejections.
> --
>
> commit d6cf15a73bfea82a8d40934257b79e361027a9b1
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> bpf: Reject stack arguments if tail call reachable
>
> Tail calls are deprecated and will be replaced by indirect calls
> in the future. Reject programs that combine tail calls with stack
> arguments rather than adding complexity for a deprecated feature.
>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index b6d07f7a52cdf..95fe21d05c100 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
> [ ... ]
>> @@ -5270,14 +5275,23 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
>>   	 * this info will be utilized by JIT so that we will be preserving the
>>   	 * tail call counter throughout bpf2bpf calls combined with tailcalls
>>   	 */
>> -	if (tail_call_reachable)
>> +	if (tail_call_reachable) {
>>   		for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller) {
>>   			if (subprog[tmp].is_exception_cb) {
>>   				verbose(env, "cannot tail call within exception cb\n");
>>   				return -EINVAL;
>>   			}
>> +			if (subprog_has_stack_args(&subprog[tmp])) {
>> +				verbose(env, "tail_calls are not allowed in programs with stack args\n");
>> +				return -EINVAL;
>> +			}
> Does this create an order dependence where the verifier might falsely reject
> valid programs?
>
> If the main program calls a subprogram with a tail call, the
> tail_call_reachable flag becomes true. If the main program subsequently calls
> an independent subprogram that uses stack arguments, the flag can leak into
> that branch, causing the program to be rejected. Would reversing the call
> order cause the same program to pass?

The requirement here is that if any subprog has both tail_call_reachable
and > 5 arguments, we should reject this bpf program. So ordering issue
does not really matter here.

>
>>   			subprog[tmp].tail_call_reachable = true;
>>   		}
>> +	} else if (!idx && subprog[0].has_tail_call && subprog_has_stack_args(&subprog[0])) {
>> +		verbose(env, "tail_calls are not allowed in programs with stack args\n");
>> +		return -EINVAL;
>> +	}
> Does this fallback check miss nested stack arguments?
>
> If the main program directly contains a tail call, the tail_call_reachable
> variable remains false. If it then calls a wrapper function that passes >5
> arguments to another subprogram, won't the nested stack arguments be missed
> because the fallback only checks subprog[0]?
>
> Can this also bypass restrictions for callback roots?
>
> Since the fallback hardcodes !idx and subprog[0], if an exception callback
> (which is verified with idx > 0) contains a direct tail call and uses outgoing
> stack arguments, it bypasses the main check loop because tail_call_reachable
> is false. Will it also bypass this fallback block because !idx is false?
>
> Could we decouple this restriction from the call graph traversal and globally
> check env->subprog_info for the presence of both features in the program
> instead?

Again, see the above cretiria.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
  2026-05-07 22:26   ` bot+bpf-ci
@ 2026-05-10 17:21     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:21 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	puranjay, eddyz87, clm, ihor.solodrai



On 5/7/26 12:26 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index ea9e707e8abf..67c2f4a3b9cc 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
> [ ... ]
>
>> @@ -1664,16 +1692,45 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>>   	int i, excnt = 0;
>>   	int ilen, proglen = 0;
>>   	u8 *ip, *prog = temp;
>> +	u16 stack_arg_depth, incoming_stack_arg_depth, outgoing_stack_arg_depth; /* in bytes */
>> +	u16 outgoing_rsp;
>>   	u32 stack_depth;
>> +	int callee_saved_size;
>> +	s32 outgoing_arg_base;
>>   	int err;
>>
>>   	stack_depth = bpf_prog->aux->stack_depth;
>> +	stack_arg_depth = bpf_prog->aux->stack_arg_cnt * 8;
>> +	incoming_stack_arg_depth = bpf_prog->aux->incoming_stack_arg_cnt * 8;
>> +	outgoing_stack_arg_depth = stack_arg_depth - incoming_stack_arg_depth;
> Can this u16 subtraction underflow for leaf subprograms? When a leaf

The maximum number of stack arguments in 7. The total depth is 56,
so underflow is not possible.

> function receives stack arguments but makes no calls with stack arguments,
> incoming_stack_arg_depth would be non-zero (e.g., 16 bytes for 2 args),
> but stack_arg_depth would be zero since the verifier never updates it
> for functions that don't make calls.
>
> The subtraction would wrap around modulo 65536:
>    outgoing_stack_arg_depth = 0 - 16 = 0xFFF0 (about 65520 bytes)

This is fixed in this patch set. sub->stack_arg_cnt is set in
bpf_prepare_func_args(), which captures the number of incoming arg's.

>
> Later in do_jit():
>
>> @@ -1700,6 +1757,42 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>>   		push_r12(&prog);
>>   	push_callee_regs(&prog, callee_regs_used);
>>   }
>> +
>> +	/* Compute callee-saved register area size. */
> [ ... ]
>
>> +	/*
>> +	 * Allocate outgoing stack arg area for args 7+ only.
>> +	 * Arg 6 goes into r9 register, not on stack.
>> +	 */
>> +	outgoing_rsp = outgoing_stack_arg_depth > 8 ? outgoing_stack_arg_depth - 8 : 0;
>> +	emit_sub_rsp(&prog, outgoing_rsp);
> This would compute outgoing_rsp = 0xFFF0 - 8 = 0xFFE8 (about 65512 bytes),
> and emit_sub_rsp() would generate:
>    sub rsp, 0xFFE8
>
> allocating a 65 KB stack frame that overflows the kernel stack
> (THREAD_SIZE is only 16 KB on x86_64).
>
> This issue was identified by bot+bpf-ci@kernel.org in v1 (Fri, 24 Apr
> 2026 18:00:42 +0000) at
> https://lore.kernel.org/bpf/e1b7281b2a52b18ae2a9f7d41091994eec0bba81e96e64adbe4c072d915acfc2@mail.kernel.org/
>
> The author acknowledged this bug and promised to fix it with a selftest
> in the next revision, but the current code still contains the vulnerable
> subtraction without underflow protection.
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
  2026-05-09  2:21   ` sashiko-bot
@ 2026-05-10 17:22     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:22 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 4:21 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
> - [Critical] The offset translation formula for outgoing stack arguments (Args 7+) negates the BPF offset, reversing the memory layout and causing partial accesses to corrupt the stack.
> - [Critical] Immediate stores (`BPF_ST`) of 8, 16, or 32 bits to Arg 6 bypass the register routing check, writing out-of-bounds to the stack and leaving Arg 6 uninitialized.
> - [Critical] Partial loads and stores (`BPF_LDX` / `BPF_STX`) to Arg 6 unconditionally emit 64-bit register moves, ignoring size semantics and breaking verifier bounds tracking.
> - [Critical] Sub-register accesses to Arg 6 bypass exact-offset interception, falling through to memory operations that leak the kernel return address or corrupt the stack.
> --
>
> commit ee1dd512e669afb6db2ee6c05d9e78b43305dc79
> Author: Yonghong Song <yonghong.song@linux.dev>
>
>      bpf,x86: Implement JIT support for stack arguments
>
> This patch adds x86_64 JIT support for BPF functions and kfuncs with more
> than 5 arguments. The extra arguments are passed through a stack area
> addressed by the R11 register (BPF_REG_PARAMS), with argument 6 being routed
> to the R9 register.
>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index ea9e707e8abff..67c2f4a3b9cc3 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
> [ ... ]
>> @@ -2134,12 +2227,27 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
>>   				EMIT1(0xC7);
>>   			goto st;
>>   		case BPF_ST | BPF_MEM | BPF_DW:
>> +			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
>> +				/* Arg 6: store immediate in r9 register */
>> +				emit_mov_imm64(&prog, X86_REG_R9, imm32 >> 31, (u32)imm32);
>> +				break;
>> +			}
> Are stores of 8, 16, or 32 bits to Arg 6 missing the register routing check?
> Because BPF_B, BPF_H, and BPF_W sizes jump directly to the st: label, they
> appear to skip this R9 check. Could this cause them to mistakenly apply the
> stack argument translation and write out-of-bounds, leaving R9 uninitialized?

Currently, all r11 based ldx/stx/st insns only support 8 byte value. So we
do not have the above or below issues.

>
>>   			EMIT2(add_1mod(0x48, dst_reg), 0xC7);
>>   
>> -st:			if (is_imm8(insn->off))
>> -				EMIT2(add_1reg(0x40, dst_reg), insn->off);
>> +st:			insn_off = insn->off;
>> +			if (dst_reg == BPF_REG_PARAMS) {
>> +				/*
>> +				 * Args 7+: reverse BPF negative offsets to
>> +				 * x86 positive rsp offsets.
>> +				 * BPF off=-16 → [rsp+0], off=-24 → [rsp+8], ...
>> +				 */
>> +				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
>> +				dst_reg = BPF_REG_FP;
>> +			}
> Does subtracting the BPF offset reverse the memory layout within a stack slot?
>
> If a BPF program accesses the upper 4 bytes of Arg 7 (offset -12), the formula
> evaluates to BASE - (-12) - 16 = BASE - 4. This points out-of-bounds below
> the stack pointer. Similarly, writing to the upper 4 bytes of Arg 8 (-20)
> would evaluate to BASE + 4, overwriting the upper half of Arg 7.
>
> [ ... ]
>> @@ -2149,7 +2257,17 @@ st:			if (is_imm8(insn->off))
>>   		case BPF_STX | BPF_MEM | BPF_H:
>>   		case BPF_STX | BPF_MEM | BPF_W:
>>   		case BPF_STX | BPF_MEM | BPF_DW:
>> -			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
>> +			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
>> +				/* Arg 6: store register value in r9 */
>> +				EMIT_mov(X86_REG_R9, src_reg);
>> +				break;
>> +			}
> Will sub-register stores to Arg 6 bypass this check? If a program writes to
> the upper 4 bytes of outgoing Arg 6 (insn->off == -4), the exact check fails
> and it will fall through to memory translation, potentially writing
> out-of-bounds.
>
> Also, does EMIT_mov() unconditionally emit a 64-bit move, ignoring the BPF_B,
> BPF_H, and BPF_W sizes? If a 1-byte store is performed, this might write 64
> bits to R9 instead of the intended size.
>
> [ ... ]
>> @@ -2248,6 +2366,19 @@ st:			if (is_imm8(insn->off))
>>   		case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
>>   		case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
>>   			insn_off = insn->off;
>> +			if (src_reg == BPF_REG_PARAMS) {
>> +				if (insn_off == 8) {
>> +					/* Incoming arg 6: read from r9 */
>> +					EMIT_mov(dst_reg, X86_REG_R9);
>> +					break;
>> +				}
>> +				src_reg = BPF_REG_FP;
> What happens if a program reads a sub-register of incoming Arg 6?
> If insn_off == 12 (the upper 4 bytes), the exact check fails, and src_reg
> becomes BPF_REG_FP (rbp). The resulting memory read from [rbp + 12]
> corresponds to the upper 4 bytes of the kernel return address, which could
> be leaked to the BPF program.
>
> Similarly, if the program performs a partial load of Arg 6 (insn_off == 8),
> EMIT_mov() moves the full 64-bit value from R9. Could this bypass the
> verifier's bounds tracking if the verifier assumes the upper 56 bits are
> properly zeroed for a 1-byte read?
>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation
  2026-05-09  1:30   ` sashiko-bot
@ 2026-05-10 17:23     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:23 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 3:30 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Low] Unconditional use of the 'r11' register in inline assembly will cause selftest build failures on older Clang compilers.
> --
>
> commit 6f66b8fd81d82b7d1f5f683d16976b8d486d440b
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> selftests/bpf: Add tests for stack argument validation
>
> This commit adds negative tests to verify that kfuncs properly reject large
> structs passed as stack arguments. It also adds tests to ensure the verifier
> rejects invalid uses of the r11 register for stack arguments.
>
>> diff --git a/tools/testing/selftests/bpf/progs/stack_arg_fail.c b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
>> new file mode 100644
>> index 0000000000000..c6db302549cbd
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/stack_arg_fail.c
> [ ... ]
>
>> +#else
>> +
>> +SEC("tc")
>> +__description("stack_arg_fail: not supported, dummy test")
>> +__success
>> +int test_stack_arg_big(struct __sk_buff *skb)
>> +{
>> +	return 0;
>> +}
>> +
>> +#endif
>> +
>> +SEC("socket")
>> +__description("r11 in ALU instruction")
>> +__failure __msg("R11 is invalid")
>> +__naked void r11_alu_reject(void)
>> +{
>> +	asm volatile (
>> +	"r11 += 1;"
> Could these inline assembly blocks using r11 cause build failures on older
> Clang compilers?
>
> The r11 register is a relatively new addition to the BPF ISA and LLVM backend.
> Because these naked assembly functions are placed outside of the
> __BPF_FEATURE_STACK_ARGUMENT guard, compiling this file with an older
> toolchain might result in fatal assembler errors regarding an invalid operand.
>
> Would it be safer to move these r11 tests inside the feature guard to
> maintain selftest compatibility across toolchains?

Indeed. Will fix.

>
> [ ... ]
>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names
  2026-05-09  1:40   ` sashiko-bot
@ 2026-05-10 17:24     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:24 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 3:40 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] The BPF object 'tobj' is leaked when 'fixup_btf_from_path()' fails.
> --
>
> commit af529dd65e68c8ac479551b0ac6332c3b5f56771
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> selftests/bpf: Add BTF fixup for __naked subprog parameter names
>
> This commit adds a mechanism to fix up missing parameter names in the BTF
> FUNC_PROTO entries of __naked subprogs by reading them from a separate BTF
> file. This allows the verifier to resolve stack argument slots by name.
>
>> diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
>> --- a/tools/testing/selftests/bpf/test_loader.c
>> +++ b/tools/testing/selftests/bpf/test_loader.c
> [ ... ]
>
>> @@ -1231,13 +1351,27 @@ void run_subtest(struct test_loader *tester,
>>   		}
>>   	}
>>   
>> -	/* Implicitly reset to NULL if next test case doesn't specify */
>> +	/* Implicitly reset to NULL if next test case doesn't specify.
>> +	 * btf_custom_func_path also serves as btf_custom_path for kfunc resolution.
>> +	 */
>>   	open_opts->btf_custom_path = spec->btf_custom_path;
>> +	if (!open_opts->btf_custom_path)
>> +		open_opts->btf_custom_path = spec->btf_custom_func_path;
>>   
>>   	tobj = bpf_object__open_mem(obj_bytes, obj_byte_cnt, open_opts);
>>   	if (!ASSERT_OK_PTR(tobj, "obj_open_mem")) /* shouldn't happen */
>>   		goto subtest_cleanup;
>>   
>> +	/* Fix up __naked subprog BTF using a separate file with named params */
>> +	if (spec->btf_custom_func_path) {
>> +		err = fixup_btf_from_path(tobj, spec->btf_custom_func_path);
>> +		if (err) {
>> +			PRINT_FAIL("failed to fixup BTF from %s: %d\n",
>> +				   spec->btf_custom_func_path, err);
>> +			goto subtest_cleanup;
> Does this error path leak tobj?
>
> If fixup_btf_from_path() fails, the code jumps to subtest_cleanup, which
> appears to bypass the tobj_cleanup label where tobj would normally be
> closed and freed. Should this be goto tobj_cleanup instead?

Thanks for catching up this bug. Will fix in the next revision.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation
  2026-05-07 22:11   ` bot+bpf-ci
@ 2026-05-10 17:27     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:27 UTC (permalink / raw)
  To: bot+bpf-ci, bpf
  Cc: ast, andrii, daniel, jose.marchesi, kernel-team, martin.lau,
	eddyz87, clm, ihor.solodrai



On 5/7/26 12:11 AM, bot+bpf-ci@kernel.org wrote:
>> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
>> new file mode 100644
>> index 000000000000..b4f680e3b6a1
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg.c
> [ ... ]
>
>> +__noinline __used
>> +static long subprog_deref_arg6(long a, long b, long c, long d, long e, long *f)
>> +{
>> +	return *f;
>> +}
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: pruning with different stack arg types")
>> +__failure
>> +__flag(BPF_F_TEST_STATE_FREQ)
>> +__msg("invalid mem access 'scalar'")
>> +__naked void stack_arg_pruning_type_mismatch(void)
>> +{
>> +	asm volatile (
>> +		"call %[bpf_get_prandom_u32];"
>> +		"r6 = r0;"
>> +		/* local = 0 on program stack */
>> +		"r7 = 0;"
>> +		"*(u64 *)(r10 - 8) = r7;"
>> +		/* Branch based on random value */
>> +		"if r6 s> 3 goto l0_%=;"
>> +		/* Path 1: store stack pointer to outgoing arg6 */
>> +		"r1 = r10;"
>> +		"r1 += -8;"
>> +		"*(u64 *)(r11 - 8) = r1;"
>> +		"goto l1_%=;"
>> +	"l0_%=:"
>> +		/* Path 2: store scalar to outgoing arg6 */
>> +		"*(u64 *)(r11 - 8) = 42;"
>> +	"l1_%=:"
>> +		/* Call subprog that dereferences arg6 */
>> +		"r1 = r6;"
>> +		"r2 = 0;"
>> +		"r3 = 0;"
>> +		"r4 = 0;"
>> +		"r5 = 0;"
>> +		"call subprog_deref_arg6;"
>> +		"exit;"
>> +		:: __imm(bpf_get_prandom_u32)
>> +		: __clobber_all
>> +	);
>> +}
> The Sashiko AI bot and bpf-ci raised concerns in v6 and April 24 versions
> about test fragility in this test and similar ones. Since
> subprog_deref_arg6 is standard C (not __naked), LLVM determines which
> register is used for dereferencing. If a future LLVM version chooses a
> different register, the tests will fail because the verifier error will
> mention the new register instead of R1.
>
> In v6, you mentioned using a regex like __msg("R{{[0-9]}} invalid mem
> access 'scalar'"), but the current code removed the register prefix
> entirely. While this may be intentional to avoid specifying a register,
> could the tests benefit from either using regex to match the register
> dynamically, or making subprog_deref_arg6 a __naked function to guarantee
> the register choice?

Okay, I will use regex ("R{{[0-9]}}") to get more clarity for the message.

>
> Reference: https://lore.kernel.org/bpf/20260419172100.A89FEC2BCAF@smtp.kernel.org/
>
>> +SEC("tc")
>> +__description("stack_arg: release_reference invalidates stack arg slot")
>> +__failure
>> +__msg("invalid mem access 'scalar'")
>> +__naked void stack_arg_release_ref(void)
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: pkt pointer in stack arg slot invalidated after pull_data")
>> +__failure
>> +__msg("invalid mem access 'scalar'")
>> +__naked void stack_arg_stale_pkt_ptr(void)
> [ ... ]
>
>> +SEC("tc")
>> +__description("stack_arg: null propagation rejects deref on null branch")
>> +__failure
>> +__msg("invalid mem access 'scalar'")
>> +__naked void stack_arg_null_propagation_fail(void)
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation
  2026-05-09  1:38   ` sashiko-bot
@ 2026-05-10 17:27     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:27 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 3:38 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] The tests in `verifier_stack_arg_order.c` are never executed because the file is not registered in the selftests runner.
> --
>
> commit 6834becc44d439e99d840e596d30164699006d53
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> selftests/bpf: Add verifier tests for stack argument validation
>
> This commit adds new BPF selftests using inline assembly to validate the
> verifier's logic for stack arguments. It covers positive tests and various
> negative tests for verifier rejection and pointer/reference tracking.
>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
>> index a96b25ebff234..aef21cf2987be 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
>> @@ -91,6 +91,7 @@
>>   #include "verifier_sockmap_mutate.skel.h"
>>   #include "verifier_spill_fill.skel.h"
>>   #include "verifier_spin_lock.skel.h"
>> +#include "verifier_stack_arg.skel.h"
> Did you intend to include verifier_stack_arg_order.skel.h here as well?

Yes.

>
>>   #include "verifier_stack_ptr.skel.h"
>>   #include "verifier_store_release.skel.h"
>>   #include "verifier_subprog_precision.skel.h"
>> @@ -238,6 +239,7 @@ void test_verifier_sock_addr(void)            { RUN(verifier_sock_addr); }
>>   void test_verifier_sockmap_mutate(void)       { RUN(verifier_sockmap_mutate); }
>>   void test_verifier_spill_fill(void)           { RUN(verifier_spill_fill); }
>>   void test_verifier_spin_lock(void)            { RUN(verifier_spin_lock); }
>> +void test_verifier_stack_arg(void)            { RUN(verifier_stack_arg); }
> Could this be missing a RUN(verifier_stack_arg_order) entry?

Yes. Thanks for capturing this. Will fix.

>
> The patch introduces the new tests for load ordering validation below,
> but without registering them in the selftests runner, it looks like
> they will not be executed.
>
> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
>> new file mode 100644
>> index 0000000000000..671c79969c6cd
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/verifier_stack_arg_order.c
>> @@ -0,0 +1,86 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments
  2026-05-09  1:52   ` sashiko-bot
@ 2026-05-10 17:31     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:31 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 3:52 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [Medium] Commit message and code comment incorrectly claim that the subprogram performs a bounds check.
> - [Medium] Non-deterministic branching in the test program can lead to intermittent flaky failures by returning 36 instead of the asserted 25.
> --
>
> commit 953824567df7ff801d99eb6df49d76d95736325d
> Author: Yonghong Song <yonghong.song@linux.dev>
>
> selftests/bpf: Add precision backtracking test for stack arguments
>
> This commit adds a test to verify precision backtracking across BPF-to-BPF
> calls involving stack arguments. The test aims to ensure precision propagates
> from a kfunc's size argument back through a subprogram frame.
>
>> The test passes a size value as incoming stack arg (arg6) to a
>> subprog, which bounds-checks it and forwards it as the mem__sz
>> parameter (outgoing arg7) to bpf_kfunc_call_stack_arg_mem.
> Is the description about bounds-checking accurate?

The description is not really reflecting the code due to code change
but commit message is not. Will fix.

>
> Looking at the inline assembly for subprog_call_mem_kfunc, it appears the
> subprogram simply reads the size from the incoming stack slot and immediately
> writes it to the outgoing stack slot without any bounds checking logic.
>
>> diff --git a/tools/testing/selftests/bpf/progs/stack_arg_precision.c b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
>> new file mode 100644
>> index 0000000000000..c94905f07dcc1
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/stack_arg_precision.c
>> @@ -0,0 +1,137 @@
> [ ... ]
>
>> +/*
>> + * Test precision backtracking across bpf-to-bpf call for kfunc stack arg.
>> + * subprog_call_mem_kfunc receives a size as incoming stack arg (arg6),
>> + * bounds-checks it, then passes it as mem__sz (arg7) to
>> + * bpf_kfunc_call_stack_arg_mem.
>> + *
>> + * 1+2+3+4+5+(1+2+3+4) = 25
>> + */
> This isn't a bug, but should the comment be updated to match the
> implementation? As mentioned above, the bounds check seems to be missing from
> the assembly below.

You are right. Because there are two different values (4 or 6), there is
no guarantee that the sum of values will be 25.

>
>> +__naked __noinline __used
>> +static long subprog_call_mem_kfunc(long a, long b, long c, long d, long e, long size)
>> +{
>> +	asm volatile (
>> +		"r1 = *(u64 *)(r11 + 8);"	/* r1 = incoming arg6 (size) */
>> +		"r2 = 0x0807060504030201 ll;"	/* r2 = buf contents */
>> +		"*(u64 *)(r10 - 8) = r2;"	/* store buf to stack */
>> +		"r2 = r10;"
>> +		"r2 += -8;"			/* r2 = &buf */
>> +		"*(u64 *)(r11 - 8) = r2;"	/* outgoing arg6 = buf */
>> +		"*(u64 *)(r11 - 16) = r1;"	/* outgoing arg7 = size */
> [ ... ]
>
>> +__naked void stack_arg_precision_bpf2bpf(void)
>> +{
>> +	asm volatile (
>> +		"call %[bpf_get_prandom_u32];"
>> +		"r6 = r0;"
>> +		"r1 = 1;"
>> +		"r2 = 2;"
>> +		"r3 = 3;"
>> +		"r4 = 4;"
>> +		"r5 = 5;"
>> +		"if r6 < 2 goto l0_%=;"
>> +		"*(u64 *)(r11 - 8) = 4;"
>> +		"goto l1_%=;"
>> +	"l0_%=:"
>> +		"*(u64 *)(r11 - 8) = 6;"
>> +	"l1_%=:"
>> +		"call subprog_call_mem_kfunc;"
>> +		"exit;"
>> +		:: __imm(bpf_get_prandom_u32)
>> +		: __clobber_all
>> +	);
>> +}
> Could this lead to intermittent test failures?
>
> The test asserts a strict return value of 25:
>
> __success __retval(25)
>
> If bpf_get_prandom_u32() happens to return 0 or 1, the size argument is set to
> 6 instead of 4. In that case, bpf_kfunc_call_stack_arg_mem() will sum up 6
> bytes instead of 4, returning 36 instead of 25, which would cause the test to
> fail.

Yes, I will remove '__retval(25)'.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments
  2026-05-09  2:15   ` sashiko-bot
@ 2026-05-10 17:32     ` Yonghong Song
  0 siblings, 0 replies; 68+ messages in thread
From: Yonghong Song @ 2026-05-10 17:32 UTC (permalink / raw)
  To: sashiko; +Cc: bpf



On 5/9/26 4:15 AM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] In `emit_stack_arg_load()`, `bpf_off` is not negated, causing a negative index that fails array bounds checks and results in the JIT embedding an invalid instruction (BRK), leading to a kernel panic.
> --
>
> commit 6ccf042549a3902757c6335313c905e99fb41292
> Author: Puranjay Mohan <puranjay@kernel.org>
>
> bpf, arm64: Add JIT support for stack arguments
>
> This commit implements stack argument passing for BPF-to-BPF and kfunc
> calls with more than 5 parameters on arm64, following the AAPCS64 calling
> convention.
>
>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>> index 085e650662e3d..9b9d2501d2d36 100644
>> --- a/arch/arm64/net/bpf_jit_comp.c
>> +++ b/arch/arm64/net/bpf_jit_comp.c
> [ ... ]
>
>> @@ -1191,6 +1207,41 @@ static int add_exception_handler(const struct bpf_insn *insn,
>>   	return 0;
>>   }
>>   
>> +static const u8 stack_arg_reg[] = { A64_R(5), A64_R(6), A64_R(7) };
>> +
>> +#define NR_STACK_ARG_REGS	ARRAY_SIZE(stack_arg_reg)
>> +
>> +static void emit_stack_arg_load(u8 dst, s16 bpf_off, struct jit_ctx *ctx)
>> +{
>> +	int idx = bpf_off / sizeof(u64) - 1;
> Does this missing negation for bpf_off lead to an invalid instruction being
> emitted?

For stack argument load, the offset is positive number and it is multiple of 8.
So it should be okay here.

>
> Since BPF stack arguments use negative offsets from BPF_REG_PARAMS (such as
> -8 for arg 6), the calculation bpf_off / sizeof(u64) - 1 will result in a
> negative idx.
>
> Because NR_STACK_ARG_REGS uses ARRAY_SIZE, it evaluates to an unsigned
> size_t. Comparing a negative idx to it will result in a false condition,
> falling through to the else block.
>
> In the else block, the negative idx generates a negative offset for
> A64_LDR64I. Since A64_LDR64I expects an unsigned 12-bit scaled immediate,
> could this cause the instruction encoder to fail and embed an
> AARCH64_BREAK_FAULT into the JIT image?
>
>> +
>> +	if (idx < NR_STACK_ARG_REGS)
>> +		emit(A64_MOV(1, dst, stack_arg_reg[idx]), ctx);
>> +	else
>> +		emit(A64_LDR64I(dst, A64_FP, (idx - NR_STACK_ARG_REGS) * sizeof(u64) + 16), ctx);
>> +}
>> +
>> +static void emit_stack_arg_store(u8 src_a64, s16 bpf_off, struct jit_ctx *ctx)
>> +{
>> +	int idx = -bpf_off / sizeof(u64) - 1;
> For comparison, emit_stack_arg_store() correctly negates bpf_off here.
>
>> +
>> +	if (idx < NR_STACK_ARG_REGS)


^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2026-05-10 17:32 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 02/23] bpf: Remove copy_register_state wrapper function Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 03/23] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:05     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
2026-05-07 22:26   ` bot+bpf-ci
2026-05-09 12:52     ` Yonghong Song
2026-05-08 18:00   ` Alexei Starovoitov
2026-05-09 12:55     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 06/23] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:08     ` Yonghong Song
2026-05-09  4:05   ` sashiko-bot
2026-05-10 16:41     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 08/23] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:29     ` Yonghong Song
2026-05-09  0:59   ` sashiko-bot
2026-05-10 16:47     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09  2:10   ` sashiko-bot
2026-05-10 16:59     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-05-09  2:19   ` sashiko-bot
2026-05-10 17:05     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns Yonghong Song
2026-05-09  2:59   ` sashiko-bot
2026-05-10 17:11     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 13/23] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09  1:42   ` sashiko-bot
2026-05-10 17:15     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-05-07 22:26   ` bot+bpf-ci
2026-05-10 17:21     ` Yonghong Song
2026-05-09  2:21   ` sashiko-bot
2026-05-10 17:22     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 16/23] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-05-09  1:30   ` sashiko-bot
2026-05-10 17:23     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-05-09  1:40   ` sashiko-bot
2026-05-10 17:24     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-10 17:27     ` Yonghong Song
2026-05-09  1:38   ` sashiko-bot
2026-05-10 17:27     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-05-09  1:52   ` sashiko-bot
2026-05-10 17:31     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-05-08 18:01   ` Alexei Starovoitov
2026-05-09 13:44     ` Yonghong Song
2026-05-07 21:32 ` [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-09  2:15   ` sashiko-bot
2026-05-10 17:32     ` Yonghong Song
2026-05-07 21:32 ` [PATCH bpf-next v2 23/23] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
2026-05-08 18:06 ` [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
2026-05-09 13:43   ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox