From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"Jose E . Marchesi" <jose.marchesi@oracle.com>,
kernel-team@fb.com, Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v6 07/17] bpf: Support stack arguments for bpf functions
Date: Mon, 20 Apr 2026 21:15:13 -0700 [thread overview]
Message-ID: <e12c07df-b5d1-4d04-9028-23db01063138@linux.dev> (raw)
In-Reply-To: <DHYEVJXVGHP5.2CAI5PGRI0W8E@gmail.com>
On 4/20/26 5:37 PM, Alexei Starovoitov wrote:
> On Sun Apr 19, 2026 at 9:33 AM PDT, Yonghong Song wrote:
>> Currently BPF functions (subprogs) are limited to 5 register arguments.
>> With [1], the compiler can emit code that passes additional arguments
>> via a dedicated stack area through bpf register BPF_REG_PARAMS (r11),
>> introduced in the previous patch.
>>
>> The compiler uses positive r11 offsets for incoming (callee-side) args
>> and negative r11 offsets for outgoing (caller-side) args, following the
>> x86_64/arm64 calling convention direction. There is an 8-byte gap at
>> offset 0 separating the two regions:
>> Incoming (callee reads): r11+8 (arg6), r11+16 (arg7), ...
>> Outgoing (caller writes): r11-8 (arg6), r11-16 (arg7), ...
> This part looks clean now.
>
>> A per-state bitmask out_stack_arg_mask tracks which outgoing stack arg
>> slots have been written on the current path. Each bit corresponds to
>> an outgoing slot index (bit 0 = r11-8 = arg6, bit 1 = r11-16 = arg7,
>> etc.). At a call site, the verifier checks that all slots required by
>> the callee have their corresponding mask bits set. This enables
>> precise per-path tracking: if one branch of a conditional writes arg6
>> but another does not, the mask correctly reflects the difference and
>> the verifier rejects the uninitialized path. The mask is included in
>> stack_arg_safe() so that states with different sets of initialized
>> slots are not incorrectly pruned together.
> But this part I don't understand.
> why do you need this bitmask?
> Even when they're written out of order stack_arg_depth is all you need.
> Then compare old->stack_arg_regs vs cur->stack_arg_regs.
> If one is not written its state will be NOT_INIT.
> so
> *(u64 *)(r11 - 16) = r7;
> // without *(u64 *)(r11 - 8) = r6;
> call bar1; // arg6 = r6, arg7 = r7
>
> will fail the verification.
I added bitmask to make it easy for early comparison for pruning
since maintenance of bitmask is light. But I think this probably
not necessary. Will remove.
>
>> @@ -1669,6 +1669,8 @@ struct bpf_prog_aux {
>> u32 max_pkt_offset;
>> u32 max_tp_access;
>> u32 stack_depth;
>> + u16 incoming_stack_arg_depth;
>> + u16 stack_arg_depth; /* both incoming and max outgoing of stack arguments */
> these two are ok. I'm assuming you don't want JIT to recompute them.
>
>> u32 id;
>> u32 func_cnt; /* used by non-func prog as the number of func progs */
>> u32 real_func_cnt; /* includes hidden progs, only used for JIT and freeing progs */
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 9fbbddc40d21..bb6d8cab3a35 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -372,6 +372,11 @@ struct bpf_func_state {
>> * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE.
>> */
>> int allocated_stack;
>> +
>> + u16 stack_arg_depth; /* Size of incoming + max outgoing stack args in bytes. */
>> + u16 incoming_stack_arg_depth; /* Size of incoming stack args in bytes. */
> but incoming_stack_arg_depth looks odd.
>
> Callee should be accessing caller's stack_arg_depth and
> caller's stack_arg_regs.
Okay. Will reduce local caching and get the value from the caller.
>
>> + u16 out_stack_arg_mask; /* Bitmask of outgoing stack arg slots that have been written. */
>> + struct bpf_reg_state *stack_arg_regs; /* On-stack arguments */
>> };
>>
>> #define MAX_CALL_FRAMES 8
>> @@ -508,6 +513,17 @@ struct bpf_verifier_state {
>> iter < frame->allocated_stack / BPF_REG_SIZE; \
>> iter++, reg = bpf_get_spilled_reg(iter, frame, mask))
>>
>> +#define bpf_get_spilled_stack_arg(slot, frame, mask) \
>> + ((((slot) < frame->stack_arg_depth / BPF_REG_SIZE) && \
>> + (frame->stack_arg_regs[slot].type != NOT_INIT)) \
>> + ? &frame->stack_arg_regs[slot] : NULL)
>> +
>> +/* Iterate over 'frame', setting 'reg' to either NULL or a spilled stack arg. */
>> +#define bpf_for_each_spilled_stack_arg(iter, frame, reg, mask) \
>> + for (iter = 0, reg = bpf_get_spilled_stack_arg(iter, frame, mask); \
>> + iter < frame->stack_arg_depth / BPF_REG_SIZE; \
>> + iter++, reg = bpf_get_spilled_stack_arg(iter, frame, mask))
>> +
>> #define bpf_for_each_reg_in_vstate_mask(__vst, __state, __reg, __mask, __expr) \
>> ({ \
>> struct bpf_verifier_state *___vstate = __vst; \
>> @@ -525,6 +541,11 @@ struct bpf_verifier_state {
>> continue; \
>> (void)(__expr); \
>> } \
>> + bpf_for_each_spilled_stack_arg(___j, __state, __reg, __mask) { \
>> + if (!__reg) \
>> + continue; \
>> + (void)(__expr); \
>> + } \
>> } \
>> })
>>
>> @@ -739,10 +760,13 @@ struct bpf_subprog_info {
>> bool keep_fastcall_stack: 1;
>> bool changes_pkt_data: 1;
>> bool might_sleep: 1;
>> - u8 arg_cnt:3;
>> + u8 arg_cnt:4;
>>
>> enum priv_stack_mode priv_stack_mode;
>> - struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
>> + struct bpf_subprog_arg_info args[MAX_BPF_FUNC_ARGS];
>> + u16 incoming_stack_arg_depth;
>> + u16 outgoing_stack_arg_depth;
>> + u16 max_out_stack_arg_depth;
> but you already have them in prog_aux?! another copy in bpf_subprog_info?!
> Remove one of them. JIT only need one set.
Okay.
>
>> };
>>
>> struct bpf_verifier_env;
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index a62d78581207..c5f3aa05d5a3 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -7887,13 +7887,19 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
>> }
>> args = (const struct btf_param *)(t + 1);
>> nargs = btf_type_vlen(t);
>> - if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>> - if (!is_global)
>> - return -EINVAL;
>> - bpf_log(log, "Global function %s() with %d > %d args. Buggy compiler.\n",
>> + if (nargs > MAX_BPF_FUNC_ARGS) {
>> + bpf_log(log, "Function %s() with %d > %d args not supported.\n",
>> + tname, nargs, MAX_BPF_FUNC_ARGS);
>> + return -EINVAL;
>> + }
>> + if (is_global && nargs > MAX_BPF_FUNC_REG_ARGS) {
>> + bpf_log(log, "Global function %s() with %d > %d args not supported.\n",
>> tname, nargs, MAX_BPF_FUNC_REG_ARGS);
>> return -EINVAL;
>> }
>> + if (nargs > MAX_BPF_FUNC_REG_ARGS)
>> + sub->incoming_stack_arg_depth = (nargs - MAX_BPF_FUNC_REG_ARGS) * BPF_REG_SIZE;
>> +
>> /* check that function is void or returns int, exception cb also requires this */
>> t = btf_type_by_id(btf, t->type);
>> while (btf_type_is_modifier(t))
>> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
>> index fba9e8c00878..c4e0224ad2f2 100644
>> --- a/kernel/bpf/fixups.c
>> +++ b/kernel/bpf/fixups.c
>> @@ -1123,6 +1123,9 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>
>> func[i]->aux->name[0] = 'F';
>> func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
>> + func[i]->aux->incoming_stack_arg_depth = env->subprog_info[i].incoming_stack_arg_depth;
>> + func[i]->aux->stack_arg_depth = env->subprog_info[i].incoming_stack_arg_depth +
>> + env->subprog_info[i].outgoing_stack_arg_depth;
>> if (env->subprog_info[i].priv_stack_mode == PRIV_STACK_ADAPTIVE)
>> func[i]->aux->jits_use_priv_stack = true;
>>
>> @@ -1301,8 +1304,10 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
>> struct bpf_insn_aux_data *orig_insn_aux;
>> u32 *orig_subprog_starts;
>>
>> - if (env->subprog_cnt <= 1)
>> + if (env->subprog_cnt <= 1) {
>> + env->prog->aux->stack_arg_depth = env->subprog_info[0].outgoing_stack_arg_depth;
>> return 0;
>> + }
>>
>> prog = orig_prog = env->prog;
>> if (bpf_prog_need_blind(prog)) {
>> @@ -1378,9 +1383,20 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
>> struct bpf_prog *prog = env->prog;
>> struct bpf_insn *insn = prog->insnsi;
>> bool has_kfunc_call = bpf_prog_has_kfunc_call(prog);
>> - int i, depth;
>> + int depth;
>> #endif
>> - int err = 0;
>> + int i, err = 0;
>> +
>> + for (i = 0; i < env->subprog_cnt; i++) {
>> + struct bpf_subprog_info *subprog = &env->subprog_info[i];
>> +
>> + if (subprog->max_out_stack_arg_depth > subprog->outgoing_stack_arg_depth) {
>> + verbose(env,
>> + "func#%d writes stack arg slot at depth %u, but calls only require %u bytes\n",
>> + i, subprog->max_out_stack_arg_depth, subprog->outgoing_stack_arg_depth);
>> + return -EINVAL;
>> + }
>> + }
>>
>> if (env->prog->jit_requested &&
>> !bpf_prog_is_offloaded(env->prog->aux)) {
>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>> index 8478d2c6ed5b..235841d23fe3 100644
>> --- a/kernel/bpf/states.c
>> +++ b/kernel/bpf/states.c
>> @@ -838,6 +838,44 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>> return true;
>> }
>>
>> +/*
>> + * Compare stack arg slots between old and current states.
>> + * Outgoing stack args are path-local state and must agree for pruning.
>> + */
>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>> + struct bpf_func_state *cur, struct bpf_idmap *idmap,
>> + enum exact_level exact)
>> +{
>> + int i, nslots;
>> +
>> + if (old->incoming_stack_arg_depth != cur->incoming_stack_arg_depth)
>> + return false;
>> +
>> + /* Compare both incoming and outgoing stack arg slots. */
>> + if (old->stack_arg_depth != cur->stack_arg_depth)
>> + return false;
>> +
>> + if (old->out_stack_arg_mask != cur->out_stack_arg_mask)
>> + return false;
> shouldn't be neccessary.
Okay.
>
>> +
>> + nslots = old->stack_arg_depth / BPF_REG_SIZE;
>> + for (i = 0; i < nslots; i++) {
>> + struct bpf_reg_state *old_arg = &old->stack_arg_regs[i];
>> + struct bpf_reg_state *cur_arg = &cur->stack_arg_regs[i];
>> +
>> + if (old_arg->type == NOT_INIT && cur_arg->type == NOT_INIT)
>> + continue;
>> +
>> + if (exact == EXACT && old_arg->type != cur_arg->type)
>> + return false;
>> +
>> + if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>> struct bpf_idmap *idmap)
>> {
>> @@ -929,6 +967,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>> if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>> return false;
>>
>> + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>> + return false;
>> +
>> return true;
>> }
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 6aa4dc161a56..78c9322870a5 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -1340,6 +1340,20 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st
>> return -ENOMEM;
>>
>> dst->allocated_stack = src->allocated_stack;
>> +
>> + /* copy stack args state */
>> + n = src->stack_arg_depth / BPF_REG_SIZE;
>> + if (n) {
>> + dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n,
>> + sizeof(struct bpf_reg_state),
>> + GFP_KERNEL_ACCOUNT);
> copy is unnecessary.
Okay.
>
>> @@ -4220,6 +4254,13 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
>> }
>> if (type == STACK_INVALID && env->allow_uninit_stack)
>> continue;
>> + /*
>> + * Cross-frame reads may hit slots poisoned by dead code elimination.
>> + * Static liveness can't track indirect references through pointers,
>> + * so allow the read conservatively.
>> + */
>> + if (type == STACK_POISON && reg_state != state)
>> + continue;
> wait what? Is this a real issue? Are you saying you have a test prog
> that passes FP derived pointers in arg 6, 7 and arg_tracking cannot detect it?
> Then it should be added properly in arg_tracking.
> This hack to allow reading poisoned slots is not ok.
> This is a serious issue.
Okay. The reason is that callee is not getting the proper state from caller.
I will do proper implementation here.
>
> pw-bot: cr
next prev parent reply other threads:[~2026-04-21 4:15 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-19 16:33 [PATCH bpf-next v6 00/17] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 01/17] bpf: Remove unused parameter from check_map_kptr_access() Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 02/17] bpf: Refactor to avoid redundant calculation of bpf_reg_state Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 03/17] bpf: Refactor to handle memory and size together Yonghong Song
2026-04-20 23:58 ` Alexei Starovoitov
2026-04-21 4:04 ` Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 04/17] bpf: Prepare verifier logs for upcoming kfunc stack arguments Yonghong Song
2026-04-21 0:03 ` Alexei Starovoitov
2026-04-21 4:06 ` Yonghong Song
2026-04-21 6:07 ` Yonghong Song
2026-04-21 13:48 ` Alexei Starovoitov
2026-04-21 15:41 ` Yonghong Song
2026-04-21 15:46 ` Alexei Starovoitov
2026-04-21 16:37 ` Yonghong Song
2026-04-21 17:24 ` Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 05/17] bpf: Introduce bpf register BPF_REG_PARAMS Yonghong Song
2026-04-19 17:06 ` sashiko-bot
2026-04-19 18:14 ` Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 06/17] bpf: Reuse MAX_BPF_FUNC_ARGS for maximum number of arguments Yonghong Song
2026-04-19 16:33 ` [PATCH bpf-next v6 07/17] bpf: Support stack arguments for bpf functions Yonghong Song
2026-04-19 19:15 ` sashiko-bot
2026-04-20 4:35 ` Yonghong Song
2026-04-21 0:37 ` Alexei Starovoitov
2026-04-21 4:15 ` Yonghong Song [this message]
2026-04-19 16:33 ` [PATCH bpf-next v6 08/17] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-04-19 18:21 ` sashiko-bot
2026-04-20 4:23 ` Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 09/17] bpf: Track r11 registers in const_fold and liveness Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 10/17] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 11/17] bpf: Enable r11 based insns Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 12/17] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-04-19 17:08 ` sashiko-bot
2026-04-19 18:18 ` Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 13/17] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-04-19 17:08 ` sashiko-bot
2026-04-19 18:20 ` Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 14/17] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-04-19 17:25 ` sashiko-bot
2026-04-19 18:55 ` Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 15/17] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-04-19 17:15 ` sashiko-bot
2026-04-20 5:52 ` Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 16/17] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-04-19 16:34 ` [PATCH bpf-next v6 17/17] selftests/bpf: Add verifier " Yonghong Song
2026-04-19 17:21 ` sashiko-bot
2026-04-20 6:14 ` Yonghong Song
2026-04-20 15:41 ` [PATCH bpf-next v6 00/17] bpf: Support stack arguments for BPF functions and kfuncs Puranjay Mohan
2026-04-20 20:22 ` Yonghong Song
2026-04-20 20:25 ` Puranjay Mohan
2026-04-20 21:49 ` Alexei Starovoitov
2026-04-20 23:44 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e12c07df-b5d1-4d04-9028-23db01063138@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.