Re: [PATCH bpf-next 01/18] bpf: Support stack arguments for bpf functions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Eduard Zingerman <eddyz87@gmail.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jose E . Marchesi" <jose.marchesi@oracle.com>,
	kernel-team@fb.com, Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next 01/18] bpf: Support stack arguments for bpf functions
Date: Wed, 29 Apr 2026 00:50:22 +0100	[thread overview]
Message-ID: <440e435c-3739-4cec-821a-e765eda2a279@linux.dev> (raw)
In-Reply-To: <29308729-2a9c-4a4e-9b4f-a92bd185ee22@linux.dev>



On 4/28/26 9:47 AM, Yonghong Song wrote:
>
>
> On 4/28/26 7:29 AM, Eduard Zingerman wrote:
>> On Fri, 2026-04-24 at 10:14 -0700, Yonghong Song wrote:
>>
>> [...]
>>
>> I didn't see this in the patch, hence the question: should or should
>> not this feature be privileged bpf only?
>
> It is priviledged only. See add_subprog_and_kfunc().
> both bpf-to-bpf call and kfunc requires bpf_capable.
>
>>
>> [...]
>>
>>> diff --git a/include/linux/bpf_verifier.h 
>>> b/include/linux/bpf_verifier.h
>>> index d5b4303315dd..2cc349d7fc17 100644
>>> --- a/include/linux/bpf_verifier.h
>>> +++ b/include/linux/bpf_verifier.h
>> [...]
>>
>>> @@ -508,6 +512,17 @@ struct bpf_verifier_state {
>>>            iter < frame->allocated_stack / BPF_REG_SIZE;        \
>>>            iter++, reg = bpf_get_spilled_reg(iter, frame, mask))
>>>   +#define bpf_get_spilled_stack_arg(slot, frame, 
>>> mask)                   \
>>> +    ((((slot) < frame->out_stack_arg_depth / BPF_REG_SIZE) 
>>> &&           \
>>> +      (frame->stack_arg_regs[slot].type != NOT_INIT))               \
>>> +     ? &frame->stack_arg_regs[slot] : NULL)
>> can this be a static inline function?
>
> We could but we have
>
> #define bpf_get_spilled_reg(slot, frame, mask)                          \
>         (((slot < frame->allocated_stack / BPF_REG_SIZE) &&             \
>           ((1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & 
> (mask))) \
>          ? &frame->stack[slot].spilled_ptr : NULL)
>
> Should we do the same (as static inline function)?
>
>>
>>> +
>>> +/* Iterate over 'frame', setting 'reg' to either NULL or a spilled 
>>> stack arg. */
>>> +#define bpf_for_each_spilled_stack_arg(iter, frame, reg, 
>>> mask)         \
>>> +    for (iter = 0, reg = bpf_get_spilled_stack_arg(iter, frame, 
>>> mask); \
>>> +         iter < frame->out_stack_arg_depth / 
>>> BPF_REG_SIZE;              \
>>> +         iter++, reg = bpf_get_spilled_stack_arg(iter, frame, mask))
>>> +
>>>   #define bpf_for_each_reg_in_vstate_mask(__vst, __state, __reg, 
>>> __mask, __expr)   \
>>> ({ \
>>>           struct bpf_verifier_state *___vstate = __vst;            \
>>> @@ -525,6 +540,11 @@ struct bpf_verifier_state {
>>>                       continue;                        \
>>>                   (void)(__expr);                          \
>>>               } \
>>> +            bpf_for_each_spilled_stack_arg(___j, __state, __reg, 
>>> __mask) { \
>>> +                if (!__reg)                              \
>>> +                    continue;                        \
>>> +                (void)(__expr);                          \
>>> +            }                         \
>>> }                                                        \
>>>       })
>> Tangential nit: I think this macro is getting a bit too complicated,
>> we might want to introduce some proper reg_state iterator at some
>> point, e.g.:
>>
>>    struct ret_iter it = new_reg_iter(state);
>>    while ((reg = next_reg(&it))) { ... }
>
> You mean have a static function with proper arguments and do the above?
> I guess can do a followup later to simplify it.
>
>>
>>
>>>   @@ -739,10 +759,13 @@ struct bpf_subprog_info {
>>>       bool keep_fastcall_stack: 1;
>>>       bool changes_pkt_data: 1;
>>>       bool might_sleep: 1;
>>> -    u8 arg_cnt:3;
>>> +    u8 arg_cnt:4;
>>>         enum priv_stack_mode priv_stack_mode;
>>> -    struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
>>> +    struct bpf_subprog_arg_info args[MAX_BPF_FUNC_ARGS];
>>> +    u16 incoming_stack_arg_depth;
>> Can this be inferred from arg_cnt?
>> Also, the verifier keeps doing '/ BPF_REG_SIZE' on this number,
>> would it be more convenient to keep it as count?
>
> This should work. Let me try.
>
>>
>>> +    u16 stack_arg_depth; /* incoming + max outgoing */
>>> +    u16 max_out_stack_arg_depth;
>>>   };
>>>     struct bpf_verifier_env;
>>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>>> index 77af44d8a3ad..cfb35a2decf6 100644
>>> --- a/kernel/bpf/btf.c
>>> +++ b/kernel/bpf/btf.c
>>> @@ -7880,13 +7880,19 @@ int btf_prepare_func_args(struct 
>>> bpf_verifier_env *env, int subprog)
>>>       }
>>>       args = (const struct btf_param *)(t + 1);
>>>       nargs = btf_type_vlen(t);
>>> -    if (nargs > MAX_BPF_FUNC_REG_ARGS) {
>>> -        if (!is_global)
>>> -            return -EINVAL;
>>> -        bpf_log(log, "Global function %s() with %d > %d args. Buggy 
>>> compiler.\n",
>>> +    if (nargs > MAX_BPF_FUNC_ARGS) {
>>> +        bpf_log(log, "Function %s() with %d > %d args not 
>>> supported.\n",
>>> +            tname, nargs, MAX_BPF_FUNC_ARGS);
>> Nit: I'd report it as "kernel supports at-most %d parameters for 
>> regular functions, while function %s is declared to accept %d 
>> parameters"
>>       just to make the rules a bit more explicit.
>
> Okay.
>
>>
>>> +        return -EINVAL;
>>> +    }
>> [...]
>>
>>> @@ -1378,9 +1382,21 @@ int bpf_fixup_call_args(struct 
>>> bpf_verifier_env *env)
>>>       struct bpf_prog *prog = env->prog;
>>>       struct bpf_insn *insn = prog->insnsi;
>>>       bool has_kfunc_call = bpf_prog_has_kfunc_call(prog);
>>> -    int i, depth;
>>> +    int depth;
>>>   #endif
>>> -    int err = 0;
>>> +    int i, err = 0;
>>> +
>>> +    for (i = 0; i < env->subprog_cnt; i++) {
>>> +        struct bpf_subprog_info *subprog = &env->subprog_info[i];
>>> +        u16 outgoing = subprog->stack_arg_depth - 
>>> subprog->incoming_stack_arg_depth;
>>> +
>>> +        if (subprog->max_out_stack_arg_depth > outgoing) {
>>> +            verbose(env,
>>> +                "func#%d writes stack arg slot at depth %u, but 
>>> calls only require %u bytes\n",
>>> +                i, subprog->max_out_stack_arg_depth, outgoing);
>>> +            return -EINVAL;
>> Is this an internal error condition?
>> If it is, maybe use verifier_bug()?
>
> It is not. For example,
>
> SEC("tc")
> __description("stack_arg: write unused stack arg slot")
> __failure
> __msg("func#0 writes stack arg slot at depth 40, but calls only 
> require 16 bytes")
> __naked void stack_arg_write_unused_slot(void)
> {
>         asm volatile (
>                 "r1 = 1;"
>                 "r2 = 2;"
>                 "r3 = 3;"
>                 "r4 = 4;"
>                 "r5 = 5;"
>                 /* Write to offset -40, unused for the callee */
>                 "*(u64 *)(r11 - 40) = 99;"
>                 "*(u64 *)(r11 - 16) = 20;"
>                 "*(u64 *)(r11 - 8) = 10;"
>                 "call subprog_7args;"
>                 "r0 = 0;"
>                 "exit;"
>                 ::: __clobber_all
>         );
> }
>
>>
>>> +        }
>>> +    }
>>>         if (env->prog->jit_requested &&
>>>           !bpf_prog_is_offloaded(env->prog->aux)) {
>>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>>> index 8478d2c6ed5b..3e59d1c3a726 100644
>>> --- a/kernel/bpf/states.c
>>> +++ b/kernel/bpf/states.c
>>> @@ -838,6 +838,34 @@ static bool stacksafe(struct bpf_verifier_env 
>>> *env, struct bpf_func_state *old,
>>>       return true;
>>>   }
>>>   +/*
>>> + * Compare stack arg slots between old and current states.
>>> + * Outgoing stack args are path-local state and must agree for 
>>> pruning.
>>> + */
>>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct 
>>> bpf_func_state *old,
>>> +               struct bpf_func_state *cur, struct bpf_idmap *idmap,
>>> +               enum exact_level exact)
>>> +{
>>> +    int i, nslots;
>>> +
>>> +    nslots = min(old->out_stack_arg_depth, 
>>> cur->out_stack_arg_depth) / BPF_REG_SIZE;
>> this is not safe, e.g. it will accept cur with one argument as
>> equivalent for old with two arguments.
>
> Good catch! Will fix.
>
>>
>>> +    for (i = 0; i < nslots; i++) {
>>> +        struct bpf_reg_state *old_arg = &old->stack_arg_regs[i];
>>> +        struct bpf_reg_state *cur_arg = &cur->stack_arg_regs[i];
>>> +
>>> +        if (old_arg->type == NOT_INIT && cur_arg->type == NOT_INIT)
>>> +            continue;
>>> +
>>> +        if (exact == EXACT && old_arg->type != cur_arg->type)
>>> +            return false;
>>> +
>>> +        if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>>> +            return false;
>>> +    }
>> regsafe() seem handles NOT_INIT and EXACT in the same way,
>> I don't think there is a necessity to do the handling explicitly here.
>
> I agree that NOT_INIT and EXACT tracks are redundant. I keep them there
> for performance reason. But I guess the return is minimum, so I will just
> do regsafe() then.
>
>>
>>> +
>>> +    return true;
>>> +}
>>> +
>> [...]
>>
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index ff6ff1c27517..bcf81692a22b 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -1361,6 +1361,18 @@ static int copy_stack_state(struct 
>>> bpf_func_state *dst, const struct bpf_func_st
>>>           return -ENOMEM;
>>>         dst->allocated_stack = src->allocated_stack;
>>> +
>>> +    /* copy stack args state */
>>> +    n = src->out_stack_arg_depth / BPF_REG_SIZE;
>>> +    if (n) {
>>> +        dst->stack_arg_regs = copy_array(dst->stack_arg_regs, 
>>> src->stack_arg_regs, n,
>>> +                         sizeof(struct bpf_reg_state),
>>> +                         GFP_KERNEL_ACCOUNT);
>>> +        if (!dst->stack_arg_regs)
>>> +            return -ENOMEM;
>>> +    }
>>> +
>>> +    dst->out_stack_arg_depth = src->out_stack_arg_depth;
>> Given that this is capped by 12, does it make sense to maintain the 
>> counter?
>> It might be simpler to always allocate an array of 12 elements.
>
> The number of stack arguments is most 7. So yes, we can do it.

I double checked this one. It would be expensive, e.g., allocating 7 bpf_reg_state
roughly consume 1KB. Considering maximum 8 frames and a lot of states in verifier,
it could have visible memory consumption in verifier.

I think we should use pointer in the beginning, if we do see lots of usage
with > 5 arguments, we can then consider allocating stack_arg_regs in bpf_func_state.

next prev parent reply	other threads:[~2026-04-28 23:50 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 17:14 [PATCH bpf-next 00/18] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-04-24 17:14 ` [PATCH bpf-next 01/18] bpf: Support stack arguments for bpf functions Yonghong Song
2026-04-24 18:13   ` bot+bpf-ci
2026-04-25  5:09     ` Yonghong Song
2026-04-27 20:40       ` Yonghong Song
2026-04-28 14:29   ` Eduard Zingerman
2026-04-28 16:47     ` Yonghong Song
2026-04-28 23:50       ` Yonghong Song [this message]
2026-04-29  0:28       ` Eduard Zingerman
2026-04-29 22:52         ` Yonghong Song
2026-04-30  1:38           ` Eduard Zingerman
2026-05-02 17:03   ` Alexei Starovoitov
2026-05-02 21:54     ` Yonghong Song
2026-05-08 17:33       ` Alexei Starovoitov
2026-04-24 17:14 ` [PATCH bpf-next 02/18] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-25  5:10     ` Yonghong Song
2026-04-28 16:46   ` Eduard Zingerman
2026-04-28 20:54     ` Yonghong Song
2026-04-24 17:14 ` [PATCH bpf-next 03/18] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-04-29  0:51   ` Eduard Zingerman
2026-04-29 22:55     ` Yonghong Song
2026-04-24 17:14 ` [PATCH bpf-next 04/18] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-25  5:11     ` Yonghong Song
2026-04-29 12:22   ` Eduard Zingerman
2026-04-29 22:55     ` Yonghong Song
2026-04-24 17:14 ` [PATCH bpf-next 05/18] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-29 12:27   ` Eduard Zingerman
2026-04-24 17:15 ` [PATCH bpf-next 06/18] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-04-24 17:48   ` bot+bpf-ci
2026-04-25  5:17     ` Yonghong Song
2026-04-29 12:37   ` Eduard Zingerman
2026-04-24 17:15 ` [PATCH bpf-next 07/18] bpf: Enable r11 based insns Yonghong Song
2026-04-29 12:48   ` Eduard Zingerman
2026-04-24 17:15 ` [PATCH bpf-next 08/18] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-25  5:19     ` Yonghong Song
2026-04-24 17:15 ` [PATCH bpf-next 09/18] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-24 17:15 ` [PATCH bpf-next 10/18] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-25  5:29     ` Yonghong Song
2026-04-24 17:16 ` [PATCH bpf-next 11/18] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-04-24 17:16 ` [PATCH bpf-next 12/18] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 13/18] selftests/bpf: Add verifier " Yonghong Song
2026-04-24 17:48   ` bot+bpf-ci
2026-04-25  5:33     ` Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 14/18] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 15/18] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 16/18] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 17/18] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-04-24 18:00   ` bot+bpf-ci
2026-04-27  9:06     ` Puranjay Mohan
2026-04-27 20:42       ` Yonghong Song
2026-04-24 17:17 ` [PATCH bpf-next 18/18] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=440e435c-3739-4cec-821a-e765eda2a279@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.