Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions

BPF List
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	bot+bpf-ci@kernel.org, bpf@vger.kernel.org
Cc: ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	jose.marchesi@oracle.com, kernel-team@fb.com,
	martin.lau@kernel.org, eddyz87@gmail.com, clm@meta.com,
	ihor.solodrai@linux.dev
Subject: Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions
Date: Mon, 11 May 2026 21:17:19 -0700	[thread overview]
Message-ID: <5a0b1bdf-951f-4a0c-909a-8c7b0ed8cdc7@linux.dev> (raw)
In-Reply-To: <DIFZ4QX7T164.30KVT7BJ584K7@gmail.com>



On 5/11/26 6:05 PM, Alexei Starovoitov wrote:
> On Mon May 11, 2026 at 8:46 AM PDT, Yonghong Song wrote:
>>
>> On 5/11/26 8:19 AM, bot+bpf-ci@kernel.org wrote:
>>>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>>>> index 321b9d69cf9c..f9020a4ea005 100644
>>>> --- a/include/linux/bpf_verifier.h
>>>> +++ b/include/linux/bpf_verifier.h
>>>> @@ -402,6 +402,7 @@ struct bpf_func_state {
>>>>    	bool in_callback_fn;
>>>>    	bool in_async_callback_fn;
>>>>    	bool in_exception_callback_fn;
>>>> +	bool no_stack_arg_load;
>>> [ ... ]
>>>
>>>> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
>>>> index bd9c22945050..c249eb40c6d6 100644
>>>> --- a/kernel/bpf/states.c
>>>> +++ b/kernel/bpf/states.c
>>>> @@ -833,6 +833,32 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>>    	return true;
>>>>    }
>>>>
>>>> +/*
>>>> + * Compare stack arg slots between old and current states.
>>>> + * Outgoing stack args are path-local state and must agree for pruning.
>>>> + */
>>>> +static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>>>> +			   struct bpf_func_state *cur, struct bpf_idmap *idmap,
>>>> +			   enum exact_level exact)
>>>> +{
>>>> +	int i, nslots;
>>>> +
>>>> +	nslots = max(old->out_stack_arg_cnt, cur->out_stack_arg_cnt);
>>>> +	for (i = 0; i < nslots; i++) {
>>>> +		struct bpf_reg_state *old_arg, *cur_arg;
>>>> +		struct bpf_reg_state not_init = { .type = NOT_INIT };
>>>> +
>>>> +		old_arg = i < old->out_stack_arg_cnt ?
>>>> +			  &old->stack_arg_regs[i] : &not_init;
>>>> +		cur_arg = i < cur->out_stack_arg_cnt ?
>>>> +			  &cur->stack_arg_regs[i] : &not_init;
>>>> +		if (!regsafe(env, old_arg, cur_arg, idmap, exact))
>>>> +			return false;
>>>> +	}
>>>> +
>>>> +	return true;
>>>> +}
>>>> +
>>>>    static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur,
>>>>    		    struct bpf_idmap *idmap)
>>>>    {
>>>> @@ -924,6 +950,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>>>>    	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>>>    		return false;
>>>>
>>>> +	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>>> +		return false;
>>>> +
>>>>    	return true;
>>>>    }
>>> In v2 (May 7), bot+bpf-ci@kernel.org raised a concern that
>>> no_stack_arg_load is not compared during state pruning in
>>> func_states_equal():
>>>
>>>       "The same issue applies to state pruning: func_states_equal/
>>>       stack_arg_safe does not compare no_stack_arg_load either, so a
>>>       state with no_stack_arg_load=true can be pruned to an old state
>>>       with no_stack_arg_load=false incorrectly. Should no_stack_arg_load
>>>       be placed before stack (to be memcpy'd) or should copy_stack_state()
>>>       explicitly assign it?"
>>>
>>> The field placement was fixed (moving no_stack_arg_load before 'stack'
>>> so it gets copied by memcpy), but func_states_equal() still does not
>>> explicitly compare this field.
>>>
>>> Could this allow incorrect pruning? If a BPF program writes to stack
>>> args and calls a function on one path (setting no_stack_arg_load=true),
>>> then branches and tries to read from stack args on another path, the
>>> verifier might incorrectly prune to the old state with
>>> no_stack_arg_load=false, missing the enforcement that stack arg reads
>>> must happen before writes/calls.
>>>
>>> Looking at func_states_equal():
>>>
>>> kernel/bpf/states.c:func_states_equal() {
>>>       if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
>>>           return false;
>>>
>>>       if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
>>>           return false;
>>>
>>>       return true;
>>> }
>>>
>>> Should no_stack_arg_load be compared here, similar to how other boolean
>>> flags in bpf_func_state are handled during state comparison?
>> You are right. The following is an example:
>>
>>     /* subprog with incoming stack arg */
>>     r1 = *(r11+8);         /* read incoming arg, no_stack_arg_load = false */
>>                                                                                                                      
>>     if cond goto path2;
>>                                                                                                                      
>>     path1:
>>       call some_helper;    /* sets no_stack_arg_load = true */
>>       goto join;
>>                                                                                                                      
>>     path2:
>>       /* no call, no_stack_arg_load = false */
>>       goto join;
>>                                                                                                                      
>>     join:
>>       r2 = *(r11+8);       /* read incoming arg again */
>>
>> In the above case, at 'join' point, we have no_stack_arg_load = true and false
>> respectively. In this case, we cannot do pruning.
>>
>> Will fix.
> Hold on. Didn't we agree that any call should scratch all arg slots?
> In the above example call some_helper will scratch it and last read shouldn't be allowed.

Looks like we may still need to compare no_stack_arg_load in func_states_equal().
The following is an example:

The selftest:

+__noinline __used __naked
+static int subprog_pruning_call_before_load_6args(int a, int b, int c, int d,
+                                                 int e, int f)
+{
+       asm volatile (
+               "if r1 s> 0 goto l0_%=;"
+               "goto l1_%=;"
+       "l0_%=:"
+               "call %[bpf_get_prandom_u32];"
+       "l1_%=:"
+               "r0 = *(u64 *)(r11 + 8);"
+               "exit;"
+               :: __imm(bpf_get_prandom_u32)
+               : __clobber_all
+       );
+}
+
+SEC("tc")
+__description("stack_arg: pruning keeps r11 load ordering")
+__failure
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("r11 load must be before any r11 store or call insn")
+__btf_func_path("btf__verifier_stack_arg_order.bpf.o")
+__naked void stack_arg_pruning_load_after_call(void)
+{
+       asm volatile (
+               "call %[bpf_get_prandom_u32];"
+               "r1 = r0;"
+               "r2 = 2;"
+               "r3 = 3;"
+               "r4 = 4;"
+               "r5 = 5;"
+               "*(u64 *)(r11 - 8) = 6;"
+               "call subprog_pruning_call_before_load_6args;"
+               "exit;"
+               :: __imm(bpf_get_prandom_u32)
+               : __clobber_all
+       );
+}

It needs the following

diff --git a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
index 2d5ddb24e241..83692570d5bc 100644
--- a/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
+++ b/tools/testing/selftests/bpf/progs/btf__verifier_stack_arg_order.c
@@ -15,6 +15,11 @@ int subprog_call_before_load_6args(int a, int b, int c, int d, int e, int f)
         return a + b + c + d + e + f;
  }
  
+int subprog_pruning_call_before_load_6args(int a, int b, int c, int d, int e, int f)
+{
+       return a + b + c + d + e + f;
+}
+
  #else
  
  int subprog_bad_order_6args(void)
@@ -27,4 +32,9 @@ int subprog_call_before_load_6args(void)
         return 0;
  }
  
+int subprog_pruning_call_before_load_6args(void)
+{
+       return 0;
+}
+
  #endif

in order to get proper BTF for subprog_pruning_call_before_load_6args().

With the above, the following is the verifier log:

func#0 @0
func#1 @9
topo_order[0] = subprog_pruning_call_before_load_6args
topo_order[1] = stack_arg_pruning_load_after_call
subprog#0: analyzing (depth 0)...
subprog#0 stack_arg_pruning_load_after_call:
   0: (85) call bpf_get_prandom_u32#7
   1: (bf) r1 = r0
   2: (b7) r2 = 2
   3: (b7) r3 = 3
   4: (b7) r4 = 4
   5: (b7) r5 = 5
   6: (7a) *(u64 *)(r11 -8) = 6
   7: (85) call pc+1
   8: (95) exit
subprog#1: analyzing (depth 0)...
subprog#1 subprog_pruning_call_before_load_6args:
   9: (65) if r1 s> 0x0 goto pc+1
  10: (05) goto pc+1
  11: (85) call bpf_get_prandom_u32#7
  12: (79) r0 = *(u64 *)(r11 +8)
  13: (95) exit
stack use/def subprog#0 stack_arg_pruning_load_after_call (d0,cs0):
   0: (85) call bpf_get_prandom_u32#7
   1: (bf) r1 = r0
   2: (b7) r2 = 2
   3: (b7) r3 = 3
   4: (b7) r4 = 4
   5: (b7) r5 = 5
   6: (7a) *(u64 *)(r11 -8) = 6
   7: (85) call pc+1
   8: (95) exit
stack use/def subprog#1 subprog_pruning_call_before_load_6args (d0,cs9):
   9: (65) if r1 s> 0x0 goto pc+1
  10: (05) goto pc+1
  11: (85) call bpf_get_prandom_u32#7
  12: (79) r0 = *(u64 *)(r11 +8)
  13: (95) exit
Live regs before insn:
       0: .......... (85) call bpf_get_prandom_u32#7
       1: 0......... (bf) r1 = r0
       2: .1........ (b7) r2 = 2
       3: .12....... (b7) r3 = 3
       4: .123...... (b7) r4 = 4
       5: .1234..... (b7) r5 = 5
       6: .12345.... (7a) *(u64 *)(r11 -8) = 6
       7: .12345.... (85) call pc+1
       8: 0......... (95) exit
       9: .1........ (65) if r1 s> 0x0 goto pc+1
      10: .......... (05) goto pc+1
      11: .......... (85) call bpf_get_prandom_u32#7
      12: .......... (79) r0 = *(u64 *)(r11 +8)
      13: 0......... (95) exit
0: R1=ctx() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
1: (bf) r1 = r0                       ; R0=scalar(id=1) R1=scalar(id=1)
2: (b7) r2 = 2                        ; R2=2
3: (b7) r3 = 3                        ; R3=3
4: (b7) r4 = 4                        ; R4=4
5: (b7) r5 = 5                        ; R5=5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
caller:
  R10=fp0
callee:
  frame1: R1=scalar() R2=2 R3=3 R4=4 R5=5 R10=fp0
9: frame1: R1=scalar() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
9: (65) if r1 s> 0x0 goto pc+1        ; frame1: R1=scalar(smax=0)
10: (05) goto pc+1
12: (79) r0 = *(u64 *)(r11 +8)        ; frame1: R0=6
13: (95) exit
returning from callee:
  frame1: R0=6 R10=fp0
to caller at 8:
  R0=6 R10=fp0

from 13 to 8: R0=6 R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
8: (95) exit

from 9 to 11: frame1: R10=fp0
11: frame1: R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
11: (85) call bpf_get_prandom_u32#7
12: safe
processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 6 peak_states 6 mark_read 0
=============

The insn 12 (r0 = *(u64 *)(r11 +8)) is considered safe
and the verification succeeded.

But this is not correct. The verification should fail due to insn 11.

If we add the following in states.c:

diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index 45d86bfe3b68..877338136009 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -941,6 +941,9 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
         if (old->callback_depth > cur->callback_depth)
                 return false;
  
+       if (!old->no_stack_arg_load && cur->no_stack_arg_load)
+               return false;
+
         for (i = 0; i < MAX_BPF_REG; i++)
                 if (((1 << i) & live_regs) &&
                     !regsafe(env, &old->regs[i], &cur->regs[i],


The verification will fail:

0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
1: (bf) r1 = r0                       ; R0=scalar(id=1) R1=scalar(id=1)
2: (b7) r2 = 2                        ; R2=2
3: (b7) r3 = 3                        ; R3=3
4: (b7) r4 = 4                        ; R4=4
5: (b7) r5 = 5                        ; R5=5
6: (7a) *(u64 *)(r11 -8) = 6
7: (85) call pc+1
caller:
  R10=fp0
callee:
  frame1: R1=scalar() R2=2 R3=3 R4=4 R5=5 R10=fp0
9: frame1: R1=scalar() R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
9: (65) if r1 s> 0x0 goto pc+1        ; frame1: R1=scalar(smax=0)
10: (05) goto pc+1
12: (79) r0 = *(u64 *)(r11 +8)        ; frame1: R0=6
13: (95) exit
returning from callee:
  frame1: R0=6 R10=fp0
to caller at 8:
  R0=6 R10=fp0

from 13 to 8: R0=6 R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:99
8: (95) exit

from 9 to 11: frame1: R10=fp0
11: frame1: R10=fp0
; asm volatile ( @ verifier_stack_arg_order.c:78
11: (85) call bpf_get_prandom_u32#7   ; frame1:
12: (79) r0 = *(u64 *)(r11 +8)
r11 load must be before any r11 store or call insn
processed 15 insns (limit 1000000) max_states_per_insn 1 total_states 7 peak_states 7 mark_read 0

Without the above states.c change, insn 12 does not have
chance to check load vs. store or call insn, and hence
considering state is equivalent.

With the above states.c change, old no_stack_arg_load is false and
cur no_stack_arg_load is true, func_states_equal returns false to
allow further for cur verification, which enters insn 12 which will
fail the verification.

What do you think about the above states.c change?

next prev parent reply	other threads:[~2026-05-12  4:17 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  5:33 [PATCH bpf-next v3 00/24] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 01/24] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 02/24] bpf: Remove copy_register_state wrapper function Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 03/24] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 04/24] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11 16:29     ` Yonghong Song
2026-05-11 17:18       ` Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11 15:46     ` Yonghong Song
2026-05-11 16:05       ` Alexei Starovoitov
2026-05-11 16:21         ` Yonghong Song
2026-05-12  4:17         ` Yonghong Song [this message]
2026-05-11  5:33 ` [PATCH bpf-next v3 06/24] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
2026-05-11 16:17   ` Alexei Starovoitov
2026-05-11 16:33     ` Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 07/24] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11  5:33 ` [PATCH bpf-next v3 08/24] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 09/24] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11 16:35     ` Yonghong Song
2026-05-11 16:34   ` Alexei Starovoitov
2026-05-11 16:40     ` Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 10/24] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11 16:42     ` Yonghong Song
2026-05-11  5:33 ` [PATCH bpf-next v3 11/24] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 12/24] bpf: Enable r11 based insns Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 13/24] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 14/24] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11  5:34 ` [PATCH bpf-next v3 15/24] bpf: Pass bpf_subprog_info to bpf_int_jit_compile() Yonghong Song
2026-05-11 16:38   ` Alexei Starovoitov
2026-05-11 16:47     ` Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 16/24] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-05-11 16:39   ` Alexei Starovoitov
2026-05-11 16:47     ` Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 17/24] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 18/24] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 19/24] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 20/24] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
2026-05-11  6:19   ` bot+bpf-ci
2026-05-11 16:49     ` Yonghong Song
2026-05-11  5:34 ` [PATCH bpf-next v3 21/24] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-05-11  5:35 ` [PATCH bpf-next v3 22/24] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-05-11  5:35 ` [PATCH bpf-next v3 23/24] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-11  5:35 ` [PATCH bpf-next v3 24/24] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:2d5ddb24e24 dfblob:83692570d5b dfblob:45d86bfe3b6
dfblob:87733813600 )
 OR (
bs:"Re: [PATCH bpf-next v3 05/24] bpf: Support stack arguments for bpf functions" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a0b1bdf-951f-4a0c-909a-8c7b0ed8cdc7@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox