From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"Jose E . Marchesi" <jose.marchesi@oracle.com>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v4 00/25] bpf: Support stack arguments for BPF functions and kfuncs
Date: Wed, 13 May 2026 10:41:53 -0700 [thread overview]
Message-ID: <0b02e692-1ee1-42e3-a437-97a3a9ca2481@linux.dev> (raw)
In-Reply-To: <CAADnVQKjVKG8cfMBpWpmbtPToGPSyD13aXLNOzb2PTQm_bS_2w@mail.gmail.com>
On 5/13/26 6:33 PM, Alexei Starovoitov wrote:
> On Tue, May 12, 2026 at 9:50 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> Currently, bpf function calls and kfunc's are limited by 5 reg-level
>> parameters. For function calls with more than 5 parameters,
>> developers can use always inlining or pass a struct pointer
>> after packing more parameters in that struct although it may have
>> some inconvenience. But there is no workaround for kfunc if more
>> than 5 parameters is needed.
>>
>> This patch set lifts the 5-argument limit by introducing stack-based
>> argument passing for BPF functions and kfunc's, coordinated with
>> compiler support in LLVM [1]. The compiler emits loads/stores through
>> a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
>> the 5th, keeping the stack arg area separate from the r10-based program
>> stack. The current maximum number of arguments is capped at
>> MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
>> use cases.
>>
>> All kfunc/bpf-function arguments are caller saved, including stack
>> arguments. For register arguments (r1-r5), the verifier already marks
>> them as clobbered after each call. For stack arguments, the verifier
>> invalidates all outgoing stack arg slots immediately after a call,
>> requiring the compiler to re-store them before any subsequent call.
>> This follows the native calling convention where all function
>> parameters are caller saved.
>>
>> The x86_64 JIT translates r11-relative accesses to RBP-relative
>> native instructions. Each function's stack allocation is extended
>> by 'max_outgoing' bytes to hold the outgoing arg area below the
>> callee-saved registers. This makes implementation easier as the r10
>> can be reused for stack argument access. At both BPF-to-BPF and kfunc
>> calls, outgoing args are pushed onto the expected calling convention
>> locations directly. The incoming parameters can directly get the value
>> from caller.
>>
>> Global subprogs and freplace progs with >5 args are not yet supported.
>> Only x86_64 and arm64 are supported for now. Same selftests are tested
>> by both x86_64 and arm64. Please see each individual patch for details.
>>
>> [1] https://github.com/llvm/llvm-project/pull/189060
>>
>> Changelogs:
>> v3 -> v4:
> Applied, but see veristat failures:
>
> |strobelight-server-monitors-gpu_mem_snapshot-gpu_mem_snapshot_bpf-gpu_mem_snapshot_bpf.o|handle_cu_mem_address_reserve_ret
> |success -> failure (!!)|-100.00 % |
> |strobelight-server-monitors-gpu_mem_snapshot-gpu_mem_snapshot_bpf-gpu_mem_snapshot_bpf.o|handle_cu_mem_create_ret
> |success -> failure (!!)|-100.00 % |
> |strobelight-server-monitors-gpu_mem_snapshot-gpu_mem_snapshot_bpf-gpu_mem_snapshot_bpf.o|handle_malloc_ret
> |success -> failure (!!)|-100.00 % |
> |strobelight-server-monitors-gpu_mem_snapshot-gpu_mem_snapshot_bpf-gpu_mem_snapshot_bpf.o|handle_mtia_allocate_exit
> |success -> failure (!!)|-83.67 %
>
> I couldn't see what might have caused it. Hopefully something minor.
> Pls follow up.
I found the reason, for gpu_mem_snapshot/bpf/gpu_mem_snapshot.bpf.c,
the source code:
static int alloc_exit(
struct pt_regs* ctx,
uint8_t alloc_type,
uint64_t direct_addr,
uint16_t extra_arg_cnt,
uint64_t extra_arg0,
uint64_t extra_arg1) {
uint64_t pid_tgid = bpf_get_current_pid_tgid();
...
} // 6 arguments
SEC("uretprobe")
int BPF_KPROBE(handle_malloc_ret) {
return alloc_exit(ctx, STROBELIGHT_GPU_MEM_ALLOC_SAMPLE, 0, 0, 0, 0);
}
But actually, the 'struct pt_regs* ctx' argument is a dead argument
and llvm compiler removed it. So for final code, the alloc_exit()
has 5 arguments and this is what bpf backend see.
Note that function call signature still has 6 arguments.
I am using llvm21 to test which is CI current using.
Without kernel stack argument support
=====================================
we have
btf_check_func_arg_match()
btf_prepare_func_args()
...
nargs = btf_type_vlen(t); // nargs = 6
if (nargs > MAX_BPF_FUNC_REG_ARGS) {
if (!is_global)
return -EINVAL;
bpf_log(log, "Global function %s() with %d > %d args. Buggy compiler.\n",
tname, nargs, MAX_BPF_FUNC_REG_ARGS);
return -EINVAL;
}
So bpf_prepare_func_args() return -EINVAL.
Eventually error code propagated back to check_func_call:
err = btf_check_subprog_call(env, subprog, caller->regs);
if (err == -EFAULT)
return err;
...
Since the return value is not -EFAULT, verification continues
although btf type does not agree with the actual parameter number.
With kernel stack argument support
==================================
we have
btf_check_func_arg_match()
btf_prepare_func_args()
...
nargs = btf_type_vlen(t); // nargs = 6
... // all arguments validated properly
return 0
// error message: callee expects 6 args, stack arg1 is not initialized
This is expected since optimized function signature only has 5 parameters
in llvm bpf backend.
How to resolve this issue
=========================
I have merged a patch internally with the following manual change:
static int alloc_exit(
- struct pt_regs* ctx,
uint8_t alloc_type,
uint64_t direct_addr,
uint16_t extra_arg_cnt,
@@ -728,7 +727,7 @@
SEC("uretprobe")
int BPF_KPROBE(handle_malloc_ret) {
- return alloc_exit(ctx, STROBELIGHT_GPU_MEM_ALLOC_SAMPLE, 0, 0, 0, 0);
+ return alloc_exit(STROBELIGHT_GPU_MEM_ALLOC_SAMPLE, 0, 0, 0, 0);
}
SEC("uprobe")
@@ -768,7 +767,7 @@
SEC("uretprobe")
int BPF_KPROBE(handle_cu_mem_create_ret) {
- return alloc_exit(ctx, STROBELIGHT_CU_MEM_CREATE_SAMPLE, 0, 0, 0, 0);
+ return alloc_exit(STROBELIGHT_CU_MEM_CREATE_SAMPLE, 0, 0, 0, 0);
}
...
This should fix the issue and next CI gets meta bpf progs and the error
should be gone.
To really fix this issue, we should encode btf with true signatures, i.e.,
btf should match the actual function parameters. This should be done
in llvm.
next prev parent reply other threads:[~2026-05-13 17:42 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 4:49 [PATCH bpf-next v4 00/25] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-13 4:49 ` [PATCH bpf-next v4 01/25] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 02/25] bpf: Remove copy_register_state wrapper function Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 03/25] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 04/25] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 05/25] bpf: Support stack arguments for bpf functions Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 06/25] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 07/25] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-05-13 5:44 ` bot+bpf-ci
2026-05-13 4:50 ` [PATCH bpf-next v4 08/25] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 09/25] bpf: Use arg_is_fp() in has_fp_args() Yonghong Song
2026-05-13 4:50 ` [PATCH bpf-next v4 10/25] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-13 5:44 ` bot+bpf-ci
2026-05-13 4:50 ` [PATCH bpf-next v4 11/25] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-05-13 5:33 ` bot+bpf-ci
2026-05-13 4:50 ` [PATCH bpf-next v4 12/25] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-05-13 5:33 ` bot+bpf-ci
2026-05-13 4:50 ` [PATCH bpf-next v4 13/25] bpf: Enable r11 based insns Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 14/25] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 15/25] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-05-13 5:33 ` bot+bpf-ci
2026-05-13 4:51 ` [PATCH bpf-next v4 16/25] bpf: Disable private stack for x86_64 if stack arguments used Yonghong Song
2026-05-13 5:33 ` bot+bpf-ci
2026-05-13 4:51 ` [PATCH bpf-next v4 17/25] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 18/25] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 19/25] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 20/25] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 21/25] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 22/25] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 23/25] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-05-13 4:51 ` [PATCH bpf-next v4 24/25] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-13 4:52 ` [PATCH bpf-next v4 25/25] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
2026-05-13 16:33 ` [PATCH bpf-next v4 00/25] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
2026-05-13 17:41 ` Yonghong Song [this message]
2026-05-13 17:51 ` Alexei Starovoitov
2026-05-13 18:11 ` Yonghong Song
2026-05-13 16:40 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b02e692-1ee1-42e3-a437-97a3a9ca2481@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox