From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"Jose E . Marchesi" <jose.marchesi@oracle.com>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v3 08/11] bpf,x86: Implement JIT support for stack arguments
Date: Sun, 5 Apr 2026 21:59:02 -0700 [thread overview]
Message-ID: <c8a3bea8-213d-461e-a647-306ff0df566d@linux.dev> (raw)
In-Reply-To: <CAADnVQKyoAWQ8z6gyeBiBqor6rNrL+F_C3VY_1XMDqaQ7g81GQ@mail.gmail.com>
On 4/5/26 9:54 PM, Alexei Starovoitov wrote:
> On Sun, Apr 5, 2026 at 9:14 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>>
>> On 4/5/26 1:36 PM, Alexei Starovoitov wrote:
>>> On Sun, Apr 5, 2026 at 10:26 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>>>> Add x86_64 JIT support for BPF functions and kfuncs with more than
>>>> 5 arguments. The extra arguments are passed through a stack area
>>>> addressed by register r12 (BPF_REG_STACK_ARG_BASE) in BPF bytecode,
>>>> which the JIT translates to RBP-relative accesses in native code.
>>>>
>>>> The JIT follows the native x86_64 calling convention for stack
>>>> argument placement. Incoming stack args from the caller sit above
>>>> the callee's frame pointer at [rbp + 16], [rbp + 24], etc., exactly
>>>> where x86_64 expects them after CALL + PUSH RBP. Only the outgoing
>>>> stack arg area is allocated below the program stack in the prologue.
>>>>
>>>> The native x86_64 stack layout for a function with incoming and
>>>> outgoing stack args:
>>>>
>>>> high address
>>>> ┌─────────────────────────┐
>>>> │ incoming stack arg N │ [rbp + 16 + (N-1)*8] (from caller)
>>>> │ ... │
>>>> │ incoming stack arg 1 │ [rbp + 16]
>>>> ├─────────────────────────┤
>>>> │ return address │ [rbp + 8]
>>>> │ saved rbp │ [rbp]
>>>> ├─────────────────────────┤
>>>> │ BPF program stack │ (stack_depth bytes)
>>>> ├─────────────────────────┤
>>>> │ outgoing stack arg 1 │ [rbp - prog_stack_depth - outgoing_depth]
>>>> │ ... │ (written via r12-relative STX/ST)
>>>> │ outgoing stack arg M │ [rbp - prog_stack_depth - 8]
>>>> ├─────────────────────────┤
>>>> │ callee-saved regs ... │ (pushed after sub rsp)
>>>> └─────────────────────────┘ rsp
>>>> low address
>>>>
>>>> BPF r12-relative offsets are translated to native RBP-relative
>>>> offsets with two formulas:
>>>> - Incoming args (load: -off <= incoming_depth):
>>>> native_off = 8 - bpf_off → [rbp + 16 + ...]
>>>> - Outgoing args (store: -off > incoming_depth):
>>>> native_off = -(bpf_prog_stack + stack_arg_depth + 8) - bpf_off
>>>>
>>>> Since callee-saved registers are pushed below the outgoing area,
>>>> outgoing args are not at [rsp] at call time. Therefore, for both BPF-to-BPF
>>>> calls and kfunc calls, outgoing args are explicitly pushed from the
>>>> outgoing area onto the stack before CALL and rsp is restored after return.
>>>>
>>>> For kfunc calls specifically, arg 6 is loaded into R9 and args 7+
>>>> are pushed onto the native stack, per the x86_64 calling convention.
>>>>
>>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>>>> ---
>>>> arch/x86/net/bpf_jit_comp.c | 135 ++++++++++++++++++++++++++++++++++--
>>>> 1 file changed, 129 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>>> index 32864dbc2c4e..206f342a0ca0 100644
>>>> --- a/arch/x86/net/bpf_jit_comp.c
>>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>>> @@ -390,6 +390,28 @@ static void pop_callee_regs(u8 **pprog, bool *callee_regs_used)
>>>> *pprog = prog;
>>>> }
>>>>
>>>> +/* Push stack args from [rbp + outgoing_base + (k - 1) * 8] in reverse order. */
>>>> +static int push_stack_args(u8 **pprog, s32 outgoing_base, int from, int to)
>>>> +{
>>>> + u8 *prog = *pprog;
>>>> + int k, bytes = 0;
>>>> + s32 off;
>>>> +
>>>> + for (k = from; k >= to; k--) {
>>>> + off = outgoing_base + (k - 1) * 8;
>>>> + /* push qword [rbp + off] */
>>>> + if (is_imm8(off)) {
>>>> + EMIT3(0xFF, 0x75, off);
>>>> + bytes += 3;
>>>> + } else {
>>>> + EMIT2_off32(0xFF, 0xB5, off);
>>>> + bytes += 6;
>>>> + }
>>>> + }
>>> This is not any better than v1.
>>> It is still a copy.
>>> As I said earlier:
>>> https://lore.kernel.org/bpf/CAADnVQ+5Aqxpk1bTw47xZQ5E0HOtf0-HHjmDFHaay7CDJ-7aKQ@mail.gmail.com/
>>> It has to be zero overhead. Copy pasting:
>>>
>>> "
>>> bpf calling convention for 6+ args needs to match x86.
>>> With an exception of 6th arg.
>>> All bpf insn need to remain as-is when calling another bpf prog
>>> or kfunc. There should be no additional moves.
>>> JIT should only special case 6th arg and convert bpf's STX [r12-N], src_reg
>>> into 'mov r9, src_reg', since r9 is used to pass 6th argument on x86.
>>> The rest of STX needs to be jitted pretty much as-is
>>> with a twist that bpf's r12 becomes %rbp on x86.
>>> And similar things in the callee.
>>> Instead of LDX [r12+N] it will be a 'mov dst_reg, r9' where r9 is x86's r9.
>>> Other LDX from [r12+M] will remain as-is, but r12->%rbp.
>>> On arm64 more of the STX/LDX insns become native 'mov'-s
>>> because arm64 has more registers for arguments.
>>> "
>>>
>>> Remapping in earlier patches is unnecessary.
>>> These STX [r12-N], src_reg emitted by LLVM will be JITed as-is into
>>> store of src_reg into %rbp-M slot.
>>> Only shift by 8 bytes is necessary for N to become M.
>>> where STX of 6th argument becomes 'mov' from one register to x86's r9.
>> Okay, I will do the following jit stack layout:
>>
>> incoming stack arg N -> 1
>> return adderss
>> saved rbp
>> BPF program stack
>> tail call cnt <== if tail call reachable
>> callee-saved regs
>> r9 <== if priv_frame_ptr is not null
>> outgoing stack arg M -> 1
>> call ...
>> undo stack of outgoing stack arg + r9
> It looks like you're trying to preserve r9 as an auxiliary register.
> If it's in the way, rewrite JIT handling. The size of the diff
> doesn't matter.
I actually will put r9 (priv_frame_ptr) into the stack. The following
is stack layout:
incoming stack arg N -> 1
return adderss
saved rbp
BPF program stack
tail call cnt <== if tail call reachable
callee-saved regs
r9 <== if priv_frame_ptr is not null
outgoing stack arg M -> 1
> r9 should be the 6th argument.
next prev parent reply other threads:[~2026-04-06 4:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-05 17:25 [PATCH bpf-next v3 00/11] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-04-05 17:25 ` [PATCH bpf-next v3 01/11] bpf: Introduce bpf register BPF_REG_STACK_ARG_BASE Yonghong Song
2026-04-05 17:25 ` [PATCH bpf-next v3 02/11] bpf: Reuse MAX_BPF_FUNC_ARGS for maximum number of arguments Yonghong Song
2026-04-05 17:25 ` [PATCH bpf-next v3 03/11] bpf: Support stack arguments for bpf functions Yonghong Song
2026-04-05 18:20 ` bot+bpf-ci
2026-04-08 4:38 ` Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 04/11] bpf: Refactor process_iter_arg() to have proper argument index Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 05/11] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-04-05 18:20 ` bot+bpf-ci
2026-04-08 4:53 ` Yonghong Song
2026-04-08 15:05 ` Alexei Starovoitov
2026-04-08 18:07 ` Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 06/11] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 07/11] bpf: Enable stack argument support for x86_64 Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 08/11] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-04-05 18:20 ` bot+bpf-ci
2026-04-08 4:40 ` Yonghong Song
2026-04-08 4:42 ` Yonghong Song
2026-04-05 20:36 ` Alexei Starovoitov
2026-04-06 4:14 ` Yonghong Song
2026-04-06 4:54 ` Alexei Starovoitov
2026-04-06 4:59 ` Yonghong Song [this message]
2026-04-05 17:26 ` [PATCH bpf-next v3 09/11] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 10/11] selftests/bpf: Add negative test for greater-than-8-byte kfunc stack argument Yonghong Song
2026-04-05 17:26 ` [PATCH bpf-next v3 11/11] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c8a3bea8-213d-461e-a647-306ff0df566d@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.