Re: [PATCH bpf-next 07/10] bpf,x86: Implement JIT support for stack arguments

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jose E . Marchesi" <jose.marchesi@oracle.com>,
	Kernel Team <kernel-team@fb.com>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next 07/10] bpf,x86: Implement JIT support for stack arguments
Date: Thu, 2 Apr 2026 21:13:15 -0700	[thread overview]
Message-ID: <72f47124-1cab-4406-a6c1-3bed0c3579e8@linux.dev> (raw)
In-Reply-To: <CAADnVQ+5Aqxpk1bTw47xZQ5E0HOtf0-HHjmDFHaay7CDJ-7aKQ@mail.gmail.com>



On 4/2/26 4:51 PM, Alexei Starovoitov wrote:
> On Wed, Apr 1, 2026 at 6:28 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> Add x86_64 JIT support for BPF functions and kfuncs with more than
>> 5 arguments. The extra arguments are passed through a stack area
>> addressed by register r12 (BPF_REG_STACK_ARG_BASE) in BPF bytecode,
>> which the JIT translates to RBP-relative accesses in native code.
>>
>> There are two possible approaches to allocate the stack arg area:
>>
>>    Option 1: Allocate a single combined region (incoming + max_outgoing)
>>      below the program stack in the function prologue. All r12-relative
>>      accesses become [rbp - prog_stack_depth - offset] where the 'offset'
>>      is the offset value in (incoming + max_outgoing) region. This is
>>      simple because the area is always at a fixed offset from RBP.
>>      The tradeoff is slightly higher stack usage when multiple callees
>>      have different stack arg counts — the area is sized to the maximum.
>>
>>    Option 2: Allocate each outgoing area individually at the call
>>      site, sized exactly to the callee's needs. This minimizes
>>      stack usage but significantly complicates the JIT: each call
>>      site must dynamically adjust RSP, and addresses of stack args
>>      would shift depending on context, making the offset
>>      calculations harder.
>>
>> This patch uses Option 1 for simplicity.
>>
>> The native x86_64 stack layout for a function with incoming and
>> outgoing stack args:
>>
>>    high address
>>    ┌─────────────────────────┐
>>    │ incoming stack arg N    │  [rbp + 16 + (N - 1) * 8]  (pushed by caller)
>>    │ ...                     │
>>    │ incoming stack arg 1    │  [rbp + 16]
>>    ├─────────────────────────┤
>>    │ return address          │  [rbp + 8]
>>    │ saved rbp               │  [rbp]
>>    ├─────────────────────────┤
>>    │ callee-saved regs       │
>>    │ BPF program stack       │  (stack_depth bytes)
>>    ├─────────────────────────┤
>>    │ incoming stack arg 1    │  [rbp - prog_stack_depth - 8]
>>    │ ...   (copied from      │   (copied in prologue)
>>    │        caller's push)   │
>>    │ incoming stack arg N    │  [rbp - prog_stack_depth - N * 8]
>>    ├─────────────────────────┤
>>    │ outgoing stack arg 1    │  (written via r12-relative STX/ST,
>>    │ ...                     │   JIT translates to RBP-relative)
>>    │ outgoing stack arg M    │
>>    └─────────────────────────┘
>>      ...                        Other stack usage
>>    ┌─────────────────────────┐
>>    │ incoming stack arg M    │ (copy from outgoing stack arg to
>>    │ ...                     │  incoming stack arg)
>>    │ incoming stack arg 1    │
>>    ├─────────────────────────┤
>>    │ return address          │
>>    │ saved rbp               │
>>    ├─────────────────────────┤
>>    │ ...                     │
>>    └─────────────────────────┘
>>    low address
>>
>> In prologue, the caller's incoming stack arguments are copied to callee's
>> incoming stack arguments, which will be fetched by later load insns.
>> The outgoing stack arguments are written by JIT RBP-relative STX or ST.
>>
>> For each bpf-to-bpf call, push outgoing stack args onto the native
>> stack before CALL, pop them after return. So the same 'outgoing stack arg'
>> area is used by all bpf-to-bpf functions.
>>
>> For kfunc calls, push stack args (arg 7+) onto the native stack
>> and load arg 6 into R9 per the x86_64 calling convention,
>> then clean up RSP after return.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>>   arch/x86/net/bpf_jit_comp.c | 145 ++++++++++++++++++++++++++++++++++--
>>   1 file changed, 138 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index 32864dbc2c4e..807493f109e5 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -367,6 +367,27 @@ static void push_callee_regs(u8 **pprog, bool *callee_regs_used)
>>          *pprog = prog;
>>   }
>>
>> +static int push_stack_args(u8 **pprog, s32 base_off, int from, int to)
>> +{
>> +       u8 *prog = *pprog;
>> +       int j, off, cnt = 0;
>> +
>> +       for (j = from; j >= to; j--) {
>> +               off = base_off - j * 8;
>> +
>> +               /* push qword [rbp + off] */
>> +               if (is_imm8(off)) {
>> +                       EMIT3(0xFF, 0x75, off);
>> +                       cnt += 3;
>> +               } else {
>> +                       EMIT2_off32(0xFF, 0xB5, off);
>> +                       cnt += 6;
>> +               }
>> +       }
>> +       *pprog = prog;
>> +       return cnt;
>> +}
>> +
>>   static void pop_r12(u8 **pprog)
>>   {
>>          u8 *prog = *pprog;
>> @@ -1664,19 +1685,35 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
>>          int i, excnt = 0;
>>          int ilen, proglen = 0;
>>          u8 *prog = temp;
>> -       u32 stack_depth;
>> +       u16 stack_arg_depth, incoming_stack_arg_depth;
>> +       u32 prog_stack_depth, stack_depth;
>> +       bool has_stack_args;
>>          int err;
>>
>>          stack_depth = bpf_prog->aux->stack_depth;
>> +       stack_arg_depth = bpf_prog->aux->stack_arg_depth;
>> +       incoming_stack_arg_depth = bpf_prog->aux->incoming_stack_arg_depth;
>>          priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
>>          if (priv_stack_ptr) {
>>                  priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
>>                  stack_depth = 0;
>>          }
>>
>> +       /*
>> +        * Save program stack depth before adding stack arg space.
>> +        * Each function allocates its own stack arg space
>> +        * (incoming + outgoing) below its BPF stack.
>> +        * Stack args are accessed via RBP-based addressing.
>> +        */
>> +       prog_stack_depth = round_up(stack_depth, 8);
>> +       if (stack_arg_depth)
>> +               stack_depth += stack_arg_depth;
>> +       has_stack_args = stack_arg_depth > 0;
>> +
>>          arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
>>          user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
>>
>> +
>>          detect_reg_usage(insn, insn_cnt, callee_regs_used);
>>
>>          emit_prologue(&prog, image, stack_depth,
>> @@ -1704,6 +1741,38 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
>>                  emit_mov_imm64(&prog, X86_REG_R12,
>>                                 arena_vm_start >> 32, (u32) arena_vm_start);
>>
>> +       if (incoming_stack_arg_depth && bpf_is_subprog(bpf_prog)) {
>> +               int n = incoming_stack_arg_depth / 8;
>> +
>> +               /*
>> +                * Caller pushed stack args before CALL, so after prologue
>> +                * (CALL saves ret addr, then PUSH saves old RBP) they sit
>> +                * above RBP:
>> +                *
>> +                *   [rbp + 16 + (n - 1) * 8]  stack_arg n
>> +                *   ...
>> +                *   [rbp + 24]                stack_arg 2
>> +                *   [rbp + 16]                stack_arg 1
>> +                *   [rbp +  8]                return address
>> +                *   [rbp +  0]                saved rbp
>> +                *
>> +                * Copy each into callee's own region below the program stack:
>> +                *   [rbp - prog_stack_depth - i * 8]
>> +                */
>> +               for (i = 0; i < n; i++) {
>> +                       s32 src = 16 + i * 8;
>> +                       s32 dst = -prog_stack_depth - (i + 1) * 8;
>> +
>> +                       /* mov rax, [rbp + src] */
>> +                       EMIT4(0x48, 0x8B, 0x45, src);
>> +                       /* mov [rbp + dst], rax */
>> +                       if (is_imm8(dst))
>> +                               EMIT4(0x48, 0x89, 0x45, dst);
>> +                       else
>> +                               EMIT3_off32(0x48, 0x89, 0x85, dst);
>> +               }
> This is really suboptimal.
> bpf calling convention for 6+ args needs to match x86.
> With an exception of 6th arg.
> All bpf insn need to remain as-is when calling another bpf prog
> or kfunc. There should be no additional moves.
> JIT should only special case 6th arg and convert bpf's STX [r12-N], src_reg
> into 'mov r9, src_reg', since r9 is used to pass 6th argument on x86.
> The rest of STX needs to be jitted pretty much as-is
> with a twist that bpf's r12 becomes %rbp on x86.
> And similar things in the callee.
> Instead of LDX [r12+N] it will be a 'mov dst_reg, r9' where r9 is x86's r9.
> Other LDX from [r12+M] will remain as-is, but r12->%rbp.
> On arm64 more of the STX/LDX insns become native 'mov'-s
> because arm64 has more registers for arguments.

Good point. I will try to simplify the JIT by following x86_64
calling convention.

>
> pw-bot: cr

next prev parent reply	other threads:[~2026-04-03  4:13 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02  1:27 [PATCH bpf-next 00/10] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-04-02  1:27 ` [PATCH bpf-next 01/10] bpf: Introduce bpf register BPF_REG_STACK_ARG_BASE Yonghong Song
2026-04-02  1:27 ` [PATCH bpf-next 02/10] bpf: Reuse MAX_BPF_FUNC_ARGS for maximum number of arguments Yonghong Song
2026-04-02  1:27 ` [PATCH bpf-next 03/10] bpf: Support stack arguments for bpf functions Yonghong Song
2026-04-02  3:18   ` bot+bpf-ci
2026-04-02 14:42     ` Yonghong Song
2026-04-02 18:55   ` Amery Hung
2026-04-02 20:45     ` Yonghong Song
2026-04-02 23:38   ` Amery Hung
2026-04-03  4:05     ` Yonghong Song
2026-04-02 23:38   ` Alexei Starovoitov
2026-04-03  4:10     ` Yonghong Song
2026-04-05 21:07       ` Alexei Starovoitov
2026-04-06  4:29         ` Yonghong Song
2026-04-06  4:51           ` Alexei Starovoitov
2026-04-06  6:03             ` Yonghong Song
2026-04-06 15:17               ` Alexei Starovoitov
2026-04-06 16:19                 ` Yonghong Song
2026-04-06 17:24                   ` Alexei Starovoitov
2026-04-02  1:27 ` [PATCH bpf-next 04/10] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-04-02  3:18   ` bot+bpf-ci
2026-04-02 14:45     ` Yonghong Song
2026-04-02 21:02   ` Amery Hung
2026-04-02  1:27 ` [PATCH bpf-next 05/10] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-04-02  1:27 ` [PATCH bpf-next 06/10] bpf: Enable stack argument support for x86_64 Yonghong Song
2026-04-02  1:28 ` [PATCH bpf-next 07/10] bpf,x86: Implement JIT support for stack arguments Yonghong Song
2026-04-02 22:26   ` Amery Hung
2026-04-02 23:26     ` Yonghong Song
2026-04-02 23:51   ` Alexei Starovoitov
2026-04-03  4:13     ` Yonghong Song [this message]
2026-04-02  1:28 ` [PATCH bpf-next 08/10] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-04-02  1:28 ` [PATCH bpf-next 09/10] selftests/bpf: Add negative test for oversized kfunc stack argument Yonghong Song
2026-04-02  1:28 ` [PATCH bpf-next 10/10] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72f47124-1cab-4406-a6c1-3bed0c3579e8@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox