BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jose E . Marchesi" <jose.marchesi@oracle.com>,
	kernel-team@fb.com, Martin KaFai Lau <martin.lau@kernel.org>,
	Puranjay Mohan <puranjay@kernel.org>
Subject: [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments
Date: Thu,  7 May 2026 14:30:59 -0700	[thread overview]
Message-ID: <20260507213124.1132088-1-yonghong.song@linux.dev> (raw)
In-Reply-To: <20260507212942.1122000-1-yonghong.song@linux.dev>

Add x86_64 JIT support for BPF functions and kfuncs with more than
5 arguments. The extra arguments are passed through a stack area
addressed by register r11 (BPF_REG_PARAMS) in BPF bytecode,
which the JIT translates to native code.

The JIT follows the x86-64 calling convention for both BPF-to-BPF
and kfunc calls:
  - Arg 6 is passed in the R9 register
  - Args 7+ are passed on the stack

Incoming arg 6 (BPF r11+8) is translated to a MOV from R9 rather
than a memory load. Incoming args 7+ (BPF r11+16, r11+24, ...) map
directly to [rbp + 16], [rbp + 24], ..., matching the x86-64 stack
layout after CALL + PUSH RBP, so no offset adjustment is needed.

tail_call_reachable is rejected by the verifier and priv_stack is
disabled by the JIT when stack args exist, so R9 is always
available. When BPF bytecode writes to the arg-6 stack slot
(offset -8), the JIT emits a MOV into R9 instead of a memory store.
Outgoing args 7+ are placed at [rsp] in a pre-allocated area below
callee-saved registers, using:
  native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16

The native x86_64 stack layout with stack arguments:

  high address
  +-------------------------+
  | incoming stack arg N    |  [rbp + 16 + (N-7)*8]  (from caller)
  | ...                     |
  | incoming stack arg 7    |  [rbp + 16]
  +-------------------------+
  | return address          |  [rbp + 8]
  | saved rbp               |  [rbp]
  +-------------------------+
  | BPF program stack       |  (round_up(stack_depth, 8) bytes)
  +-------------------------+
  | callee-saved regs       |  (r12, rbx, r13, r14, r15 as needed)
  +-------------------------+
  | outgoing arg M          |  [rsp + (M-7)*8]
  | ...                     |
  | outgoing arg 7          |  [rsp]
  +-------------------------+  rsp
  low address

Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 arch/x86/net/bpf_jit_comp.c | 155 ++++++++++++++++++++++++++++++++++--
 1 file changed, 149 insertions(+), 6 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index ea9e707e8abf..67c2f4a3b9cc 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -390,6 +390,34 @@ static void pop_callee_regs(u8 **pprog, bool *callee_regs_used)
 	*pprog = prog;
 }
 
+/* add rsp, depth */
+static void emit_add_rsp(u8 **pprog, u16 depth)
+{
+	u8 *prog = *pprog;
+
+	if (!depth)
+		return;
+	if (is_imm8(depth))
+		EMIT4(0x48, 0x83, 0xC4, depth); /* add rsp, imm8 */
+	else
+		EMIT3_off32(0x48, 0x81, 0xC4, depth); /* add rsp, imm32 */
+	*pprog = prog;
+}
+
+/* sub rsp, depth */
+static void emit_sub_rsp(u8 **pprog, u16 depth)
+{
+	u8 *prog = *pprog;
+
+	if (!depth)
+		return;
+	if (is_imm8(depth))
+		EMIT4(0x48, 0x83, 0xEC, depth); /* sub rsp, imm8 */
+	else
+		EMIT3_off32(0x48, 0x81, 0xEC, depth); /* sub rsp, imm32 */
+	*pprog = prog;
+}
+
 static void emit_nops(u8 **pprog, int len)
 {
 	u8 *prog = *pprog;
@@ -1664,16 +1692,45 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 	int i, excnt = 0;
 	int ilen, proglen = 0;
 	u8 *ip, *prog = temp;
+	u16 stack_arg_depth, incoming_stack_arg_depth, outgoing_stack_arg_depth; /* in bytes */
+	u16 outgoing_rsp;
 	u32 stack_depth;
+	int callee_saved_size;
+	s32 outgoing_arg_base;
 	int err;
 
 	stack_depth = bpf_prog->aux->stack_depth;
+	stack_arg_depth = bpf_prog->aux->stack_arg_cnt * 8;
+	incoming_stack_arg_depth = bpf_prog->aux->incoming_stack_arg_cnt * 8;
+	outgoing_stack_arg_depth = stack_arg_depth - incoming_stack_arg_depth;
 	priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
 	if (priv_stack_ptr) {
 		priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
 		stack_depth = 0;
 	}
 
+	/*
+	 * Follow x86-64 calling convention for both BPF-to-BPF and
+	 * kfunc calls:
+	 *   - Arg 6 is passed in R9 register
+	 *   - Args 7+ are passed on the stack at [rsp]
+	 *
+	 * Incoming arg 6 is read from R9 (BPF r11+8 → MOV from R9).
+	 * Incoming args 7+ are read from [rbp + 16], [rbp + 24], ...
+	 * (BPF r11+16, r11+24, ... map directly with no offset change).
+	 *
+	 * tail_call_reachable is rejected by the verifier and priv_stack
+	 * is disabled by the JIT when stack args exist, so R9 is always
+	 * available.
+	 *
+	 * Stack layout (high to low):
+	 *   [rbp + 16 + ...]    incoming stack args 7+ (from caller)
+	 *   [rbp + 8]           return address
+	 *   [rbp]               saved rbp
+	 *   [rbp - prog_stack]  program stack
+	 *   [below]             callee-saved regs
+	 *   [below]             outgoing args 7+ (= rsp)
+	 */
 	arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
 	user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
 
@@ -1700,6 +1757,42 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 			push_r12(&prog);
 		push_callee_regs(&prog, callee_regs_used);
 	}
+
+	/* Compute callee-saved register area size. */
+	callee_saved_size = 0;
+	if (bpf_prog->aux->exception_boundary || arena_vm_start)
+		callee_saved_size += 8; /* r12 */
+	if (bpf_prog->aux->exception_boundary) {
+		callee_saved_size += 4 * 8; /* rbx, r13, r14, r15 */
+	} else {
+		int j;
+
+		for (j = 0; j < 4; j++)
+			if (callee_regs_used[j])
+				callee_saved_size += 8;
+	}
+	/*
+	 * Base offset from rbp for translating BPF outgoing args 7+
+	 * to native offsets. BPF uses negative offsets from r11
+	 * (r11-8 for arg6, r11-16 for arg7, ...) while x86 uses
+	 * positive offsets from rsp ([rsp+0] for arg7, [rsp+8] for
+	 * arg8, ...). Arg 6 goes to R9 directly.
+	 *
+	 * The translation reverses direction:
+	 *   native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16
+	 *
+	 * Note that tail_call_reachable is guaranteed to be false when
+	 * stack args exist, so tcc pushes need not be accounted for.
+	 */
+	outgoing_arg_base = -(round_up(stack_depth, 8) + callee_saved_size);
+
+	/*
+	 * Allocate outgoing stack arg area for args 7+ only.
+	 * Arg 6 goes into r9 register, not on stack.
+	 */
+	outgoing_rsp = outgoing_stack_arg_depth > 8 ? outgoing_stack_arg_depth - 8 : 0;
+	emit_sub_rsp(&prog, outgoing_rsp);
+
 	if (arena_vm_start)
 		emit_mov_imm64(&prog, X86_REG_R12,
 			       arena_vm_start >> 32, (u32) arena_vm_start);
@@ -1721,7 +1814,7 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 		u8 b2 = 0, b3 = 0;
 		u8 *start_of_ldx;
 		s64 jmp_offset;
-		s16 insn_off;
+		s32 insn_off;
 		u8 jmp_cond;
 		u8 *func;
 		int nops;
@@ -2134,12 +2227,27 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
 				EMIT1(0xC7);
 			goto st;
 		case BPF_ST | BPF_MEM | BPF_DW:
+			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+				/* Arg 6: store immediate in r9 register */
+				emit_mov_imm64(&prog, X86_REG_R9, imm32 >> 31, (u32)imm32);
+				break;
+			}
 			EMIT2(add_1mod(0x48, dst_reg), 0xC7);
 
-st:			if (is_imm8(insn->off))
-				EMIT2(add_1reg(0x40, dst_reg), insn->off);
+st:			insn_off = insn->off;
+			if (dst_reg == BPF_REG_PARAMS) {
+				/*
+				 * Args 7+: reverse BPF negative offsets to
+				 * x86 positive rsp offsets.
+				 * BPF off=-16 → [rsp+0], off=-24 → [rsp+8], ...
+				 */
+				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+				dst_reg = BPF_REG_FP;
+			}
+			if (is_imm8(insn_off))
+				EMIT2(add_1reg(0x40, dst_reg), insn_off);
 			else
-				EMIT1_off32(add_1reg(0x80, dst_reg), insn->off);
+				EMIT1_off32(add_1reg(0x80, dst_reg), insn_off);
 
 			EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(insn->code)));
 			break;
@@ -2149,7 +2257,17 @@ st:			if (is_imm8(insn->off))
 		case BPF_STX | BPF_MEM | BPF_H:
 		case BPF_STX | BPF_MEM | BPF_W:
 		case BPF_STX | BPF_MEM | BPF_DW:
-			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
+			if (dst_reg == BPF_REG_PARAMS && insn->off == -8) {
+				/* Arg 6: store register value in r9 */
+				EMIT_mov(X86_REG_R9, src_reg);
+				break;
+			}
+			insn_off = insn->off;
+			if (dst_reg == BPF_REG_PARAMS) {
+				insn_off = outgoing_arg_base - outgoing_rsp - insn_off - 16;
+				dst_reg = BPF_REG_FP;
+			}
+			emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
 			break;
 
 		case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
@@ -2248,6 +2366,19 @@ st:			if (is_imm8(insn->off))
 		case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
 		case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
 			insn_off = insn->off;
+			if (src_reg == BPF_REG_PARAMS) {
+				if (insn_off == 8) {
+					/* Incoming arg 6: read from r9 */
+					EMIT_mov(dst_reg, X86_REG_R9);
+					break;
+				}
+				src_reg = BPF_REG_FP;
+				/*
+				 * Incoming args 7+: native_off == bpf_off
+				 * (r11+16 → [rbp+16], r11+24 → [rbp+24], ...)
+				 * No offset adjustment needed.
+				 */
+			}
 
 			if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
 			    BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
@@ -2736,6 +2867,8 @@ st:			if (is_imm8(insn->off))
 				if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
 					return -EINVAL;
 			}
+			/* Deallocate outgoing args 7+ area. */
+			emit_add_rsp(&prog, outgoing_rsp);
 			if (bpf_prog->aux->exception_boundary) {
 				pop_callee_regs(&prog, all_callee_regs_used);
 				pop_r12(&prog);
@@ -3743,7 +3876,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 		prog->aux->jit_data = jit_data;
 	}
 	priv_stack_ptr = prog->aux->priv_stack_ptr;
-	if (!priv_stack_ptr && prog->aux->jits_use_priv_stack) {
+	/*
+	 * x86-64 uses R9 for both private stack frame pointer and arg 6,
+	 * so disable private stack when stack args are present.
+	 */
+	if (!priv_stack_ptr && prog->aux->jits_use_priv_stack &&
+	    prog->aux->stack_arg_cnt == 0) {
 		/* Allocate actual private stack size with verifier-calculated
 		 * stack size plus two memory guards to protect overflow and
 		 * underflow.
@@ -3910,6 +4048,11 @@ bool bpf_jit_supports_kfunc_call(void)
 	return true;
 }
 
+bool bpf_jit_supports_stack_args(void)
+{
+	return true;
+}
+
 void *bpf_arch_text_copy(void *dst, void *src, size_t len)
 {
 	if (text_poke_copy(dst, src, len) == NULL)
-- 
2.53.0-Meta


  parent reply	other threads:[~2026-05-07 21:31 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 21:29 [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 01/23] bpf: Convert bpf_get_spilled_reg macro to static inline function Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 02/23] bpf: Remove copy_register_state wrapper function Yonghong Song
2026-05-07 21:29 ` [PATCH bpf-next v2 03/23] bpf: Add helper functions for r11-based stack argument insns Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 04/23] bpf: Set sub->arg_cnt earlier in btf_prepare_func_args() Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:05     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions Yonghong Song
2026-05-07 22:26   ` bot+bpf-ci
2026-05-09 12:52     ` Yonghong Song
2026-05-08 18:00   ` Alexei Starovoitov
2026-05-09 12:55     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 06/23] bpf: Refactor jmp history to use dedicated spi/frame fields Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 07/23] bpf: Add precision marking and backtracking for stack argument slots Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:08     ` Yonghong Song
2026-05-09  4:05   ` sashiko-bot
2026-05-10 16:41     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 08/23] bpf: Refactor record_call_access() to extract per-arg logic Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 09/23] bpf: Extend liveness analysis to track stack argument slots Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09 13:29     ` Yonghong Song
2026-05-09  0:59   ` sashiko-bot
2026-05-10 16:47     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 10/23] bpf: Reject stack arguments in non-JITed programs Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09  2:10   ` sashiko-bot
2026-05-10 16:59     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 11/23] bpf: Prepare architecture JIT support for stack arguments Yonghong Song
2026-05-09  2:19   ` sashiko-bot
2026-05-10 17:05     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 12/23] bpf: Enable r11 based insns Yonghong Song
2026-05-09  2:59   ` sashiko-bot
2026-05-10 17:11     ` Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 13/23] bpf: Support stack arguments for kfunc calls Yonghong Song
2026-05-07 21:30 ` [PATCH bpf-next v2 14/23] bpf: Reject stack arguments if tail call reachable Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-09  1:42   ` sashiko-bot
2026-05-10 17:15     ` Yonghong Song
2026-05-07 21:30 ` Yonghong Song [this message]
2026-05-07 22:26   ` [PATCH bpf-next v2 15/23] bpf,x86: Implement JIT support for stack arguments bot+bpf-ci
2026-05-10 17:21     ` Yonghong Song
2026-05-09  2:21   ` sashiko-bot
2026-05-10 17:22     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 16/23] selftests/bpf: Add tests for BPF function " Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 17/23] selftests/bpf: Add tests for stack argument validation Yonghong Song
2026-05-09  1:30   ` sashiko-bot
2026-05-10 17:23     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 18/23] selftests/bpf: Add BTF fixup for __naked subprog parameter names Yonghong Song
2026-05-09  1:40   ` sashiko-bot
2026-05-10 17:24     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 19/23] selftests/bpf: Add verifier tests for stack argument validation Yonghong Song
2026-05-07 22:11   ` bot+bpf-ci
2026-05-10 17:27     ` Yonghong Song
2026-05-09  1:38   ` sashiko-bot
2026-05-10 17:27     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 20/23] selftests/bpf: Add precision backtracking test for stack arguments Yonghong Song
2026-05-09  1:52   ` sashiko-bot
2026-05-10 17:31     ` Yonghong Song
2026-05-07 21:31 ` [PATCH bpf-next v2 21/23] bpf, arm64: Map BPF_REG_0 to x8 instead of x7 Yonghong Song
2026-05-08 18:01   ` Alexei Starovoitov
2026-05-09 13:44     ` Yonghong Song
2026-05-07 21:32 ` [PATCH bpf-next v2 22/23] bpf, arm64: Add JIT support for stack arguments Yonghong Song
2026-05-09  2:15   ` sashiko-bot
2026-05-10 17:32     ` Yonghong Song
2026-05-07 21:32 ` [PATCH bpf-next v2 23/23] selftests/bpf: Enable stack argument tests for arm64 Yonghong Song
2026-05-08 18:06 ` [PATCH bpf-next v2 00/23] bpf: Support stack arguments for BPF functions and kfuncs Alexei Starovoitov
2026-05-09 13:43   ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260507213124.1132088-1-yonghong.song@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=puranjay@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox