netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpf,x86: do RSB balance for trampoline
@ 2025-11-04 10:49 Menglong Dong
  2025-11-04 18:56 ` Alexei Starovoitov
  0 siblings, 1 reply; 16+ messages in thread
From: Menglong Dong @ 2025-11-04 10:49 UTC (permalink / raw)
  To: ast
  Cc: daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, davem, dsahern, tglx,
	mingo, bp, dave.hansen, x86, hpa, jiang.biao, menglong.dong, bpf,
	netdev, linux-kernel

In origin call case, we skip the "rip" directly before we return, which
break the RSB, as we have twice "call", but only once "ret".

Do the RSB balance by pseudo a "ret". Instead of skipping the "rip", we
modify it to the address of a "ret" insn that we generate.

The performance of "fexit" increases from 76M/s to 84M/s. Before this
optimize, the bench resulting of fexit is:

fexit          :   76.494 ± 0.216M/s
fexit          :   76.319 ± 0.097M/s
fexit          :   70.680 ± 0.060M/s
fexit          :   75.509 ± 0.039M/s
fexit          :   76.392 ± 0.049M/s

After this optimize:

fexit          :   86.023 ± 0.518M/s
fexit          :   83.388 ± 0.021M/s
fexit          :   85.146 ± 0.058M/s
fexit          :   85.646 ± 0.136M/s
fexit          :   84.040 ± 0.045M/s

Things become a little more complex, not sure if the benefits worth it :/

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 arch/x86/net/bpf_jit_comp.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index d4c93d9e73e4..a9c2142a84d0 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3185,6 +3185,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
 	void *orig_call = func_addr;
 	u8 **branches = NULL;
+	u8 *rsb_pos;
 	u8 *prog;
 	bool save_ret;
 
@@ -3431,17 +3432,42 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 		LOAD_TRAMP_TAIL_CALL_CNT_PTR(stack_size);
 	}
 
+	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
+		u64 ret_addr = (u64)(image + (prog - (u8 *)rw_image));
+
+		rsb_pos = prog;
+		/*
+		 * reserve the room to save the return address to rax:
+		 *   movabs rax, imm64
+		 *
+		 * this is used to do the RSB balance. For the SKIP_FRAME
+		 * case, we do the "call" twice, but only have one "ret",
+		 * which can break the RSB.
+		 *
+		 * Therefore, instead of skipping the "rip", we make it as
+		 * a pseudo return: modify the "rip" in the stack to the
+		 * second "ret" address that we build bellow.
+		 */
+		emit_mov_imm64(&prog, BPF_REG_0, ret_addr >> 32, (u32)ret_addr);
+		/* mov [rbp + 8], rax */
+		EMIT4(0x48, 0x89, 0x45, 0x08);
+	}
+
 	/* restore return value of orig_call or fentry prog back into RAX */
 	if (save_ret)
 		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
 
 	emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
 	EMIT1(0xC9); /* leave */
+	emit_return(&prog, image + (prog - (u8 *)rw_image));
 	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
-		/* skip our return address and return to parent */
-		EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
+		u64 ret_addr = (u64)(image + (prog - (u8 *)rw_image));
+
+		/* fix the return address to second return address */
+		emit_mov_imm64(&rsb_pos, BPF_REG_0, ret_addr >> 32, (u32)ret_addr);
+		/* this is the second(real) return */
+		emit_return(&prog, image + (prog - (u8 *)rw_image));
 	}
-	emit_return(&prog, image + (prog - (u8 *)rw_image));
 	/* Make sure the trampoline generation logic doesn't overflow */
 	if (WARN_ON_ONCE(prog > (u8 *)rw_image_end - BPF_INSN_SAFETY)) {
 		ret = -EFAULT;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-11-11  2:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 10:49 [PATCH bpf-next] bpf,x86: do RSB balance for trampoline Menglong Dong
2025-11-04 18:56 ` Alexei Starovoitov
2025-11-05  1:30   ` Menglong Dong
2025-11-05  2:12     ` Alexei Starovoitov
2025-11-05  7:13       ` Menglong Dong
2025-11-05  7:46         ` Menglong Dong
2025-11-05 23:31           ` Alexei Starovoitov
2025-11-06  1:40             ` Menglong Dong
2025-11-06  2:49               ` Menglong Dong
2025-11-06  2:56                 ` Alexei Starovoitov
2025-11-06  3:00                   ` Menglong Dong
2025-11-10 11:43                   ` Menglong Dong
2025-11-10 16:32                     ` Alexei Starovoitov
2025-11-11  1:28                       ` Menglong Dong
2025-11-11  2:41                         ` Alexei Starovoitov
2025-11-06 12:03           ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).