linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode
@ 2025-11-17  3:49 Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

For now, the bpf trampoline is called by the "call" instruction. However,
it break the RSB and introduce extra overhead in x86_64 arch.

For example, we hook the function "foo" with fexit, the call and return
logic will be like this:
  call foo -> call trampoline -> call foo-body ->
  return foo-body -> return foo

As we can see above, there are 3 call, but 2 return, which break the RSB
balance. We can pseudo a "return" here, but it's not the best choice,
as it will still cause once RSB miss:
  call foo -> call trampoline -> call foo-body ->
  return foo-body -> return dummy -> return foo

The "return dummy" doesn't pair the "call trampoline", which can also
cause the RSB miss.

Therefore, we introduce the "jmp" mode for bpf trampoline, as advised by
Alexei in [1]. And the logic will become this:
  call foo -> jmp trampoline -> call foo-body ->
  return foo-body -> return foo

As we can see above, the RSB is totally balanced. After the modification,
the performance of fexit increases from 76M/s to 130M/s.

In this series, we introduce the FTRACE_OPS_FL_JMP for ftrace to make it
use the "jmp" instruction instead of "call".

And we also do some adjustment to bpf_arch_text_poke() to allow us specify
the old and new poke_type.

Link: https://lore.kernel.org/bpf/20251114092450.172024-1-dongml2@chinatelecom.cn/
Changes since v1:
* change the bool parameter that we add to save_args() to "u32 flags"
* rename bpf_trampoline_need_jmp() to bpf_trampoline_use_jmp()
* add new function parameter to bpf_arch_text_poke instead of introduce
  bpf_arch_text_poke_type()
* rename bpf_text_poke to bpf_trampoline_update_fentry
* remove the BPF_TRAMP_F_JMPED and check the current mode with the origin
  flags instead.

Link: https://lore.kernel.org/bpf/CAADnVQLX54sVi1oaHrkSiLqjJaJdm3TQjoVrgU-LZimK6iDcSA@mail.gmail.com/[1]
Menglong Dong (6):
  ftrace: introduce FTRACE_OPS_FL_JMP
  x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP
  bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME
  bpf,x86: adjust the "jmp" mode for bpf trampoline
  bpf: specify the old and new poke_type for bpf_arch_text_poke
  bpf: implement "jmp" mode for trampoline

 arch/arm64/net/bpf_jit_comp.c   | 14 +++---
 arch/loongarch/net/bpf_jit.c    |  9 ++--
 arch/powerpc/net/bpf_jit_comp.c |  8 ++--
 arch/riscv/net/bpf_jit_comp64.c | 11 +++--
 arch/s390/net/bpf_jit_comp.c    |  7 +--
 arch/x86/Kconfig                |  1 +
 arch/x86/kernel/ftrace.c        |  7 ++-
 arch/x86/kernel/ftrace_64.S     | 12 ++++-
 arch/x86/net/bpf_jit_comp.c     | 55 +++++++++++++----------
 include/linux/bpf.h             | 18 +++++++-
 include/linux/ftrace.h          | 33 ++++++++++++++
 kernel/bpf/core.c               |  5 ++-
 kernel/bpf/trampoline.c         | 78 +++++++++++++++++++++++++--------
 kernel/trace/Kconfig            | 12 +++++
 kernel/trace/ftrace.c           |  9 +++-
 15 files changed, 212 insertions(+), 67 deletions(-)

-- 
2.51.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-18  5:19   ` Masami Hiramatsu
  2025-11-17  3:49 ` [PATCH bpf-next v2 2/6] x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP Menglong Dong
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

For now, the "nop" will be replaced with a "call" instruction when a
function is hooked by the ftrace. However, sometimes the "call" can break
the RSB and introduce extra overhead. Therefore, introduce the flag
FTRACE_OPS_FL_JMP, which indicate that the ftrace_ops should be called
with a "jmp" instead of "call". For now, it is only used by the direct
call case.

When a direct ftrace_ops is marked with FTRACE_OPS_FL_JMP, the last bit of
the ops->direct_call will be set to 1. Therefore, we can tell if we should
use "jmp" for the callback in ftrace_call_replace().

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 include/linux/ftrace.h | 33 +++++++++++++++++++++++++++++++++
 kernel/trace/Kconfig   | 12 ++++++++++++
 kernel/trace/ftrace.c  |  9 ++++++++-
 3 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 07f8c309e432..015dd1049bea 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -359,6 +359,7 @@ enum {
 	FTRACE_OPS_FL_DIRECT			= BIT(17),
 	FTRACE_OPS_FL_SUBOP			= BIT(18),
 	FTRACE_OPS_FL_GRAPH			= BIT(19),
+	FTRACE_OPS_FL_JMP			= BIT(20),
 };
 
 #ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
@@ -577,6 +578,38 @@ static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
 						 unsigned long addr) { }
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_JMP
+static inline bool ftrace_is_jmp(unsigned long addr)
+{
+	return addr & 1;
+}
+
+static inline unsigned long ftrace_jmp_set(unsigned long addr)
+{
+	return addr | 1UL;
+}
+
+static inline unsigned long ftrace_jmp_get(unsigned long addr)
+{
+	return addr & ~1UL;
+}
+#else
+static inline bool ftrace_is_jmp(unsigned long addr)
+{
+	return false;
+}
+
+static inline unsigned long ftrace_jmp_set(unsigned long addr)
+{
+	return addr;
+}
+
+static inline unsigned long ftrace_jmp_get(unsigned long addr)
+{
+	return addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_JMP */
+
 #ifdef CONFIG_STACK_TRACER
 
 int stack_trace_sysctl(const struct ctl_table *table, int write, void *buffer,
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index d2c79da81e4f..4661b9e606e0 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -80,6 +80,12 @@ config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
 	  If the architecture generates __patchable_function_entries sections
 	  but does not want them included in the ftrace locations.
 
+config HAVE_DYNAMIC_FTRACE_WITH_JMP
+	bool
+	help
+	  If the architecture supports to replace the __fentry__ with a
+	  "jmp" instruction.
+
 config HAVE_SYSCALL_TRACEPOINTS
 	bool
 	help
@@ -330,6 +336,12 @@ config DYNAMIC_FTRACE_WITH_ARGS
 	depends on DYNAMIC_FTRACE
 	depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
 
+config DYNAMIC_FTRACE_WITH_JMP
+	def_bool y
+	depends on DYNAMIC_FTRACE
+	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	depends on HAVE_DYNAMIC_FTRACE_WITH_JMP
+
 config FPROBE
 	bool "Kernel Function Probe (fprobe)"
 	depends on HAVE_FUNCTION_GRAPH_FREGS && HAVE_FTRACE_GRAPH_FUNC
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 59cfacb8a5bb..a6c060a4f50b 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5951,7 +5951,8 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
 	for (i = 0; i < size; i++) {
 		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
 			del = __ftrace_lookup_ip(direct_functions, entry->ip);
-			if (del && del->direct == addr) {
+			if (del && ftrace_jmp_get(del->direct) ==
+				   ftrace_jmp_get(addr)) {
 				remove_hash_entry(direct_functions, del);
 				kfree(del);
 			}
@@ -6018,6 +6019,9 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 
 	mutex_lock(&direct_mutex);
 
+	if (ops->flags & FTRACE_OPS_FL_JMP)
+		addr = ftrace_jmp_set(addr);
+
 	/* Make sure requested entries are not already registered.. */
 	size = 1 << hash->size_bits;
 	for (i = 0; i < size; i++) {
@@ -6138,6 +6142,9 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 
 	lockdep_assert_held_once(&direct_mutex);
 
+	if (ops->flags & FTRACE_OPS_FL_JMP)
+		addr = ftrace_jmp_set(addr);
+
 	/* Enable the tmp_ops to have the same functions as the direct ops */
 	ftrace_ops_init(&tmp_ops);
 	tmp_ops.func_hash = ops->func_hash;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 2/6] x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 3/6] bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME Menglong Dong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

Implement the DYNAMIC_FTRACE_WITH_JMP for x86_64. In ftrace_call_replace,
we will use JMP32_INSN_OPCODE instead of CALL_INSN_OPCODE if the address
should use "jmp".

Meanwhile, adjust the direct call in the ftrace_regs_caller. The RSB is
balanced in the "jmp" mode. Take the function "foo" for example:

 original_caller:
 call foo -> foo:
         call fentry -> fentry:
                 [do ftrace callbacks ]
                 move tramp_addr to stack
                 RET -> tramp_addr
                         tramp_addr:
                         [..]
                         call foo_body -> foo_body:
                                 [..]
                                 RET -> back to tramp_addr
                         [..]
                         RET -> back to original_caller

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 arch/x86/Kconfig            |  1 +
 arch/x86/kernel/ftrace.c    |  7 ++++++-
 arch/x86/kernel/ftrace_64.S | 12 +++++++++++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index fa3b616af03a..462250a20311 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -230,6 +230,7 @@ config X86
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS	if X86_64
 	select HAVE_FTRACE_REGS_HAVING_PT_REGS	if X86_64
 	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	select HAVE_DYNAMIC_FTRACE_WITH_JMP	if X86_64
 	select HAVE_SAMPLE_FTRACE_DIRECT	if X86_64
 	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI	if X86_64
 	select HAVE_EBPF_JIT
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 4450acec9390..0543b57f54ee 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -74,7 +74,12 @@ static const char *ftrace_call_replace(unsigned long ip, unsigned long addr)
 	 * No need to translate into a callthunk. The trampoline does
 	 * the depth accounting itself.
 	 */
-	return text_gen_insn(CALL_INSN_OPCODE, (void *)ip, (void *)addr);
+	if (ftrace_is_jmp(addr)) {
+		addr = ftrace_jmp_get(addr);
+		return text_gen_insn(JMP32_INSN_OPCODE, (void *)ip, (void *)addr);
+	} else {
+		return text_gen_insn(CALL_INSN_OPCODE, (void *)ip, (void *)addr);
+	}
 }
 
 static int ftrace_verify_code(unsigned long ip, const char *old_code)
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 823dbdd0eb41..a132608265f6 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -285,8 +285,18 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
 	ANNOTATE_NOENDBR
 	RET
 
+1:
+	testb	$1, %al
+	jz	2f
+	andq $0xfffffffffffffffe, %rax
+	movq %rax, MCOUNT_REG_SIZE+8(%rsp)
+	restore_mcount_regs
+	/* Restore flags */
+	popfq
+	RET
+
 	/* Swap the flags with orig_rax */
-1:	movq MCOUNT_REG_SIZE(%rsp), %rdi
+2:	movq MCOUNT_REG_SIZE(%rsp), %rdi
 	movq %rdi, MCOUNT_REG_SIZE-8(%rsp)
 	movq %rax, MCOUNT_REG_SIZE(%rsp)
 
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 3/6] bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 2/6] x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 4/6] bpf,x86: adjust the "jmp" mode for bpf trampoline Menglong Dong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

Some places calculate the origin_call by checking if
BPF_TRAMP_F_SKIP_FRAME is set. However, it should use
BPF_TRAMP_F_ORIG_STACK for this propose. Just fix them.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/riscv/net/bpf_jit_comp64.c | 2 +-
 arch/x86/net/bpf_jit_comp.c     | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 45cbc7c6fe49..21c70ae3296b 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -1131,7 +1131,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 	store_args(nr_arg_slots, args_off, ctx);
 
 	/* skip to actual body of traced function */
-	if (flags & BPF_TRAMP_F_SKIP_FRAME)
+	if (flags & BPF_TRAMP_F_ORIG_STACK)
 		orig_call += RV_FENTRY_NINSNS * 4;
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 36a0d4db9f68..808d4343f6cf 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3289,7 +3289,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 
 	arg_stack_off = stack_size;
 
-	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		/* skip patched call instruction and point orig_call to actual
 		 * body of the kernel function.
 		 */
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 4/6] bpf,x86: adjust the "jmp" mode for bpf trampoline
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
                   ` (2 preceding siblings ...)
  2025-11-17  3:49 ` [PATCH bpf-next v2 3/6] bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-17  3:49 ` [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke Menglong Dong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

In the origin call case, if BPF_TRAMP_F_SKIP_FRAME is not set, it means
that the trampoline is not called, but "jmp".

Introduce the function bpf_trampoline_use_jmp() to check if the trampoline
is in "jmp" mode.

Do some adjustment on the "jmp" mode for the x86_64. The main adjustment
that we make is for the stack parameter passing case, as the stack
alignment logic changes in the "jmp" mode without the "rip". What's more,
the location of the parameters on the stack also changes.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v2:
- rename bpf_trampoline_need_jmp() to bpf_trampoline_use_jmp()
---
 arch/x86/net/bpf_jit_comp.c | 16 +++++++++++-----
 include/linux/bpf.h         | 12 ++++++++++++
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 808d4343f6cf..632a83381c2d 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2847,9 +2847,10 @@ static int get_nr_used_regs(const struct btf_func_model *m)
 }
 
 static void save_args(const struct btf_func_model *m, u8 **prog,
-		      int stack_size, bool for_call_origin)
+		      int stack_size, bool for_call_origin, u32 flags)
 {
 	int arg_regs, first_off = 0, nr_regs = 0, nr_stack_slots = 0;
+	bool use_jmp = bpf_trampoline_use_jmp(flags);
 	int i, j;
 
 	/* Store function arguments to stack.
@@ -2890,7 +2891,7 @@ static void save_args(const struct btf_func_model *m, u8 **prog,
 			 */
 			for (j = 0; j < arg_regs; j++) {
 				emit_ldx(prog, BPF_DW, BPF_REG_0, BPF_REG_FP,
-					 nr_stack_slots * 8 + 0x18);
+					 nr_stack_slots * 8 + 16 + (!use_jmp) * 8);
 				emit_stx(prog, BPF_DW, BPF_REG_FP, BPF_REG_0,
 					 -stack_size);
 
@@ -3284,7 +3285,12 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 		 * should be 16-byte aligned. Following code depend on
 		 * that stack_size is already 8-byte aligned.
 		 */
-		stack_size += (stack_size % 16) ? 0 : 8;
+		if (bpf_trampoline_use_jmp(flags)) {
+			/* no rip in the "jmp" case */
+			stack_size += (stack_size % 16) ? 8 : 0;
+		} else {
+			stack_size += (stack_size % 16) ? 0 : 8;
+		}
 	}
 
 	arg_stack_off = stack_size;
@@ -3344,7 +3350,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
 	}
 
-	save_args(m, &prog, regs_off, false);
+	save_args(m, &prog, regs_off, false, flags);
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		/* arg1: mov rdi, im */
@@ -3377,7 +3383,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		restore_regs(m, &prog, regs_off);
-		save_args(m, &prog, arg_stack_off, true);
+		save_args(m, &prog, arg_stack_off, true, flags);
 
 		if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) {
 			/* Before calling the original function, load the
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 09d5dc541d1c..4187b7578580 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1264,6 +1264,18 @@ typedef void (*bpf_trampoline_exit_t)(struct bpf_prog *prog, u64 start,
 bpf_trampoline_enter_t bpf_trampoline_enter(const struct bpf_prog *prog);
 bpf_trampoline_exit_t bpf_trampoline_exit(const struct bpf_prog *prog);
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_JMP
+static inline bool bpf_trampoline_use_jmp(u64 flags)
+{
+	return flags & BPF_TRAMP_F_CALL_ORIG && !(flags & BPF_TRAMP_F_SKIP_FRAME);
+}
+#else
+static inline bool bpf_trampoline_use_jmp(u64 flags)
+{
+	return false;
+}
+#endif
+
 struct bpf_ksym {
 	unsigned long		 start;
 	unsigned long		 end;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
                   ` (3 preceding siblings ...)
  2025-11-17  3:49 ` [PATCH bpf-next v2 4/6] bpf,x86: adjust the "jmp" mode for bpf trampoline Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-17 20:55   ` kernel test robot
  2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
  2025-11-18  6:31 ` [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Alexei Starovoitov
  6 siblings, 1 reply; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

In the origin logic, the bpf_arch_text_poke() assume that the old and new
instructions have the same opcode. However, they can have different opcode
if we want to replace a "call" insn with a "jmp" insn.

Therefore, add the new function parameter "old_t" along with the "new_t",
which are used to indicate the old and new poke type. Meanwhile, adjust
the implement of bpf_arch_text_poke() for all the archs.

"BPF_MOD_NOP" is added to make the code more readable. In
bpf_arch_text_poke(), we still check if the new and old address is NULL to
determine if nop insn should be used, which I think is more safe.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v2:
- add new function parameter to bpf_arch_text_poke instead of introduce
  bpf_arch_text_poke_type()
---
 arch/arm64/net/bpf_jit_comp.c   | 14 ++++++-------
 arch/loongarch/net/bpf_jit.c    |  9 +++++---
 arch/powerpc/net/bpf_jit_comp.c |  8 ++++---
 arch/riscv/net/bpf_jit_comp64.c |  9 +++++---
 arch/s390/net/bpf_jit_comp.c    |  7 ++++---
 arch/x86/net/bpf_jit_comp.c     | 37 +++++++++++++++++++--------------
 include/linux/bpf.h             |  6 ++++--
 kernel/bpf/core.c               |  5 +++--
 kernel/bpf/trampoline.c         | 20 ++++++++++++------
 9 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0c9a50a1e73e..c64df579b7e0 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2923,8 +2923,9 @@ static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
  * The dummy_tramp is used to prevent another CPU from jumping to unknown
  * locations during the patching process, making the patching process easier.
  */
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	int ret;
 	u32 old_insn;
@@ -2968,14 +2969,13 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 		    !poking_bpf_entry))
 		return -EINVAL;
 
-	if (poke_type == BPF_MOD_CALL)
-		branch_type = AARCH64_INSN_BRANCH_LINK;
-	else
-		branch_type = AARCH64_INSN_BRANCH_NOLINK;
-
+	branch_type = old_t == BPF_MOD_CALL ? AARCH64_INSN_BRANCH_LINK :
+					      AARCH64_INSN_BRANCH_NOLINK;
 	if (gen_branch_or_nop(branch_type, ip, old_addr, plt, &old_insn) < 0)
 		return -EFAULT;
 
+	branch_type = new_t == BPF_MOD_CALL ? AARCH64_INSN_BRANCH_LINK :
+					      AARCH64_INSN_BRANCH_NOLINK;
 	if (gen_branch_or_nop(branch_type, ip, new_addr, plt, &new_insn) < 0)
 		return -EFAULT;
 
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index cbe53d0b7fb0..2e7dacbbef5c 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1284,11 +1284,12 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
 	return ret ? ERR_PTR(-EINVAL) : dst;
 }
 
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	int ret;
-	bool is_call = (poke_type == BPF_MOD_CALL);
+	bool is_call;
 	u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
 	u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
 
@@ -1298,6 +1299,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 	if (!is_bpf_text_address((unsigned long)ip))
 		return -ENOTSUPP;
 
+	is_call = old_t == BPF_MOD_CALL;
 	ret = emit_jump_or_nops(old_addr, ip, old_insns, is_call);
 	if (ret)
 		return ret;
@@ -1305,6 +1307,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 	if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
 		return -EFAULT;
 
+	is_call = new_t == BPF_MOD_CALL;
 	ret = emit_jump_or_nops(new_addr, ip, new_insns, is_call);
 	if (ret)
 		return ret;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 88ad5ba7b87f..28faf721ea64 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -1107,8 +1107,9 @@ static void do_isync(void *info __maybe_unused)
  * execute isync (or some CSI) so that they don't go back into the
  * trampoline again.
  */
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	unsigned long bpf_func, bpf_func_end, size, offset;
 	ppc_inst_t old_inst, new_inst;
@@ -1119,7 +1120,6 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 		return -EOPNOTSUPP;
 
 	bpf_func = (unsigned long)ip;
-	branch_flags = poke_type == BPF_MOD_CALL ? BRANCH_SET_LINK : 0;
 
 	/* We currently only support poking bpf programs */
 	if (!__bpf_address_lookup(bpf_func, &size, &offset, name)) {
@@ -1166,6 +1166,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 	}
 
 	old_inst = ppc_inst(PPC_RAW_NOP());
+	branch_flags = old_t == BPF_MOD_CALL ? BRANCH_SET_LINK : 0;
 	if (old_addr) {
 		if (is_offset_in_branch_range(ip - old_addr))
 			create_branch(&old_inst, ip, (unsigned long)old_addr, branch_flags);
@@ -1174,6 +1175,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 				      branch_flags);
 	}
 	new_inst = ppc_inst(PPC_RAW_NOP());
+	branch_flags = new_t == BPF_MOD_CALL ? BRANCH_SET_LINK : 0;
 	if (new_addr) {
 		if (is_offset_in_branch_range(ip - new_addr))
 			create_branch(&new_inst, ip, (unsigned long)new_addr, branch_flags);
diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 21c70ae3296b..5f9457e910e8 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -852,17 +852,19 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
 	return emit_jump_and_link(is_call ? RV_REG_T0 : RV_REG_ZERO, rvoff, false, &ctx);
 }
 
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	u32 old_insns[RV_FENTRY_NINSNS], new_insns[RV_FENTRY_NINSNS];
-	bool is_call = poke_type == BPF_MOD_CALL;
+	bool is_call;
 	int ret;
 
 	if (!is_kernel_text((unsigned long)ip) &&
 	    !is_bpf_text_address((unsigned long)ip))
 		return -ENOTSUPP;
 
+	is_call = old_t == BPF_MOD_CALL;
 	ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
 	if (ret)
 		return ret;
@@ -870,6 +872,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 	if (memcmp(ip, old_insns, RV_FENTRY_NBYTES))
 		return -EFAULT;
 
+	is_call = new_t == BPF_MOD_CALL;
 	ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
 	if (ret)
 		return ret;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index cf461d76e9da..1eb441098fd8 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -2413,8 +2413,9 @@ bool bpf_jit_supports_far_kfunc_call(void)
 	return true;
 }
 
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	struct bpf_plt expected_plt, current_plt, new_plt, *plt;
 	struct {
@@ -2431,7 +2432,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 	if (insn.opc != (0xc004 | (old_addr ? 0xf0 : 0)))
 		return -EINVAL;
 
-	if (t == BPF_MOD_JUMP &&
+	if (old_t == BPF_MOD_JUMP && new_t == BPF_MOD_JUMP &&
 	    insn.disp == ((char *)new_addr - (char *)ip) >> 1) {
 		/*
 		 * The branch already points to the destination,
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 632a83381c2d..b69dc7194e2c 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -597,7 +597,8 @@ static int emit_jump(u8 **pprog, void *func, void *ip)
 	return emit_patch(pprog, func, ip, 0xE9);
 }
 
-static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
+static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+				enum bpf_text_poke_type new_t,
 				void *old_addr, void *new_addr)
 {
 	const u8 *nop_insn = x86_nops[5];
@@ -607,9 +608,9 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 	int ret;
 
 	memcpy(old_insn, nop_insn, X86_PATCH_SIZE);
-	if (old_addr) {
+	if (old_t != BPF_MOD_NOP && old_addr) {
 		prog = old_insn;
-		ret = t == BPF_MOD_CALL ?
+		ret = old_t == BPF_MOD_CALL ?
 		      emit_call(&prog, old_addr, ip) :
 		      emit_jump(&prog, old_addr, ip);
 		if (ret)
@@ -617,9 +618,9 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 	}
 
 	memcpy(new_insn, nop_insn, X86_PATCH_SIZE);
-	if (new_addr) {
+	if (new_t != BPF_MOD_NOP && new_addr) {
 		prog = new_insn;
-		ret = t == BPF_MOD_CALL ?
+		ret = new_t == BPF_MOD_CALL ?
 		      emit_call(&prog, new_addr, ip) :
 		      emit_jump(&prog, new_addr, ip);
 		if (ret)
@@ -640,8 +641,9 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 	return ret;
 }
 
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
-		       void *old_addr, void *new_addr)
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr)
 {
 	if (!is_kernel_text((long)ip) &&
 	    !is_bpf_text_address((long)ip))
@@ -655,7 +657,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 	if (is_endbr(ip))
 		ip += ENDBR_INSN_SIZE;
 
-	return __bpf_arch_text_poke(ip, t, old_addr, new_addr);
+	return __bpf_arch_text_poke(ip, old_t, new_t, old_addr, new_addr);
 }
 
 #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
@@ -897,12 +899,13 @@ static void bpf_tail_call_direct_fixup(struct bpf_prog *prog)
 		target = array->ptrs[poke->tail_call.key];
 		if (target) {
 			ret = __bpf_arch_text_poke(poke->tailcall_target,
-						   BPF_MOD_JUMP, NULL,
+						   BPF_MOD_NOP, BPF_MOD_JUMP,
+						   NULL,
 						   (u8 *)target->bpf_func +
 						   poke->adj_off);
 			BUG_ON(ret < 0);
 			ret = __bpf_arch_text_poke(poke->tailcall_bypass,
-						   BPF_MOD_JUMP,
+						   BPF_MOD_JUMP, BPF_MOD_NOP,
 						   (u8 *)poke->tailcall_target +
 						   X86_PATCH_SIZE, NULL);
 			BUG_ON(ret < 0);
@@ -3985,6 +3988,7 @@ void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
 			       struct bpf_prog *new, struct bpf_prog *old)
 {
 	u8 *old_addr, *new_addr, *old_bypass_addr;
+	enum bpf_text_poke_type t;
 	int ret;
 
 	old_bypass_addr = old ? NULL : poke->bypass_addr;
@@ -3997,21 +4001,22 @@ void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
 	 * the kallsyms check.
 	 */
 	if (new) {
+		t = old_addr ? BPF_MOD_JUMP : BPF_MOD_NOP;
 		ret = __bpf_arch_text_poke(poke->tailcall_target,
-					   BPF_MOD_JUMP,
+					   t, BPF_MOD_JUMP,
 					   old_addr, new_addr);
 		BUG_ON(ret < 0);
 		if (!old) {
 			ret = __bpf_arch_text_poke(poke->tailcall_bypass,
-						   BPF_MOD_JUMP,
+						   BPF_MOD_JUMP, BPF_MOD_NOP,
 						   poke->bypass_addr,
 						   NULL);
 			BUG_ON(ret < 0);
 		}
 	} else {
+		t = old_bypass_addr ? BPF_MOD_JUMP : BPF_MOD_NOP;
 		ret = __bpf_arch_text_poke(poke->tailcall_bypass,
-					   BPF_MOD_JUMP,
-					   old_bypass_addr,
+					   t, BPF_MOD_JUMP, old_bypass_addr,
 					   poke->bypass_addr);
 		BUG_ON(ret < 0);
 		/* let other CPUs finish the execution of program
@@ -4020,9 +4025,9 @@ void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
 		 */
 		if (!ret)
 			synchronize_rcu();
+		t = old_addr ? BPF_MOD_JUMP : BPF_MOD_NOP;
 		ret = __bpf_arch_text_poke(poke->tailcall_target,
-					   BPF_MOD_JUMP,
-					   old_addr, NULL);
+					   t, BPF_MOD_NOP, old_addr, NULL);
 		BUG_ON(ret < 0);
 	}
 }
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 4187b7578580..d5e2af29c7c8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3708,12 +3708,14 @@ static inline u32 bpf_xdp_sock_convert_ctx_access(enum bpf_access_type type,
 #endif /* CONFIG_INET */
 
 enum bpf_text_poke_type {
+	BPF_MOD_NOP,
 	BPF_MOD_CALL,
 	BPF_MOD_JUMP,
 };
 
-int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
-		       void *addr1, void *addr2);
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+		       enum bpf_text_poke_type new_t, void *old_addr,
+		       void *new_addr);
 
 void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
 			       struct bpf_prog *new, struct bpf_prog *old);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index ef4448f18aad..c8ae6ab31651 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -3150,8 +3150,9 @@ int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
 	return -EFAULT;
 }
 
-int __weak bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
-			      void *addr1, void *addr2)
+int __weak bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
+			      enum bpf_text_poke_type new_t, void *old_addr,
+			      void *new_addr)
 {
 	return -ENOTSUPP;
 }
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 04104397c432..2dcc999a411f 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -183,7 +183,8 @@ static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
 	if (tr->func.ftrace_managed)
 		ret = unregister_ftrace_direct(tr->fops, (long)old_addr, false);
 	else
-		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, NULL);
+		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, BPF_MOD_NOP,
+					 old_addr, NULL);
 
 	return ret;
 }
@@ -200,7 +201,10 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
 		else
 			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
 	} else {
-		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, new_addr);
+		ret = bpf_arch_text_poke(ip,
+					 old_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
+					 new_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
+					 old_addr, new_addr);
 	}
 	return ret;
 }
@@ -225,7 +229,8 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
 			return ret;
 		ret = register_ftrace_direct(tr->fops, (long)new_addr);
 	} else {
-		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr);
+		ret = bpf_arch_text_poke(ip, BPF_MOD_NOP, BPF_MOD_CALL,
+					 NULL, new_addr);
 	}
 
 	return ret;
@@ -336,8 +341,9 @@ static void bpf_tramp_image_put(struct bpf_tramp_image *im)
 	 * call_rcu_tasks() is not necessary.
 	 */
 	if (im->ip_after_call) {
-		int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP,
-					     NULL, im->ip_epilogue);
+		int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_NOP,
+					      BPF_MOD_JUMP, NULL,
+					      im->ip_epilogue);
 		WARN_ON(err);
 		if (IS_ENABLED(CONFIG_TASKS_RCU))
 			call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu_tasks);
@@ -570,7 +576,8 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 		if (err)
 			return err;
 		tr->extension_prog = link->link.prog;
-		return bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+		return bpf_arch_text_poke(tr->func.addr, BPF_MOD_NOP,
+					  BPF_MOD_JUMP, NULL,
 					  link->link.prog->bpf_func);
 	}
 	if (cnt >= BPF_MAX_TRAMP_LINKS)
@@ -618,6 +625,7 @@ static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
 	if (kind == BPF_TRAMP_REPLACE) {
 		WARN_ON_ONCE(!tr->extension_prog);
 		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
+					 BPF_MOD_NOP,
 					 tr->extension_prog->bpf_func, NULL);
 		tr->extension_prog = NULL;
 		guard(mutex)(&tgt_prog->aux->ext_mutex);
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
                   ` (4 preceding siblings ...)
  2025-11-17  3:49 ` [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke Menglong Dong
@ 2025-11-17  3:49 ` Menglong Dong
  2025-11-17 22:49   ` kernel test robot
                     ` (2 more replies)
  2025-11-18  6:31 ` [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Alexei Starovoitov
  6 siblings, 3 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-17  3:49 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

Implement the "jmp" mode for the bpf trampoline. For the ftrace_managed
case, we need only to set the FTRACE_OPS_FL_JMP on the tr->fops if "jmp"
is needed.

For the bpf poke case, we will check the origin poke type with the
"origin_flags", and current poke type with "tr->flags". The function
bpf_trampoline_update_fentry() is introduced to do the job.

The "jmp" mode will only be enabled with CONFIG_DYNAMIC_FTRACE_WITH_JMP
enabled and BPF_TRAMP_F_SHARE_IPMODIFY is not set. With
BPF_TRAMP_F_SHARE_IPMODIFY, we need to get the origin call ip from the
stack, so we can't use the "jmp" mode.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v2:
- rename bpf_text_poke to bpf_trampoline_update_fentry
- remove the BPF_TRAMP_F_JMPED and check the current mode with the origin
  flags instead.
---
 kernel/bpf/trampoline.c | 68 ++++++++++++++++++++++++++++++-----------
 1 file changed, 51 insertions(+), 17 deletions(-)

diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 2dcc999a411f..80ab435d6e00 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -175,24 +175,42 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
 	return tr;
 }
 
-static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
+static int bpf_trampoline_update_fentry(struct bpf_trampoline *tr, u32 orig_flags,
+					void *old_addr, void *new_addr)
 {
+	enum bpf_text_poke_type new_t = BPF_MOD_CALL, old_t = BPF_MOD_CALL;
 	void *ip = tr->func.addr;
+
+	if (!new_addr)
+		new_t = BPF_MOD_NOP;
+	else if (bpf_trampoline_use_jmp(tr->flags))
+		new_t = BPF_MOD_JUMP;
+
+	if (!old_addr)
+		old_t = BPF_MOD_NOP;
+	else if (bpf_trampoline_use_jmp(orig_flags))
+		old_t = BPF_MOD_JUMP;
+
+	return bpf_arch_text_poke(ip, old_t, new_t, old_addr, new_addr);
+}
+
+static int unregister_fentry(struct bpf_trampoline *tr, u32 orig_flags,
+			     void *old_addr)
+{
 	int ret;
 
 	if (tr->func.ftrace_managed)
 		ret = unregister_ftrace_direct(tr->fops, (long)old_addr, false);
 	else
-		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, BPF_MOD_NOP,
-					 old_addr, NULL);
+		ret = bpf_trampoline_update_fentry(tr, orig_flags, old_addr, NULL);
 
 	return ret;
 }
 
-static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_addr,
+static int modify_fentry(struct bpf_trampoline *tr, u32 orig_flags,
+			 void *old_addr, void *new_addr,
 			 bool lock_direct_mutex)
 {
-	void *ip = tr->func.addr;
 	int ret;
 
 	if (tr->func.ftrace_managed) {
@@ -201,10 +219,8 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
 		else
 			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
 	} else {
-		ret = bpf_arch_text_poke(ip,
-					 old_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
-					 new_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
-					 old_addr, new_addr);
+		ret = bpf_trampoline_update_fentry(tr, orig_flags, old_addr,
+						   new_addr);
 	}
 	return ret;
 }
@@ -229,8 +245,7 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
 			return ret;
 		ret = register_ftrace_direct(tr->fops, (long)new_addr);
 	} else {
-		ret = bpf_arch_text_poke(ip, BPF_MOD_NOP, BPF_MOD_CALL,
-					 NULL, new_addr);
+		ret = bpf_trampoline_update_fentry(tr, 0, NULL, new_addr);
 	}
 
 	return ret;
@@ -416,7 +431,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
 		return PTR_ERR(tlinks);
 
 	if (total == 0) {
-		err = unregister_fentry(tr, tr->cur_image->image);
+		err = unregister_fentry(tr, orig_flags, tr->cur_image->image);
 		bpf_tramp_image_put(tr->cur_image);
 		tr->cur_image = NULL;
 		goto out;
@@ -440,9 +455,17 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 again:
-	if ((tr->flags & BPF_TRAMP_F_SHARE_IPMODIFY) &&
-	    (tr->flags & BPF_TRAMP_F_CALL_ORIG))
-		tr->flags |= BPF_TRAMP_F_ORIG_STACK;
+	if (tr->flags & BPF_TRAMP_F_CALL_ORIG) {
+		if (tr->flags & BPF_TRAMP_F_SHARE_IPMODIFY) {
+			tr->flags |= BPF_TRAMP_F_ORIG_STACK;
+		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_JMP)) {
+			/* Use "jmp" instead of "call" for the trampoline
+			 * in the origin call case, and we don't need to
+			 * skip the frame.
+			 */
+			tr->flags &= ~BPF_TRAMP_F_SKIP_FRAME;
+		}
+	}
 #endif
 
 	size = arch_bpf_trampoline_size(&tr->func.model, tr->flags,
@@ -473,10 +496,16 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
 	if (err)
 		goto out_free;
 
+	if (bpf_trampoline_use_jmp(tr->flags))
+		tr->fops->flags |= FTRACE_OPS_FL_JMP;
+	else
+		tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
+
 	WARN_ON(tr->cur_image && total == 0);
 	if (tr->cur_image)
 		/* progs already running at this address */
-		err = modify_fentry(tr, tr->cur_image->image, im->image, lock_direct_mutex);
+		err = modify_fentry(tr, orig_flags, tr->cur_image->image,
+				    im->image, lock_direct_mutex);
 	else
 		/* first time registering */
 		err = register_fentry(tr, im->image);
@@ -499,8 +528,13 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
 	tr->cur_image = im;
 out:
 	/* If any error happens, restore previous flags */
-	if (err)
+	if (err) {
 		tr->flags = orig_flags;
+		if (bpf_trampoline_use_jmp(tr->flags))
+			tr->fops->flags |= FTRACE_OPS_FL_JMP;
+		else
+			tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
+	}
 	kfree(tlinks);
 	return err;
 
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke
  2025-11-17  3:49 ` [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke Menglong Dong
@ 2025-11-17 20:55   ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-11-17 20:55 UTC (permalink / raw)
  To: Menglong Dong, ast, rostedt
  Cc: oe-kbuild-all, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mhiramat, mark.rutland, mathieu.desnoyers, jiang.biao, bpf,
	linux-kernel, linux-trace-kernel

Hi Menglong,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Menglong-Dong/ftrace-introduce-FTRACE_OPS_FL_JMP/20251117-115243
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20251117034906.32036-6-dongml2%40chinatelecom.cn
patch subject: [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke
config: powerpc64-randconfig-002-20251118 (https://download.01.org/0day-ci/archive/20251118/202511180431.JVOEm6SO-lkp@intel.com/config)
compiler: powerpc64-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511180431.JVOEm6SO-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511180431.JVOEm6SO-lkp@intel.com/

All errors (new ones prefixed by >>):

   arch/powerpc/net/bpf_jit_comp.c: In function 'bpf_arch_text_poke':
>> arch/powerpc/net/bpf_jit_comp.c:1135:7: error: 'poke_type' undeclared (first use in this function); did you mean 'probe_type'?
      if (poke_type != BPF_MOD_JUMP) {
          ^~~~~~~~~
          probe_type
   arch/powerpc/net/bpf_jit_comp.c:1135:7: note: each undeclared identifier is reported only once for each function it appears in


vim +1135 arch/powerpc/net/bpf_jit_comp.c

d243b62b7bd3d5 Naveen N Rao  2024-10-30  1070  
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1071  /*
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1072   * A 3-step process for bpf prog entry:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1073   * 1. At bpf prog entry, a single nop/b:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1074   * bpf_func:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1075   *	[nop|b]	ool_stub
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1076   * 2. Out-of-line stub:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1077   * ool_stub:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1078   *	mflr	r0
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1079   *	[b|bl]	<bpf_prog>/<long_branch_stub>
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1080   *	mtlr	r0 // CONFIG_PPC_FTRACE_OUT_OF_LINE only
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1081   *	b	bpf_func + 4
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1082   * 3. Long branch stub:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1083   * long_branch_stub:
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1084   *	.long	<branch_addr>/<dummy_tramp>
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1085   *	mflr	r11
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1086   *	bcl	20,31,$+4
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1087   *	mflr	r12
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1088   *	ld	r12, -16(r12)
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1089   *	mtctr	r12
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1090   *	mtlr	r11 // needed to retain ftrace ABI
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1091   *	bctr
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1092   *
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1093   * dummy_tramp is used to reduce synchronization requirements.
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1094   *
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1095   * When attaching a bpf trampoline to a bpf prog, we do not need any
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1096   * synchronization here since we always have a valid branch target regardless
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1097   * of the order in which the above stores are seen. dummy_tramp ensures that
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1098   * the long_branch stub goes to a valid destination on other cpus, even when
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1099   * the branch to the long_branch stub is seen before the updated trampoline
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1100   * address.
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1101   *
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1102   * However, when detaching a bpf trampoline from a bpf prog, or if changing
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1103   * the bpf trampoline address, we need synchronization to ensure that other
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1104   * cpus can no longer branch into the older trampoline so that it can be
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1105   * safely freed. bpf_tramp_image_put() uses rcu_tasks to ensure all cpus
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1106   * make forward progress, but we still need to ensure that other cpus
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1107   * execute isync (or some CSI) so that they don't go back into the
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1108   * trampoline again.
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1109   */
7cc5910294285d Menglong Dong 2025-11-17  1110  int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
7cc5910294285d Menglong Dong 2025-11-17  1111  		       enum bpf_text_poke_type new_t, void *old_addr,
7cc5910294285d Menglong Dong 2025-11-17  1112  		       void *new_addr)
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1113  {
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1114  	unsigned long bpf_func, bpf_func_end, size, offset;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1115  	ppc_inst_t old_inst, new_inst;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1116  	int ret = 0, branch_flags;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1117  	char name[KSYM_NAME_LEN];
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1118  
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1119  	if (IS_ENABLED(CONFIG_PPC32))
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1120  		return -EOPNOTSUPP;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1121  
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1122  	bpf_func = (unsigned long)ip;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1123  
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1124  	/* We currently only support poking bpf programs */
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1125  	if (!__bpf_address_lookup(bpf_func, &size, &offset, name)) {
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1126  		pr_err("%s (0x%lx): kernel/modules are not supported\n", __func__, bpf_func);
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1127  		return -EOPNOTSUPP;
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1128  	}
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1129  
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1130  	/*
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1131  	 * If we are not poking at bpf prog entry, then we are simply patching in/out
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1132  	 * an unconditional branch instruction at im->ip_after_call
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1133  	 */
d243b62b7bd3d5 Naveen N Rao  2024-10-30  1134  	if (offset) {
d243b62b7bd3d5 Naveen N Rao  2024-10-30 @1135  		if (poke_type != BPF_MOD_JUMP) {

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
  2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
@ 2025-11-17 22:49   ` kernel test robot
  2025-11-18  1:20   ` Menglong Dong
  2025-11-18  5:09   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-11-17 22:49 UTC (permalink / raw)
  To: Menglong Dong, ast, rostedt
  Cc: llvm, oe-kbuild-all, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mhiramat, mark.rutland, mathieu.desnoyers, jiang.biao, bpf,
	linux-kernel, linux-trace-kernel

Hi Menglong,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Menglong-Dong/ftrace-introduce-FTRACE_OPS_FL_JMP/20251117-115243
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20251117034906.32036-7-dongml2%40chinatelecom.cn
patch subject: [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
config: arm64-randconfig-002-20251118 (https://download.01.org/0day-ci/archive/20251118/202511180613.Om7k1nP1-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 0bba1e76581bad04e7d7f09f5115ae5e2989e0d9)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511180613.Om7k1nP1-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511180613.Om7k1nP1-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/bpf/trampoline.c:500:11: error: incomplete definition of type 'struct ftrace_ops'
     500 |                 tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |                 ~~~~~~~~^
   include/linux/ftrace.h:40:8: note: forward declaration of 'struct ftrace_ops'
      40 | struct ftrace_ops;
         |        ^
>> kernel/bpf/trampoline.c:500:22: error: use of undeclared identifier 'FTRACE_OPS_FL_JMP'
     500 |                 tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |                                    ^~~~~~~~~~~~~~~~~
   kernel/bpf/trampoline.c:502:11: error: incomplete definition of type 'struct ftrace_ops'
     502 |                 tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |                 ~~~~~~~~^
   include/linux/ftrace.h:40:8: note: forward declaration of 'struct ftrace_ops'
      40 | struct ftrace_ops;
         |        ^
   kernel/bpf/trampoline.c:502:23: error: use of undeclared identifier 'FTRACE_OPS_FL_JMP'
     502 |                 tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |                                     ^~~~~~~~~~~~~~~~~
   kernel/bpf/trampoline.c:534:12: error: incomplete definition of type 'struct ftrace_ops'
     534 |                         tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |                         ~~~~~~~~^
   include/linux/ftrace.h:40:8: note: forward declaration of 'struct ftrace_ops'
      40 | struct ftrace_ops;
         |        ^
   kernel/bpf/trampoline.c:534:23: error: use of undeclared identifier 'FTRACE_OPS_FL_JMP'
     534 |                         tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |                                            ^~~~~~~~~~~~~~~~~
   kernel/bpf/trampoline.c:536:12: error: incomplete definition of type 'struct ftrace_ops'
     536 |                         tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |                         ~~~~~~~~^
   include/linux/ftrace.h:40:8: note: forward declaration of 'struct ftrace_ops'
      40 | struct ftrace_ops;
         |        ^
   kernel/bpf/trampoline.c:536:24: error: use of undeclared identifier 'FTRACE_OPS_FL_JMP'
     536 |                         tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |                                             ^~~~~~~~~~~~~~~~~
   8 errors generated.


vim +/FTRACE_OPS_FL_JMP +500 kernel/bpf/trampoline.c

   470	
   471		size = arch_bpf_trampoline_size(&tr->func.model, tr->flags,
   472						tlinks, tr->func.addr);
   473		if (size < 0) {
   474			err = size;
   475			goto out;
   476		}
   477	
   478		if (size > PAGE_SIZE) {
   479			err = -E2BIG;
   480			goto out;
   481		}
   482	
   483		im = bpf_tramp_image_alloc(tr->key, size);
   484		if (IS_ERR(im)) {
   485			err = PTR_ERR(im);
   486			goto out;
   487		}
   488	
   489		err = arch_prepare_bpf_trampoline(im, im->image, im->image + size,
   490						  &tr->func.model, tr->flags, tlinks,
   491						  tr->func.addr);
   492		if (err < 0)
   493			goto out_free;
   494	
   495		err = arch_protect_bpf_trampoline(im->image, im->size);
   496		if (err)
   497			goto out_free;
   498	
   499		if (bpf_trampoline_use_jmp(tr->flags))
 > 500			tr->fops->flags |= FTRACE_OPS_FL_JMP;
   501		else
   502			tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
   503	
   504		WARN_ON(tr->cur_image && total == 0);
   505		if (tr->cur_image)
   506			/* progs already running at this address */
   507			err = modify_fentry(tr, orig_flags, tr->cur_image->image,
   508					    im->image, lock_direct_mutex);
   509		else
   510			/* first time registering */
   511			err = register_fentry(tr, im->image);
   512	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
  2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
  2025-11-17 22:49   ` kernel test robot
@ 2025-11-18  1:20   ` Menglong Dong
  2025-11-18  5:09   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-18  1:20 UTC (permalink / raw)
  To: ast, rostedt, Menglong Dong
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

On 2025/11/17 11:49, Menglong Dong wrote:
> Implement the "jmp" mode for the bpf trampoline. For the ftrace_managed
> case, we need only to set the FTRACE_OPS_FL_JMP on the tr->fops if "jmp"
> is needed.
> 
> For the bpf poke case, we will check the origin poke type with the
> "origin_flags", and current poke type with "tr->flags". The function
> bpf_trampoline_update_fentry() is introduced to do the job.
> 
> The "jmp" mode will only be enabled with CONFIG_DYNAMIC_FTRACE_WITH_JMP
> enabled and BPF_TRAMP_F_SHARE_IPMODIFY is not set. With
> BPF_TRAMP_F_SHARE_IPMODIFY, we need to get the origin call ip from the
> stack, so we can't use the "jmp" mode.
> 
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
> v2:
> - rename bpf_text_poke to bpf_trampoline_update_fentry
> - remove the BPF_TRAMP_F_JMPED and check the current mode with the origin
>   flags instead.
> ---
>  kernel/bpf/trampoline.c | 68 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 51 insertions(+), 17 deletions(-)
> 
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 2dcc999a411f..80ab435d6e00 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -175,24 +175,42 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
>  	return tr;
>  }
>  
> -static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
> +static int bpf_trampoline_update_fentry(struct bpf_trampoline *tr, u32 orig_flags,
> +					void *old_addr, void *new_addr)
>  {
> +	enum bpf_text_poke_type new_t = BPF_MOD_CALL, old_t = BPF_MOD_CALL;
>  	void *ip = tr->func.addr;
> +
> +	if (!new_addr)
> +		new_t = BPF_MOD_NOP;
> +	else if (bpf_trampoline_use_jmp(tr->flags))
> +		new_t = BPF_MOD_JUMP;
> +
> +	if (!old_addr)
> +		old_t = BPF_MOD_NOP;
> +	else if (bpf_trampoline_use_jmp(orig_flags))
> +		old_t = BPF_MOD_JUMP;
> +
> +	return bpf_arch_text_poke(ip, old_t, new_t, old_addr, new_addr);
> +}
> +
> +static int unregister_fentry(struct bpf_trampoline *tr, u32 orig_flags,
> +			     void *old_addr)
> +{
>  	int ret;
>  
>  	if (tr->func.ftrace_managed)
>  		ret = unregister_ftrace_direct(tr->fops, (long)old_addr, false);
>  	else
> -		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, BPF_MOD_NOP,
> -					 old_addr, NULL);
> +		ret = bpf_trampoline_update_fentry(tr, orig_flags, old_addr, NULL);
>  
>  	return ret;
>  }
>  
> -static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_addr,
> +static int modify_fentry(struct bpf_trampoline *tr, u32 orig_flags,
> +			 void *old_addr, void *new_addr,
>  			 bool lock_direct_mutex)
>  {
> -	void *ip = tr->func.addr;
>  	int ret;
>  
>  	if (tr->func.ftrace_managed) {
> @@ -201,10 +219,8 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
>  		else
>  			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
>  	} else {
> -		ret = bpf_arch_text_poke(ip,
> -					 old_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
> -					 new_addr ? BPF_MOD_CALL : BPF_MOD_NOP,
> -					 old_addr, new_addr);
> +		ret = bpf_trampoline_update_fentry(tr, orig_flags, old_addr,
> +						   new_addr);
>  	}
>  	return ret;
>  }
> @@ -229,8 +245,7 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
>  			return ret;
>  		ret = register_ftrace_direct(tr->fops, (long)new_addr);
>  	} else {
> -		ret = bpf_arch_text_poke(ip, BPF_MOD_NOP, BPF_MOD_CALL,
> -					 NULL, new_addr);
> +		ret = bpf_trampoline_update_fentry(tr, 0, NULL, new_addr);
>  	}
>  
>  	return ret;
> @@ -416,7 +431,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
>  		return PTR_ERR(tlinks);
>  
>  	if (total == 0) {
> -		err = unregister_fentry(tr, tr->cur_image->image);
> +		err = unregister_fentry(tr, orig_flags, tr->cur_image->image);
>  		bpf_tramp_image_put(tr->cur_image);
>  		tr->cur_image = NULL;
>  		goto out;
> @@ -440,9 +455,17 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  again:
> -	if ((tr->flags & BPF_TRAMP_F_SHARE_IPMODIFY) &&
> -	    (tr->flags & BPF_TRAMP_F_CALL_ORIG))
> -		tr->flags |= BPF_TRAMP_F_ORIG_STACK;
> +	if (tr->flags & BPF_TRAMP_F_CALL_ORIG) {
> +		if (tr->flags & BPF_TRAMP_F_SHARE_IPMODIFY) {
> +			tr->flags |= BPF_TRAMP_F_ORIG_STACK;
> +		} else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_JMP)) {
> +			/* Use "jmp" instead of "call" for the trampoline
> +			 * in the origin call case, and we don't need to
> +			 * skip the frame.
> +			 */
> +			tr->flags &= ~BPF_TRAMP_F_SKIP_FRAME;
> +		}
> +	}
>  #endif
>  
>  	size = arch_bpf_trampoline_size(&tr->func.model, tr->flags,
> @@ -473,10 +496,16 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
>  	if (err)
>  		goto out_free;
>  
> +	if (bpf_trampoline_use_jmp(tr->flags))
> +		tr->fops->flags |= FTRACE_OPS_FL_JMP;
> +	else
> +		tr->fops->flags &= ~FTRACE_OPS_FL_JMP;

This should be wrapped by "#ifdef CONFIG_DYNAMIC_FTRACE_WITH_JMP".
I'll change it in v3 after more human comments.

> +
>  	WARN_ON(tr->cur_image && total == 0);
>  	if (tr->cur_image)
>  		/* progs already running at this address */
> -		err = modify_fentry(tr, tr->cur_image->image, im->image, lock_direct_mutex);
> +		err = modify_fentry(tr, orig_flags, tr->cur_image->image,
> +				    im->image, lock_direct_mutex);
>  	else
>  		/* first time registering */
>  		err = register_fentry(tr, im->image);
> @@ -499,8 +528,13 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
>  	tr->cur_image = im;
>  out:
>  	/* If any error happens, restore previous flags */
> -	if (err)
> +	if (err) {
>  		tr->flags = orig_flags;
> +		if (bpf_trampoline_use_jmp(tr->flags))
> +			tr->fops->flags |= FTRACE_OPS_FL_JMP;
> +		else
> +			tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
> +	}
>  	kfree(tlinks);
>  	return err;
>  
> 





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
  2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
  2025-11-17 22:49   ` kernel test robot
  2025-11-18  1:20   ` Menglong Dong
@ 2025-11-18  5:09   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-11-18  5:09 UTC (permalink / raw)
  To: Menglong Dong, ast, rostedt
  Cc: oe-kbuild-all, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mhiramat, mark.rutland, mathieu.desnoyers, jiang.biao, bpf,
	linux-kernel, linux-trace-kernel

Hi Menglong,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Menglong-Dong/ftrace-introduce-FTRACE_OPS_FL_JMP/20251117-115243
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20251117034906.32036-7-dongml2%40chinatelecom.cn
patch subject: [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline
config: riscv-randconfig-001-20251118 (https://download.01.org/0day-ci/archive/20251118/202511181238.cVO5ERaA-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 10.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511181238.cVO5ERaA-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511181238.cVO5ERaA-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/bpf/trampoline.c: In function 'bpf_trampoline_update':
>> kernel/bpf/trampoline.c:500:11: error: invalid use of undefined type 'struct ftrace_ops'
     500 |   tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |           ^~
>> kernel/bpf/trampoline.c:500:22: error: 'FTRACE_OPS_FL_JMP' undeclared (first use in this function)
     500 |   tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |                      ^~~~~~~~~~~~~~~~~
   kernel/bpf/trampoline.c:500:22: note: each undeclared identifier is reported only once for each function it appears in
   kernel/bpf/trampoline.c:502:11: error: invalid use of undefined type 'struct ftrace_ops'
     502 |   tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |           ^~
   kernel/bpf/trampoline.c:534:12: error: invalid use of undefined type 'struct ftrace_ops'
     534 |    tr->fops->flags |= FTRACE_OPS_FL_JMP;
         |            ^~
   kernel/bpf/trampoline.c:536:12: error: invalid use of undefined type 'struct ftrace_ops'
     536 |    tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
         |            ^~


vim +500 kernel/bpf/trampoline.c

   470	
   471		size = arch_bpf_trampoline_size(&tr->func.model, tr->flags,
   472						tlinks, tr->func.addr);
   473		if (size < 0) {
   474			err = size;
   475			goto out;
   476		}
   477	
   478		if (size > PAGE_SIZE) {
   479			err = -E2BIG;
   480			goto out;
   481		}
   482	
   483		im = bpf_tramp_image_alloc(tr->key, size);
   484		if (IS_ERR(im)) {
   485			err = PTR_ERR(im);
   486			goto out;
   487		}
   488	
   489		err = arch_prepare_bpf_trampoline(im, im->image, im->image + size,
   490						  &tr->func.model, tr->flags, tlinks,
   491						  tr->func.addr);
   492		if (err < 0)
   493			goto out_free;
   494	
   495		err = arch_protect_bpf_trampoline(im->image, im->size);
   496		if (err)
   497			goto out_free;
   498	
   499		if (bpf_trampoline_use_jmp(tr->flags))
 > 500			tr->fops->flags |= FTRACE_OPS_FL_JMP;
   501		else
   502			tr->fops->flags &= ~FTRACE_OPS_FL_JMP;
   503	
   504		WARN_ON(tr->cur_image && total == 0);
   505		if (tr->cur_image)
   506			/* progs already running at this address */
   507			err = modify_fentry(tr, orig_flags, tr->cur_image->image,
   508					    im->image, lock_direct_mutex);
   509		else
   510			/* first time registering */
   511			err = register_fentry(tr, im->image);
   512	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP
  2025-11-17  3:49 ` [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
@ 2025-11-18  5:19   ` Masami Hiramatsu
  2025-11-18  6:14     ` Menglong Dong
  0 siblings, 1 reply; 17+ messages in thread
From: Masami Hiramatsu @ 2025-11-18  5:19 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, rostedt, daniel, john.fastabend, andrii, martin.lau, eddyz87,
	song, yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

On Mon, 17 Nov 2025 11:49:01 +0800
Menglong Dong <menglong8.dong@gmail.com> wrote:

> For now, the "nop" will be replaced with a "call" instruction when a
> function is hooked by the ftrace. However, sometimes the "call" can break
> the RSB and introduce extra overhead. Therefore, introduce the flag
> FTRACE_OPS_FL_JMP, which indicate that the ftrace_ops should be called
> with a "jmp" instead of "call". For now, it is only used by the direct
> call case.
> 
> When a direct ftrace_ops is marked with FTRACE_OPS_FL_JMP, the last bit of
> the ops->direct_call will be set to 1. Therefore, we can tell if we should
> use "jmp" for the callback in ftrace_call_replace().
> 

nit: Is it sure the last bit is always 0?
At least register_ftrace_direct() needs to reject if @addr
parameter has the last bit.

Thanks,


> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
>  include/linux/ftrace.h | 33 +++++++++++++++++++++++++++++++++
>  kernel/trace/Kconfig   | 12 ++++++++++++
>  kernel/trace/ftrace.c  |  9 ++++++++-
>  3 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 07f8c309e432..015dd1049bea 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -359,6 +359,7 @@ enum {
>  	FTRACE_OPS_FL_DIRECT			= BIT(17),
>  	FTRACE_OPS_FL_SUBOP			= BIT(18),
>  	FTRACE_OPS_FL_GRAPH			= BIT(19),
> +	FTRACE_OPS_FL_JMP			= BIT(20),
>  };
>  
>  #ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> @@ -577,6 +578,38 @@ static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
>  						 unsigned long addr) { }
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_JMP
> +static inline bool ftrace_is_jmp(unsigned long addr)
> +{
> +	return addr & 1;
> +}
> +
> +static inline unsigned long ftrace_jmp_set(unsigned long addr)
> +{
> +	return addr | 1UL;
> +}
> +
> +static inline unsigned long ftrace_jmp_get(unsigned long addr)
> +{
> +	return addr & ~1UL;
> +}
> +#else
> +static inline bool ftrace_is_jmp(unsigned long addr)
> +{
> +	return false;
> +}
> +
> +static inline unsigned long ftrace_jmp_set(unsigned long addr)
> +{
> +	return addr;
> +}
> +
> +static inline unsigned long ftrace_jmp_get(unsigned long addr)
> +{
> +	return addr;
> +}
> +#endif /* CONFIG_DYNAMIC_FTRACE_WITH_JMP */
> +
>  #ifdef CONFIG_STACK_TRACER
>  
>  int stack_trace_sysctl(const struct ctl_table *table, int write, void *buffer,
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index d2c79da81e4f..4661b9e606e0 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -80,6 +80,12 @@ config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
>  	  If the architecture generates __patchable_function_entries sections
>  	  but does not want them included in the ftrace locations.
>  
> +config HAVE_DYNAMIC_FTRACE_WITH_JMP
> +	bool
> +	help
> +	  If the architecture supports to replace the __fentry__ with a
> +	  "jmp" instruction.
> +
>  config HAVE_SYSCALL_TRACEPOINTS
>  	bool
>  	help
> @@ -330,6 +336,12 @@ config DYNAMIC_FTRACE_WITH_ARGS
>  	depends on DYNAMIC_FTRACE
>  	depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  
> +config DYNAMIC_FTRACE_WITH_JMP
> +	def_bool y
> +	depends on DYNAMIC_FTRACE
> +	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	depends on HAVE_DYNAMIC_FTRACE_WITH_JMP
> +
>  config FPROBE
>  	bool "Kernel Function Probe (fprobe)"
>  	depends on HAVE_FUNCTION_GRAPH_FREGS && HAVE_FTRACE_GRAPH_FUNC
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 59cfacb8a5bb..a6c060a4f50b 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5951,7 +5951,8 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
>  	for (i = 0; i < size; i++) {
>  		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
>  			del = __ftrace_lookup_ip(direct_functions, entry->ip);
> -			if (del && del->direct == addr) {
> +			if (del && ftrace_jmp_get(del->direct) ==
> +				   ftrace_jmp_get(addr)) {
>  				remove_hash_entry(direct_functions, del);
>  				kfree(del);
>  			}
> @@ -6018,6 +6019,9 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  
>  	mutex_lock(&direct_mutex);
>  
> +	if (ops->flags & FTRACE_OPS_FL_JMP)
> +		addr = ftrace_jmp_set(addr);
> +
>  	/* Make sure requested entries are not already registered.. */
>  	size = 1 << hash->size_bits;
>  	for (i = 0; i < size; i++) {
> @@ -6138,6 +6142,9 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  
>  	lockdep_assert_held_once(&direct_mutex);
>  
> +	if (ops->flags & FTRACE_OPS_FL_JMP)
> +		addr = ftrace_jmp_set(addr);
> +
>  	/* Enable the tmp_ops to have the same functions as the direct ops */
>  	ftrace_ops_init(&tmp_ops);
>  	tmp_ops.func_hash = ops->func_hash;
> -- 
> 2.51.2
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP
  2025-11-18  5:19   ` Masami Hiramatsu
@ 2025-11-18  6:14     ` Menglong Dong
  0 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-18  6:14 UTC (permalink / raw)
  To: Menglong Dong, Masami Hiramatsu
  Cc: ast, rostedt, daniel, john.fastabend, andrii, martin.lau, eddyz87,
	song, yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

On 2025/11/18 13:19, Masami Hiramatsu wrote:
> On Mon, 17 Nov 2025 11:49:01 +0800
> Menglong Dong <menglong8.dong@gmail.com> wrote:
> 
> > For now, the "nop" will be replaced with a "call" instruction when a
> > function is hooked by the ftrace. However, sometimes the "call" can break
> > the RSB and introduce extra overhead. Therefore, introduce the flag
> > FTRACE_OPS_FL_JMP, which indicate that the ftrace_ops should be called
> > with a "jmp" instead of "call". For now, it is only used by the direct
> > call case.
> > 
> > When a direct ftrace_ops is marked with FTRACE_OPS_FL_JMP, the last bit of
> > the ops->direct_call will be set to 1. Therefore, we can tell if we should
> > use "jmp" for the callback in ftrace_call_replace().
> > 
> 
> nit: Is it sure the last bit is always 0?

AFAIK, the function address is 16-bytes aligned for x86_64, and
8-bytes aligned for arm64, so I guess it is.

In the feature, if there is a exception, we can make ftrace_jmp_set,
ftrace_jmp_get arch-specification.

> At least register_ftrace_direct() needs to reject if @addr
> parameter has the last bit.

That make sense, I'll add such checking in the next version.

Thanks!
Menglong Dong

> 
> Thanks,
> 
> 
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > ---
> >  include/linux/ftrace.h | 33 +++++++++++++++++++++++++++++++++
> >  kernel/trace/Kconfig   | 12 ++++++++++++
> >  kernel/trace/ftrace.c  |  9 ++++++++-
> >  3 files changed, 53 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 07f8c309e432..015dd1049bea 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -359,6 +359,7 @@ enum {
> >  	FTRACE_OPS_FL_DIRECT			= BIT(17),
> >  	FTRACE_OPS_FL_SUBOP			= BIT(18),
> >  	FTRACE_OPS_FL_GRAPH			= BIT(19),
> > +	FTRACE_OPS_FL_JMP			= BIT(20),
> >  };
> >  
> >  #ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > @@ -577,6 +578,38 @@ static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
> >  						 unsigned long addr) { }
> >  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
> >  
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_JMP
> > +static inline bool ftrace_is_jmp(unsigned long addr)
> > +{
> > +	return addr & 1;
> > +}
> > +
> > +static inline unsigned long ftrace_jmp_set(unsigned long addr)
> > +{
> > +	return addr | 1UL;
> > +}
> > +
> > +static inline unsigned long ftrace_jmp_get(unsigned long addr)
> > +{
> > +	return addr & ~1UL;
> > +}
> > +#else
> > +static inline bool ftrace_is_jmp(unsigned long addr)
> > +{
> > +	return false;
> > +}
> > +
> > +static inline unsigned long ftrace_jmp_set(unsigned long addr)
> > +{
> > +	return addr;
> > +}
> > +
> > +static inline unsigned long ftrace_jmp_get(unsigned long addr)
> > +{
> > +	return addr;
> > +}
> > +#endif /* CONFIG_DYNAMIC_FTRACE_WITH_JMP */
> > +
> >  #ifdef CONFIG_STACK_TRACER
> >  
> >  int stack_trace_sysctl(const struct ctl_table *table, int write, void *buffer,
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index d2c79da81e4f..4661b9e606e0 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -80,6 +80,12 @@ config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
> >  	  If the architecture generates __patchable_function_entries sections
> >  	  but does not want them included in the ftrace locations.
> >  
> > +config HAVE_DYNAMIC_FTRACE_WITH_JMP
> > +	bool
> > +	help
> > +	  If the architecture supports to replace the __fentry__ with a
> > +	  "jmp" instruction.
> > +
> >  config HAVE_SYSCALL_TRACEPOINTS
> >  	bool
> >  	help
> > @@ -330,6 +336,12 @@ config DYNAMIC_FTRACE_WITH_ARGS
> >  	depends on DYNAMIC_FTRACE
> >  	depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >  
> > +config DYNAMIC_FTRACE_WITH_JMP
> > +	def_bool y
> > +	depends on DYNAMIC_FTRACE
> > +	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > +	depends on HAVE_DYNAMIC_FTRACE_WITH_JMP
> > +
> >  config FPROBE
> >  	bool "Kernel Function Probe (fprobe)"
> >  	depends on HAVE_FUNCTION_GRAPH_FREGS && HAVE_FTRACE_GRAPH_FUNC
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index 59cfacb8a5bb..a6c060a4f50b 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -5951,7 +5951,8 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
> >  	for (i = 0; i < size; i++) {
> >  		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
> >  			del = __ftrace_lookup_ip(direct_functions, entry->ip);
> > -			if (del && del->direct == addr) {
> > +			if (del && ftrace_jmp_get(del->direct) ==
> > +				   ftrace_jmp_get(addr)) {
> >  				remove_hash_entry(direct_functions, del);
> >  				kfree(del);
> >  			}
> > @@ -6018,6 +6019,9 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >  
> >  	mutex_lock(&direct_mutex);
> >  
> > +	if (ops->flags & FTRACE_OPS_FL_JMP)
> > +		addr = ftrace_jmp_set(addr);
> > +
> >  	/* Make sure requested entries are not already registered.. */
> >  	size = 1 << hash->size_bits;
> >  	for (i = 0; i < size; i++) {
> > @@ -6138,6 +6142,9 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >  
> >  	lockdep_assert_held_once(&direct_mutex);
> >  
> > +	if (ops->flags & FTRACE_OPS_FL_JMP)
> > +		addr = ftrace_jmp_set(addr);
> > +
> >  	/* Enable the tmp_ops to have the same functions as the direct ops */
> >  	ftrace_ops_init(&tmp_ops);
> >  	tmp_ops.func_hash = ops->func_hash;
> 
> 
> 





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode
  2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
                   ` (5 preceding siblings ...)
  2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
@ 2025-11-18  6:31 ` Alexei Starovoitov
  2025-11-18  6:34   ` Menglong Dong
  6 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2025-11-18  6:31 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau, Eduard,
	Song Liu, Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo,
	Jiri Olsa, Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers,
	jiang.biao, bpf, LKML, linux-trace-kernel

On Sun, Nov 16, 2025 at 7:49 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> For now, the bpf trampoline is called by the "call" instruction. However,
> it break the RSB and introduce extra overhead in x86_64 arch.

Please include performance numbers in the cover letter when you respin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode
  2025-11-18  6:31 ` [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Alexei Starovoitov
@ 2025-11-18  6:34   ` Menglong Dong
  2025-11-18  6:41     ` Alexei Starovoitov
  0 siblings, 1 reply; 17+ messages in thread
From: Menglong Dong @ 2025-11-18  6:34 UTC (permalink / raw)
  To: Menglong Dong, Alexei Starovoitov
  Cc: Alexei Starovoitov, Steven Rostedt, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau, Eduard,
	Song Liu, Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo,
	Jiri Olsa, Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers,
	jiang.biao, bpf, LKML, linux-trace-kernel

On 2025/11/18 14:31, Alexei Starovoitov wrote:
> On Sun, Nov 16, 2025 at 7:49 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > For now, the bpf trampoline is called by the "call" instruction. However,
> > it break the RSB and introduce extra overhead in x86_64 arch.
> 
> Please include performance numbers in the cover letter when you respin.

Hmm...I included a little performance, do you mean more performance
data? Current description:

As we can see above, the RSB is totally balanced. After the modification,
the performance of fexit increases from 76M/s to 130M/s.

> 
> 





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode
  2025-11-18  6:34   ` Menglong Dong
@ 2025-11-18  6:41     ` Alexei Starovoitov
  2025-11-18  6:46       ` Menglong Dong
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2025-11-18  6:41 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Menglong Dong, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, John Fastabend, Andrii Nakryiko,
	Martin KaFai Lau, Eduard, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Masami Hiramatsu,
	Mark Rutland, Mathieu Desnoyers, jiang.biao, bpf, LKML,
	linux-trace-kernel

On Mon, Nov 17, 2025 at 10:34 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2025/11/18 14:31, Alexei Starovoitov wrote:
> > On Sun, Nov 16, 2025 at 7:49 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > For now, the bpf trampoline is called by the "call" instruction. However,
> > > it break the RSB and introduce extra overhead in x86_64 arch.
> >
> > Please include performance numbers in the cover letter when you respin.
>
> Hmm...I included a little performance, do you mean more performance
> data? Current description:
>
> As we can see above, the RSB is totally balanced. After the modification,
> the performance of fexit increases from 76M/s to 130M/s.

I saw that. I meant full comparison with fentry and fmodret.
I suspect fmodret improved as well, right?
And include the command line that you used to measure.
selftests/bpf/bench...
so there is a way to reproduce what patchset claims.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode
  2025-11-18  6:41     ` Alexei Starovoitov
@ 2025-11-18  6:46       ` Menglong Dong
  0 siblings, 0 replies; 17+ messages in thread
From: Menglong Dong @ 2025-11-18  6:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Menglong Dong, Alexei Starovoitov, Steven Rostedt,
	Daniel Borkmann, John Fastabend, Andrii Nakryiko,
	Martin KaFai Lau, Eduard, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Masami Hiramatsu,
	Mark Rutland, Mathieu Desnoyers, jiang.biao, bpf, LKML,
	linux-trace-kernel

On Tue, Nov 18, 2025 at 2:41 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 17, 2025 at 10:34 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> >
> > On 2025/11/18 14:31, Alexei Starovoitov wrote:
> > > On Sun, Nov 16, 2025 at 7:49 PM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > >
> > > > For now, the bpf trampoline is called by the "call" instruction. However,
> > > > it break the RSB and introduce extra overhead in x86_64 arch.
> > >
> > > Please include performance numbers in the cover letter when you respin.
> >
> > Hmm...I included a little performance, do you mean more performance
> > data? Current description:
> >
> > As we can see above, the RSB is totally balanced. After the modification,
> > the performance of fexit increases from 76M/s to 130M/s.
>
> I saw that. I meant full comparison with fentry and fmodret.
> I suspect fmodret improved as well, right?
> And include the command line that you used to measure.
> selftests/bpf/bench...
> so there is a way to reproduce what patchset claims.

I see. "fmodret" improved too, and all the BPF prog that based on
bpf trampoline origin call have a performance improvement.

I'll add the full comparison results in the next version.

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-11-18  6:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17  3:49 [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Menglong Dong
2025-11-17  3:49 ` [PATCH bpf-next v2 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
2025-11-18  5:19   ` Masami Hiramatsu
2025-11-18  6:14     ` Menglong Dong
2025-11-17  3:49 ` [PATCH bpf-next v2 2/6] x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP Menglong Dong
2025-11-17  3:49 ` [PATCH bpf-next v2 3/6] bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME Menglong Dong
2025-11-17  3:49 ` [PATCH bpf-next v2 4/6] bpf,x86: adjust the "jmp" mode for bpf trampoline Menglong Dong
2025-11-17  3:49 ` [PATCH bpf-next v2 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke Menglong Dong
2025-11-17 20:55   ` kernel test robot
2025-11-17  3:49 ` [PATCH bpf-next v2 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
2025-11-17 22:49   ` kernel test robot
2025-11-18  1:20   ` Menglong Dong
2025-11-18  5:09   ` kernel test robot
2025-11-18  6:31 ` [PATCH bpf-next v2 0/6] bpf trampoline support "jmp" mode Alexei Starovoitov
2025-11-18  6:34   ` Menglong Dong
2025-11-18  6:41     ` Alexei Starovoitov
2025-11-18  6:46       ` Menglong Dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).