linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/6] bpf trampoline support "jmp" mode
@ 2025-11-18 12:36 Menglong Dong
  2025-11-18 12:36 ` [PATCH bpf-next v3 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Menglong Dong @ 2025-11-18 12:36 UTC (permalink / raw)
  To: ast, rostedt
  Cc: daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mhiramat,
	mark.rutland, mathieu.desnoyers, jiang.biao, bpf, linux-kernel,
	linux-trace-kernel

For now, the bpf trampoline is called by the "call" instruction. However,
it break the RSB and introduce extra overhead in x86_64 arch.

For example, we hook the function "foo" with fexit, the call and return
logic will be like this:
  call foo -> call trampoline -> call foo-body ->
  return foo-body -> return foo

As we can see above, there are 3 call, but 2 return, which break the RSB
balance. We can pseudo a "return" here, but it's not the best choice,
as it will still cause once RSB miss:
  call foo -> call trampoline -> call foo-body ->
  return foo-body -> return dummy -> return foo

The "return dummy" doesn't pair the "call trampoline", which can also
cause the RSB miss.

Therefore, we introduce the "jmp" mode for bpf trampoline, as advised by
Alexei in [1]. And the logic will become this:
  call foo -> jmp trampoline -> call foo-body ->
  return foo-body -> return foo

As we can see above, the RSB is totally balanced after this series.

In this series, we introduce the FTRACE_OPS_FL_JMP for ftrace to make it
use the "jmp" instruction instead of "call".

And we also do some adjustment to bpf_arch_text_poke() to allow us specify
the old and new poke_type.

For the BPF_TRAMP_F_SHARE_IPMODIFY case, we will fallback to the "call"
mode, as it need to get the function address from the stack, which is not
supported in "jmp" mode.

Before this series, we have the following performance with the bpf
benchmark:

  $ cd tools/testing/selftests/bpf
  $ ./benchs/run_bench_trigger.sh
  usermode-count :  890.171 ± 1.522M/s
  kernel-count   :  409.184 ± 0.330M/s
  syscall-count  :   26.792 ± 0.010M/s
  fentry         :  171.242 ± 0.322M/s
  fexit          :   80.544 ± 0.045M/s
  fmodret        :   78.301 ± 0.065M/s
  rawtp          :  192.906 ± 0.900M/s
  tp             :   81.883 ± 0.209M/s
  kprobe         :   52.029 ± 0.113M/s
  kprobe-multi   :   62.237 ± 0.060M/s
  kprobe-multi-all:    4.761 ± 0.014M/s
  kretprobe      :   23.779 ± 0.046M/s
  kretprobe-multi:   29.134 ± 0.012M/s
  kretprobe-multi-all:    3.822 ± 0.003M/

And after this series, we have the following performance:

  usermode-count :  890.443 ± 0.307M/s
  kernel-count   :  416.139 ± 0.055M/s
  syscall-count  :   31.037 ± 0.813M/s
  fentry         :  169.549 ± 0.519M/s
  fexit          :  136.540 ± 0.518M/s
  fmodret        :  159.248 ± 0.188M/s
  rawtp          :  194.475 ± 0.144M/s
  tp             :   84.505 ± 0.041M/s
  kprobe         :   59.951 ± 0.071M/s
  kprobe-multi   :   63.153 ± 0.177M/s
  kprobe-multi-all:    4.699 ± 0.012M/s
  kretprobe      :   23.740 ± 0.015M/s
  kretprobe-multi:   29.301 ± 0.022M/s
  kretprobe-multi-all:    3.869 ± 0.005M/s

As we can see above, the performance of fexit increase from 80.544M/s to
136.540M/s, and the "fmodret" increase from 78.301M/s to 159.248M/s.

Link: https://lore.kernel.org/bpf/20251117034906.32036-1-dongml2@chinatelecom.cn/
Changes since v2:
* reject if the addr is already "jmp" in register_ftrace_direct() and
  __modify_ftrace_direct() in the 1st patch.
* fix compile error in powerpc in the 5th patch.
* changes in the 6th patch:
  - fix the compile error by wrapping the write to tr->fops->flags with
    CONFIG_DYNAMIC_FTRACE_WITH_JMP
  - reset BPF_TRAMP_F_SKIP_FRAME when the second try of modify_fentry in
    bpf_trampoline_update()

Link: https://lore.kernel.org/bpf/20251114092450.172024-1-dongml2@chinatelecom.cn/
Changes since v1:
* change the bool parameter that we add to save_args() to "u32 flags"
* rename bpf_trampoline_need_jmp() to bpf_trampoline_use_jmp()
* add new function parameter to bpf_arch_text_poke instead of introduce
  bpf_arch_text_poke_type()
* rename bpf_text_poke to bpf_trampoline_update_fentry
* remove the BPF_TRAMP_F_JMPED and check the current mode with the origin
  flags instead.

Link: https://lore.kernel.org/bpf/CAADnVQLX54sVi1oaHrkSiLqjJaJdm3TQjoVrgU-LZimK6iDcSA@mail.gmail.com/[1]
Menglong Dong (6):
  ftrace: introduce FTRACE_OPS_FL_JMP
  x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP
  bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME
  bpf,x86: adjust the "jmp" mode for bpf trampoline
  bpf: specify the old and new poke_type for bpf_arch_text_poke
  bpf: implement "jmp" mode for trampoline

 arch/arm64/net/bpf_jit_comp.c   | 14 +++---
 arch/loongarch/net/bpf_jit.c    |  9 ++--
 arch/powerpc/net/bpf_jit_comp.c | 10 +++--
 arch/riscv/net/bpf_jit_comp64.c | 11 +++--
 arch/s390/net/bpf_jit_comp.c    |  7 +--
 arch/x86/Kconfig                |  1 +
 arch/x86/kernel/ftrace.c        |  7 ++-
 arch/x86/kernel/ftrace_64.S     | 12 +++++-
 arch/x86/net/bpf_jit_comp.c     | 55 ++++++++++++++----------
 include/linux/bpf.h             | 18 +++++++-
 include/linux/ftrace.h          | 33 ++++++++++++++
 kernel/bpf/core.c               |  5 ++-
 kernel/bpf/trampoline.c         | 76 ++++++++++++++++++++++++++-------
 kernel/trace/Kconfig            | 12 ++++++
 kernel/trace/ftrace.c           | 14 +++++-
 15 files changed, 219 insertions(+), 65 deletions(-)

-- 
2.51.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-11-24 18:00 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-18 12:36 [PATCH bpf-next v3 0/6] bpf trampoline support "jmp" mode Menglong Dong
2025-11-18 12:36 ` [PATCH bpf-next v3 1/6] ftrace: introduce FTRACE_OPS_FL_JMP Menglong Dong
2025-11-18 13:25   ` bot+bpf-ci
2025-11-18 13:51     ` Steven Rostedt
2025-11-18 12:36 ` [PATCH bpf-next v3 2/6] x86/ftrace: implement DYNAMIC_FTRACE_WITH_JMP Menglong Dong
2025-11-18 22:01   ` Jiri Olsa
2025-11-19  1:05     ` Menglong Dong
2025-11-18 12:36 ` [PATCH bpf-next v3 3/6] bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME Menglong Dong
2025-11-18 12:36 ` [PATCH bpf-next v3 4/6] bpf,x86: adjust the "jmp" mode for bpf trampoline Menglong Dong
2025-11-18 12:36 ` [PATCH bpf-next v3 5/6] bpf: specify the old and new poke_type for bpf_arch_text_poke Menglong Dong
2025-11-18 12:36 ` [PATCH bpf-next v3 6/6] bpf: implement "jmp" mode for trampoline Menglong Dong
2025-11-19  0:59   ` Alexei Starovoitov
2025-11-19  1:03     ` Steven Rostedt
2025-11-22  2:37       ` Alexei Starovoitov
2025-11-24 14:50         ` Steven Rostedt
2025-11-19  0:28 ` [PATCH bpf-next v3 0/6] bpf trampoline support "jmp" mode Alexei Starovoitov
2025-11-19  2:47   ` Menglong Dong
2025-11-19  2:55     ` Leon Hwang
2025-11-19 12:36       ` Xu Kuohai
2025-11-20  2:07         ` Leon Hwang
2025-11-20  3:24           ` Xu Kuohai
2025-11-24 18:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).