linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 bpf-next 0/9] ftrace,bpf: Use single direct ops for bpf trampolines
@ 2025-11-13 12:37 Jiri Olsa
  2025-11-13 12:37 ` [PATCHv2 bpf-next 1/8] ftrace: Make alloc_and_copy_ftrace_hash direct friendly Jiri Olsa
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Jiri Olsa @ 2025-11-13 12:37 UTC (permalink / raw)
  To: Steven Rostedt, Florent Revest, Mark Rutland
  Cc: bpf, linux-kernel, linux-trace-kernel, linux-arm-kernel,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Menglong Dong, Song Liu

hi,
while poking the multi-tracing interface I ended up with just one ftrace_ops
object to attach all trampolines.

This change allows to use less direct API calls during the attachment changes
in the future code, so in effect speeding up the attachment.

In current code we get a speed up from using just a single ftrace_ops object.

- with current code:

  Performance counter stats for 'bpftrace -e fentry:vmlinux:ksys_* {} -c true':

     6,364,157,902      cycles:k
       828,728,902      cycles:u
     1,064,803,824      instructions:u                   #    1.28  insn per cycle
    23,797,500,067      instructions:k                   #    3.74  insn per cycle

       4.416004987 seconds time elapsed

       0.164121000 seconds user
       1.289550000 seconds sys


- with the fix:

   Performance counter stats for 'bpftrace -e fentry:vmlinux:ksys_* {} -c true':

     6,535,857,905      cycles:k
       810,809,429      cycles:u
     1,064,594,027      instructions:u                   #    1.31  insn per cycle
    23,962,552,894      instructions:k                   #    3.67  insn per cycle

       1.666961239 seconds time elapsed

       0.157412000 seconds user
       1.283396000 seconds sys



The speedup seems to be related to the fact that with single ftrace_ops object
we don't call ftrace_shutdown anymore (we use ftrace_update_ops instead) and
we skip the synchronize rcu calls (each ~100ms) at the end of that function.


rfc: https://lore.kernel.org/bpf/20250729102813.1531457-1-jolsa@kernel.org/
v1:  https://lore.kernel.org/bpf/20250923215147.1571952-1-jolsa@kernel.org/

v2 changes:
- rebased on top fo bpf-next/master plus Song's livepatch fixes [1] 
- renamed the API functions [2] [Steven]
- do not export the new api [Steven]
- kept the original direct interface:

  I'm not sure if we want to melt both *_ftrace_direct and the new interface
  into single one. It's bit different in semantic (hence the name change as
  Steven suggested [2]) and I don't think the changes are not that big so
  we could easily keep both APIs.

v1 changes:
- make the change x86 specific, after discussing with Mark options for
  arm64 [Mark]

thanks,
jirka


[1] https://lore.kernel.org/bpf/20251027175023.1521602-1-song@kernel.org/
[2] https://lore.kernel.org/bpf/20250924050415.4aefcb91@batman.local.home/
---
Jiri Olsa (8):
      ftrace: Make alloc_and_copy_ftrace_hash direct friendly
      ftrace: Export some of hash related functions
      ftrace: Add update_ftrace_direct_add function
      ftrace: Add update_ftrace_direct_del function
      ftrace: Add update_ftrace_direct_mod function
      bpf: Add trampoline ip hash table
      ftrace: Factor ftrace_ops ops_func interface
      bpf, x86: Use single ftrace_ops for direct calls

 arch/x86/Kconfig        |   1 +
 include/linux/bpf.h     |   7 ++-
 include/linux/ftrace.h  |  37 ++++++++++++++-
 kernel/bpf/trampoline.c | 199 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------
 kernel/trace/Kconfig    |   3 ++
 kernel/trace/ftrace.c   | 326 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 kernel/trace/trace.h    |   8 ----
 7 files changed, 532 insertions(+), 49 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-11-13 17:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-13 12:37 [PATCHv2 bpf-next 0/9] ftrace,bpf: Use single direct ops for bpf trampolines Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 1/8] ftrace: Make alloc_and_copy_ftrace_hash direct friendly Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 2/8] ftrace: Export some of hash related functions Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 3/8] ftrace: Add update_ftrace_direct_add function Jiri Olsa
2025-11-13 13:02   ` bot+bpf-ci
2025-11-13 15:59     ` Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 4/8] ftrace: Add update_ftrace_direct_del function Jiri Olsa
2025-11-13 13:02   ` bot+bpf-ci
2025-11-13 16:00     ` Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 5/8] ftrace: Add update_ftrace_direct_mod function Jiri Olsa
2025-11-13 13:02   ` bot+bpf-ci
2025-11-13 16:00     ` Jiri Olsa
2025-11-13 17:57       ` Alexei Starovoitov
2025-11-13 12:37 ` [PATCHv2 bpf-next 6/8] bpf: Add trampoline ip hash table Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 7/8] ftrace: Factor ftrace_ops ops_func interface Jiri Olsa
2025-11-13 12:37 ` [PATCHv2 bpf-next 8/8] bpf, x86: Use single ftrace_ops for direct calls Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).