From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Shuah Khan <shuah@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, bpf@vger.kernel.org
Subject: [RFC PATCH 0/4] tracing/probes: Optimize fetcharg with BPF
Date: Wed, 1 Jul 2026 22:45:22 +0900 [thread overview]
Message-ID: <178291352217.1566898.14481561093843379745.stgit@devnote2> (raw)
Hi,
I investigated the feasibility of optimizing `fetcharg` in probe events
using BPF conversion. The result looks promising. It can reduce about
30% of overhead (and maybe more if we have more than 3 arguments.)
I actually thought there was not such a big difference because I guessed
major overhead source is unsafe pointer dereferencing (e.g.
copy_from_kernel_nofault()). Actually without CONFIG_BPF_JIT, the overhead
is more than double. But with the JIT compiler it showed better performance.
The basic concept is quite simple. The process remains the same up until
the point where user input is converted into `fetcharg` code. It is
possible to convert some of the fundamental `fetcharg` operations into
an equivalent sequence of BPF instructions. This creates a single
`bpf_prog` for each probe event (rather than one per argument).
This program executes within the event handler, reads `pt_regs` directly,
and stores the results in the ftrace ring buffer, just as `fetcharg`
does.
So here are the benchmark results on qemu (KVM) on Intel Core i7-8565U.
When enabling BPF with JIT:
--------------------------------------------------------------------------------
Configuration 0 Fetchargs 1 Fetcharg 2 Fetchargs 3 Fetchargs
--------------------------------------------------------------------------------
Baseline 298882359 - - - loops/sec
- - - - overhead
Kprobe 9740841 8664195 7944956 7608274 loops/sec
99.31 ns 12.76 ns 23.21 ns 28.78 ns overhead
Fprobe 10827749 9220918 7992512 7683757 loops/sec
89.01 ns 16.09 ns 32.76 ns 37.79 ns overhead
Eprobe 6746389 6245994 5319037 4845406 loops/sec
144.88 ns 11.88 ns 39.78 ns 58.15 ns overhead
--------------------------------------------------------------------------------
When enabling BPF without JIT:
-----------------------------------------------------------------------------------------------
Configuration 0 Fetchargs 1 Fetcharg 2 Fetchargs 3 Fetchargs
-----------------------------------------------------------------------------------------------
Baseline 84067374 - - - loops/sec
- - - - overhead
Kprobe 7092949 5834913 3848776 3443408 loops/sec
129.09 ns 30.40 ns 118.84 ns 149.42 ns overhead
Fprobe 9426302 6441734 4350313 3710814 loops/sec
94.19 ns 49.15 ns 123.78 ns 163.40 ns overhead
Eprobe 5681716 4958113 3940999 3953434 loops/sec
164.11 ns 25.69 ns 77.74 ns 76.94 ns overhead
-----------------------------------------------------------------------------------------------
When disabling BPF (legacy fetcharg)
--------------------------------------------------------------------------------
Configuration 0 Fetchargs 1 Fetcharg 2 Fetchargs 3 Fetchargs
--------------------------------------------------------------------------------
Baseline 245433525 - - - loops/sec
- - - - overhead
Kprobe 9055348 8488351 7219595 6453928 loops/sec
106.36 ns 7.38 ns 28.08 ns 44.51 ns overhead
Fprobe 10859326 9288801 7492518 6607046 loops/sec
88.01 ns 15.57 ns 41.38 ns 59.27 ns overhead
Eprobe 6987128 5114526 5055084 4803759 loops/sec
139.05 ns 52.40 ns 54.70 ns 65.05 ns overhead
--------------------------------------------------------------------------------
The number is still unstable (because of the benchmark problem) but the
trend shows the BPF+JIT is the winner.
TODOs:
- Add a new Kconfig which depends on CONFIG_BPF_JIT=y.
- Even if a single dereference operation fails, processing of subsequent
arguments continues.
- Allow mixing with unsupported FETCH_OPs on the same event.
Thank you,
---
base-commit: c0c56fe6fb52cfb28419242cfa6235125f818f94
Masami Hiramatsu (Google) (4):
tools/tracing: Add fetcharg performance micro-benchmark
tracing/probes: Compile all fetchargs into a single BPF program per event
tracing: Add disable_bpf trace option to ignore eBPF for fetchargs
selftests/ftrace: Add a test for eBPF compiled fetchargs
kernel/trace/trace.c | 7 +
kernel/trace/trace.h | 8 +
kernel/trace/trace_probe.c | 249 ++++++++++++++++++++
kernel/trace/trace_probe.h | 15 +
kernel/trace/trace_probe_tmpl.h | 13 +
.../ftrace/test.d/dynevent/test_bpf_fetchargs.tc | 51 ++++
tools/tracing/benchmark/Kbuild | 3
tools/tracing/benchmark/Makefile | 12 +
tools/tracing/benchmark/bench_fetcharg.sh | 195 ++++++++++++++++
tools/tracing/benchmark/fetcharg_bench.c | 98 ++++++++
tools/tracing/benchmark/fetcharg_bench_trace.h | 37 +++
11 files changed, 684 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/test_bpf_fetchargs.tc
create mode 100644 tools/tracing/benchmark/Kbuild
create mode 100644 tools/tracing/benchmark/Makefile
create mode 100755 tools/tracing/benchmark/bench_fetcharg.sh
create mode 100644 tools/tracing/benchmark/fetcharg_bench.c
create mode 100644 tools/tracing/benchmark/fetcharg_bench_trace.h
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
next reply other threads:[~2026-07-01 13:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 13:45 Masami Hiramatsu (Google) [this message]
2026-07-01 13:45 ` [RFC PATCH 1/4] tools/tracing: Add fetcharg performance micro-benchmark Masami Hiramatsu (Google)
2026-07-01 13:45 ` [RFC PATCH 2/4] tracing/probes: Compile all fetchargs into a single BPF program per event Masami Hiramatsu (Google)
2026-07-01 18:41 ` Alexei Starovoitov
2026-07-01 18:47 ` Steven Rostedt
2026-07-01 18:53 ` Alexei Starovoitov
2026-07-01 22:40 ` Masami Hiramatsu
2026-07-02 0:01 ` Alexei Starovoitov
2026-07-02 1:01 ` Masami Hiramatsu
2026-07-02 14:04 ` Steven Rostedt
2026-07-01 13:45 ` [RFC PATCH 3/4] tracing: Add disable_bpf trace option to ignore eBPF for fetchargs Masami Hiramatsu (Google)
2026-07-01 13:46 ` [RFC PATCH 4/4] selftests/ftrace: Add a test for eBPF compiled fetchargs Masami Hiramatsu (Google)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=178291352217.1566898.14481561093843379745.stgit@devnote2 \
--to=mhiramat@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox