From: Jiri Olsa <jolsa@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
Hao Luo <haoluo@google.com>, Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Alan Maguire <alan.maguire@oracle.com>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: [RFC 00/11] uprobes: Add support to optimize usdt probes on x86_64
Date: Tue, 5 Nov 2024 14:33:54 +0100 [thread overview]
Message-ID: <20241105133405.2703607-1-jolsa@kernel.org> (raw)
hi,
this patchset adds support to optimize usdt probes on top of 5-byte
nop instruction.
The generic approach (optimize all uprobes) is hard due to emulating
possible multiple original instructions and its related issues. The
usdt case, which stores 5-byte nop seems much easier, so starting
with that.
The basic idea is to replace breakpoint exception with syscall which
is faster on x86_64. For more details please see changelog of patch 7.
The first benchmark shows about 68% speed up (see below). The benchmark
triggers usdt probe in a loop and counts how many of those happened
per second.
It's still rfc state with some loose ends, but I'd be interested in
any feedback about the direction of this.
It's based on tip/perf/core with bpf-next/master merged on top of
that together with uprobe session patchset.
thanks,
jirka
current:
# ./bench -w2 -d5 -a trig-usdt
Setting up benchmark 'trig-usdt'...
Benchmark 'trig-usdt' started.
Iter 0 ( 46.982us): hits 4.893M/s ( 4.893M/prod), drops 0.000M/s, total operations 4.893M/s
Iter 1 ( -5.967us): hits 4.892M/s ( 4.892M/prod), drops 0.000M/s, total operations 4.892M/s
Iter 2 ( -2.771us): hits 4.899M/s ( 4.899M/prod), drops 0.000M/s, total operations 4.899M/s
Iter 3 ( 1.286us): hits 4.889M/s ( 4.889M/prod), drops 0.000M/s, total operations 4.889M/s
Iter 4 ( -2.871us): hits 4.881M/s ( 4.881M/prod), drops 0.000M/s, total operations 4.881M/s
Iter 5 ( 1.005us): hits 4.886M/s ( 4.886M/prod), drops 0.000M/s, total operations 4.886M/s
Iter 6 ( 11.626us): hits 4.906M/s ( 4.906M/prod), drops 0.000M/s, total operations 4.906M/s
Iter 7 ( -6.638us): hits 4.896M/s ( 4.896M/prod), drops 0.000M/s, total operations 4.896M/s
Summary: hits 4.893 +- 0.009M/s ( 4.893M/prod), drops 0.000 +- 0.000M/s, total operations 4.893 +- 0.009M/s
optimized:
# ./bench -w2 -d5 -a trig-usdt
Setting up benchmark 'trig-usdt'...
Benchmark 'trig-usdt' started.
Iter 0 ( 46.073us): hits 8.258M/s ( 8.258M/prod), drops 0.000M/s, total operations 8.258M/s
Iter 1 ( -5.752us): hits 8.264M/s ( 8.264M/prod), drops 0.000M/s, total operations 8.264M/s
Iter 2 ( -1.333us): hits 8.263M/s ( 8.263M/prod), drops 0.000M/s, total operations 8.263M/s
Iter 3 ( -2.996us): hits 8.265M/s ( 8.265M/prod), drops 0.000M/s, total operations 8.265M/s
Iter 4 ( -0.620us): hits 8.264M/s ( 8.264M/prod), drops 0.000M/s, total operations 8.264M/s
Iter 5 ( -2.624us): hits 8.236M/s ( 8.236M/prod), drops 0.000M/s, total operations 8.236M/s
Iter 6 ( -0.840us): hits 8.232M/s ( 8.232M/prod), drops 0.000M/s, total operations 8.232M/s
Iter 7 ( -1.783us): hits 8.235M/s ( 8.235M/prod), drops 0.000M/s, total operations 8.235M/s
Summary: hits 8.249 +- 0.016M/s ( 8.249M/prod), drops 0.000 +- 0.000M/s, total operations 8.249 +- 0.016M/s
---
Jiri Olsa (11):
uprobes: Rename arch_uretprobe_trampoline function
uprobes: Make copy_from_page global
uprobes: Add len argument to uprobe_write_opcode
uprobes: Add data argument to uprobe_write_opcode function
uprobes: Add mapping for optimized uprobe trampolines
uprobes: Add uprobe syscall to speed up uprobe
uprobes/x86: Add support to optimize uprobes
selftests/bpf: Use 5-byte nop for x86 usdt probes
selftests/bpf: Add usdt trigger bench
selftests/bpf: Add uprobe/usdt optimized test
selftests/bpf: Add hit/attach/detach race optimized uprobe test
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/uprobes.h | 7 +++
arch/x86/kernel/uprobes.c | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
include/linux/syscalls.h | 2 +
include/linux/uprobes.h | 25 +++++++++-
kernel/events/uprobes.c | 222 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
kernel/fork.c | 2 +
kernel/sys_ni.c | 1 +
tools/testing/selftests/bpf/bench.c | 2 +
tools/testing/selftests/bpf/benchs/bench_trigger.c | 45 +++++++++++++++++
tools/testing/selftests/bpf/prog_tests/uprobe_optimized.c | 252 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tools/testing/selftests/bpf/progs/trigger_bench.c | 10 +++-
tools/testing/selftests/bpf/progs/uprobe_optimized.c | 29 +++++++++++
tools/testing/selftests/bpf/sdt.h | 9 +++-
14 files changed, 768 insertions(+), 19 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_optimized.c
create mode 100644 tools/testing/selftests/bpf/progs/uprobe_optimized.c
next reply other threads:[~2024-11-05 13:34 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-05 13:33 Jiri Olsa [this message]
2024-11-05 13:33 ` [RFC perf/core 01/11] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2024-11-05 13:33 ` [RFC perf/core 02/11] uprobes: Make copy_from_page global Jiri Olsa
2024-11-14 23:40 ` Andrii Nakryiko
2024-11-16 21:41 ` Jiri Olsa
2024-11-05 13:33 ` [RFC perf/core 03/11] uprobes: Add len argument to uprobe_write_opcode Jiri Olsa
2024-11-14 23:41 ` Andrii Nakryiko
2024-11-16 21:41 ` Jiri Olsa
2024-11-05 13:33 ` [RFC perf/core 04/11] uprobes: Add data argument to uprobe_write_opcode function Jiri Olsa
2024-11-14 23:41 ` Andrii Nakryiko
2024-11-16 21:43 ` Jiri Olsa
2024-11-05 13:33 ` [RFC perf/core 05/11] uprobes: Add mapping for optimized uprobe trampolines Jiri Olsa
2024-11-05 14:23 ` Peter Zijlstra
2024-11-05 16:33 ` Jiri Olsa
2024-11-14 23:44 ` Andrii Nakryiko
2024-11-16 21:44 ` Jiri Olsa
2024-11-19 6:06 ` Andrii Nakryiko
2024-11-19 9:13 ` Peter Zijlstra
2024-11-19 15:15 ` Jiri Olsa
2024-11-21 0:07 ` Andrii Nakryiko
2024-11-21 11:53 ` Peter Zijlstra
2024-11-21 16:02 ` Alexei Starovoitov
2024-11-21 16:34 ` Peter Zijlstra
2024-11-21 16:47 ` Alexei Starovoitov
2024-11-21 19:38 ` Mark Rutland
2024-11-14 23:44 ` Andrii Nakryiko
2024-11-16 21:44 ` Jiri Olsa
2024-11-19 6:05 ` Andrii Nakryiko
2024-11-19 15:14 ` Jiri Olsa
2024-11-21 0:10 ` Andrii Nakryiko
2024-11-05 13:34 ` [RFC perf/core 06/11] uprobes: Add uprobe syscall to speed up uprobe Jiri Olsa
2024-11-05 13:34 ` [RFC perf/core 07/11] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2024-11-14 23:44 ` Andrii Nakryiko
2024-11-16 21:44 ` Jiri Olsa
2024-11-18 8:18 ` Masami Hiramatsu
2024-11-18 9:39 ` Jiri Olsa
2024-11-05 13:34 ` [RFC bpf-next 08/11] selftests/bpf: Use 5-byte nop for x86 usdt probes Jiri Olsa
2024-11-05 13:34 ` [RFC bpf-next 09/11] selftests/bpf: Add usdt trigger bench Jiri Olsa
2024-11-14 23:40 ` Andrii Nakryiko
2024-11-16 21:45 ` Jiri Olsa
2024-11-19 6:08 ` Andrii Nakryiko
2024-11-05 13:34 ` [RFC bpf-next 10/11] selftests/bpf: Add uprobe/usdt optimized test Jiri Olsa
2024-11-05 13:34 ` [RFC bpf-next 11/11] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2024-11-17 11:49 ` [RFC 00/11] uprobes: Add support to optimize usdt probes on x86_64 Peter Zijlstra
2024-11-18 9:29 ` Jiri Olsa
2024-11-18 10:06 ` Mark Rutland
2024-11-19 6:13 ` Andrii Nakryiko
2024-11-21 18:18 ` Mark Rutland
2024-11-26 19:13 ` Andrii Nakryiko
2024-11-18 8:04 ` Masami Hiramatsu
2024-11-18 9:52 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241105133405.2703607-1-jolsa@kernel.org \
--to=jolsa@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox