* [PATCHv3 perf/core 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench
2025-04-14 8:36 [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Jiri Olsa
@ 2025-04-14 8:36 ` Jiri Olsa
2025-04-14 16:13 ` Andrii Nakryiko
2025-04-14 12:55 ` [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Oleg Nesterov
2025-04-14 16:13 ` Andrii Nakryiko
2 siblings, 1 reply; 6+ messages in thread
From: Jiri Olsa @ 2025-04-14 8:36 UTC (permalink / raw)
To: Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Andrii Nakryiko
Cc: bpf, linux-kernel, linux-trace-kernel, x86, Song Liu,
Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt,
Masami Hiramatsu, Alan Maguire
Add 5-byte nop uprobe trigger bench (x86_64 specific) to measure
uprobes/uretprobes on top of nop5 instruction.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/testing/selftests/bpf/bench.c | 12 ++++++
.../selftests/bpf/benchs/bench_trigger.c | 42 +++++++++++++++++++
.../selftests/bpf/benchs/run_bench_uprobes.sh | 2 +-
3 files changed, 55 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 1bd403a5ef7b..0fd8c9b0d38f 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -526,6 +526,12 @@ extern const struct bench bench_trig_uprobe_multi_push;
extern const struct bench bench_trig_uretprobe_multi_push;
extern const struct bench bench_trig_uprobe_multi_ret;
extern const struct bench bench_trig_uretprobe_multi_ret;
+#ifdef __x86_64__
+extern const struct bench bench_trig_uprobe_nop5;
+extern const struct bench bench_trig_uretprobe_nop5;
+extern const struct bench bench_trig_uprobe_multi_nop5;
+extern const struct bench bench_trig_uretprobe_multi_nop5;
+#endif
extern const struct bench bench_rb_libbpf;
extern const struct bench bench_rb_custom;
@@ -586,6 +592,12 @@ static const struct bench *benchs[] = {
&bench_trig_uretprobe_multi_push,
&bench_trig_uprobe_multi_ret,
&bench_trig_uretprobe_multi_ret,
+#ifdef __x86_64__
+ &bench_trig_uprobe_nop5,
+ &bench_trig_uretprobe_nop5,
+ &bench_trig_uprobe_multi_nop5,
+ &bench_trig_uretprobe_multi_nop5,
+#endif
/* ringbuf/perfbuf benchmarks */
&bench_rb_libbpf,
&bench_rb_custom,
diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c
index 32e9f194d449..82327657846e 100644
--- a/tools/testing/selftests/bpf/benchs/bench_trigger.c
+++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c
@@ -333,6 +333,20 @@ static void *uprobe_producer_ret(void *input)
return NULL;
}
+#ifdef __x86_64__
+__nocf_check __weak void uprobe_target_nop5(void)
+{
+ asm volatile (".byte 0x0f, 0x1f, 0x44, 0x00, 0x00");
+}
+
+static void *uprobe_producer_nop5(void *input)
+{
+ while (true)
+ uprobe_target_nop5();
+ return NULL;
+}
+#endif
+
static void usetup(bool use_retprobe, bool use_multi, void *target_addr)
{
size_t uprobe_offset;
@@ -448,6 +462,28 @@ static void uretprobe_multi_ret_setup(void)
usetup(true, true /* use_multi */, &uprobe_target_ret);
}
+#ifdef __x86_64__
+static void uprobe_nop5_setup(void)
+{
+ usetup(false, false /* !use_multi */, &uprobe_target_nop5);
+}
+
+static void uretprobe_nop5_setup(void)
+{
+ usetup(true, false /* !use_multi */, &uprobe_target_nop5);
+}
+
+static void uprobe_multi_nop5_setup(void)
+{
+ usetup(false, true /* use_multi */, &uprobe_target_nop5);
+}
+
+static void uretprobe_multi_nop5_setup(void)
+{
+ usetup(true, true /* use_multi */, &uprobe_target_nop5);
+}
+#endif
+
const struct bench bench_trig_syscall_count = {
.name = "trig-syscall-count",
.validate = trigger_validate,
@@ -506,3 +542,9 @@ BENCH_TRIG_USERMODE(uprobe_multi_ret, ret, "uprobe-multi-ret");
BENCH_TRIG_USERMODE(uretprobe_multi_nop, nop, "uretprobe-multi-nop");
BENCH_TRIG_USERMODE(uretprobe_multi_push, push, "uretprobe-multi-push");
BENCH_TRIG_USERMODE(uretprobe_multi_ret, ret, "uretprobe-multi-ret");
+#ifdef __x86_64__
+BENCH_TRIG_USERMODE(uprobe_nop5, nop5, "uprobe-nop5");
+BENCH_TRIG_USERMODE(uretprobe_nop5, nop5, "uretprobe-nop5");
+BENCH_TRIG_USERMODE(uprobe_multi_nop5, nop5, "uprobe-multi-nop5");
+BENCH_TRIG_USERMODE(uretprobe_multi_nop5, nop5, "uretprobe-multi-nop5");
+#endif
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh
index af169f831f2f..03f55405484b 100755
--- a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh
+++ b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh
@@ -2,7 +2,7 @@
set -eufo pipefail
-for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret}
+for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret,nop5}
do
summary=$(sudo ./bench -w2 -d5 -a trig-$i | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-)
printf "%-15s: %s\n" $i "$summary"
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions
2025-04-14 8:36 [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Jiri Olsa
2025-04-14 8:36 ` [PATCHv3 perf/core 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa
@ 2025-04-14 12:55 ` Oleg Nesterov
2025-04-14 16:13 ` Andrii Nakryiko
2 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2025-04-14 12:55 UTC (permalink / raw)
To: Jiri Olsa
Cc: Peter Zijlstra, Ingo Molnar, Andrii Nakryiko, bpf, linux-kernel,
linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend,
Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire
On 04/14, Jiri Olsa wrote:
>
> @@ -840,6 +840,11 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> insn_byte_t p;
> int i;
>
> + /* x86_nops[insn->length]; same as jmp with .offs = 0 */
> + if (insn->length <= ASM_NOP_MAX &&
> + !memcmp(insn->kaddr, x86_nops[insn->length], insn->length))
> + goto setup;
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions
2025-04-14 8:36 [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Jiri Olsa
2025-04-14 8:36 ` [PATCHv3 perf/core 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa
2025-04-14 12:55 ` [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Oleg Nesterov
@ 2025-04-14 16:13 ` Andrii Nakryiko
2 siblings, 0 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2025-04-14 16:13 UTC (permalink / raw)
To: Jiri Olsa
Cc: Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Andrii Nakryiko, bpf,
linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song,
John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu,
Alan Maguire
On Mon, Apr 14, 2025 at 1:36 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding support to emulate all nop instructions as the original uprobe
> instruction.
>
> This change speeds up uprobe on top of all nop instructions and is a
> preparation for usdt probe optimization, that will be done on top of
> nop5 instruction.
>
> With this change the usdt probe on top of nop5 won't take the performance
> hit compared to usdt probe on top of standard nop instruction.
>
> Suggested-by: Oleg Nesterov <oleg@redhat.com>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> v3 changes:
> - use insn->length as index to x86_nops [Andrii]
>
> arch/x86/kernel/uprobes.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 9194695662b2..6d383839e839 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -840,6 +840,11 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> insn_byte_t p;
> int i;
>
> + /* x86_nops[insn->length]; same as jmp with .offs = 0 */
> + if (insn->length <= ASM_NOP_MAX &&
> + !memcmp(insn->kaddr, x86_nops[insn->length], insn->length))
> + goto setup;
> +
LGTM, thanks!
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> switch (opc1) {
> case 0xeb: /* jmp 8 */
> case 0xe9: /* jmp 32 */
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread