* [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction @ 2025-04-08 21:13 Jiri Olsa 2025-04-08 21:13 ` [PATCH 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa 2025-04-09 11:28 ` [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Oleg Nesterov 0 siblings, 2 replies; 8+ messages in thread From: Jiri Olsa @ 2025-04-08 21:13 UTC (permalink / raw) To: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko Cc: bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire Adding support to emulate nop5 as the original uprobe instruction. This change speeds up uprobe on top of nop5 and is a preparation for usdt probe optimization, that will be done on top of nop5 instruction. With this change the usdt probe on top of nop5 won't take the performance hit compared to usdt probe on top of standard nop instruction. Signed-off-by: Jiri Olsa <jolsa@kernel.org> --- arch/x86/kernel/uprobes.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 9194695662b2..63cc68e19da6 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -608,6 +608,16 @@ static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) *sr = utask->autask.saved_scratch_register; } } + +static int is_nop5_insn(uprobe_opcode_t *insn) +{ + return !memcmp(insn, x86_nops[5], 5); +} + +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) +{ + return is_nop5_insn((uprobe_opcode_t *) &auprobe->insn); +} #else /* 32-bit: */ /* * No RIP-relative addressing on 32-bit @@ -621,6 +631,10 @@ static void riprel_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) { } +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) +{ + return false; +} #endif /* CONFIG_X86_64 */ struct uprobe_xol_ops { @@ -852,6 +866,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) break; case 0x0f: + if (emulate_nop5_insn(auprobe)) + goto setup; if (insn->opcode.nbytes != 2) return -ENOSYS; /* -- 2.49.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench 2025-04-08 21:13 [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Jiri Olsa @ 2025-04-08 21:13 ` Jiri Olsa 2025-04-09 11:28 ` [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Oleg Nesterov 1 sibling, 0 replies; 8+ messages in thread From: Jiri Olsa @ 2025-04-08 21:13 UTC (permalink / raw) To: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko Cc: bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire Add 5-byte nop uprobe trigger bench (x86_64 specific) to measure uprobes/uretprobes on top of nop5 instruction. Signed-off-by: Jiri Olsa <jolsa@kernel.org> --- tools/testing/selftests/bpf/bench.c | 12 ++++++ .../selftests/bpf/benchs/bench_trigger.c | 42 +++++++++++++++++++ .../selftests/bpf/benchs/run_bench_uprobes.sh | 2 +- 3 files changed, 55 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index 1bd403a5ef7b..0fd8c9b0d38f 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -526,6 +526,12 @@ extern const struct bench bench_trig_uprobe_multi_push; extern const struct bench bench_trig_uretprobe_multi_push; extern const struct bench bench_trig_uprobe_multi_ret; extern const struct bench bench_trig_uretprobe_multi_ret; +#ifdef __x86_64__ +extern const struct bench bench_trig_uprobe_nop5; +extern const struct bench bench_trig_uretprobe_nop5; +extern const struct bench bench_trig_uprobe_multi_nop5; +extern const struct bench bench_trig_uretprobe_multi_nop5; +#endif extern const struct bench bench_rb_libbpf; extern const struct bench bench_rb_custom; @@ -586,6 +592,12 @@ static const struct bench *benchs[] = { &bench_trig_uretprobe_multi_push, &bench_trig_uprobe_multi_ret, &bench_trig_uretprobe_multi_ret, +#ifdef __x86_64__ + &bench_trig_uprobe_nop5, + &bench_trig_uretprobe_nop5, + &bench_trig_uprobe_multi_nop5, + &bench_trig_uretprobe_multi_nop5, +#endif /* ringbuf/perfbuf benchmarks */ &bench_rb_libbpf, &bench_rb_custom, diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c index 32e9f194d449..82327657846e 100644 --- a/tools/testing/selftests/bpf/benchs/bench_trigger.c +++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c @@ -333,6 +333,20 @@ static void *uprobe_producer_ret(void *input) return NULL; } +#ifdef __x86_64__ +__nocf_check __weak void uprobe_target_nop5(void) +{ + asm volatile (".byte 0x0f, 0x1f, 0x44, 0x00, 0x00"); +} + +static void *uprobe_producer_nop5(void *input) +{ + while (true) + uprobe_target_nop5(); + return NULL; +} +#endif + static void usetup(bool use_retprobe, bool use_multi, void *target_addr) { size_t uprobe_offset; @@ -448,6 +462,28 @@ static void uretprobe_multi_ret_setup(void) usetup(true, true /* use_multi */, &uprobe_target_ret); } +#ifdef __x86_64__ +static void uprobe_nop5_setup(void) +{ + usetup(false, false /* !use_multi */, &uprobe_target_nop5); +} + +static void uretprobe_nop5_setup(void) +{ + usetup(true, false /* !use_multi */, &uprobe_target_nop5); +} + +static void uprobe_multi_nop5_setup(void) +{ + usetup(false, true /* use_multi */, &uprobe_target_nop5); +} + +static void uretprobe_multi_nop5_setup(void) +{ + usetup(true, true /* use_multi */, &uprobe_target_nop5); +} +#endif + const struct bench bench_trig_syscall_count = { .name = "trig-syscall-count", .validate = trigger_validate, @@ -506,3 +542,9 @@ BENCH_TRIG_USERMODE(uprobe_multi_ret, ret, "uprobe-multi-ret"); BENCH_TRIG_USERMODE(uretprobe_multi_nop, nop, "uretprobe-multi-nop"); BENCH_TRIG_USERMODE(uretprobe_multi_push, push, "uretprobe-multi-push"); BENCH_TRIG_USERMODE(uretprobe_multi_ret, ret, "uretprobe-multi-ret"); +#ifdef __x86_64__ +BENCH_TRIG_USERMODE(uprobe_nop5, nop5, "uprobe-nop5"); +BENCH_TRIG_USERMODE(uretprobe_nop5, nop5, "uretprobe-nop5"); +BENCH_TRIG_USERMODE(uprobe_multi_nop5, nop5, "uprobe-multi-nop5"); +BENCH_TRIG_USERMODE(uretprobe_multi_nop5, nop5, "uretprobe-multi-nop5"); +#endif diff --git a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh index af169f831f2f..03f55405484b 100755 --- a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh +++ b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh @@ -2,7 +2,7 @@ set -eufo pipefail -for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret} +for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret,nop5} do summary=$(sudo ./bench -w2 -d5 -a trig-$i | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-) printf "%-15s: %s\n" $i "$summary" -- 2.49.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-08 21:13 [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Jiri Olsa 2025-04-08 21:13 ` [PATCH 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa @ 2025-04-09 11:28 ` Oleg Nesterov 2025-04-09 11:49 ` Oleg Nesterov 2025-04-09 12:08 ` Jiri Olsa 1 sibling, 2 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-04-09 11:28 UTC (permalink / raw) To: Jiri Olsa Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On 04/08, Jiri Olsa wrote: > > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -608,6 +608,16 @@ static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > *sr = utask->autask.saved_scratch_register; > } > } > + > +static int is_nop5_insn(uprobe_opcode_t *insn) > +{ > + return !memcmp(insn, x86_nops[5], 5); > +} > + > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > +{ > + return is_nop5_insn((uprobe_opcode_t *) &auprobe->insn); > +} Why do we need 2 functions? Can't branch_setup_xol_ops() just use is_nop5_insn(insn->kaddr) ? > #else /* 32-bit: */ > /* > * No RIP-relative addressing on 32-bit > @@ -621,6 +631,10 @@ static void riprel_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > { > } > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > +{ > + return false; > +} Hmm, why? I mean, why we can't emulate x86_nops[5] if !CONFIG_X86_64 ? OTOH. What if the kernel is 64-bit, but the probed task is 32-bit and it uses the 64-bit version of BYTES_NOP5? Perhaps this is fine, I simply don't know, so let me ask... > @@ -852,6 +866,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) > break; > > case 0x0f: > + if (emulate_nop5_insn(auprobe)) > + goto setup; I think this will work, but if we want to emulate nop5, then perhaps we can do the same for other nops? For the moment, lets forget about compat tasks on a 64-bit kernel, can't we simply do something like below? Oleg. --- diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 9194695662b2..76d2cceca6c4 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -840,12 +840,16 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) insn_byte_t p; int i; + /* prefix* + nop[i]; same as jmp with .offs = 0 */ + for (i = 1; i <= ASM_NOP_MAX; ++i) { + if (!memcmp(insn->kaddr, x86_nops[i], i)) + goto setup; + } + switch (opc1) { case 0xeb: /* jmp 8 */ case 0xe9: /* jmp 32 */ break; - case 0x90: /* prefix* + nop; same as jmp with .offs = 0 */ - goto setup; case 0xe8: /* call relative */ branch_clear_offset(auprobe, insn); ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-09 11:28 ` [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Oleg Nesterov @ 2025-04-09 11:49 ` Oleg Nesterov 2025-04-09 12:08 ` Jiri Olsa 1 sibling, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-04-09 11:49 UTC (permalink / raw) To: Jiri Olsa Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On 04/09, Oleg Nesterov wrote: > > For the moment, lets forget about compat tasks on a 64-bit kernel, can't > we simply do something like below? ... > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -840,12 +840,16 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) > insn_byte_t p; > int i; > > + /* prefix* + nop[i]; same as jmp with .offs = 0 */ > + for (i = 1; i <= ASM_NOP_MAX; ++i) { > + if (!memcmp(insn->kaddr, x86_nops[i], i)) > + goto setup; > + } > + > switch (opc1) { > case 0xeb: /* jmp 8 */ > case 0xe9: /* jmp 32 */ > break; > - case 0x90: /* prefix* + nop; same as jmp with .offs = 0 */ > - goto setup; OK, I guess we can't remove this "case 0x90" because of prefixes, please ignore this part. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-09 11:28 ` [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Oleg Nesterov 2025-04-09 11:49 ` Oleg Nesterov @ 2025-04-09 12:08 ` Jiri Olsa 2025-04-09 13:11 ` Oleg Nesterov 1 sibling, 1 reply; 8+ messages in thread From: Jiri Olsa @ 2025-04-09 12:08 UTC (permalink / raw) To: Oleg Nesterov Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On Wed, Apr 09, 2025 at 01:28:39PM +0200, Oleg Nesterov wrote: > On 04/08, Jiri Olsa wrote: > > > > --- a/arch/x86/kernel/uprobes.c > > +++ b/arch/x86/kernel/uprobes.c > > @@ -608,6 +608,16 @@ static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > *sr = utask->autask.saved_scratch_register; > > } > > } > > + > > +static int is_nop5_insn(uprobe_opcode_t *insn) > > +{ > > + return !memcmp(insn, x86_nops[5], 5); > > +} > > + > > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > > +{ > > + return is_nop5_insn((uprobe_opcode_t *) &auprobe->insn); > > +} > > Why do we need 2 functions? Can't branch_setup_xol_ops() just use > is_nop5_insn(insn->kaddr) ? I need is_nop5_insn in other changes I have in queue, so did not want to introduce extra changes > > > #else /* 32-bit: */ > > /* > > * No RIP-relative addressing on 32-bit > > @@ -621,6 +631,10 @@ static void riprel_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > { > > } > > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > > +{ > > + return false; > > +} > > Hmm, why? I mean, why we can't emulate x86_nops[5] if !CONFIG_X86_64 ? ok, so the following uprobe optimization is for CONFIG_X86_64 only, so I followed that, but I guess we might emulate nop5 for !CONFIG_X86_64 > > OTOH. What if the kernel is 64-bit, but the probed task is 32-bit and it > uses the 64-bit version of BYTES_NOP5? > > Perhaps this is fine, I simply don't know, so let me ask... hum, did not think of that, let me try it > > > @@ -852,6 +866,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) > > break; > > > > case 0x0f: > > + if (emulate_nop5_insn(auprobe)) > > + goto setup; > > I think this will work, but if we want to emulate nop5, then perhaps > we can do the same for other nops? > > For the moment, lets forget about compat tasks on a 64-bit kernel, can't > we simply do something like below? I sent similar change (CONFIG_X86_64 only) in this thread: https://lore.kernel.org/bpf/Z_O0Z1ON1YlRqyny@krava/T/#m59c430fb5a30cb9faeb9587fd672ea0adbf3ef4f uprobe won't attach on nop9/10/11 atm, also I don't have practical justification for doing that.. nop5 seems to have future, because of the optimization thanks, jirka > > Oleg. > --- > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > index 9194695662b2..76d2cceca6c4 100644 > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -840,12 +840,16 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) > insn_byte_t p; > int i; > > + /* prefix* + nop[i]; same as jmp with .offs = 0 */ > + for (i = 1; i <= ASM_NOP_MAX; ++i) { > + if (!memcmp(insn->kaddr, x86_nops[i], i)) > + goto setup; > + } > + > switch (opc1) { > case 0xeb: /* jmp 8 */ > case 0xe9: /* jmp 32 */ > break; > - case 0x90: /* prefix* + nop; same as jmp with .offs = 0 */ > - goto setup; > > case 0xe8: /* call relative */ > branch_clear_offset(auprobe, insn); > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-09 12:08 ` Jiri Olsa @ 2025-04-09 13:11 ` Oleg Nesterov 2025-04-09 16:37 ` Jiri Olsa 0 siblings, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2025-04-09 13:11 UTC (permalink / raw) To: Jiri Olsa Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On 04/09, Jiri Olsa wrote: > > On Wed, Apr 09, 2025 at 01:28:39PM +0200, Oleg Nesterov wrote: > > On 04/08, Jiri Olsa wrote: > > > > > > --- a/arch/x86/kernel/uprobes.c > > > +++ b/arch/x86/kernel/uprobes.c > > > @@ -608,6 +608,16 @@ static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > > *sr = utask->autask.saved_scratch_register; > > > } > > > } > > > + > > > +static int is_nop5_insn(uprobe_opcode_t *insn) > > > +{ > > > + return !memcmp(insn, x86_nops[5], 5); > > > +} > > > + > > > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > > > +{ > > > + return is_nop5_insn((uprobe_opcode_t *) &auprobe->insn); > > > +} > > > > Why do we need 2 functions? Can't branch_setup_xol_ops() just use > > is_nop5_insn(insn->kaddr) ? > > I need is_nop5_insn in other changes I have in queue, so did not want > to introduce extra changes But I didn't suggest to remove is_nop5_insn(), I meant that branch_setup_xol_ops() can do if (is_nop5_insn(insn->kaddr)) goto setup; or if (is_nop5_insn(auprobe->insn)) goto setup; this even looks more readable to me. but I won't insist. > > For the moment, lets forget about compat tasks on a 64-bit kernel, can't > > we simply do something like below? > > I sent similar change (CONFIG_X86_64 only) in this thread: > https://lore.kernel.org/bpf/Z_O0Z1ON1YlRqyny@krava/T/#m59c430fb5a30cb9faeb9587fd672ea0adbf3ef4f > > uprobe won't attach on nop9/10/11 atm, Ah, OK, I didn't know. But this means that nop9/10/11 will be rejected by uprobe_init_insn() -> is_prefix_bad() before branch_setup_xol_ops() is called, right? So I guess it is safe to use ASM_NOP_MAX. Nevermind. > also I don't have practical justification > for doing that.. nop5 seems to have future, because of the optimization OK, I won't insist, up to you. Just it looks a bit strange to me. Even if we do not have a use-case for other nops, why we can't emulate them all just for consistency? And given that emulate_nop5_insn() compares the whole insn with x86_nops[5], I guess we don't even need to check OPCODE1(insn)... Nevermind. So, once again, I won't argue. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-09 13:11 ` Oleg Nesterov @ 2025-04-09 16:37 ` Jiri Olsa 2025-04-09 17:58 ` Oleg Nesterov 0 siblings, 1 reply; 8+ messages in thread From: Jiri Olsa @ 2025-04-09 16:37 UTC (permalink / raw) To: Oleg Nesterov Cc: Jiri Olsa, Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On Wed, Apr 09, 2025 at 03:11:16PM +0200, Oleg Nesterov wrote: > On 04/09, Jiri Olsa wrote: > > > > On Wed, Apr 09, 2025 at 01:28:39PM +0200, Oleg Nesterov wrote: > > > On 04/08, Jiri Olsa wrote: > > > > > > > > --- a/arch/x86/kernel/uprobes.c > > > > +++ b/arch/x86/kernel/uprobes.c > > > > @@ -608,6 +608,16 @@ static void riprel_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > > > > *sr = utask->autask.saved_scratch_register; > > > > } > > > > } > > > > + > > > > +static int is_nop5_insn(uprobe_opcode_t *insn) > > > > +{ > > > > + return !memcmp(insn, x86_nops[5], 5); > > > > +} > > > > + > > > > +static bool emulate_nop5_insn(struct arch_uprobe *auprobe) > > > > +{ > > > > + return is_nop5_insn((uprobe_opcode_t *) &auprobe->insn); > > > > +} > > > > > > Why do we need 2 functions? Can't branch_setup_xol_ops() just use > > > is_nop5_insn(insn->kaddr) ? > > > > I need is_nop5_insn in other changes I have in queue, so did not want > > to introduce extra changes > > But I didn't suggest to remove is_nop5_insn(), I meant that > branch_setup_xol_ops() can do > > if (is_nop5_insn(insn->kaddr)) > goto setup; > or > if (is_nop5_insn(auprobe->insn)) > goto setup; > > this even looks more readable to me. but I won't insist. > > > > For the moment, lets forget about compat tasks on a 64-bit kernel, can't > > > we simply do something like below? > > > > I sent similar change (CONFIG_X86_64 only) in this thread: > > https://lore.kernel.org/bpf/Z_O0Z1ON1YlRqyny@krava/T/#m59c430fb5a30cb9faeb9587fd672ea0adbf3ef4f > > > > uprobe won't attach on nop9/10/11 atm, > > Ah, OK, I didn't know. But this means that nop9/10/11 will be rejected > by uprobe_init_insn() -> is_prefix_bad() before branch_setup_xol_ops() is > called, right? So I guess it is safe to use ASM_NOP_MAX. Nevermind. > > > also I don't have practical justification > > for doing that.. nop5 seems to have future, because of the optimization > > OK, I won't insist, up to you. > > Just it looks a bit strange to me. Even if we do not have a use-case > for other nops, why we can't emulate them all just for consistency? we can, I went with nop5 just for simplicity, if you think having all nops support is better, let's do that I checked and compact process executes 64bit nops just fine, so we should be ok there > > And given that emulate_nop5_insn() compares the whole insn with > x86_nops[5], I guess we don't even need to check OPCODE1(insn)... right > Nevermind. > > So, once again, I won't argue. I'm happy to go with your version, wdyt? thanks, jirka --- diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 9194695662b2..63ecc5f6c235 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -840,12 +840,16 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) insn_byte_t p; int i; + /* x86_nops[i]; same as jmp with .offs = 0 */ + for (i = 1; i <= ASM_NOP_MAX; ++i) { + if (!memcmp(insn->kaddr, x86_nops[i], i)) + goto setup; + } + switch (opc1) { case 0xeb: /* jmp 8 */ case 0xe9: /* jmp 32 */ break; - case 0x90: /* prefix* + nop; same as jmp with .offs = 0 */ - goto setup; case 0xe8: /* call relative */ branch_clear_offset(auprobe, insn); ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction 2025-04-09 16:37 ` Jiri Olsa @ 2025-04-09 17:58 ` Oleg Nesterov 0 siblings, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2025-04-09 17:58 UTC (permalink / raw) To: Jiri Olsa Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-kernel, linux-trace-kernel, x86, Song Liu, Yonghong Song, John Fastabend, Hao Luo, Steven Rostedt, Masami Hiramatsu, Alan Maguire On 04/09, Jiri Olsa wrote: > > > Just it looks a bit strange to me. Even if we do not have a use-case > > for other nops, why we can't emulate them all just for consistency? > > we can, I went with nop5 just for simplicity, if you think > having all nops support is better, let's do that Well... Let me repeat, I am not really arguing and I do not want to delay your next changes. We can always cleanup this code later. Please see below. > I checked and compact process executes 64bit nops just fine, > so we should be ok there OK. Then, for your original patch: Acked-by: Oleg Nesterov <oleg@redhat.com> I'd only ask to define is_nop5_insn/emulate_nop5_insn regardless of CONFIG_X86_64. I understand that we have no reason to emulate nop5 on the 32-bit kernel, but at the same time I don't see any reason to complicate this code to explicitly "nack" nop5 in this case. As for the new version below: > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -840,12 +840,16 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) > insn_byte_t p; > int i; > > + /* x86_nops[i]; same as jmp with .offs = 0 */ > + for (i = 1; i <= ASM_NOP_MAX; ++i) { > + if (!memcmp(insn->kaddr, x86_nops[i], i)) > + goto setup; > + } Well, yes, I'd personally obviously prefer this version ;) Just because it looks a bit more clear/consistent to me. But this is subjective. And, > - case 0x90: /* prefix* + nop; same as jmp with .offs = 0 */ > - goto setup; No, this is wrong. Please see my reply to myself, https://lore.kernel.org/all/20250409114950.GB32748@redhat.com/ This way we can no longer emulate, say, "rep; nop". Exactly because either way memcmp(x86_nops[i]) checks the whole instruction. Probably we don't really care, but still this patch shouldn't add any "regression". So, let me repeat. Up to you. Whatever you prefer. I just tried to understand your patch. You have my ACK in any case. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-04-09 17:59 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-08 21:13 [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Jiri Olsa 2025-04-08 21:13 ` [PATCH 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa 2025-04-09 11:28 ` [PATCH 1/2] uprobes/x86: Add support to emulate nop5 instruction Oleg Nesterov 2025-04-09 11:49 ` Oleg Nesterov 2025-04-09 12:08 ` Jiri Olsa 2025-04-09 13:11 ` Oleg Nesterov 2025-04-09 16:37 ` Jiri Olsa 2025-04-09 17:58 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).