From: Jiri Olsa <olsajiri@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Oleg Nesterov" <oleg@redhat.com>,
"Andrii Nakryiko" <andrii@kernel.org>,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, x86@kernel.org,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Hao Luo" <haoluo@google.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Masami Hiramatsu" <mhiramat@kernel.org>,
"Alan Maguire" <alan.maguire@oracle.com>,
"David Laight" <David.Laight@aculab.com>,
"Thomas Weißschuh" <thomas@t-8ch.de>,
"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCHv5 perf/core 10/22] uprobes/x86: Add support to optimize uprobes
Date: Mon, 14 Jul 2025 23:29:07 +0200 [thread overview]
Message-ID: <aHV2o4SXbnRZdQSu@krava> (raw)
In-Reply-To: <20250714094824.GQ905792@noisy.programming.kicks-ass.net>
On Mon, Jul 14, 2025 at 11:48:24AM +0200, Peter Zijlstra wrote:
> On Fri, Jul 11, 2025 at 10:29:18AM +0200, Jiri Olsa wrote:
> > +enum {
> > + OPT_PART,
> > + OPT_INSN,
> > + UNOPT_INT3,
> > + UNOPT_PART,
> > +};
> > +
> > +struct write_opcode_ctx {
> > + unsigned long base;
> > + int update;
> > +};
> > +
> > +static int is_call_insn(uprobe_opcode_t *insn)
> > +{
> > + return *insn == CALL_INSN_OPCODE;
> > +}
> > +
> > +static int verify_insn(struct page *page, unsigned long vaddr, uprobe_opcode_t *new_opcode,
> > + int nbytes, void *data)
> > +{
> > + struct write_opcode_ctx *ctx = data;
> > + uprobe_opcode_t old_opcode[5];
> > +
> > + uprobe_copy_from_page(page, ctx->base, (uprobe_opcode_t *) &old_opcode, 5);
> > +
> > + switch (ctx->update) {
> > + case OPT_PART:
> > + case OPT_INSN:
> > + if (is_swbp_insn(&old_opcode[0]))
> > + return 1;
> > + break;
> > + case UNOPT_INT3:
> > + if (is_call_insn(&old_opcode[0]))
> > + return 1;
> > + break;
> > + case UNOPT_PART:
> > + if (is_swbp_insn(&old_opcode[0]))
> > + return 1;
> > + break;
> > + }
> > +
> > + return -1;
> > +}
> > +
> > +static int write_insn(struct arch_uprobe *auprobe, struct vm_area_struct *vma, unsigned long vaddr,
> > + uprobe_opcode_t *insn, int nbytes, void *ctx)
> > +{
> > + return uprobe_write(auprobe, vma, vaddr, insn, nbytes, verify_insn,
> > + true /* is_register */, false /* do_update_ref_ctr */, ctx);
> > +}
> > +
> > +static void relative_call(void *dest, long from, long to)
> > +{
> > + struct __packed __arch_relative_insn {
> > + u8 op;
> > + s32 raddr;
> > + } *insn;
> > +
> > + insn = (struct __arch_relative_insn *)dest;
> > + insn->raddr = (s32)(to - (from + 5));
> > + insn->op = CALL_INSN_OPCODE;
> > +}
>
> We already have this in asm/text-patching.h, its called
> __text_gen_insn().
>
> > +
> > +static int swbp_optimize(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
> > + unsigned long vaddr, unsigned long tramp)
> > +{
> > + struct write_opcode_ctx ctx = {
> > + .base = vaddr,
> > + };
> > + char call[5];
> > + int err;
> > +
> > + relative_call(call, vaddr, tramp);
>
> __text_gen_insn(call, CALL_INSN_OPCODE, vaddr, tramp, CALL_INSN_SIZE);
ok, will use that
>
> > +
> > + /*
> > + * We are in state where breakpoint (int3) is installed on top of first
> > + * byte of the nop5 instruction. We will do following steps to overwrite
> > + * this to call instruction:
> > + *
> > + * - sync cores
> > + * - write last 4 bytes of the call instruction
> > + * - sync cores
> > + * - update the call instruction opcode
>
> The sanctioned text poke sequence has another sync-core at the end.
> Please also do this.
ok
>
> > + */
> > +
> > + smp_text_poke_sync_each_cpu();
> > +
> > + ctx.update = OPT_PART;
> > + err = write_insn(auprobe, vma, vaddr + 1, call + 1, 4, &ctx);
> > + if (err)
> > + return err;
> > +
> > + smp_text_poke_sync_each_cpu();
> > +
> > + ctx.update = OPT_INSN;
> > + return write_insn(auprobe, vma, vaddr, call, 1, &ctx);
> > +}
> > +
> > +static int swbp_unoptimize(struct arch_uprobe *auprobe, struct vm_area_struct *vma,
> > + unsigned long vaddr)
> > +{
> > + uprobe_opcode_t int3 = UPROBE_SWBP_INSN;
> > + struct write_opcode_ctx ctx = {
> > + .base = vaddr,
> > + };
> > + int err;
> > +
> > + /*
> > + * We need to overwrite call instruction into nop5 instruction with
> > + * breakpoint (int3) installed on top of its first byte. We will:
> > + *
> > + * - overwrite call opcode with breakpoint (int3)
> > + * - sync cores
> > + * - write last 4 bytes of the nop5 instruction
> > + * - sync cores
> > + */
> > +
> > + ctx.update = UNOPT_INT3;
> > + err = write_insn(auprobe, vma, vaddr, &int3, 1, &ctx);
> > + if (err)
> > + return err;
> > +
> > + smp_text_poke_sync_each_cpu();
> > +
> > + ctx.update = UNOPT_PART;
> > + err = write_insn(auprobe, vma, vaddr + 1, (uprobe_opcode_t *) auprobe->insn + 1, 4, &ctx);
> > +
> > + smp_text_poke_sync_each_cpu();
> > + return err;
> > +}
>
> Please unify these two functions; it makes absolutely no sense to have
> two copies of this logic around.
will try to come up with something
thanks,
jirka
next prev parent reply other threads:[~2025-07-14 21:29 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-11 8:29 [PATCHv5 perf/core 00/22] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 01/22] uprobes: Remove breakpoint in unapply_uprobe under mmap_write_lock Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 02/22] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 03/22] uprobes: Make copy_from_page global Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 04/22] uprobes: Add uprobe_write function Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 05/22] uprobes: Add nbytes argument to uprobe_write Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 06/22] uprobes: Add is_register argument to uprobe_write and uprobe_write_opcode Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 07/22] uprobes: Add do_ref_ctr argument to uprobe_write function Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 08/22] uprobes/x86: Add mapping for optimized uprobe trampolines Jiri Olsa
2025-07-11 17:46 ` Oleg Nesterov
2025-07-11 19:36 ` Jiri Olsa
2025-07-14 7:23 ` Masami Hiramatsu
2025-07-11 8:29 ` [PATCHv5 perf/core 09/22] uprobes/x86: Add uprobe syscall to speed up uprobe Jiri Olsa
2025-07-14 8:39 ` Masami Hiramatsu
2025-07-14 9:28 ` Peter Zijlstra
2025-07-14 21:29 ` Jiri Olsa
2025-07-14 9:39 ` Peter Zijlstra
2025-07-14 10:19 ` Masami Hiramatsu
2025-07-14 21:28 ` Jiri Olsa
2025-07-14 23:54 ` Masami Hiramatsu
2025-07-15 12:16 ` Jiri Olsa
2025-07-16 2:39 ` Masami Hiramatsu
2025-07-11 8:29 ` [PATCHv5 perf/core 10/22] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2025-07-14 9:48 ` Peter Zijlstra
2025-07-14 21:29 ` Jiri Olsa [this message]
2025-07-17 15:29 ` Jiri Olsa
2025-07-14 10:13 ` Masami Hiramatsu
2025-07-14 21:29 ` Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 11/22] selftests/bpf: Import usdt.h from libbpf/usdt project Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 12/22] selftests/bpf: Reorg the uprobe_syscall test function Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 13/22] selftests/bpf: Rename uprobe_syscall_executed prog to test_uretprobe_multi Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 14/22] selftests/bpf: Add uprobe/usdt syscall tests Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 15/22] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 16/22] selftests/bpf: Add uprobe syscall sigill signal test Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 17/22] selftests/bpf: Add optimized usdt variant for basic usdt test Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 18/22] selftests/bpf: Add uprobe_regs_equal test Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 19/22] selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 20/22] seccomp: passthrough uprobe systemcall without filtering Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 perf/core 21/22] selftests/seccomp: validate uprobe syscall passes through seccomp Jiri Olsa
2025-07-11 8:29 ` [PATCHv5 22/22] man2: Add uprobe syscall page Jiri Olsa
2025-07-14 14:04 ` Masami Hiramatsu
2025-07-11 17:17 ` [PATCHv5 perf/core 00/22] uprobes: Add support to optimize usdt probes on x86_64 Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHV2o4SXbnRZdQSu@krava \
--to=olsajiri@gmail.com \
--cc=David.Laight@aculab.com \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=thomas@t-8ch.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.