bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Masami Hiramatsu" <mhiramat@kernel.org>,
	"Jiri Olsa" <olsajiri@gmail.com>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org, x86@kernel.org,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Hao Luo" <haoluo@google.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Alan Maguire" <alan.maguire@oracle.com>,
	"David Laight" <David.Laight@aculab.com>,
	"Thomas Weißschuh" <thomas@t-8ch.de>,
	"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCHv6 perf/core 10/22] uprobes/x86: Add support to optimize uprobes
Date: Fri, 8 Aug 2025 19:44:29 +0200	[thread overview]
Message-ID: <aJY3fXqnD7MkxDMm@krava> (raw)
In-Reply-To: <aIftAJg1hZGYp4NF@krava>

ping, thanks

On Mon, Jul 28, 2025 at 11:34:56PM +0200, Jiri Olsa wrote:
> On Fri, Jul 25, 2025 at 07:13:18PM +0900, Masami Hiramatsu wrote:
> > On Sun, 20 Jul 2025 13:21:20 +0200
> > Jiri Olsa <jolsa@kernel.org> wrote:
> > 
> > > Putting together all the previously added pieces to support optimized
> > > uprobes on top of 5-byte nop instruction.
> > > 
> > > The current uprobe execution goes through following:
> > > 
> > >   - installs breakpoint instruction over original instruction
> > >   - exception handler hit and calls related uprobe consumers
> > >   - and either simulates original instruction or does out of line single step
> > >     execution of it
> > >   - returns to user space
> > > 
> > > The optimized uprobe path does following:
> > > 
> > >   - checks the original instruction is 5-byte nop (plus other checks)
> > >   - adds (or uses existing) user space trampoline with uprobe syscall
> > >   - overwrites original instruction (5-byte nop) with call to user space
> > >     trampoline
> > >   - the user space trampoline executes uprobe syscall that calls related uprobe
> > >     consumers
> > >   - trampoline returns back to next instruction
> > > 
> > > This approach won't speed up all uprobes as it's limited to using nop5 as
> > > original instruction, but we plan to use nop5 as USDT probe instruction
> > > (which currently uses single byte nop) and speed up the USDT probes.
> > > 
> > > The arch_uprobe_optimize triggers the uprobe optimization and is called after
> > > first uprobe hit. I originally had it called on uprobe installation but then
> > > it clashed with elf loader, because the user space trampoline was added in a
> > > place where loader might need to put elf segments, so I decided to do it after
> > > first uprobe hit when loading is done.
> > > 
> > > The uprobe is un-optimized in arch specific set_orig_insn call.
> > > 
> > > The instruction overwrite is x86 arch specific and needs to go through 3 updates:
> > > (on top of nop5 instruction)
> > > 
> > >   - write int3 into 1st byte
> > >   - write last 4 bytes of the call instruction
> > >   - update the call instruction opcode
> > > 
> > > And cleanup goes though similar reverse stages:
> > > 
> > >   - overwrite call opcode with breakpoint (int3)
> > >   - write last 4 bytes of the nop5 instruction
> > >   - write the nop5 first instruction byte
> > > 
> > > We do not unmap and release uprobe trampoline when it's no longer needed,
> > > because there's no easy way to make sure none of the threads is still
> > > inside the trampoline. But we do not waste memory, because there's just
> > > single page for all the uprobe trampoline mappings.
> > > 
> > > We do waste frame on page mapping for every 4GB by keeping the uprobe
> > > trampoline page mapped, but that seems ok.
> > > 
> > > We take the benefit from the fact that set_swbp and set_orig_insn are
> > > called under mmap_write_lock(mm), so we can use the current instruction
> > > as the state the uprobe is in - nop5/breakpoint/call trampoline -
> > > and decide the needed action (optimize/un-optimize) based on that.
> > > 
> > > Attaching the speed up from benchs/run_bench_uprobes.sh script:
> > > 
> > > current:
> > >         usermode-count :  152.604 ± 0.044M/s
> > >         syscall-count  :   13.359 ± 0.042M/s
> > > -->     uprobe-nop     :    3.229 ± 0.002M/s
> > >         uprobe-push    :    3.086 ± 0.004M/s
> > >         uprobe-ret     :    1.114 ± 0.004M/s
> > >         uprobe-nop5    :    1.121 ± 0.005M/s
> > >         uretprobe-nop  :    2.145 ± 0.002M/s
> > >         uretprobe-push :    2.070 ± 0.001M/s
> > >         uretprobe-ret  :    0.931 ± 0.001M/s
> > >         uretprobe-nop5 :    0.957 ± 0.001M/s
> > > 
> > > after the change:
> > >         usermode-count :  152.448 ± 0.244M/s
> > >         syscall-count  :   14.321 ± 0.059M/s
> > >         uprobe-nop     :    3.148 ± 0.007M/s
> > >         uprobe-push    :    2.976 ± 0.004M/s
> > >         uprobe-ret     :    1.068 ± 0.003M/s
> > > -->     uprobe-nop5    :    7.038 ± 0.007M/s
> > >         uretprobe-nop  :    2.109 ± 0.004M/s
> > >         uretprobe-push :    2.035 ± 0.001M/s
> > >         uretprobe-ret  :    0.908 ± 0.001M/s
> > >         uretprobe-nop5 :    3.377 ± 0.009M/s
> > > 
> > > I see bit more speed up on Intel (above) compared to AMD. The big nop5
> > > speed up is partly due to emulating nop5 and partly due to optimization.
> > > 
> > > The key speed up we do this for is the USDT switch from nop to nop5:
> > >         uprobe-nop     :    3.148 ± 0.007M/s
> > >         uprobe-nop5    :    7.038 ± 0.007M/s
> > > 
> > 
> > This also looks good to me.
> > 
> > Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> thanks!
> 
> Peter, do you have more comments?
> 
> thanks,
> jirka

  reply	other threads:[~2025-08-08 17:44 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-20 11:21 [PATCHv6 perf/core 00/22] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 01/22] uprobes: Remove breakpoint in unapply_uprobe under mmap_write_lock Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 02/22] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 03/22] uprobes: Make copy_from_page global Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 04/22] uprobes: Add uprobe_write function Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 05/22] uprobes: Add nbytes argument to uprobe_write Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 06/22] uprobes: Add is_register argument to uprobe_write and uprobe_write_opcode Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 07/22] uprobes: Add do_ref_ctr argument to uprobe_write function Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 08/22] uprobes/x86: Add mapping for optimized uprobe trampolines Jiri Olsa
2025-08-19 14:53   ` Peter Zijlstra
2025-08-20 12:18     ` Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 09/22] uprobes/x86: Add uprobe syscall to speed up uprobe Jiri Olsa
2025-07-20 11:38   ` Oleg Nesterov
2025-07-25 10:11   ` Masami Hiramatsu
2025-09-03 18:24   ` Andrii Nakryiko
2025-09-03 20:56     ` Jiri Olsa
2025-09-03 21:01       ` Peter Zijlstra
2025-09-03 23:12         ` Andrii Nakryiko
2025-09-04  7:56           ` Jiri Olsa
2025-09-04  9:39             ` Jann Horn
2025-09-04 14:03               ` Jiri Olsa
2025-09-04 18:32                 ` Andrii Nakryiko
2025-09-04  8:13     ` Jiri Olsa
2025-09-04 18:27       ` nop5-optimized USDTs WAS: " Andrii Nakryiko
2025-07-20 11:21 ` [PATCHv6 perf/core 10/22] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2025-07-25 10:13   ` Masami Hiramatsu
2025-07-28 21:34     ` Jiri Olsa
2025-08-08 17:44       ` Jiri Olsa [this message]
2025-08-19 19:17       ` Peter Zijlstra
2025-08-20 12:19         ` Jiri Olsa
2025-08-19 19:15   ` Peter Zijlstra
2025-08-20 12:19     ` Jiri Olsa
2025-08-20 13:01       ` Peter Zijlstra
2025-08-20 12:30     ` Peter Zijlstra
2025-08-20 15:58       ` Edgecombe, Rick P
2025-08-20 17:12         ` Peter Zijlstra
2025-08-20 17:26           ` Edgecombe, Rick P
2025-08-20 17:43             ` Peter Zijlstra
2025-08-20 18:04               ` Edgecombe, Rick P
2025-08-20 21:38       ` Jiri Olsa
2025-09-03  6:48     ` Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 11/22] selftests/bpf: Import usdt.h from libbpf/usdt project Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 12/22] selftests/bpf: Reorg the uprobe_syscall test function Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 13/22] selftests/bpf: Rename uprobe_syscall_executed prog to test_uretprobe_multi Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 14/22] selftests/bpf: Add uprobe/usdt syscall tests Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 15/22] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 16/22] selftests/bpf: Add uprobe syscall sigill signal test Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 17/22] selftests/bpf: Add optimized usdt variant for basic usdt test Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 18/22] selftests/bpf: Add uprobe_regs_equal test Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 19/22] selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 20/22] seccomp: passthrough uprobe systemcall without filtering Jiri Olsa
2025-07-20 11:21 ` [PATCHv6 perf/core 21/22] selftests/seccomp: validate uprobe syscall passes through seccomp Jiri Olsa
2025-07-20 11:21 ` [PATCHv5 22/22] man2: Add uprobe syscall page Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aJY3fXqnD7MkxDMm@krava \
    --to=olsajiri@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=thomas@t-8ch.de \
    --cc=x86@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).