From: Jiri Olsa <olsajiri@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Jiri Olsa" <olsajiri@gmail.com>,
"Oleg Nesterov" <oleg@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Andrii Nakryiko" <andrii@kernel.org>,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, x86@kernel.org,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Hao Luo" <haoluo@google.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Alan Maguire" <alan.maguire@oracle.com>,
"David Laight" <David.Laight@aculab.com>,
"Thomas Weißschuh" <thomas@t-8ch.de>,
"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCHv3 perf/core 08/22] uprobes/x86: Add mapping for optimized uprobe trampolines
Date: Fri, 27 Jun 2025 14:39:16 +0200 [thread overview]
Message-ID: <aF6Q9NgCJx5p0MNJ@krava> (raw)
In-Reply-To: <20250627150145.15cdec0f4991a99f997a8168@kernel.org>
On Fri, Jun 27, 2025 at 03:01:45PM +0900, Masami Hiramatsu wrote:
SNIP
> > >
> > > > + return tramp;
> > > > + }
> > > > +
> > > > + tramp = create_uprobe_trampoline(vaddr);
> > > > + if (!tramp)
> > > > + return NULL;
> > > > +
> > > > + *new = true;
> > > > + hlist_add_head(&tramp->node, &state->head_tramps);
> > > > + return tramp;
> > > > +}
> > > > +
> > > > +static void destroy_uprobe_trampoline(struct uprobe_trampoline *tramp)
> > > > +{
> > > > + hlist_del(&tramp->node);
> > > > + kfree(tramp);
> > >
> > > Don't we need to unmap the tramp->vaddr?
> >
> > that's tricky because we have no way to make sure the application is
> > no longer executing the trampoline, it's described in the changelog
> > of following patch:
> >
> > uprobes/x86: Add support to optimize uprobes
> >
> > ...
> >
> > We do not unmap and release uprobe trampoline when it's no longer needed,
> > because there's no easy way to make sure none of the threads is still
> > inside the trampoline. But we do not waste memory, because there's just
> > single page for all the uprobe trampoline mappings.
> >
>
> I think we should put this as a code comment.
ok
>
> > We do waste frame on page mapping for every 4GB by keeping the uprobe
> > trampoline page mapped, but that seems ok.
>
> Hmm, this is not right with the current find_nearest_page(), because
> it always finds a page from the farthest +2GB range until it is full.
> Thus, in the worst case, if we hits uprobes with the order of
> uprobe0 -> 1 -> 2 which is put as below;
>
> 0x0abc0004 [uprobe2]
> ...
> 0x0abc2004 [uprobe1]
> ...
> 0x0abc4004 [uprobe0]
>
> Then the trampoline pages can be allocated as below.
>
> 0x8abc0000 [uprobe_tramp2]
> [gap]
> 0x8abc2000 [uprobe_tramp1]
> [gap]
> 0x8abc4000 [uprobe_tramp0]
>
> Using true "find_nearest_page()", this will be mitigated. But not
> allocated for "every 4GB". So I think we should drop that part
> from the comment :)
I think you're right, it's better to start with nearest page,
will change it in new version
SNIP
> > > > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> > > > index 5080619560d4..b40d33aae016 100644
> > > > --- a/include/linux/uprobes.h
> > > > +++ b/include/linux/uprobes.h
> > > > @@ -17,6 +17,7 @@
> > > > #include <linux/wait.h>
> > > > #include <linux/timer.h>
> > > > #include <linux/seqlock.h>
> > > > +#include <linux/mutex.h>
> > > >
> > > > struct uprobe;
> > > > struct vm_area_struct;
> > > > @@ -185,6 +186,9 @@ struct xol_area;
> > > >
> > > > struct uprobes_state {
> > > > struct xol_area *xol_area;
> > > > +#ifdef CONFIG_X86_64
> > >
> > > Maybe we can introduce struct arch_uprobe_state{} here?
> >
> > ok, on top of that Andrii also asked for [1]:
> > - alloc 'struct uprobes_state' for mm_struct only when needed
> >
> > this could be part of that follow up? I'd rather not complicate this
> > patchset any further
> >
> > [1] https://lore.kernel.org/bpf/CAEf4BzY2zKPM9JHgn_wa8yCr8q5KntE5w8g=AoT2MnrD2Dx6gA@mail.gmail.com/
>
> Hmm, OK. But if you need to avoid #ifdef CONFIG_<arch>,
> you can use include/asm-generic to override macros.
>
> struct uprobes_state {
> struct xol_area *xol_area;
> uprobe_arch_specific_data
> };
>
>
> --- include/asm-generic/uprobes.h
>
> #define uprobe_arch_specific_data
>
> --- arch/x86/include/asm/uprobes.h
>
> #undef uprobe_arch_specific_data
> #define uprobe_arch_specific_data \
> struct hlist_head head_tramps;
ok
SNIP
> > > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > > index 1ee8eb11f38b..7108ca558518 100644
> > > > --- a/kernel/fork.c
> > > > +++ b/kernel/fork.c
> > > > @@ -1010,6 +1010,7 @@ static void mm_init_uprobes_state(struct mm_struct *mm)
> > > > {
> > > > #ifdef CONFIG_UPROBES
> > > > mm->uprobes_state.xol_area = NULL;
> > > > + arch_uprobe_init_state(mm);
> > > > #endif
> > >
> > > Can't we make this uprobe_init_state(mm)?
> >
> > hum, there are other mm_init_* functions around, I guess we should keep
> > the same pattern?
> >
> > unless you mean s/arch_uprobe_init_state/uprobe_init_state/ but that's
> > arch code.. so probably not sure what you mean ;-)
>
> Ah, I misunderstood. Yeah, this part is good to me.
ok, thanks
jirka
next prev parent reply other threads:[~2025-06-27 12:39 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-05 13:23 [PATCHv3 perf/core 00/22] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 01/22] uprobes: Remove breakpoint in unapply_uprobe under mmap_write_lock Jiri Olsa
2025-06-25 6:04 ` Masami Hiramatsu
2025-06-05 13:23 ` [PATCHv3 perf/core 02/22] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 03/22] uprobes: Make copy_from_page global Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 04/22] uprobes: Add uprobe_write function Jiri Olsa
2025-06-25 6:12 ` Masami Hiramatsu
2025-06-05 13:23 ` [PATCHv3 perf/core 05/22] uprobes: Add nbytes argument to uprobe_write Jiri Olsa
2025-06-25 6:13 ` Masami Hiramatsu
2025-06-05 13:23 ` [PATCHv3 perf/core 06/22] uprobes: Add is_register argument to uprobe_write and uprobe_write_opcode Jiri Olsa
2025-06-25 6:32 ` Masami Hiramatsu
2025-06-05 13:23 ` [PATCHv3 perf/core 07/22] uprobes: Add do_ref_ctr argument to uprobe_write function Jiri Olsa
2025-06-25 6:42 ` Masami Hiramatsu
2025-06-25 15:11 ` Jiri Olsa
2025-06-27 4:58 ` Masami Hiramatsu
2025-06-05 13:23 ` [PATCHv3 perf/core 08/22] uprobes/x86: Add mapping for optimized uprobe trampolines Jiri Olsa
2025-06-25 8:21 ` Masami Hiramatsu
2025-06-25 15:16 ` Jiri Olsa
2025-06-27 6:01 ` Masami Hiramatsu
2025-06-27 12:39 ` Jiri Olsa [this message]
2025-07-04 8:23 ` Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 09/22] uprobes/x86: Add uprobe syscall to speed up uprobe Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 10/22] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 11/22] selftests/bpf: Import usdt.h from libbpf/usdt project Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 12/22] selftests/bpf: Reorg the uprobe_syscall test function Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 13/22] selftests/bpf: Rename uprobe_syscall_executed prog to test_uretprobe_multi Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 14/22] selftests/bpf: Add uprobe/usdt syscall tests Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 15/22] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 16/22] selftests/bpf: Add uprobe syscall sigill signal test Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 17/22] selftests/bpf: Add optimized usdt variant for basic usdt test Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 18/22] selftests/bpf: Add uprobe_regs_equal test Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 19/22] selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 20/22] seccomp: passthrough uprobe systemcall without filtering Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 perf/core 21/22] selftests/seccomp: validate uprobe syscall passes through seccomp Jiri Olsa
2025-06-05 13:23 ` [PATCHv3 22/22] man2: Add uprobe syscall page Jiri Olsa
2025-06-11 8:30 ` Alejandro Colomar
2025-06-17 13:08 ` [PATCHv3 perf/core 00/22] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2025-06-24 8:36 ` Jiri Olsa
2025-06-25 6:05 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aF6Q9NgCJx5p0MNJ@krava \
--to=olsajiri@gmail.com \
--cc=David.Laight@aculab.com \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=thomas@t-8ch.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.