From: Jiri Olsa <olsajiri@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
Hao Luo <haoluo@google.com>, Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Alan Maguire <alan.maguire@oracle.com>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH bpf-next 05/13] uprobes: Add mapping for optimized uprobe trampolines
Date: Fri, 13 Dec 2024 14:42:01 +0100 [thread overview]
Message-ID: <Z1w5qXERTJV9hQ9p@krava> (raw)
In-Reply-To: <CAEf4BzZEPdGxjHjPGr-4qKFju+roOiAVrMhTuviozmcP1-qojw@mail.gmail.com>
On Thu, Dec 12, 2024 at 05:01:52PM -0800, Andrii Nakryiko wrote:
SNIP
> > ---
> > include/linux/uprobes.h | 12 +++++
> > kernel/events/uprobes.c | 114 ++++++++++++++++++++++++++++++++++++++++
> > kernel/fork.c | 1 +
> > 3 files changed, 127 insertions(+)
> >
>
> Ran out of time for today, will continue tomorrow for the rest of
> patches. Some comments below.
thanks!
>
> The numbers are really encouraging, though!
>
> > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
> > index 8843b7f99ed0..c4ee755ca2a1 100644
> > --- a/include/linux/uprobes.h
> > +++ b/include/linux/uprobes.h
> > @@ -16,6 +16,7 @@
> > #include <linux/types.h>
> > #include <linux/wait.h>
> > #include <linux/timer.h>
> > +#include <linux/mutex.h>
> >
> > struct uprobe;
> > struct vm_area_struct;
> > @@ -172,6 +173,13 @@ struct xol_area;
> >
> > struct uprobes_state {
> > struct xol_area *xol_area;
> > + struct hlist_head tramp_head;
> > +};
> > +
>
> should we make uprobe_state be linked by a pointer from mm_struct
> instead of increasing mm for each added field? right now it's
> embedded, I don't think it's problematic to allocate it on demand and
> keep it until mm_struct is freed
seems like good idea, I'll check on that
>
> > +struct uprobe_trampoline {
> > + struct hlist_node node;
> > + unsigned long vaddr;
> > + atomic64_t ref;
> > };
> >
> > extern void __init uprobes_init(void);
> > @@ -220,6 +228,10 @@ extern int arch_uprobe_verify_opcode(struct arch_uprobe *auprobe, struct page *p
> > unsigned long vaddr, uprobe_opcode_t *new_opcode,
> > int nbytes);
> > extern bool arch_uprobe_is_register(uprobe_opcode_t *insn, int nbytes);
> > +extern struct uprobe_trampoline *uprobe_trampoline_get(unsigned long vaddr);
> > +extern void uprobe_trampoline_put(struct uprobe_trampoline *area);
> > +extern bool arch_uprobe_is_callable(unsigned long vtramp, unsigned long vaddr);
> > +extern const struct vm_special_mapping *arch_uprobe_trampoline_mapping(void);
> > #else /* !CONFIG_UPROBES */
> > struct uprobes_state {
> > };
> > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> > index 8068f91de9e3..f57918c624da 100644
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -615,6 +615,118 @@ set_orig_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long v
> > (uprobe_opcode_t *)&auprobe->insn, UPROBE_SWBP_INSN_SIZE);
> > }
> >
> > +bool __weak arch_uprobe_is_callable(unsigned long vtramp, unsigned long vaddr)
>
> bikeshedding some more, I still find "is_callable" confusing. How
> about "is_reachable_by_call"? slightly verbose, but probably more
> meaningful?
yep, more precise, will change
>
> > +{
> > + return false;
> > +}
> > +
> > +const struct vm_special_mapping * __weak arch_uprobe_trampoline_mapping(void)
> > +{
> > + return NULL;
> > +}
> > +
> > +static unsigned long find_nearest_page(unsigned long vaddr)
> > +{
> > + struct mm_struct *mm = current->mm;
> > + struct vm_area_struct *vma, *prev;
> > + VMA_ITERATOR(vmi, mm, 0);
> > +
> > + prev = vma_next(&vmi);
>
> minor: we are missing an opportunity to add something between
> [PAGE_SIZE, <first_vma_start>). Probably fine, but why not?
true, will add that check
>
> > + vma = vma_next(&vmi);
> > + while (vma) {
> > + if (vma->vm_start - prev->vm_end >= PAGE_SIZE) {
> > + if (arch_uprobe_is_callable(prev->vm_end, vaddr))
> > + return prev->vm_end;
> > + if (arch_uprobe_is_callable(vma->vm_start - PAGE_SIZE, vaddr))
> > + return vma->vm_start - PAGE_SIZE;
> > + }
> > +
> > + prev = vma;
> > + vma = vma_next(&vmi);
> > + }
> > +
> > + return 0;
> > +}
> > +
>
> [...]
>
> > +struct uprobe_trampoline *uprobe_trampoline_get(unsigned long vaddr)
> > +{
> > + struct uprobes_state *state = ¤t->mm->uprobes_state;
> > + struct uprobe_trampoline *tramp = NULL;
> > +
> > + hlist_for_each_entry(tramp, &state->tramp_head, node) {
> > + if (arch_uprobe_is_callable(tramp->vaddr, vaddr)) {
> > + atomic64_inc(&tramp->ref);
> > + return tramp;
> > + }
> > + }
> > +
> > + tramp = create_uprobe_trampoline(vaddr);
> > + if (!tramp)
> > + return NULL;
> > +
> > + hlist_add_head(&tramp->node, &state->tramp_head);
> > + return tramp;
> > +}
> > +
> > +static void destroy_uprobe_trampoline(struct uprobe_trampoline *tramp)
> > +{
> > + hlist_del(&tramp->node);
> > + kfree(tramp);
>
> hmm... shouldn't this be RCU-delayed (RCU Tasks Trace for uprobes),
> otherwise we might have some CPU executing code in that trampoline,
> no?
so we call destroy_uprobe_trampoline in 2 scenarios:
- from uprobe_trampoline_put (in __arch_uprobe_optimize) when we failed
to optimize the uprobe, so no task can execute it at that point
- from clear_tramp_head as part of the uprobe trampolines cleanup
(__mmput -> uprobe_clear_state) at which point the task should be dead
jirka
>
> > +}
> > +
>
> [...]
next prev parent reply other threads:[~2024-12-13 13:42 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 13:33 [PATCH bpf-next 00/13] uprobes: Add support to optimize usdt probes on x86_64 Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 01/13] uprobes: Rename arch_uretprobe_trampoline function Jiri Olsa
2024-12-13 0:42 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 02/13] uprobes: Make copy_from_page global Jiri Olsa
2024-12-13 0:43 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 03/13] uprobes: Add nbytes argument to uprobe_write_opcode Jiri Olsa
2024-12-13 0:45 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 04/13] uprobes: Add arch_uprobe_verify_opcode function Jiri Olsa
2024-12-13 0:48 ` Andrii Nakryiko
2024-12-13 13:21 ` Jiri Olsa
2024-12-13 21:11 ` Andrii Nakryiko
2024-12-13 21:52 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 05/13] uprobes: Add mapping for optimized uprobe trampolines Jiri Olsa
2024-12-13 1:01 ` Andrii Nakryiko
2024-12-13 13:42 ` Jiri Olsa [this message]
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 06/13] uprobes/x86: Add uprobe syscall to speed up uprobe Jiri Olsa
2024-12-13 13:48 ` Thomas Weißschuh
2024-12-13 14:51 ` Jiri Olsa
2024-12-13 15:12 ` Thomas Weißschuh
2024-12-13 21:52 ` Jiri Olsa
2024-12-14 13:21 ` Thomas Weißschuh
2024-12-16 8:03 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 07/13] uprobes/x86: Add support to emulate nop5 instruction Jiri Olsa
2024-12-13 10:45 ` Peter Zijlstra
2024-12-13 13:02 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes Jiri Olsa
2024-12-13 10:49 ` Peter Zijlstra
2024-12-13 13:06 ` Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-15 12:06 ` David Laight
2024-12-15 14:14 ` Oleg Nesterov
2024-12-16 8:08 ` Jiri Olsa
2024-12-16 9:18 ` David Laight
2024-12-16 10:12 ` Oleg Nesterov
2024-12-16 11:10 ` David Laight
2024-12-16 12:22 ` Oleg Nesterov
2024-12-16 12:50 ` Jiri Olsa
2024-12-16 15:08 ` David Laight
2024-12-16 16:06 ` Jiri Olsa
2024-12-11 13:33 ` [PATCH bpf-next 09/13] selftests/bpf: Use 5-byte nop for x86 usdt probes Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 8:32 ` Jiri Olsa
2024-12-16 23:06 ` Andrii Nakryiko
2024-12-11 13:33 ` [PATCH bpf-next 10/13] selftests/bpf: Add uprobe/usdt optimized test Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 7:58 ` Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 11/13] selftests/bpf: Add hit/attach/detach race optimized uprobe test Jiri Olsa
2024-12-13 21:58 ` Andrii Nakryiko
2024-12-16 7:59 ` Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 12/13] selftests/bpf: Add uprobe syscall sigill signal test Jiri Olsa
2024-12-11 13:34 ` [PATCH bpf-next 13/13] selftests/bpf: Add 5-byte nop uprobe trigger bench Jiri Olsa
2024-12-13 21:57 ` Andrii Nakryiko
2024-12-16 7:56 ` Jiri Olsa
2024-12-13 0:43 ` [PATCH bpf-next 00/13] uprobes: Add support to optimize usdt probes on x86_64 Andrii Nakryiko
2024-12-13 9:46 ` Jiri Olsa
2024-12-13 10:51 ` Peter Zijlstra
2024-12-13 13:07 ` Jiri Olsa
2024-12-13 13:54 ` Peter Zijlstra
2024-12-13 14:05 ` Jiri Olsa
2024-12-13 18:39 ` Peter Zijlstra
2024-12-13 21:52 ` Jiri Olsa
2024-12-13 21:59 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1w5qXERTJV9hQ9p@krava \
--to=olsajiri@gmail.com \
--cc=alan.maguire@oracle.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.