From: Jiri Olsa <olsajiri@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Jiri Olsa <olsajiri@gmail.com>, Tianyi Liu <i.pear@outlook.com>,
andrii.nakryiko@gmail.com, mhiramat@kernel.org, ajor@meta.com,
albancrequy@linux.microsoft.com, bpf@vger.kernel.org,
flaniel@linux.microsoft.com, linux-trace-kernel@vger.kernel.org,
linux@jordanrome.com, mathieu.desnoyers@efficios.com
Subject: Re: [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe
Date: Mon, 26 Aug 2024 15:48:30 +0200 [thread overview]
Message-ID: <ZsyHrhG9Q5BpZ1ae@krava> (raw)
In-Reply-To: <20240826115752.GA21268@redhat.com>
On Mon, Aug 26, 2024 at 01:57:52PM +0200, Oleg Nesterov wrote:
> On 08/26, Jiri Olsa wrote:
> >
> > On Mon, Aug 26, 2024 at 12:40:18AM +0200, Oleg Nesterov wrote:
> > > $ ./test &
> > > $ bpftrace -p $! -e 'uprobe:./test:func { printf("%d\n", pid); }'
> > >
> > > I hope that the syntax of the 2nd command is correct...
> > >
> > > I _think_ that it will print 2 pids too.
> >
> > yes.. but with CLONE_VM both processes share 'mm'
>
> Yes sure,
>
> > so they are threads,
>
> Well this depends on definition ;) but the CLONE_VM child is not a sub-thread,
> it has another TGID. See below.
>
> > and at least uprobe_multi filters by process [1] now.. ;-)
>
> OK, if you say that this behaviour is fine I won't argue, I simply do not know.
> But see below.
>
> > > But "perf-record -p" works as expected.
> >
> > I wonder it's because there's the perf layer that schedules each
> > uprobe event only when its process (PID1/2) is scheduled in and will
> > receive events only from that cpu while the process is running on it
>
> Not sure I understand... The task which hits the breakpoint is always
> current, it is always scheduled in.
hum, I might be missing something, but ;-)
assuming we have 2 tasks, each with perf uprobe event assigned
in perf path there's uprobe_perf_func which calls the uprobe_perf_filter
and if it returns true it then goes:
uprobe_perf_func
__uprobe_perf_func
perf_trace_buf_submit
perf_tp_event
{
hlist_for_each_entry_rcu(event, head, hlist_entry) {
if (perf_tp_event_match(event, &data, regs)) {
perf_swevent_event(event, count, &data, regs);
}
so it will store the event only to perf event which is added for
the task that is currently scheduled in, so it's not stored to the
other task event
in comparison with uprobe_multi path, where uprobe_multi_link_filter
will pass and then it goes:
handler_chain
uprobe_multi_link_handler
bpf_prog_run
- executes bpf program
the bpf program will get executed for uprobe from both tasks
>
> The main purpose of uprobe_perf_func()->uprobe_perf_filter() is NOT that
> we want to avoid __uprobe_perf_func() although this makes sense.
>
> The main purpose is that we want to remove the breakpoints in current->mm
> when uprobe_perf_filter() returns false, that is why UPROBE_HANDLER_REMOVE.
> IOW, the main purpose is not penalise user-space unnecessarily.
>
> IIUC (but I am not sure), perf-record -p will work "correctly" even if we
> remove uprobe_perf_filter() altogether. IIRC the perf layer does its own
> filtering but I forgot everything.
I think that's what I tried to describe above
>
> And this makes me think that perhaps BPF can't rely on uprobe_perf_filter()
> either, even we forget about ret-probes.
>
> > [1] 46ba0e49b642 bpf: fix multi-uprobe PID filtering logic
>
> Looks obviously wrong... get_pid_task(PIDTYPE_TGID) can return a zombie
> leader with ->mm == NULL while other threads and thus the whole process
> is still alive.
>
> And again, the changelog says "the intent for PID filtering it to filter by
> *process*", but clone(CLONE_VM) creates another process, not a thread.
>
> So perhaps we need
>
> - if (link->task && current->mm != link->task->mm)
> + if (link->task && !same_thread_group(current, link->task))
that seems correct, need to check
>
> in uprobe_prog_run() to make "filter by *process*" true, but this won't
> fix the problem with link->task->mm == NULL in uprobe_multi_link_filter().
>
>
> Does bpftrace use bpf_uprobe_multi_link_attach/etc ? I guess not...
> Then which userspace tool uses this code? ;)
yes, it will trigger if you attach to multiple uprobes, like with your
test example with:
# bpftrace -p xxx -e 'uprobe:./ex:func* { printf("%d\n", pid); }'
thanks,
jirka
next prev parent reply other threads:[~2024-08-26 13:48 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-23 13:53 [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe Tianyi Liu
2024-08-23 17:44 ` Masami Hiramatsu
2024-08-23 19:07 ` Andrii Nakryiko
2024-08-24 5:49 ` Tianyi Liu
2024-08-24 17:27 ` Masami Hiramatsu
2024-08-25 17:14 ` Oleg Nesterov
2024-08-25 18:43 ` Oleg Nesterov
2024-08-25 22:40 ` Oleg Nesterov
2024-08-26 10:05 ` Jiri Olsa
2024-08-26 11:57 ` Oleg Nesterov
2024-08-26 12:24 ` Oleg Nesterov
2024-08-26 13:48 ` Jiri Olsa [this message]
2024-08-26 18:56 ` Oleg Nesterov
2024-08-26 21:25 ` Oleg Nesterov
2024-08-26 22:01 ` Jiri Olsa
2024-08-26 22:08 ` Andrii Nakryiko
2024-08-26 22:29 ` Oleg Nesterov
2024-08-27 13:07 ` Jiri Olsa
2024-08-27 13:45 ` Jiri Olsa
2024-08-27 16:45 ` Oleg Nesterov
2024-08-28 11:40 ` Jiri Olsa
2024-08-27 20:19 ` Oleg Nesterov
2024-08-28 11:46 ` Jiri Olsa
2024-08-29 15:20 ` Oleg Nesterov
2024-08-29 19:46 ` Jiri Olsa
2024-08-29 21:12 ` Oleg Nesterov
2024-08-29 23:22 ` Jiri Olsa
2024-08-27 6:27 ` Tianyi Liu
2024-08-27 10:08 ` Jiri Olsa
2024-08-27 10:20 ` Jiri Olsa
2024-08-27 10:54 ` Oleg Nesterov
2024-08-27 10:40 ` Oleg Nesterov
2024-08-27 13:32 ` Jiri Olsa
2024-08-27 14:26 ` Oleg Nesterov
2024-08-27 14:41 ` Jiri Olsa
2024-08-26 14:52 ` Tianyi Liu
2024-08-25 17:00 ` Oleg Nesterov
2024-08-30 10:12 ` Oleg Nesterov
2024-08-30 12:23 ` Oleg Nesterov
2024-08-30 13:34 ` Jiri Olsa
2024-08-30 15:51 ` Andrii Nakryiko
2024-09-02 9:11 ` Jiri Olsa
2024-09-03 18:09 ` Andrii Nakryiko
2024-09-03 18:11 ` Andrii Nakryiko
2024-09-03 19:15 ` Jiri Olsa
2024-09-01 19:22 ` Tianyi Liu
2024-09-01 23:26 ` Oleg Nesterov
2024-09-02 17:17 ` Oleg Nesterov
2024-09-03 14:33 ` Jiri Olsa
2024-09-06 10:43 ` Jiri Olsa
2024-09-06 19:18 ` Oleg Nesterov
2024-09-09 10:41 ` Jiri Olsa
2024-09-09 18:34 ` Oleg Nesterov
2024-09-10 8:45 ` Jiri Olsa
2024-09-07 19:19 ` Tianyi Liu
2024-09-08 13:15 ` Oleg Nesterov
2024-09-09 1:16 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZsyHrhG9Q5BpZ1ae@krava \
--to=olsajiri@gmail.com \
--cc=ajor@meta.com \
--cc=albancrequy@linux.microsoft.com \
--cc=andrii.nakryiko@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=flaniel@linux.microsoft.com \
--cc=i.pear@outlook.com \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linux@jordanrome.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).