public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Tianyi Liu <i.pear@outlook.com>, Oleg Nesterov <oleg@redhat.com>
Cc: rostedt@goodmis.org, mathieu.desnoyers@efficios.com,
	flaniel@linux.microsoft.com, albancrequy@linux.microsoft.com,
	linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe
Date: Sat, 24 Aug 2024 02:44:39 +0900	[thread overview]
Message-ID: <20240824024439.a37c41bab87dbdf3d0486846@kernel.org> (raw)
In-Reply-To: <ME0P300MB0416034322B9915ECD3888649D882@ME0P300MB0416.AUSP300.PROD.OUTLOOK.COM>

On Fri, 23 Aug 2024 21:53:00 +0800
Tianyi Liu <i.pear@outlook.com> wrote:

> U(ret)probes are designed to be filterable using the PID, which is the
> second parameter in the perf_event_open syscall. Currently, uprobe works
> well with the filtering, but uretprobe is not affected by it. This often
> leads to users being disturbed by events from uninterested processes while
> using uretprobe.
> 
> We found that the filter function was not invoked when uretprobe was
> initially implemented, and this has been existing for ten years. We have
> tested the patch under our workload, binding eBPF programs to uretprobe
> tracepoints, and confirmed that it resolved our problem.

Is this eBPF related problem? It seems only perf record is also affected.
Let me try.


> 
> Following are the steps to reproduce the issue:
> 
> Step 1. Compile the following reproducer program:
> ```
> 
> int main() {
>     printf("pid: %d\n", getpid());
>     while (1) {
>         sleep(2);
>         void *ptr = malloc(1024);
>         free(ptr);
>     }
> }
> ```
> We will then use uretprobe to trace the `malloc` function.

OK, and run perf probe to add an event on malloc's return.

$ sudo ~/bin/perf probe -x ./malloc-run --add malloc%return  
Added new event:
  probe_malloc:malloc__return (on malloc%return in /home/mhiramat/ksrc/linux/malloc-run)

You can now use it in all perf tools, such as:

	perf record -e probe_malloc:malloc__return -aR sleep 1

> 
> Step 2. Run two instances of the reproducer program and record their PIDs.

$ ./malloc-run &  ./malloc-run &
[1] 93927
[2] 93928
pid: 93927
pid: 93928

And trace one of them;

$ sudo ~/bin/perf trace record -e probe_malloc:malloc__return  -p 93928
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data (9 samples) ]

And dump the data;

$ sudo ~/bin/perf script
      malloc-run   93928 [004] 351736.730649:       raw_syscalls:sys_exit: NR 230 = 0
      malloc-run   93928 [004] 351736.730694: probe_malloc:malloc__return: (561cfdeb30c0 <- 561cfdeb3204)
      malloc-run   93928 [004] 351736.730696:      raw_syscalls:sys_enter: NR 230 (0, 0, 7ffc7a5c5380, 7ffc7a5c5380, 561d2940f6b0,
      malloc-run   93928 [004] 351738.730857:       raw_syscalls:sys_exit: NR 230 = 0
      malloc-run   93928 [004] 351738.730869: probe_malloc:malloc__return: (561cfdeb30c0 <- 561cfdeb3204)
      malloc-run   93928 [004] 351738.730883:      raw_syscalls:sys_enter: NR 230 (0, 0, 7ffc7a5c5380, 7ffc7a5c5380, 561d2940f6b0,
      malloc-run   93928 [004] 351740.731110:       raw_syscalls:sys_exit: NR 230 = 0
      malloc-run   93928 [004] 351740.731125: probe_malloc:malloc__return: (561cfdeb30c0 <- 561cfdeb3204)
      malloc-run   93928 [004] 351740.731127:      raw_syscalls:sys_enter: NR 230 (0, 0, 7ffc7a5c5380, 7ffc7a5c5380, 561d2940f6b0,

Hmm, it seems to trace one pid data. (without this change)
If this changes eBPF behavior, I would like to involve eBPF people to ask
this is OK. As far as from the viewpoint of perf tool, current code works.

But I agree that current code is a bit strange. Oleg, do you know anything?

Thank you,

> 
> Step 3. Use uretprobe to trace each of the two running reproducers
> separately. We use bpftrace to make it easier to reproduce. Please run two
> instances of bpftrace simultaneously: the first instance filters events
> from PID1, and the second instance filters events from PID2.
> 
> The expected behavior is that each bpftrace instance would only print
> events matching its respective PID filter. However, in practice, both
> bpftrace instances receive events from both processes, the PID filter is
> ineffective at this moment:
> 
> Before:
> ```
> PID1=55256
> bpftrace -p $PID1 -e 'uretprobe:libc:malloc { printf("time=%llu pid=%d\n", elapsed / 1000000000, pid); }'
> Attaching 1 probe...
> time=0 pid=55256
> time=2 pid=55273
> time=2 pid=55256
> time=4 pid=55273
> time=4 pid=55256
> time=6 pid=55273
> time=6 pid=55256
> 
> PID2=55273
> bpftrace -p $PID2 -e 'uretprobe:libc:malloc { printf("time=%llu pid=%d\n", elapsed / 1000000000, pid); }'
> Attaching 1 probe...
> time=0 pid=55273
> time=0 pid=55256
> time=2 pid=55273
> time=2 pid=55256
> time=4 pid=55273
> time=4 pid=55256
> time=6 pid=55273
> time=6 pid=55256
> ```
> 
> After: Both bpftrace instances will show the expected behavior, only
> printing events from the PID specified by their respective filters:
> ```
> PID1=1621
> bpftrace -p $PID1 -e 'uretprobe:libc:malloc { printf("time=%llu pid=%d\n", elapsed / 1000000000, pid); }'
> Attaching 1 probe...
> time=0 pid=1621
> time=2 pid=1621
> time=4 pid=1621
> time=6 pid=1621
> 
> PID2=1633
> bpftrace -p $PID2 -e 'uretprobe:libc:malloc { printf("time=%llu pid=%d\n", elapsed / 1000000000, pid); }'
> Attaching 1 probe...
> time=0 pid=1633
> time=2 pid=1633
> time=4 pid=1633
> time=6 pid=1633
> ```
> 
> Fixes: c1ae5c75e103 ("uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()")
> Cc: Alban Crequy <albancrequy@linux.microsoft.com>
> Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
> Signed-off-by: Tianyi Liu <i.pear@outlook.com>
> ---
> Changes in v2:
> - Drop cover letter and update commit message.
> - Link to v1: https://lore.kernel.org/linux-trace-kernel/ME0P300MB04166144CDF92A72B9E1BAEA9D8F2@ME0P300MB0416.AUSP300.PROD.OUTLOOK.COM/
> ---
>  kernel/trace/trace_uprobe.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index c98e3b3386ba..c7e2a0962928 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -1443,6 +1443,9 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
>  				struct pt_regs *regs,
>  				struct uprobe_cpu_buffer **ucbp)
>  {
> +	if (!uprobe_perf_filter(&tu->consumer, 0, current->mm))
> +		return;
> +
>  	__uprobe_perf_func(tu, func, regs, ucbp);
>  }
>  
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

       reply	other threads:[~2024-08-23 17:44 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ME0P300MB0416034322B9915ECD3888649D882@ME0P300MB0416.AUSP300.PROD.OUTLOOK.COM>
2024-08-23 17:44 ` Masami Hiramatsu [this message]
2024-08-23 19:07   ` [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe Andrii Nakryiko
2024-08-24  5:49     ` Tianyi Liu
2024-08-24 17:27       ` Masami Hiramatsu
2024-08-25 17:14       ` Oleg Nesterov
2024-08-25 18:43         ` Oleg Nesterov
2024-08-25 22:40         ` Oleg Nesterov
2024-08-26 10:05           ` Jiri Olsa
2024-08-26 11:57             ` Oleg Nesterov
2024-08-26 12:24               ` Oleg Nesterov
2024-08-26 13:48               ` Jiri Olsa
2024-08-26 18:56                 ` Oleg Nesterov
2024-08-26 21:25                 ` Oleg Nesterov
2024-08-26 22:01                   ` Jiri Olsa
2024-08-26 22:08                     ` Andrii Nakryiko
2024-08-26 22:29                     ` Oleg Nesterov
2024-08-27 13:07                       ` Jiri Olsa
2024-08-27 13:45                         ` Jiri Olsa
2024-08-27 16:45                         ` Oleg Nesterov
2024-08-28 11:40                           ` Jiri Olsa
2024-08-27 20:19                         ` Oleg Nesterov
2024-08-28 11:46                           ` Jiri Olsa
2024-08-29 15:20                             ` Oleg Nesterov
2024-08-29 19:46                               ` Jiri Olsa
2024-08-29 21:12                                 ` Oleg Nesterov
2024-08-29 23:22                                   ` Jiri Olsa
2024-08-27  6:27                   ` Tianyi Liu
2024-08-27 10:08               ` Jiri Olsa
2024-08-27 10:20                 ` Jiri Olsa
2024-08-27 10:54                   ` Oleg Nesterov
2024-08-27 10:40                 ` Oleg Nesterov
2024-08-27 13:32                   ` Jiri Olsa
2024-08-27 14:26                     ` Oleg Nesterov
2024-08-27 14:41                       ` Jiri Olsa
2024-08-26 14:52           ` Tianyi Liu
2024-08-25 17:00     ` Oleg Nesterov
2024-08-30 10:12 ` Oleg Nesterov
2024-08-30 12:23   ` Oleg Nesterov
2024-08-30 13:34   ` Jiri Olsa
2024-08-30 15:51     ` Andrii Nakryiko
2024-09-02  9:11       ` Jiri Olsa
2024-09-03 18:09         ` Andrii Nakryiko
2024-09-03 18:11           ` Andrii Nakryiko
2024-09-03 19:15             ` Jiri Olsa
2024-09-01 19:22   ` Tianyi Liu
2024-09-01 23:26     ` Oleg Nesterov
2024-09-02 17:17       ` Oleg Nesterov
2024-09-03 14:33         ` Jiri Olsa
2024-09-06 10:43     ` Jiri Olsa
2024-09-06 19:18       ` Oleg Nesterov
2024-09-09 10:41         ` Jiri Olsa
2024-09-09 18:34           ` Oleg Nesterov
2024-09-10  8:45             ` Jiri Olsa
2024-09-07 19:19       ` Tianyi Liu
2024-09-08 13:15         ` Oleg Nesterov
2024-09-09  1:16           ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240824024439.a37c41bab87dbdf3d0486846@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=albancrequy@linux.microsoft.com \
    --cc=bpf@vger.kernel.org \
    --cc=flaniel@linux.microsoft.com \
    --cc=i.pear@outlook.com \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox