public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Tianyi Liu <i.pear@outlook.com>,
	andrii.nakryiko@gmail.com, ajor@meta.com,
	albancrequy@linux.microsoft.com, bpf@vger.kernel.org,
	flaniel@linux.microsoft.com, linux-trace-kernel@vger.kernel.org,
	linux@jordanrome.com, mathieu.desnoyers@efficios.com,
	mhiramat@kernel.org, rostedt@goodmis.org
Subject: Re: [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe
Date: Tue, 3 Sep 2024 17:33:50 +0300	[thread overview]
Message-ID: <ZtceTuu8E4hHZr2P@krava> (raw)
In-Reply-To: <20240902171745.GC26532@redhat.com>

On Mon, Sep 02, 2024 at 07:17:45PM +0200, Oleg Nesterov wrote:
> On 09/02, Oleg Nesterov wrote:
> >
> > And... I think that BPF has even more problems with filtering. Not sure,
> > I'll try to write another test-case tomorrow.
> 
> See below. This test-case needs a one-liner patch at the end, but this is only
> because I have no idea how to add BPF_EMIT_CALL(BPF_FUNC_trace_printk) into
> "struct bpf_insn insns[]" correctly. Is there a simple-to-use user-space tool
> which can translate 'bpf_trace_printk("Hello world\n", 13)' into bpf_insn[] ???
> 
> So. The CONFIG_BPF_EVENTS code in __uprobe_perf_func() assumes that it "owns"
> tu->consumer and uprobe_perf_filter(), but this is not true in general.
> 
> test.c:
> 	#include <unistd.h>
> 
> 	int func(int i)
> 	{
> 		return i;
> 	}
> 
> 	int main(void)
> 	{
> 		int i;
> 		for (i = 0;; ++i) {
> 			sleep(1);
> 			func(i);
> 		}
> 		return 0;
> 	}
> 
> run_prog.c
> 	// cc -I./tools/include -I./tools/include/uapi -Wall
> 	#include "./include/generated/uapi/linux/version.h"
> 	#include <linux/perf_event.h>
> 	#include <linux/filter.h>
> 	#define _GNU_SOURCE
> 	#include <sys/syscall.h>
> 	#include <sys/ioctl.h>
> 	#include <assert.h>
> 	#include <unistd.h>
> 	#include <stdlib.h>
> 
> 	int prog_load(void)
> 	{
> 		struct bpf_insn insns[] = {
> 			BPF_MOV64_IMM(BPF_REG_0, 0),
> 			BPF_EXIT_INSN(),
> 		};
> 
> 		union bpf_attr attr = {
> 			.prog_type	= BPF_PROG_TYPE_KPROBE,
> 			.insns		= (unsigned long)insns,
> 			.insn_cnt	= sizeof(insns) / sizeof(insns[0]),
> 			.license	= (unsigned long)"GPL",
> 			.kern_version	= LINUX_VERSION_CODE, // unneeded
> 		};
> 
> 		return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
> 	}
> 
> 	void run_probe(int eid, int pid)
> 	{
> 		struct perf_event_attr attr = {
> 			.type	= PERF_TYPE_TRACEPOINT,
> 			.config	= eid,
> 			.size	= sizeof(struct perf_event_attr),
> 		};
> 
> 		int fd = syscall(__NR_perf_event_open, &attr, pid, 0, -1, 0);
> 		assert(fd >= 0);
> 
> 		int pfd = prog_load();
> 		assert(pfd >= 0);
> 
> 		assert(ioctl(fd, PERF_EVENT_IOC_SET_BPF, pfd) == 0);
> 		assert(ioctl(fd, PERF_EVENT_IOC_ENABLE, 0) == 0);
> 
> 		for (;;)
> 			pause();
> 	}
> 
> 	int main(int argc, const char *argv[])
> 	{
> 		int eid = atoi(argv[1]);
> 		int pid = atoi(argv[2]);
> 		run_probe(eid, pid);
> 		return 0;
> 	}
> 
> Now,
> 
> 	$ ./test &
> 	$ PID1=$!
> 	$ ./test &
> 	$ PID2=$!
> 	$ perf probe -x ./test -a func
> 	$ ./run_prog `cat /sys/kernel/debug/tracing/events/probe_test/func/id` $PID1 &
> 
> dmesg -c:
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	...
> 
> So far so good. Now,
> 
> 	$ perf record -e probe_test:func -p $PID2 -- sleep 10 &
> 
> This creates another PERF_TYPE_TRACEPOINT perf_event which shares
> trace_uprobe/consumer/filter with the perf_event created by run_prog.
> 
> dmesg -c:
> 	trace_uprobe: BPF_FUNC: pid=51 ret=0
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	trace_uprobe: BPF_FUNC: pid=51 ret=0
> 	trace_uprobe: BPF_FUNC: pid=50 ret=0
> 	...
> 
> until perf-record exits. and after that
> 
> 	$ perf script
> 
> reports nothing.
> 
> So, in this case:
> 
> 	- run_prog's bpf program is called when current->pid == $PID2, this patch
> 	  (or any other change in trace_uprobe.c) can't help.
> 
> 	- run_prog's bpf program "steals" __uprobe_perf_func() from /usr/bin/perf

ok, there's just one call instance (struct trace_event_call) shared
with all the uprobe tracepoints, so if there's bpf program attached
to any uprobe tracepoint, it will not continue and send the perf event
for any other uprobe tracepoint (without the bpf program attached)

I think there might be similar issue with tracepoints and kprobes

> 
> and to me this is yet another indication that we need some changes in the
> bpf_prog_run_array_uprobe() paths, or even in the user-space bpftrace/whatever's
> code.
> 
> And. Why the "if (bpf_prog_array_valid(call))" block in __uprobe_perf_func() returns?
> Why this depends on "ret == 0" ??? I fail to understand this logic.

I'd think the intention was that if there's bpf program we don't emit
the perf event.. and we never thought about having them (event with bpf
program and standard perf event) co-existing together

problem is that the perf event pid filtering is implemented through the
call->perf_events list 

jirka

  reply	other threads:[~2024-09-03 14:33 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ME0P300MB0416034322B9915ECD3888649D882@ME0P300MB0416.AUSP300.PROD.OUTLOOK.COM>
2024-08-23 17:44 ` [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe Masami Hiramatsu
2024-08-23 19:07   ` Andrii Nakryiko
2024-08-24  5:49     ` Tianyi Liu
2024-08-24 17:27       ` Masami Hiramatsu
2024-08-25 17:14       ` Oleg Nesterov
2024-08-25 18:43         ` Oleg Nesterov
2024-08-25 22:40         ` Oleg Nesterov
2024-08-26 10:05           ` Jiri Olsa
2024-08-26 11:57             ` Oleg Nesterov
2024-08-26 12:24               ` Oleg Nesterov
2024-08-26 13:48               ` Jiri Olsa
2024-08-26 18:56                 ` Oleg Nesterov
2024-08-26 21:25                 ` Oleg Nesterov
2024-08-26 22:01                   ` Jiri Olsa
2024-08-26 22:08                     ` Andrii Nakryiko
2024-08-26 22:29                     ` Oleg Nesterov
2024-08-27 13:07                       ` Jiri Olsa
2024-08-27 13:45                         ` Jiri Olsa
2024-08-27 16:45                         ` Oleg Nesterov
2024-08-28 11:40                           ` Jiri Olsa
2024-08-27 20:19                         ` Oleg Nesterov
2024-08-28 11:46                           ` Jiri Olsa
2024-08-29 15:20                             ` Oleg Nesterov
2024-08-29 19:46                               ` Jiri Olsa
2024-08-29 21:12                                 ` Oleg Nesterov
2024-08-29 23:22                                   ` Jiri Olsa
2024-08-27  6:27                   ` Tianyi Liu
2024-08-27 10:08               ` Jiri Olsa
2024-08-27 10:20                 ` Jiri Olsa
2024-08-27 10:54                   ` Oleg Nesterov
2024-08-27 10:40                 ` Oleg Nesterov
2024-08-27 13:32                   ` Jiri Olsa
2024-08-27 14:26                     ` Oleg Nesterov
2024-08-27 14:41                       ` Jiri Olsa
2024-08-26 14:52           ` Tianyi Liu
2024-08-25 17:00     ` Oleg Nesterov
2024-08-30 10:12 ` Oleg Nesterov
2024-08-30 12:23   ` Oleg Nesterov
2024-08-30 13:34   ` Jiri Olsa
2024-08-30 15:51     ` Andrii Nakryiko
2024-09-02  9:11       ` Jiri Olsa
2024-09-03 18:09         ` Andrii Nakryiko
2024-09-03 18:11           ` Andrii Nakryiko
2024-09-03 19:15             ` Jiri Olsa
2024-09-01 19:22   ` Tianyi Liu
2024-09-01 23:26     ` Oleg Nesterov
2024-09-02 17:17       ` Oleg Nesterov
2024-09-03 14:33         ` Jiri Olsa [this message]
2024-09-06 10:43     ` Jiri Olsa
2024-09-06 19:18       ` Oleg Nesterov
2024-09-09 10:41         ` Jiri Olsa
2024-09-09 18:34           ` Oleg Nesterov
2024-09-10  8:45             ` Jiri Olsa
2024-09-07 19:19       ` Tianyi Liu
2024-09-08 13:15         ` Oleg Nesterov
2024-09-09  1:16           ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtceTuu8E4hHZr2P@krava \
    --to=olsajiri@gmail.com \
    --cc=ajor@meta.com \
    --cc=albancrequy@linux.microsoft.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=flaniel@linux.microsoft.com \
    --cc=i.pear@outlook.com \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux@jordanrome.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox