public inbox for llvm@lists.linux.dev
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>, Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Athira Jajeev <atrajeev@linux.vnet.ibm.com>,
	bpf@vger.kernel.org, Brendan Gregg <brendan.d.gregg@gmail.com>,
	Carsten Haitzler <carsten.haitzler@arm.com>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Fangrui Song <maskray@google.com>, He Kuang <hekuang@huawei.com>,
	Ingo Molnar <mingo@redhat.com>, James Clark <james.clark@arm.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	Leo Yan <leo.yan@linaro.org>,
	llvm@lists.linux.dev, Madhavan Srinivasan <maddy@linux.ibm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Nathan Chancellor <nathan@kernel.org>,
	"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	Rob Herring <robh@kernel.org>,
	Tiezhu Yang <yangtiezhu@loongson.cn>, Tom Rix <trix@redhat.com>,
	Wang Nan <wangnan0@huawei.com>,
	Wang ShaoBo <bobo.shaobowang@huawei.com>,
	Yang Jihong <yangjihong1@huawei.com>, Yonghong Song <yhs@fb.com>,
	YueHaibing <yuehaibing@huawei.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: [PATCH 1/1] perf trace: Use the augmented_raw_syscall BPF skel only for tracing syscalls
Date: Thu, 17 Aug 2023 12:37:23 -0300	[thread overview]
Message-ID: <ZN4+s2Wl+zYmXTDj@kernel.org> (raw)

It is possible to use 'perf trace' with tracepoints and in that case we
can't initialize/use the augmented_raw_syscalls BPF skel.

For instance, this usecase:

  # perf trace -e sched:*exec --max-events=5
         ? (         ): NetworkManager/1183  ... [continued]: poll())                                             = 1
     0.043 ( 0.007 ms): NetworkManager/1183 epoll_wait(epfd: 17<anon_inode:[eventpoll]>, events: 0x55555f90e920, maxevents: 6) = 0
     0.060 ( 0.007 ms): NetworkManager/1183 write(fd: 3<anon_inode:[eventfd]>, buf: 0x7ffc5a27cd30, count: 8)     = 8
     0.073 ( 0.005 ms): NetworkManager/1183 epoll_wait(epfd: 24<anon_inode:[eventpoll]>, events: 0x7ffc5a27cd20, maxevents: 2) = 1
     0.082 ( 0.010 ms): NetworkManager/1183 recvmmsg(fd: 26<socket:[30298]>, mmsg: 0x7ffc5a27caa0, vlen: 8)       = 1
  #

Where we want to trace just some sched tracepoints ending in 'exec' ends
up tracing all syscalls.

Fix it by checking existing trace->trace_syscalls boolean to see if we
need the augmenter.

A followup patch will move those sections of code used only with the
augmenter to separate functions, to get it cleaner and remove the goto,
done just for reviewing purposes.

With this patch in place the previous behaviour is restored: no syscalls
when we have other events and no syscall names:

  [root@quaco ~]# perf probe do_filp_open "filename=pathname->name:string"
  Added new event:
    probe:do_filp_open   (on do_filp_open with filename=pathname->name:string)

  You can now use it in all perf tools, such as:

	  perf record -e probe:do_filp_open -aR sleep 1

  [root@quaco ~]# perf trace --max-events=10 -e probe:do_filp_open sleep 1
     0.000 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/etc/ld.so.cache")
     0.056 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/lib64/libc.so.6")
     0.481 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/locale-archive")
     0.501 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/share/locale/locale.alias")
     0.572 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_IDENTIFICATION")
     0.581 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.utf8/LC_IDENTIFICATION")
     0.616 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib64/gconv/gconv-modules.cache")
     0.656 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_MEASUREMENT")
     0.664 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.utf8/LC_MEASUREMENT")
     0.696 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_TELEPHONE")
  [root@quaco ~]#

As well as mixing syscalls with tracepoints, getting the syscall
tracepoints used augmented using the BPF skel:

  [root@quaco ~]# perf trace --max-events=10 -e open*,probe:do_filp_open sleep 1
     0.000 (         ): sleep/455124 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) ...
     0.005 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/etc/ld.so.cache")
     0.000 ( 0.011 ms): sleep/455124  ... [continued]: openat())                                           = 3
     0.031 (         ): sleep/455124 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) ...
     0.033 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/lib64/libc.so.6")
     0.031 ( 0.006 ms): sleep/455124  ... [continued]: openat())                                           = 3
     0.258 (         ): sleep/455124 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) ...
     0.261 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/locale-archive")
     0.258 ( 0.006 ms): sleep/455124  ... [continued]: openat())                                           = -1 ENOENT (No such file or directory)
     0.272 (         ): sleep/455124 openat(dfd: CWD, filename: "/usr/share/locale/locale.alias", flags: RDONLY|CLOEXEC) ...
     0.273  (        ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/share/locale/locale.alias")

A final note: the probe:do_filp_open uses a kprobe (probably optimized
as its in the start of a function) that uses the kprobe_tracer mechanism
in the kernel to collect the pathname->name string and stash it into the
tracepoint created by 'perf probe' for that:

  [root@quaco ~]# cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/do_filp_open _text+4621920 filename=+0(+0(%si)):string
  [root@quaco ~]#

While the syscalls:sys_enter_openat tracepoint gets its string from a
BPF program attached to raw_syscalls:sys_enter that tail calls into
another BPF program that knows the types for the openat syscall args and
thus can bpf_probe_read it right after the normal
sys_enter/sys_enter_openat tracepoint payload that comes prefixed with
whatever perf_event_open asked for (CPU, timestamp, etc):

  [root@quaco ~]# bpftool prog | grep -E "sys_enter |sys_enter_opena" -A3
  3176: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
	loaded_at 2023-08-17T12:32:20-0300  uid 0
	xlated 272B  jited 257B  memlock 4096B  map_ids 2462,2466,2463
	btf_id 2976
  --
  3180: tracepoint  name sys_enter_opena  tag 19dd077f00ec2f58  gpl
	  loaded_at 2023-08-17T12:32:20-0300  uid 0
	  xlated 328B  jited 206B  memlock 4096B  map_ids 2466,2465
	  btf_id 2976
  [root@quaco ~]#

Fixes: 42963c8bedeb864b ("perf trace: Migrate BPF augmentation to use a skeleton")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: bpf@vger.kernel.org
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: llvm@lists.linux.dev
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0ebfa95895e0bf4d..3964cf44cdbcb3e8 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3895,7 +3895,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (err < 0)
 		goto out_error_open;
 #ifdef HAVE_BPF_SKEL
-	{
+	if (trace->syscalls.events.bpf_output) {
 		struct perf_cpu cpu;
 
 		/*
@@ -3916,7 +3916,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		goto out_error_mem;
 
 #ifdef HAVE_BPF_SKEL
-	if (trace->skel->progs.sys_enter)
+	if (trace->skel && trace->skel->progs.sys_enter)
 		trace__init_syscalls_bpf_prog_array_maps(trace);
 #endif
 
@@ -4850,6 +4850,9 @@ int cmd_trace(int argc, const char **argv)
 	}
 
 #ifdef HAVE_BPF_SKEL
+	if (!trace.trace_syscalls)
+		goto skip_augmentation;
+
 	trace.skel = augmented_raw_syscalls_bpf__open();
 	if (!trace.skel) {
 		pr_debug("Failed to open augmented syscalls BPF skeleton");
@@ -4884,6 +4887,7 @@ int cmd_trace(int argc, const char **argv)
 	}
 	trace.syscalls.events.bpf_output = evlist__last(trace.evlist);
 	assert(!strcmp(evsel__name(trace.syscalls.events.bpf_output), "__augmented_syscalls__"));
+skip_augmentation:
 #endif
 	err = -1;
 
-- 
2.41.0


             reply	other threads:[~2023-08-17 15:37 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-17 15:37 Arnaldo Carvalho de Melo [this message]
2023-08-17 15:44 ` [PATCH 1/1] perf trace: Use the augmented_raw_syscall BPF skel only for tracing syscalls Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZN4+s2Wl+zYmXTDj@kernel.org \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrii@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bpf@vger.kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=carsten.haitzler@arm.com \
    --cc=eddyz87@gmail.com \
    --cc=hekuang@huawei.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=maddy@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=maskray@google.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=nathan@kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=robh@kernel.org \
    --cc=trix@redhat.com \
    --cc=wangnan0@huawei.com \
    --cc=yangjihong1@huawei.com \
    --cc=yangtiezhu@loongson.cn \
    --cc=yhs@fb.com \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox