From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: "Jiri Olsa" <jolsa@kernel.org>,
"Namhyung Kim" <namhyung@kernel.org>,
"Clark Williams" <williams@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
"Arnaldo Carvalho de Melo" <acme@redhat.com>,
"Adrian Hunter" <adrian.hunter@intel.com>,
"Alexei Starovoitov" <ast@fb.com>,
"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Luis Cláudio Gonçalves" <lclaudio@redhat.com>,
"Martin KaFai Lau" <kafai@fb.com>,
"Song Liu" <songliubraving@fb.com>,
"Wang Nan" <wangnan0@huawei.com>, "Yonghong Song" <yhs@fb.com>
Subject: [PATCH 17/35] perf bpf: Automatically add BTF ELF markers
Date: Thu, 7 Mar 2019 14:44:15 -0300 [thread overview]
Message-ID: <20190307174433.28819-18-acme@kernel.org> (raw)
In-Reply-To: <20190307174433.28819-1-acme@kernel.org>
From: Arnaldo Carvalho de Melo <acme@redhat.com>
The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
place with the keys and values types of maps so that one can store the
struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
= BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
MAP_ID'.
Since we already have this for defining maps in 'perf trace' BPF events:
bpf_map(name, _type, type_key, type_val, _max_entries)
As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:
--- 8< ---
struct syscall {
bool enabled;
};
bpf_map(syscalls, ARRAY, int, struct syscall, 512);
--- 8< ---
All we need is to get all that already available info, piggyback on the
'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
'perf trace' BPF programs and do that without requiring changes to the
BPF programs already defining maps using 'bpf_map()'.
So this is what we have before this patch:
1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
so that we can use the .o later as a pre-compiled BPF bytecode:
# grep '\[llvm\]' -A2 ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
#
# clang --version
clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm/bin
2) Note the -g there so that we get clang to generate debuginfo, and
since the target is 'bpf' it will generate the BTF info in this
clang version (9.0).
3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
source code:
# perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data ]
# file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
4) Look at the BTF structs encoded in it:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
#
# pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
# pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct syscall {
bool enabled; /* 0 1 */
/* size: 1, cachelines: 1, members: 1 */
/* last cacheline: 1 bytes */
};
#
5) Ok, with just this we don't have the markers expected by the libbpf
loader and when we run with this BPF bytecode, because we have:
# grep '\[trace\]' -A1 ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
#
6) Lets do a 'perf trace' system wide session using this BPF program:
# perf trace -e *mmsg,open*
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12
7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):
# bpftool map list | tail -6
149: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
150: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
151: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
8) Dump the "pids_filtered", map, that will have one entry per PID that
'perf trace' wants filtered, which includes its own, to avoid a
tracing feedback loop (perf trace shows the syscalls it does which
generates more syscalls that it has to show that...), it also
auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
same reason:
# bpftool map dump id 151
key: a5 0c 00 00 value: 01
key: 14 63 00 00 value: 01
Found 2 elements
#
9) Since there is no BTF info available, it does a generic hex dump :-\
10) Now, with this patch applied, we'll do steps 3 to 6 again and look
with pahole if there are extra structs encoded in BTF:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
____btf_map___augmented_syscalls__ 8 0
____btf_map_syscalls 8 0
____btf_map_pids_filtered 8 0
#
11) Yes, those __btf_map_ + the map names, lets see how they look like:
# pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct ____btf_map_syscalls {
int key; /* 0 4 */
struct syscall value; /* 4 1 */
/* size: 8, cachelines: 1, members: 2 */
/* padding: 3 */
/* last cacheline: 8 bytes */
};
#
12) Lets repeat step 7 to get the new map ids:
# bpftool map list | tail -6
155: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
156: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
157: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
13) And finally lets dump the 'pids_filtered':
# bpftool map dump id 157
[{
"key": 3237,
"value": true
},{
"key": 26435,
"value": true
}
]
#
Looks much better! BTF info was used to interpret the key as an integer
and the value as a struct with just one boolean member, so to make it
more compact, show just the 'true' value where we saw '01'.
Now to make 'perf trace --dump-map' to use BTF!
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/include/bpf/bpf.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 5df7ed9d9020..2eac6d804b2d 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -24,7 +24,13 @@ struct bpf_map SEC("maps") name = { \
.key_size = sizeof(type_key), \
.value_size = sizeof(type_val), \
.max_entries = _max_entries, \
-}
+}; \
+struct ____btf_map_##name { \
+ type_key key; \
+ type_val value; \
+}; \
+struct ____btf_map_##name __attribute__((section(".maps." #name), used)) \
+ ____btf_map_##name = { }
/*
* FIXME: this should receive .max_entries as a parameter, as careful
--
2.20.1
next prev parent reply other threads:[~2019-03-07 17:44 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-07 17:43 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-03-07 17:43 ` Arnaldo Carvalho de Melo
2019-03-07 17:43 ` [PATCH 01/35] perf, bpf: Consider events with attr.bpf_event as side-band events Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 02/35] perf probe: Clarify error message about not finding kernel modules debuginfo Arnaldo Carvalho de Melo
2019-03-07 23:30 ` Masami Hiramatsu
2019-03-07 23:58 ` Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 03/35] tools lib traceevent: Fix buffer overflow in arg_eval Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 04/35] perf: Mark expected switch fall-through Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 05/35] perf time-utils: Refactor time range parsing code Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 06/35] perf auxtrace: Improve address filter error message when there is no DSO Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 07/35] perf intel-pt: Fix divide by zero when TSC is not available Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 08/35] perf db-export: Add calls parent_id to enable creation of call trees Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 09/35] perf scripts python: export-to-sqlite.py: Export calls parent_id Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 10/35] perf scripts python: export-to-postgresql.py: Fix invalid input syntax for integer error Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 11/35] perf scripts python: export-to-postgresql.py: Export calls parent_id Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 12/35] perf scripts python: exported-sql-viewer.py: Factor out TreeWindowBase Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 13/35] perf scripts python: exported-sql-viewer.py: Improve TreeModel abstraction Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 14/35] perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 15/35] perf scripts python: exported-sql-viewer.py: Add call tree Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 16/35] perf beauty msg_flags: Add missing %s lost when adding prefix suppression logic Arnaldo Carvalho de Melo
2019-03-07 17:44 ` Arnaldo Carvalho de Melo [this message]
2019-03-07 17:44 ` [PATCH 18/35] perf clang: Remove needless extra semicolon Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 19/35] perf annotate: Calculate the max instruction name, align column to that Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 20/35] perf thread: Generalize function to copy from thread addr space from intel-bts code Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 21/35] perf diff: Support --time filter option Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 22/35] perf diff: Support --cpu " Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 23/35] perf diff: Support --pid/--tid filter options Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 24/35] perf script python: Remove mixed indentation Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 25/35] perf script python: Add Python3 support to futex-contention.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 26/35] perf script python: add Python3 support to check-perf-trace.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 27/35] perf script python: Add Python3 support to event_analyzing_sample.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 28/35] perf script python: Add Python3 support to intel-pt-events.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 29/35] perf c2c: Fix c2c report for empty numa node Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 30/35] perf hist: Add error path into hist_entry__init Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 31/35] perf hist: Fix memory leak of srcline Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 32/35] perf tools: Read and store caps/max_precise in perf_pmu Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 33/35] perf evsel: Probe for precise_ip with simple attr Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 34/35] perf session: Fix double free in perf_data__close Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 35/35] perf data: Force perf_data__open|close zero data->file.path Arnaldo Carvalho de Melo
2019-03-09 16:02 ` [GIT PULL 00/35] perf/core improvements and fixes Ingo Molnar
2019-03-09 16:02 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190307174433.28819-18-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=adrian.hunter@intel.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@fb.com \
--cc=daniel@iogearbox.net \
--cc=jolsa@kernel.org \
--cc=kafai@fb.com \
--cc=lclaudio@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=songliubraving@fb.com \
--cc=wangnan0@huawei.com \
--cc=williams@redhat.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.