All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Song Liu <songliubraving@fb.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>, Jiri Olsa <jolsa@redhat.com>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v7 3/3] perf-stat: enable counting events for BPF programs
Date: Wed, 20 Jan 2021 09:37:27 -0300	[thread overview]
Message-ID: <20210120123727.GR12699@kernel.org> (raw)
In-Reply-To: <20210119223021.GO12699@kernel.org>

Em Tue, Jan 19, 2021 at 07:30:21PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jan 19, 2021 at 09:54:50PM +0000, Song Liu escreveu:
> > 
> > 
> > > On Jan 19, 2021, at 8:31 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > 
> > > Em Tue, Jan 19, 2021 at 11:42:49AM -0300, Arnaldo Carvalho de Melo escreveu:
> > >> Em Tue, Jan 19, 2021 at 11:31:44AM -0300, Arnaldo Carvalho de Melo escreveu:
> > >>> Em Tue, Jan 19, 2021 at 12:48:19AM +0000, Song Liu escreveu:
> > >>>>> On Jan 18, 2021, at 11:38 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >>>> We are looking at two issues:
> > >>>> 1. Cannot recursively attach;
> > >>>> 2. prog FD 3 doesn't have valid btf. 
> > >> 
> > >>>> #1 was caused by the verifier disallowing attaching fentry/fexit program 
> > >>>> to program with type BPF_PROG_TYPE_TRACING (in bpf_check_attach_target). 
> > >>>> This constraint was added when we only had fentry/fexit in the TRACING
> > >>>> type. We have extended the TRACING type to many other use cases, like 
> > >>>> "tp_btf/", "fmod_ret" and "iter/". Therefore, it is good time to revisit 
> > >>>> this constraint. I will work on this. 
> > >> 
> > >>>> For #2, we require the target program to have BTF. I guess we won't remove
> > >>>> this requirement.
> > >> 
> > >>>> While I work on improving #1, could you please test with some kprobe 
> > >>>> programs? For example, we can use fileslower.py from bcc. 
> > >> 
> > >>> Sure, and please consider improving the error messages to state what you
> > >>> described above.
> > >> 
> > >> Terminal 1:
> > >> 
> > >> [root@five perf]# perf trace -e 5sec.c
> > >> ^C
> > >> [root@five perf]# cat 5sec.c
> > >> #include <bpf.h>
> > >> 
> > >> #define NSEC_PER_SEC	1000000000L
> > >> 
> > >> int probe(hrtimer_nanosleep, rqtp)(void *ctx, int err, long long sec)
> > >> {
> > >> 	return sec / NSEC_PER_SEC == 5;
> > >> }
> > >> 
> > >> license(GPL);
> > >> [root@five perf]# perf trace -e 5sec.c/max-stack=16/
> > >>     0.000 sleep/3739435 perf_bpf_probe:hrtimer_nanosleep(__probe_ip: -1743337312, rqtp: 5000000000)
> > >>                                       hrtimer_nanosleep ([kernel.kallsyms])
> > >>                                       common_nsleep ([kernel.kallsyms])
> > >>                                       __x64_sys_clock_nanosleep ([kernel.kallsyms])
> > >>                                       do_syscall_64 ([kernel.kallsyms])
> > >>                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
> > >>                                       __clock_nanosleep_2 (/usr/lib64/libc-2.32.so)
> > >> 
> > >> 
> > >> Terminal 2:
> > >> 
> > >> [root@five ~]# perf stat -e cycles -b 180 -I 1000
> > >> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> > >> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> > >> perf: util/bpf_counter.c:227: bpf_program_profiler__read: Assertion `skel != NULL' failed.
> > >> Aborted (core dumped)
> > > 
> > > Out to lunch, will continue later, but this may help you figure this out
> > > till then :)
> > > 
> > > Starting program: /root/bin/perf stat -e cycles -b 244 -I 1000
> > > [Thread debugging using libthread_db enabled]
> > > Using host libthread_db library "/lib64/libthread_db.so.1".
> > > 
> > > Breakpoint 1, bpf_program_profiler_load_one (evsel=0xce02c0, prog_id=244) at util/bpf_counter.c:96
> > > 96	{
> > > (gdb) n
> > > 104		prog_fd = bpf_prog_get_fd_by_id(prog_id);
> > > (gdb) 
> > > 105		if (prog_fd < 0) {
> > > (gdb) 
> > > 109		counter = bpf_counter_alloc();
> > > (gdb) 
> > > 110		if (!counter) {
> > > (gdb) n
> > > 115		skel = bpf_prog_profiler_bpf__open();
> > > (gdb) p counter
> > > $9 = (struct bpf_counter *) 0xce09e0
> > > (gdb) p *counter
> > > $10 = {skel = 0x0, list = {next = 0xce09e8, prev = 0xce09e8}}
> > > (gdb) p *counter
> > > $11 = {skel = 0x0, list = {next = 0xce09e8, prev = 0xce09e8}}
> > > (gdb) n
> > > libbpf: elf: skipping unrecognized data section(9) .eh_frame
> > > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> > > 116		if (!skel) {
> > > (gdb) 
> > > 121		skel->rodata->num_cpu = evsel__nr_cpus(evsel);
> > > (gdb) 
> > > 123		bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
> > > (gdb) 
> > > 124		bpf_map__resize(skel->maps.fentry_readings, 1);
> > > (gdb) 
> > > 125		bpf_map__resize(skel->maps.accum_readings, 1);
> > > (gdb) 
> > > 127		prog_name = bpf_target_prog_name(prog_fd);
> > > (gdb) 
> > > 128		if (!prog_name) {
> > > (gdb) 
> > > 133		bpf_object__for_each_program(prog, skel->obj) {
> > > (gdb) 
> > > 134			err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
> > > (gdb) 
> > > 135			if (err) {
> > > (gdb) 
> > > 133		bpf_object__for_each_program(prog, skel->obj) {
> > > (gdb) p evsel
> > > $12 = (struct evsel *) 0xce02c0
> > > (gdb) p evsel->name
> > > $13 = 0xce04e0 "cycles"
> > > (gdb) n
> > > 134			err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
> > > (gdb) 
> > > 135			if (err) {
> > > (gdb) 
> > > 133		bpf_object__for_each_program(prog, skel->obj) {
> > > (gdb) 
> > > 141		set_max_rlimit();
> > > (gdb) 
> > > 142		err = bpf_prog_profiler_bpf__load(skel);
> > > (gdb) 
> > > 143		if (err) {
> > > (gdb) 
> > > 148		assert(skel != NULL);
> > > (gdb) 
> > > 149		counter->skel = skel;
> > > (gdb) 
> > > 150		list_add(&counter->list, &evsel->bpf_counter_list);
> > > (gdb) c
> > > Continuing.
> > > 
> > > Breakpoint 4, bpf_program_profiler__install_pe (evsel=0xce02c0, cpu=0, fd=3) at util/bpf_counter.c:247
> > > 247	{
> > > (gdb) n
> > > 252		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> > > (gdb) 
> > > 253			skel = counter->skel;
> > > (gdb) watch counter->skel
> > > Hardware watchpoint 6: counter->skel
> > > (gdb) p counter->skel
> > > $14 = (void *) 0xce0a00
> > > (gdb) n
> > > 254			assert(skel != NULL);
> > > (gdb) p skel
> > > $15 = (struct bpf_prog_profiler_bpf *) 0xce0a00
> > > (gdb) c
> > > Continuing.
> > > 
> > > Hardware watchpoint 6: counter->skel
> > > 
> > > Old value = (void *) 0xce0a00
> > > New value = (void *) 0x0
> > > 0x00000000005cf45e in bpf_program_profiler__install_pe (evsel=0xce02c0, cpu=0, fd=3) at util/bpf_counter.c:252
> > > 252		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> > 
> > So it was the list operation that set counter->skel to NULL? I am really confused...
> 
> Yep, I'm confused as well and trying to reproduce this, but got
> sidetracked...

Coming back to this, now it is exploding later:

Breakpoint 8, bpf_program_profiler__install_pe (evsel=0xce02c0, cpu=22, fd=32) at util/bpf_counter.c:254
254			assert(skel != NULL);
(gdb) p skel
$52 = (struct bpf_prog_profiler_bpf *) 0xce0a00
(gdb) c
Continuing.

Breakpoint 4, bpf_program_profiler__install_pe (evsel=0xce02c0, cpu=23, fd=33) at util/bpf_counter.c:247
247	{
(gdb) p skel
$53 = (struct bpf_prog_profiler_bpf *) 0xce04c0
(gdb) c
Continuing.

Breakpoint 8, bpf_program_profiler__install_pe (evsel=0xce02c0, cpu=23, fd=33) at util/bpf_counter.c:254
254			assert(skel != NULL);
(gdb) p skel
$54 = (struct bpf_prog_profiler_bpf *) 0xce0a00
(gdb) c
Continuing.

Breakpoint 2, bpf_program_profiler__enable (evsel=0xce02c0) at util/bpf_counter.c:192
192	{
(gdb) n
196		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
(gdb)
197			assert(counter->skel != NULL);
(gdb)
198			ret = bpf_prog_profiler_bpf__attach(counter->skel);
(gdb) c
Continuing.

Breakpoint 3, bpf_program_profiler__read (evsel=0xce02c0) at util/bpf_counter.c:208
208	{
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00000000005cf34b in bpf_program_profiler__read (evsel=0x0) at util/bpf_counter.c:224
224		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
(gdb) p evsel
$55 = (struct evsel *) 0x0
(gdb) bt
#0  0x00000000005cf34b in bpf_program_profiler__read (evsel=0x0) at util/bpf_counter.c:224
#1  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x00000000005cf34b in bpf_program_profiler__read (evsel=0x0) at util/bpf_counter.c:224
#1  0x0000000000000000 in ?? ()
(gdb)
(gdb) info threads
  Id   Target Id                                  Frame
* 1    Thread 0x7ffff647f900 (LWP 1725711) "perf" 0x00000000005cf34b in bpf_program_profiler__read (evsel=0x0) at util/bpf_counter.c:224
(gdb)

  reply	other threads:[~2021-01-20 13:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-29 21:42 [PATCH v7 0/3] Introduce perf-stat -b for BPF programs Song Liu
2020-12-29 21:42 ` [PATCH v7 1/3] bpftool: add Makefile target bootstrap Song Liu
2020-12-29 21:42 ` [PATCH v7 2/3] perf: support build BPF skeletons with perf Song Liu
2020-12-29 21:42 ` [PATCH v7 3/3] perf-stat: enable counting events for BPF programs Song Liu
2021-01-12  7:35   ` Namhyung Kim
2021-01-15 18:53     ` Arnaldo Carvalho de Melo
2021-01-18 19:38   ` Arnaldo Carvalho de Melo
2021-01-19  0:48     ` Song Liu
2021-01-19 14:31       ` Arnaldo Carvalho de Melo
2021-01-19 14:42         ` Arnaldo Carvalho de Melo
2021-01-19 16:31           ` Arnaldo Carvalho de Melo
2021-01-19 21:54             ` Song Liu
2021-01-19 22:30               ` Arnaldo Carvalho de Melo
2021-01-20 12:37                 ` Arnaldo Carvalho de Melo [this message]
2021-01-20 13:01                   ` Arnaldo Carvalho de Melo
2021-01-20 13:50                     ` Arnaldo Carvalho de Melo
2021-01-20 16:30                       ` FIX " Arnaldo Carvalho de Melo
2021-01-20 16:40                         ` Song Liu
2021-01-20 17:04                           ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210120123727.GR12699@kernel.org \
    --to=acme@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.