From: Jiri Olsa <olsajiri@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Jiri Olsa <jolsa@redhat.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Namhyung Kim <namhyung@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Michael Petlan <mpetlan@redhat.com>,
Ian Rogers <irogers@google.com>,
"linux-perf-use." <linux-perf-users@vger.kernel.org>,
Christy Lee <christylee@fb.com>,
Andrii Nakryiko <andrii@kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH 1/3] perf/bpf: Remove prologue generation
Date: Sun, 13 Feb 2022 16:02:49 +0100 [thread overview]
Message-ID: <YgkdmQVN1XSuFDWM@krava> (raw)
In-Reply-To: <CAEf4Bza9a5_U2U9ZK_su4VGSA91EMNouipk=PFudGqkN_iGsPg@mail.gmail.com>
On Thu, Feb 10, 2022 at 09:28:51PM -0800, Andrii Nakryiko wrote:
> On Thu, Feb 10, 2022 at 1:31 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, Feb 10, 2022 at 04:18:10PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Sun, Jan 23, 2022 at 11:19:30PM +0100, Jiri Olsa escreveu:
> > > > Removing code for ebpf program prologue generation.
> > > >
> > > > The prologue code was used to get data for extra arguments specified
> > > > in program section name, like:
> > > >
> > > > SEC("lock_page=__lock_page page->flags")
> > > > int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
> > > > {
> > > > return 1;
> > > > }
> > > >
> > > > This code is using deprecated libbpf API and blocks its removal.
> > > >
> > > > This feature was not documented and broken for some time without
> > > > anyone complaining, also original authors are not responding,
> > > > so I'm removing it.
> > >
> > > So, the example below breaks, how hard would be to move the deprecated
> > > APIs to perf like was done in some other cases?
> > >
>
> Just copy/pasting libbpf code won't work. But there are three parts:
>
> 1. bpf_(program|map|object)__set_priv(). There is no equivalent API,
> but perf can maintain this private data by building hashmap where the
> key is bpf_object/bpf_map/bpf_program pointer itself. Annoying but
> very straightforward to replace.
>
> 2. For prologue generation, bpf_program__set_prep() doesn't have a
> direct equivalent. But program cloning and adjustment of the code can
> be achieved through bpf_program__insns()/bpf_program__insn_cnt() API
> to load one "prototype" program, gets its underlying insns and clone
> programs as necessary. After that, bpf_prog_load() would be used to
> load those cloned programs into kernel.
hm, I can't see how to clone a program.. so we need to end up with
several copies of the single program defined in the object.. I can
get its intructions with bpf_program__insns, but how do I add more
programs with these instructions customized/prefixed?
thanks,
jirka
>
> 3. Those *very* custom SEC() definitions will be possible for perf to
> handle once [0] lands (I'll send new revision tomorrow, probably).
> You'll be able to register your own "fallback" handler with
> libbpf_register_prog_handler(NULL, ...).
>
> [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=611491&state=*
>
> Sorry, it's not a straightforward copy/paste, but I hope this helps a bit.
>
> > > - Arnaldo
> > >
> > > Before:
> > >
> > > [root@quaco perf]# cat tools/perf/examples/bpf/5sec.c
> > > // SPDX-License-Identifier: GPL-2.0
> > > /*
> > > Description:
> > >
> > > . Disable strace like syscall tracing (--no-syscalls), or try tracing
> > > just some (-e *sleep).
> > >
> > > . Attach a filter function to a kernel function, returning when it should
> > > be considered, i.e. appear on the output.
> > >
> > > . Run it system wide, so that any sleep of >= 5 seconds and < than 6
> > > seconds gets caught.
> > >
> > > . Ask for callgraphs using DWARF info, so that userspace can be unwound
> > >
> > > . While this is running, run something like "sleep 5s".
> > >
> > > . If we decide to add tv_nsec as well, then it becomes:
> > >
> > > int probe(hrtimer_nanosleep, rqtp->tv_sec rqtp->tv_nsec)(void *ctx, int err, long sec, long nsec)
> > >
> > > I.e. add where it comes from (rqtp->tv_nsec) and where it will be
> > > accessible in the function body (nsec)
> > >
> > > # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c/call-graph=dwarf/
> > > 0.000 perf_bpf_probe:func:(ffffffff9811b5f0) tv_sec=5
> > > hrtimer_nanosleep ([kernel.kallsyms])
> > > __x64_sys_nanosleep ([kernel.kallsyms])
> > > do_syscall_64 ([kernel.kallsyms])
> > > entry_SYSCALL_64 ([kernel.kallsyms])
> > > __GI___nanosleep (/usr/lib64/libc-2.26.so)
> > > rpl_nanosleep (/usr/bin/sleep)
> > > xnanosleep (/usr/bin/sleep)
> > > main (/usr/bin/sleep)
> > > __libc_start_main (/usr/lib64/libc-2.26.so)
> > > _start (/usr/bin/sleep)
> > > ^C#
> > >
> > > Copyright (C) 2018 Red Hat, Inc., Arnaldo Carvalho de Melo <acme@redhat.com>
> > > */
> > >
> > > #include <bpf.h>
> > >
> > > #define NSEC_PER_SEC 1000000000L
> > >
> > > int probe(hrtimer_nanosleep, rqtp)(void *ctx, int err, long long sec)
> > > {
> > > return sec / NSEC_PER_SEC == 5ULL;
> > > }
> >
> > that sucks ;-) I'll check if we can re-implement as we discussed earlier,
> > however below is workaround how to do it without the prologue support
> >
> > jirka
> >
> >
> > ---
> > diff --git a/tools/perf/examples/bpf/5sec.c b/tools/perf/examples/bpf/5sec.c
> > index e6b6181c6dc6..734d39debdb8 100644
> > --- a/tools/perf/examples/bpf/5sec.c
> > +++ b/tools/perf/examples/bpf/5sec.c
> > @@ -43,9 +43,17 @@
> >
> > #define NSEC_PER_SEC 1000000000L
> >
> > -int probe(hrtimer_nanosleep, rqtp)(void *ctx, int err, long long sec)
> > +struct pt_regs {
> > + long di;
> > +};
> > +
> > +SEC("function=hrtimer_nanosleep")
> > +int krava(struct pt_regs *ctx)
> > {
> > - return sec / NSEC_PER_SEC == 5ULL;
> > + unsigned long arg;
> > +
> > + probe_read_kernel(&arg, sizeof(arg), __builtin_preserve_access_index(&ctx->di));
> > + return arg / NSEC_PER_SEC == 5ULL;
> > }
> >
> > license(GPL);
> > diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
> > index b422aeef5339..b7d6d2fc8342 100644
> > --- a/tools/perf/include/bpf/bpf.h
> > +++ b/tools/perf/include/bpf/bpf.h
> > @@ -64,6 +64,7 @@ int _version SEC("version") = LINUX_VERSION_CODE;
> >
> > static int (*probe_read)(void *dst, int size, const void *unsafe_addr) = (void *)BPF_FUNC_probe_read;
> > static int (*probe_read_str)(void *dst, int size, const void *unsafe_addr) = (void *)BPF_FUNC_probe_read_str;
> > +static long (*probe_read_kernel)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) BPF_FUNC_probe_read_kernel;
> >
> > static int (*perf_event_output)(void *, struct bpf_map *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output;
> >
> > diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
> > index 96c8ef60f4f8..9274a3373847 100644
> > --- a/tools/perf/util/llvm-utils.c
> > +++ b/tools/perf/util/llvm-utils.c
> > @@ -25,7 +25,7 @@
> > "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
> > "-Wno-unused-value -Wno-pointer-sign " \
> > "-working-directory $WORKING_DIR " \
> > - "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
> > + "-g -c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
> >
> > struct llvm_param llvm_param = {
> > .clang_path = "clang",
next prev parent reply other threads:[~2022-02-13 15:02 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-23 22:19 [PATCH 1/3] perf/bpf: Remove prologue generation Jiri Olsa
2022-01-23 22:19 ` [PATCH 2/3] perf/bpf: Remove special bpf config support Jiri Olsa
2022-01-23 22:19 ` [PATCH 3/3] perf tests: Remove bpf prologue generation test Jiri Olsa
2022-01-24 20:24 ` [PATCH 1/3] perf/bpf: Remove prologue generation Andrii Nakryiko
2022-02-02 1:01 ` Andrii Nakryiko
2022-02-02 10:08 ` Arnaldo Carvalho de Melo
2022-02-02 17:21 ` Andrii Nakryiko
2022-02-10 19:18 ` Arnaldo Carvalho de Melo
2022-02-10 21:30 ` Jiri Olsa
2022-02-11 5:28 ` Andrii Nakryiko
2022-02-13 15:02 ` Jiri Olsa [this message]
2022-02-14 5:57 ` Andrii Nakryiko
2022-02-14 8:33 ` Jiri Olsa
2022-02-14 6:02 ` Andrii Nakryiko
2022-02-14 8:16 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YgkdmQVN1XSuFDWM@krava \
--to=olsajiri@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=christylee@fb.com \
--cc=irogers@google.com \
--cc=jolsa@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=mpetlan@redhat.com \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.