Re: [PATCH v3 bpf-next 05/14] bpf: allow to specify user-provided bpf_cookie for BPF perf links

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kernel Team <kernel-team@fb.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v3 bpf-next 05/14] bpf: allow to specify user-provided bpf_cookie for BPF perf links
Date: Mon, 9 Aug 2021 17:52:24 -0700	[thread overview]
Message-ID: <CAEf4BzbOy1Ct4xWBNFXsQq2i3VQ4_T0xecAdSFXGiu7vtuLMqA@mail.gmail.com> (raw)
In-Reply-To: <5e026f3e-d94a-b8b0-8564-e16b73d6bbcc@iogearbox.net>

On Mon, Aug 9, 2021 at 4:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/30/21 7:34 AM, Andrii Nakryiko wrote:
> > Add ability for users to specify custom u64 value (bpf_cookie) when creating
> > BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event,
> > tracepoints).
> >
> > This is useful for cases when the same BPF program is used for attaching and
> > processing invocation of different tracepoints/kprobes/uprobes in a generic
> > fashion, but such that each invocation is distinguished from each other (e.g.,
> > BPF program can look up additional information associated with a specific
> > kernel function without having to rely on function IP lookups). This enables
> > new use cases to be implemented simply and efficiently that previously were
> > possible only through code generation (and thus multiple instances of almost
> > identical BPF program) or compilation at runtime (BCC-style) on target hosts
> > (even more expensive resource-wise). For uprobes it is not even possible in
> > some cases to know function IP before hand (e.g., when attaching to shared
> > library without PID filtering, in which case base load address is not known
> > for a library).
> >
> > This is done by storing u64 bpf_cookie in struct bpf_prog_array_item,
> > corresponding to each attached and run BPF program. Given cgroup BPF programs
> > already use two 8-byte pointers for their needs and cgroup BPF programs don't
> > have (yet?) support for bpf_cookie, reuse that space through union of
> > cgroup_storage and new bpf_cookie field.
> >
> > Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx.
> > This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF
> > program execution code, which luckily is now also split from
> > BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper
> > giving access to this user-provided cookie value from inside a BPF program.
> > Generic perf_event BPF programs will access this value from perf_event itself
> > through passed in BPF program context.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> [...]
> >
> > +struct bpf_trace_run_ctx {
> > +     struct bpf_run_ctx run_ctx;
> > +     u64 bpf_cookie;
> > +};
> > +
> >   #ifdef CONFIG_BPF_SYSCALL
> >   static inline struct bpf_run_ctx *bpf_set_run_ctx(struct bpf_run_ctx *new_ctx)
> >   {
> > @@ -1247,6 +1256,8 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
> >       const struct bpf_prog_array_item *item;
> >       const struct bpf_prog *prog;
> >       const struct bpf_prog_array *array;
> > +     struct bpf_run_ctx *old_run_ctx;
> > +     struct bpf_trace_run_ctx run_ctx;
> >       u32 ret = 1;
> >
> >       migrate_disable();
> > @@ -1254,11 +1265,14 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
> >       array = rcu_dereference(array_rcu);
> >       if (unlikely(!array))
> >               goto out;
> > +     old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
> >       item = &array->items[0];
> >       while ((prog = READ_ONCE(item->prog))) {
> > +             run_ctx.bpf_cookie = item->bpf_cookie;
> >               ret &= run_prog(prog, ctx);
> >               item++;
> >       }
> > +     bpf_reset_run_ctx(old_run_ctx);
> >   out:
> >       rcu_read_unlock();
> >       migrate_enable();
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 2d510ad750ed..fe156a8170aa 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -762,6 +762,7 @@ struct perf_event {
> >   #ifdef CONFIG_BPF_SYSCALL
> >       perf_overflow_handler_t         orig_overflow_handler;
> >       struct bpf_prog                 *prog;
> > +     u64                             bpf_cookie;
> >   #endif
> >
> >   #ifdef CONFIG_EVENT_TRACING
> > diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> > index 8ac92560d3a3..8e0631a4b046 100644
> > --- a/include/linux/trace_events.h
> > +++ b/include/linux/trace_events.h
> > @@ -675,7 +675,7 @@ trace_trigger_soft_disabled(struct trace_event_file *file)
> >
> >   #ifdef CONFIG_BPF_EVENTS
> >   unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx);
> > -int perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog);
> > +int perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie);
> >   void perf_event_detach_bpf_prog(struct perf_event *event);
> >   int perf_event_query_prog_array(struct perf_event *event, void __user *info);
> >   int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
> > @@ -692,7 +692,7 @@ static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *c
> >   }
> >
> >   static inline int
> > -perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog)
> > +perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie)
> >   {
> >       return -EOPNOTSUPP;
> >   }
> > @@ -803,7 +803,7 @@ extern void ftrace_profile_free_filter(struct perf_event *event);
> >   void perf_trace_buf_update(void *record, u16 type);
> >   void *perf_trace_buf_alloc(int size, struct pt_regs **regs, int *rctxp);
> >
> > -int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog);
> > +int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie);
> >   void perf_event_free_bpf_prog(struct perf_event *event);
> >
> >   void bpf_trace_run1(struct bpf_prog *prog, u64 arg1);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 94fe8329b28f..63ee482d50e1 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1448,6 +1448,13 @@ union bpf_attr {
> >                               __aligned_u64   iter_info;      /* extra bpf_iter_link_info */
> >                               __u32           iter_info_len;  /* iter_info length */
> >                       };
> > +                     struct {
> > +                             /* black box user-provided value passed through
> > +                              * to BPF program at the execution time and
> > +                              * accessible through bpf_get_attach_cookie() BPF helper
> > +                              */
> > +                             __u64           bpf_cookie;
>
>  From API PoV, should we just name this link_id to avoid confusion around gen_cookie_next()

I'd like to avoid the "id" connotations, as it doesn't have to be
ID-like at all. It can be duplicated across multiple
links/attachments, it could be random, it could be sequential, it
could be hash, whatever. I started originally with "user value" name
for the concept, switching to "user context" before submitting v1. And
then Peter proposed "cookie" terminology and it clicked, because
that's how custom user-provided value is often called when passed back
to user-provided callback. I agree about possible confusion with
socket and netns cookie, but ultimately there are only so many
different words we can use :) I think link_id itself is super
confusing as well, as there is BPF link ID (similar to prog ID and map
ID), which means a completely different thing, yet applies to the same
BPF link object.

> users? Do we expect other link types to implement similar mechanism? I'd think probably yes
> if the prog would be common and e.g. do htab lookups based on that opaque value.

Yes, you are right. I intend to add it at least to fentry/fexit
programs, but ultimately any type of program and its attachment might
benefit (e.g., BPF iterator is a pretty good candidate as well). In
practice, I'd try to use array to pass extra info, but hashmap is also
an option, of course.

>
> Is the 8b chosen given function IP fits, or is there a different rationale size-wise? Should
> this be of dynamic size to be more future proof, e.g. hidden map like in prog's global sections
> that libbpf sets up / prepopulates internally, but tied to link object instead?

See my previous reply to Yonghong about this ([0]). 4 bytes probably
would be workable in most cases, but felt like unnecessary economy,
given we are still going to use full 8 bytes in prog_array_item and
the likes. But I also explicitly don't want an arbitrary sized
"storage", that's not necessary. Users are free to handle extra
storage on their own and use this cookie value as an index/key into
that extra storage. So far I always used an ARRAY map for this, it's
actually very easy for user-space to do this efficiently, based on a
specific problem it is solving. I feel like adding libbpf conventions
with some hidden map just adds lots of complexity without adding much
value. Sometimes it has to be dynamically-sized BPF_MAP_TYPE_ARRAY,
but often it's enough to have a global variable (struct my_config
configs[MAX_CONFIG_CNT]), and in some other cases we don't even need a
storage at all.


  [0] https://lore.kernel.org/bpf/CAEf4BzY2fVJN5CEdWDDNkWQ9En4N6Rynnnzj7hTnWG65BqdusQ@mail.gmail.com/


>
> > +                     } perf_event;
> >               };
> >       } link_create;
> >
> Thanks,
> Daniel

next prev parent reply	other threads:[~2021-08-10  0:52 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30  5:33 [PATCH v3 bpf-next 00/14] BPF perf link and user-provided bpf_cookie Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 01/14] bpf: refactor BPF_PROG_RUN into a function Andrii Nakryiko
2021-07-30 17:19   ` Yonghong Song
2021-08-09 22:43   ` Daniel Borkmann
2021-08-10  0:28     ` Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 02/14] bpf: refactor BPF_PROG_RUN_ARRAY family of macros into functions Andrii Nakryiko
2021-08-09 23:00   ` Daniel Borkmann
2021-08-10  0:36     ` Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 03/14] bpf: refactor perf_event_set_bpf_prog() to use struct bpf_prog input Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 04/14] bpf: implement minimal BPF perf link Andrii Nakryiko
2021-07-30 17:24   ` Yonghong Song
2021-07-30  5:34 ` [PATCH v3 bpf-next 05/14] bpf: allow to specify user-provided bpf_cookie for BPF perf links Andrii Nakryiko
2021-07-30 22:02   ` Yonghong Song
2021-08-09 23:30   ` Daniel Borkmann
2021-08-10  0:52     ` Andrii Nakryiko [this message]
2021-07-30  5:34 ` [PATCH v3 bpf-next 06/14] bpf: add bpf_get_attach_cookie() BPF helper to access bpf_cookie value Andrii Nakryiko
2021-07-30 22:13   ` Yonghong Song
2021-07-30  5:34 ` [PATCH v3 bpf-next 07/14] libbpf: re-build libbpf.so when libbpf.map changes Andrii Nakryiko
2021-07-30 22:31   ` Yonghong Song
2021-07-30  5:34 ` [PATCH v3 bpf-next 08/14] libbpf: remove unused bpf_link's destroy operation, but add dealloc Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 09/14] libbpf: use BPF perf link when supported by kernel Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 10/14] libbpf: add bpf_cookie support to bpf_link_create() API Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 11/14] libbpf: add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 12/14] selftests/bpf: test low-level perf BPF link API Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 13/14] selftests/bpf: extract uprobe-related helpers into trace_helpers.{c,h} Andrii Nakryiko
2021-07-30  5:34 ` [PATCH v3 bpf-next 14/14] selftests/bpf: add bpf_cookie selftests for high-level APIs Andrii Nakryiko
2021-08-01  5:11 ` [PATCH bpf-next v4] libbpf: introduce legacy kprobe events support Rafael David Tinoco
2021-08-06 22:29   ` Andrii Nakryiko
2021-09-12  6:48     ` [PATCH bpf-next v5] " Rafael David Tinoco
2021-09-12  6:52       ` Rafael David Tinoco
2021-09-14  4:44       ` Andrii Nakryiko
2021-09-14  5:02         ` Rafael David Tinoco
2021-09-14  5:21           ` Andrii Nakryiko
2021-09-14 14:27             ` sunyucong
2021-09-14 15:42               ` Rafael David Tinoco
2021-09-14 20:34                 ` [PATCH bpf-next] libbpf: fix build error introduced by legacy kprobe feature Rafael David Tinoco
2021-09-14 21:04                   ` Alexei Starovoitov
2021-09-14 21:35                 ` [PATCH bpf-next v2] " Rafael David Tinoco
2021-09-14 21:39                   ` Rafael David Tinoco
2021-09-14 21:48                     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzbOy1Ct4xWBNFXsQq2i3VQ4_T0xecAdSFXGiu7vtuLMqA@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).