From: Jiri Olsa <olsajiri@gmail.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <olsajiri@gmail.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Hao Sun <sunhao.th@gmail.com>, bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>
Subject: Re: [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints
Date: Tue, 6 Dec 2022 09:14:10 +0100 [thread overview]
Message-ID: <Y4750mbd7XEzue0r@krava> (raw)
In-Reply-To: <CAM9d7cj2QGH2x=J=7LVEEOfcDUYLU0Cmd_O7KEHZM-9FRmX3OA@mail.gmail.com>
On Mon, Dec 05, 2022 at 08:00:16PM -0800, Namhyung Kim wrote:
> On Mon, Dec 5, 2022 at 4:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Sat, Dec 03, 2022 at 09:58:34AM -0800, Namhyung Kim wrote:
> > > On Wed, Nov 30, 2022 at 03:29:39PM -0800, Andrii Nakryiko wrote:
> > > > On Fri, Nov 25, 2022 at 1:35 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 24, 2022 at 09:17:22AM -0800, Alexei Starovoitov wrote:
> > > > > > On Thu, Nov 24, 2022 at 1:42 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Nov 24, 2022 at 01:41:23AM +0100, Daniel Borkmann wrote:
> > > > > > > > On 11/21/22 10:31 PM, Jiri Olsa wrote:
> > > > > > > > > We hit following issues [1] [2] when we attach bpf program that calls
> > > > > > > > > bpf_trace_printk helper to the contention_begin tracepoint.
> > > > > > > > >
> > > > > > > > > As described in [3] with multiple bpf programs that call bpf_trace_printk
> > > > > > > > > helper attached to the contention_begin might result in exhaustion of
> > > > > > > > > printk buffer or cause a deadlock [2].
> > > > > > > > >
> > > > > > > > > There's also another possible deadlock when multiple bpf programs attach
> > > > > > > > > to bpf_trace_printk tracepoint and call one of the printk bpf helpers.
> > > > > > > > >
> > > > > > > > > This change denies the attachment of bpf program to contention_begin
> > > > > > > > > and bpf_trace_printk tracepoints if the bpf program calls one of the
> > > > > > > > > printk bpf helpers.
> > > > > > > > >
> > > > > > > > > Adding also verifier check for tb_btf programs, so this can be cought
> > > > > > > > > in program loading time with error message like:
> > > > > > > > >
> > > > > > > > > Can't attach program with bpf_trace_printk#6 helper to contention_begin tracepoint.
> > > > > > > > >
> > > > > > > > > [1] https://lore.kernel.org/bpf/CACkBjsakT_yWxnSWr4r-0TpPvbKm9-OBmVUhJb7hV3hY8fdCkw@mail.gmail.com/
> > > > > > > > > [2] https://lore.kernel.org/bpf/CACkBjsaCsTovQHFfkqJKto6S4Z8d02ud1D7MPESrHa1cVNNTrw@mail.gmail.com/
> > > > > > > > > [3] https://lore.kernel.org/bpf/Y2j6ivTwFmA0FtvY@krava/
> > > > > > > > >
> > > > > > > > > Reported-by: Hao Sun <sunhao.th@gmail.com>
> > > > > > > > > Suggested-by: Alexei Starovoitov <ast@kernel.org>
> > > > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > > > > ---
> > > > > > > > > include/linux/bpf.h | 1 +
> > > > > > > > > include/linux/bpf_verifier.h | 2 ++
> > > > > > > > > kernel/bpf/syscall.c | 3 +++
> > > > > > > > > kernel/bpf/verifier.c | 46 ++++++++++++++++++++++++++++++++++++
> > > > > > > > > 4 files changed, 52 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > > > > > index c9eafa67f2a2..3ccabede0f50 100644
> > > > > > > > > --- a/include/linux/bpf.h
> > > > > > > > > +++ b/include/linux/bpf.h
> > > > > > > > > @@ -1319,6 +1319,7 @@ struct bpf_prog {
> > > > > > > > > enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
> > > > > > > > > call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
> > > > > > > > > call_get_func_ip:1, /* Do we call get_func_ip() */
> > > > > > > > > + call_printk:1, /* Do we call trace_printk/trace_vprintk */
> > > > > > > > > tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */
> > > > > > > > > enum bpf_prog_type type; /* Type of BPF program */
> > > > > > > > > enum bpf_attach_type expected_attach_type; /* For some prog types */
> > > > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > > > > > > > index 545152ac136c..7118c2fda59d 100644
> > > > > > > > > --- a/include/linux/bpf_verifier.h
> > > > > > > > > +++ b/include/linux/bpf_verifier.h
> > > > > > > > > @@ -618,6 +618,8 @@ bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > > > > > > > > struct bpf_reg_state *reg,
> > > > > > > > > enum bpf_arg_type arg_type);
> > > > > > > > > +int bpf_check_tp_printk_denylist(const char *name, struct bpf_prog *prog);
> > > > > > > > > +
> > > > > > > > > /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
> > > > > > > > > static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> > > > > > > > > struct btf *btf, u32 btf_id)
> > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > > index 35972afb6850..9a69bda7d62b 100644
> > > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > > @@ -3329,6 +3329,9 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
> > > > > > > > > return -EINVAL;
> > > > > > > > > }
> > > > > > > > > + if (bpf_check_tp_printk_denylist(tp_name, prog))
> > > > > > > > > + return -EACCES;
> > > > > > > > > +
> > > > > > > > > btp = bpf_get_raw_tracepoint(tp_name);
> > > > > > > > > if (!btp)
> > > > > > > > > return -ENOENT;
> > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > > index f07bec227fef..b662bc851e1c 100644
> > > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > > @@ -7472,6 +7472,47 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
> > > > > > > > > state->callback_subprogno == subprogno);
> > > > > > > > > }
> > > > > > > > > +int bpf_check_tp_printk_denylist(const char *name, struct bpf_prog *prog)
> > > > > > > > > +{
> > > > > > > > > + static const char * const denylist[] = {
> > > > > > > > > + "contention_begin",
> > > > > > > > > + "bpf_trace_printk",
> > > > > > > > > + };
> > > > > > > > > + int i;
> > > > > > > > > +
> > > > > > > > > + /* Do not allow attachment to denylist[] tracepoints,
> > > > > > > > > + * if the program calls some of the printk helpers,
> > > > > > > > > + * because there's possibility of deadlock.
> > > > > > > > > + */
> > > > > > > >
> > > > > > > > What if that prog doesn't but tail calls into another one which calls printk helpers?
> > > > > > >
> > > > > > > right, I'll deny that for all BPF_PROG_TYPE_RAW_TRACEPOINT* programs,
> > > > > > > because I don't see easy way to check on that
> > > > > > >
> > > > > > > we can leave printk check for tracing BPF_TRACE_RAW_TP programs,
> > > > > > > because verifier known the exact tracepoint already
> > > > > >
> > > > > > This is all fragile and merely a stop gap.
> > > > > > Doesn't sound that the issue is limited to bpf_trace_printk
> > > > >
> > > > > hm, I don't have a better idea how to fix that.. I can't deny
> > > > > contention_begin completely, because we use it in perf via
> > > > > tp_btf/contention_begin (perf lock contention) and I don't
> > > > > think there's another way for perf to do that
> > > > >
> > > > > fwiw the last version below denies BPF_PROG_TYPE_RAW_TRACEPOINT
> > > > > programs completely and tracing BPF_TRACE_RAW_TP with printks
> > > > >
> > > >
> > > > I think disabling bpf_trace_printk() tracepoint for any BPF program is
> > > > totally fine. This tracepoint was never intended to be attached to.
> > > >
> > > > But as for the general bpf_trace_printk() deadlocking. Should we
> > > > discuss how to make it not deadlock instead of starting to denylist
> > > > things left and right?
> > > >
> > > > Do I understand that we take trace_printk_lock only to protect that
> > > > static char buf[]? Can we just make this buf per-CPU and do a trylock
> > > > instead? We'll only fail to bpf_trace_printk() something if we have
> > > > nested BPF programs (rare) or NMI (also rare).
> > > >
> > > > And it's a printk(), it's never mission-critical, so if we drop some
> > > > message in rare case it's totally fine.
> > >
> > > What about contention_begin? I wonder if we can disallow recursions
> > > for those in the deny list like using bpf_prog_active..
> >
> > I was testing change below which allows to check recursion just
> > for contention_begin tracepoint
> >
> > for the reported issue we might be ok with the change that Andrii
> > suggested, but we could have the change below as extra precaution
>
> Looks ok to me. But it seems it'd add the recursion check to every
hm, it should allocate recursion variable just for the contention_begin
tracepoint, rest should see NULL pointer
> tracepoint. Can we just change the affected tracepoints only by
> using a kind of wrapped btp->bpf_func with some macro magic? ;-)
I tried that and the only other ways I found are:
- add something like TRACE_EVENT_FLAGS macro and have __init call
for specific tracepoint that sets the flag
- add extra new 'bpf_func' that checks the re-entry, but that'd mean
around 1000 extra mostly unused small functions
>
> >
> > ---
>
> [SNIP]
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 3bbd3f0c810c..d27b7dc77894 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2252,9 +2252,8 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
> > }
> >
> > static __always_inline
> > -void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
> > +void __bpf_trace_prog_run(struct bpf_prog *prog, u64 *args)
> > {
> > - cant_sleep();
> > if (unlikely(this_cpu_inc_return(*(prog->active)) != 1)) {
> > bpf_prog_inc_misses_counter(prog);
> > goto out;
> > @@ -2266,6 +2265,22 @@ void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
> > this_cpu_dec(*(prog->active));
> > }
> >
> > +static __always_inline
> > +void __bpf_trace_run(struct bpf_raw_event_data *data, u64 *args)
> > +{
> > + struct bpf_prog *prog = data->prog;
> > +
> > + cant_sleep();
> > + if (unlikely(!data->recursion))
>
> likely ?
right, thanks
jirka
next prev parent reply other threads:[~2022-12-06 8:14 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-21 21:31 [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints Jiri Olsa
2022-11-24 0:41 ` Daniel Borkmann
2022-11-24 9:42 ` Jiri Olsa
2022-11-24 17:17 ` Alexei Starovoitov
2022-11-25 9:35 ` Jiri Olsa
2022-11-30 23:29 ` Andrii Nakryiko
2022-12-03 17:58 ` Namhyung Kim
2022-12-05 12:28 ` Jiri Olsa
2022-12-06 4:00 ` Namhyung Kim
2022-12-06 8:14 ` Jiri Olsa [this message]
2022-12-06 18:20 ` Namhyung Kim
2022-12-06 20:09 ` Alexei Starovoitov
2022-12-07 2:14 ` Namhyung Kim
2022-12-07 5:23 ` Hao Sun
2022-12-07 22:58 ` Namhyung Kim
2022-12-07 8:18 ` Jiri Olsa
2022-12-07 19:08 ` Namhyung Kim
2022-12-08 6:15 ` Namhyung Kim
2022-12-08 12:04 ` Jiri Olsa
2022-12-04 21:44 ` Jiri Olsa
2022-12-07 13:39 ` Jiri Olsa
2022-12-07 19:10 ` Alexei Starovoitov
2022-12-08 2:47 ` Hao Sun
2022-12-03 17:42 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y4750mbd7XEzue0r@krava \
--to=olsajiri@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=namhyung@kernel.org \
--cc=sdf@google.com \
--cc=songliubraving@fb.com \
--cc=sunhao.th@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.