From: Jiri Olsa <olsajiri@gmail.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <olsajiri@gmail.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Hao Sun <sunhao.th@gmail.com>, bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>
Subject: Re: [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints
Date: Tue, 6 Dec 2022 09:14:10 +0100 [thread overview]
Message-ID: <Y4750mbd7XEzue0r@krava> (raw)
In-Reply-To: <CAM9d7cj2QGH2x=J=7LVEEOfcDUYLU0Cmd_O7KEHZM-9FRmX3OA@mail.gmail.com>
On Mon, Dec 05, 2022 at 08:00:16PM -0800, Namhyung Kim wrote:
> On Mon, Dec 5, 2022 at 4:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Sat, Dec 03, 2022 at 09:58:34AM -0800, Namhyung Kim wrote:
> > > On Wed, Nov 30, 2022 at 03:29:39PM -0800, Andrii Nakryiko wrote:
> > > > On Fri, Nov 25, 2022 at 1:35 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 24, 2022 at 09:17:22AM -0800, Alexei Starovoitov wrote:
> > > > > > On Thu, Nov 24, 2022 at 1:42 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Nov 24, 2022 at 01:41:23AM +0100, Daniel Borkmann wrote:
> > > > > > > > On 11/21/22 10:31 PM, Jiri Olsa wrote:
> > > > > > > > > We hit following issues [1] [2] when we attach bpf program that calls
> > > > > > > > > bpf_trace_printk helper to the contention_begin tracepoint.
> > > > > > > > >
> > > > > > > > > As described in [3] with multiple bpf programs that call bpf_trace_printk
> > > > > > > > > helper attached to the contention_begin might result in exhaustion of
> > > > > > > > > printk buffer or cause a deadlock [2].
> > > > > > > > >
> > > > > > > > > There's also another possible deadlock when multiple bpf programs attach
> > > > > > > > > to bpf_trace_printk tracepoint and call one of the printk bpf helpers.
> > > > > > > > >
> > > > > > > > > This change denies the attachment of bpf program to contention_begin
> > > > > > > > > and bpf_trace_printk tracepoints if the bpf program calls one of the
> > > > > > > > > printk bpf helpers.
> > > > > > > > >
> > > > > > > > > Adding also verifier check for tb_btf programs, so this can be cought
> > > > > > > > > in program loading time with error message like:
> > > > > > > > >
> > > > > > > > > Can't attach program with bpf_trace_printk#6 helper to contention_begin tracepoint.
> > > > > > > > >
> > > > > > > > > [1] https://lore.kernel.org/bpf/CACkBjsakT_yWxnSWr4r-0TpPvbKm9-OBmVUhJb7hV3hY8fdCkw@mail.gmail.com/
> > > > > > > > > [2] https://lore.kernel.org/bpf/CACkBjsaCsTovQHFfkqJKto6S4Z8d02ud1D7MPESrHa1cVNNTrw@mail.gmail.com/
> > > > > > > > > [3] https://lore.kernel.org/bpf/Y2j6ivTwFmA0FtvY@krava/
> > > > > > > > >
> > > > > > > > > Reported-by: Hao Sun <sunhao.th@gmail.com>
> > > > > > > > > Suggested-by: Alexei Starovoitov <ast@kernel.org>
> > > > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > > > > ---
> > > > > > > > > include/linux/bpf.h | 1 +
> > > > > > > > > include/linux/bpf_verifier.h | 2 ++
> > > > > > > > > kernel/bpf/syscall.c | 3 +++
> > > > > > > > > kernel/bpf/verifier.c | 46 ++++++++++++++++++++++++++++++++++++
> > > > > > > > > 4 files changed, 52 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > > > > > index c9eafa67f2a2..3ccabede0f50 100644
> > > > > > > > > --- a/include/linux/bpf.h
> > > > > > > > > +++ b/include/linux/bpf.h
> > > > > > > > > @@ -1319,6 +1319,7 @@ struct bpf_prog {
> > > > > > > > > enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
> > > > > > > > > call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
> > > > > > > > > call_get_func_ip:1, /* Do we call get_func_ip() */
> > > > > > > > > + call_printk:1, /* Do we call trace_printk/trace_vprintk */
> > > > > > > > > tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */
> > > > > > > > > enum bpf_prog_type type; /* Type of BPF program */
> > > > > > > > > enum bpf_attach_type expected_attach_type; /* For some prog types */
> > > > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > > > > > > > index 545152ac136c..7118c2fda59d 100644
> > > > > > > > > --- a/include/linux/bpf_verifier.h
> > > > > > > > > +++ b/include/linux/bpf_verifier.h
> > > > > > > > > @@ -618,6 +618,8 @@ bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > > > > > > > > struct bpf_reg_state *reg,
> > > > > > > > > enum bpf_arg_type arg_type);
> > > > > > > > > +int bpf_check_tp_printk_denylist(const char *name, struct bpf_prog *prog);
> > > > > > > > > +
> > > > > > > > > /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
> > > > > > > > > static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> > > > > > > > > struct btf *btf, u32 btf_id)
> > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > > index 35972afb6850..9a69bda7d62b 100644
> > > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > > @@ -3329,6 +3329,9 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog,
> > > > > > > > > return -EINVAL;
> > > > > > > > > }
> > > > > > > > > + if (bpf_check_tp_printk_denylist(tp_name, prog))
> > > > > > > > > + return -EACCES;
> > > > > > > > > +
> > > > > > > > > btp = bpf_get_raw_tracepoint(tp_name);
> > > > > > > > > if (!btp)
> > > > > > > > > return -ENOENT;
> > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > > index f07bec227fef..b662bc851e1c 100644
> > > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > > @@ -7472,6 +7472,47 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
> > > > > > > > > state->callback_subprogno == subprogno);
> > > > > > > > > }
> > > > > > > > > +int bpf_check_tp_printk_denylist(const char *name, struct bpf_prog *prog)
> > > > > > > > > +{
> > > > > > > > > + static const char * const denylist[] = {
> > > > > > > > > + "contention_begin",
> > > > > > > > > + "bpf_trace_printk",
> > > > > > > > > + };
> > > > > > > > > + int i;
> > > > > > > > > +
> > > > > > > > > + /* Do not allow attachment to denylist[] tracepoints,
> > > > > > > > > + * if the program calls some of the printk helpers,
> > > > > > > > > + * because there's possibility of deadlock.
> > > > > > > > > + */
> > > > > > > >
> > > > > > > > What if that prog doesn't but tail calls into another one which calls printk helpers?
> > > > > > >
> > > > > > > right, I'll deny that for all BPF_PROG_TYPE_RAW_TRACEPOINT* programs,
> > > > > > > because I don't see easy way to check on that
> > > > > > >
> > > > > > > we can leave printk check for tracing BPF_TRACE_RAW_TP programs,
> > > > > > > because verifier known the exact tracepoint already
> > > > > >
> > > > > > This is all fragile and merely a stop gap.
> > > > > > Doesn't sound that the issue is limited to bpf_trace_printk
> > > > >
> > > > > hm, I don't have a better idea how to fix that.. I can't deny
> > > > > contention_begin completely, because we use it in perf via
> > > > > tp_btf/contention_begin (perf lock contention) and I don't
> > > > > think there's another way for perf to do that
> > > > >
> > > > > fwiw the last version below denies BPF_PROG_TYPE_RAW_TRACEPOINT
> > > > > programs completely and tracing BPF_TRACE_RAW_TP with printks
> > > > >
> > > >
> > > > I think disabling bpf_trace_printk() tracepoint for any BPF program is
> > > > totally fine. This tracepoint was never intended to be attached to.
> > > >
> > > > But as for the general bpf_trace_printk() deadlocking. Should we
> > > > discuss how to make it not deadlock instead of starting to denylist
> > > > things left and right?
> > > >
> > > > Do I understand that we take trace_printk_lock only to protect that
> > > > static char buf[]? Can we just make this buf per-CPU and do a trylock
> > > > instead? We'll only fail to bpf_trace_printk() something if we have
> > > > nested BPF programs (rare) or NMI (also rare).
> > > >
> > > > And it's a printk(), it's never mission-critical, so if we drop some
> > > > message in rare case it's totally fine.
> > >
> > > What about contention_begin? I wonder if we can disallow recursions
> > > for those in the deny list like using bpf_prog_active..
> >
> > I was testing change below which allows to check recursion just
> > for contention_begin tracepoint
> >
> > for the reported issue we might be ok with the change that Andrii
> > suggested, but we could have the change below as extra precaution
>
> Looks ok to me. But it seems it'd add the recursion check to every
hm, it should allocate recursion variable just for the contention_begin
tracepoint, rest should see NULL pointer
> tracepoint. Can we just change the affected tracepoints only by
> using a kind of wrapped btp->bpf_func with some macro magic? ;-)
I tried that and the only other ways I found are:
- add something like TRACE_EVENT_FLAGS macro and have __init call
for specific tracepoint that sets the flag
- add extra new 'bpf_func' that checks the re-entry, but that'd mean
around 1000 extra mostly unused small functions
>
> >
> > ---
>
> [SNIP]
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 3bbd3f0c810c..d27b7dc77894 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2252,9 +2252,8 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
> > }
> >
> > static __always_inline
> > -void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
> > +void __bpf_trace_prog_run(struct bpf_prog *prog, u64 *args)
> > {
> > - cant_sleep();
> > if (unlikely(this_cpu_inc_return(*(prog->active)) != 1)) {
> > bpf_prog_inc_misses_counter(prog);
> > goto out;
> > @@ -2266,6 +2265,22 @@ void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
> > this_cpu_dec(*(prog->active));
> > }
> >
> > +static __always_inline
> > +void __bpf_trace_run(struct bpf_raw_event_data *data, u64 *args)
> > +{
> > + struct bpf_prog *prog = data->prog;
> > +
> > + cant_sleep();
> > + if (unlikely(!data->recursion))
>
> likely ?
right, thanks
jirka
next prev parent reply other threads:[~2022-12-06 8:14 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-21 21:31 [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints Jiri Olsa
2022-11-24 0:41 ` Daniel Borkmann
2022-11-24 9:42 ` Jiri Olsa
2022-11-24 17:17 ` Alexei Starovoitov
2022-11-25 9:35 ` Jiri Olsa
2022-11-30 23:29 ` Andrii Nakryiko
2022-12-03 17:58 ` Namhyung Kim
2022-12-05 12:28 ` Jiri Olsa
2022-12-06 4:00 ` Namhyung Kim
2022-12-06 8:14 ` Jiri Olsa [this message]
2022-12-06 18:20 ` Namhyung Kim
2022-12-06 20:09 ` Alexei Starovoitov
2022-12-07 2:14 ` Namhyung Kim
2022-12-07 5:23 ` Hao Sun
2022-12-07 22:58 ` Namhyung Kim
2022-12-07 8:18 ` Jiri Olsa
2022-12-07 19:08 ` Namhyung Kim
2022-12-08 6:15 ` Namhyung Kim
2022-12-08 12:04 ` Jiri Olsa
2022-12-04 21:44 ` Jiri Olsa
2022-12-07 13:39 ` Jiri Olsa
2022-12-07 19:10 ` Alexei Starovoitov
2022-12-08 2:47 ` Hao Sun
2022-12-03 17:42 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y4750mbd7XEzue0r@krava \
--to=olsajiri@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=namhyung@kernel.org \
--cc=sdf@google.com \
--cc=songliubraving@fb.com \
--cc=sunhao.th@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).