Re: [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints

BPF List
 help / color / mirror / Atom feed

From: Jiri Olsa <olsajiri@gmail.com>
To: Namhyung Kim <namhyung@gmail.com>
Cc: Jiri Olsa <olsajiri@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Hao Sun <sunhao.th@gmail.com>, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>
Subject: Re: [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints
Date: Thu, 8 Dec 2022 13:04:46 +0100	[thread overview]
Message-ID: <Y5HS3nmjARpe+Mef@krava> (raw)
In-Reply-To: <Y5GA8LjlB1BDQ/TO@google.com>

On Wed, Dec 07, 2022 at 10:15:12PM -0800, Namhyung Kim wrote:
> On Wed, Dec 07, 2022 at 11:08:40AM -0800, Namhyung Kim wrote:
> > On Wed, Dec 7, 2022 at 12:18 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Tue, Dec 06, 2022 at 06:14:06PM -0800, Namhyung Kim wrote:
> > >
> > > SNIP
> > >
> > > > -static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
> > > > +static void *bpf_trace_norecurse_funcs[12] = {
> > > > +     (void *)bpf_trace_run_norecurse1,
> > > > +     (void *)bpf_trace_run_norecurse2,
> > > > +     (void *)bpf_trace_run_norecurse3,
> > > > +     (void *)bpf_trace_run_norecurse4,
> > > > +     (void *)bpf_trace_run_norecurse5,
> > > > +     (void *)bpf_trace_run_norecurse6,
> > > > +     (void *)bpf_trace_run_norecurse7,
> > > > +     (void *)bpf_trace_run_norecurse8,
> > > > +     (void *)bpf_trace_run_norecurse9,
> > > > +     (void *)bpf_trace_run_norecurse10,
> > > > +     (void *)bpf_trace_run_norecurse11,
> > > > +     (void *)bpf_trace_run_norecurse12,
> > > > +};
> > > > +
> > > > +static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog,
> > > > +                             void *func, void *data)
> > > >  {
> > > >       struct tracepoint *tp = btp->tp;
> > > >
> > > > @@ -2325,13 +2354,12 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *
> > > >       if (prog->aux->max_tp_access > btp->writable_size)
> > > >               return -EINVAL;
> > > >
> > > > -     return tracepoint_probe_register_may_exist(tp, (void *)btp->bpf_func,
> > > > -                                                prog);
> > > > +     return tracepoint_probe_register_may_exist(tp, func, data);
> > > >  }
> > > >
> > > >  int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
> > > >  {
> > > > -     return __bpf_probe_register(btp, prog);
> > > > +     return __bpf_probe_register(btp, prog, btp->bpf_func, prog);
> > > >  }
> > > >
> > > >  int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
> > > > @@ -2339,6 +2367,33 @@ int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
> > > >       return tracepoint_probe_unregister(btp->tp, (void *)btp->bpf_func, prog);
> > > >  }
> > > >
> > > > +int bpf_probe_register_norecurse(struct bpf_raw_event_map *btp, struct bpf_prog *prog,
> > > > +                              struct bpf_raw_event_data *data)
> > > > +{
> > > > +     void *bpf_func;
> > > > +
> > > > +     data->active = alloc_percpu_gfp(int, GFP_KERNEL);
> > > > +     if (!data->active)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     data->prog = prog;
> > > > +     bpf_func = bpf_trace_norecurse_funcs[btp->num_args];
> > > > +     return __bpf_probe_register(btp, prog, bpf_func, data);
> > >
> > > I don't think we can do that, because it won't do the arg -> u64 conversion
> > > that __bpf_trace_##call functions are doing:
> > >
> > >         __bpf_trace_##call(void *__data, proto)                                 \
> > >         {                                                                       \
> > >                 struct bpf_prog *prog = __data;                                 \
> > >                 CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args));  \
> > >         }
> > >
> > > like for 'old_pid' arg in sched_process_exec tracepoint:
> > >
> > >         ffffffff811959e0 <__bpf_trace_sched_process_exec>:
> > >         ffffffff811959e0:       89 d2                   mov    %edx,%edx
> > >         ffffffff811959e2:       e9 a9 07 14 00          jmp    ffffffff812d6190 <bpf_trace_run3>
> > >         ffffffff811959e7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
> > >         ffffffff811959ee:       00 00
> > >
> > > bpf program could see some trash in args < u64
> > >
> > > we'd need to add 'recursion' variant for all __bpf_trace_##call functions
> > 
> > Ah, ok.  So 'contention_begin' tracepoint has unsigned int flags.
> > perf lock contention BPF program properly uses the lower 4 bytes of flags,
> > but others might access the whole 8 bytes then they will see the garbage.
> > Is that your concern?
> > 
> > Hmm.. I think we can use BTF to get the size of each argument then do
> > the conversion.  Let me see..
> 
> Maybe something like this?  But I'm not sure if I did cast_to_u64() right.

I guess that would work, but now I like the idea of fixing the original
issue by removing the spinlock from printk helpers completely

we might need to come back to something like this in future if we hit
similar issue and won't have better way to fix it

jirka

next prev parent reply	other threads:[~2022-12-08 12:04 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 21:31 [PATCH bpf-next] bpf: Restrict attachment of bpf program to some tracepoints Jiri Olsa
2022-11-24  0:41 ` Daniel Borkmann
2022-11-24  9:42   ` Jiri Olsa
2022-11-24 17:17     ` Alexei Starovoitov
2022-11-25  9:35       ` Jiri Olsa
2022-11-30 23:29         ` Andrii Nakryiko
2022-12-03 17:58           ` Namhyung Kim
2022-12-05 12:28             ` Jiri Olsa
2022-12-06  4:00               ` Namhyung Kim
2022-12-06  8:14                 ` Jiri Olsa
2022-12-06 18:20                   ` Namhyung Kim
2022-12-06 20:09               ` Alexei Starovoitov
2022-12-07  2:14                 ` Namhyung Kim
2022-12-07  5:23                   ` Hao Sun
2022-12-07 22:58                     ` Namhyung Kim
2022-12-07  8:18                   ` Jiri Olsa
2022-12-07 19:08                     ` Namhyung Kim
2022-12-08  6:15                       ` Namhyung Kim
2022-12-08 12:04                         ` Jiri Olsa [this message]
2022-12-04 21:44           ` Jiri Olsa
2022-12-07 13:39             ` Jiri Olsa
2022-12-07 19:10               ` Alexei Starovoitov
2022-12-08  2:47               ` Hao Sun
2022-12-03 17:42       ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y5HS3nmjARpe+Mef@krava \
    --to=olsajiri@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=namhyung@gmail.com \
    --cc=sdf@google.com \
    --cc=songliubraving@fb.com \
    --cc=sunhao.th@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox