All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Yonghong Song <yhs@fb.com>
Subject: Re: debug annotations for bpf progs. Was: [PATCH bpf-next 1/3] bpf: preserve command of the process that loaded the program
Date: Tue, 15 Oct 2019 15:26:12 -0700	[thread overview]
Message-ID: <20191015222612.GB1897241@mini-arch> (raw)
In-Reply-To: <CAADnVQKuysEvFAX54+f0YPJ1+cgcRJbhrpVE7xmvLqu-ADrk+Q@mail.gmail.com>

On 10/15, Alexei Starovoitov wrote:
> On Fri, Oct 11, 2019 at 5:38 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 10/11, Alexei Starovoitov wrote:
> > > On Fri, Oct 11, 2019 at 9:21 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > Even though we have the pointer to user_struct and can recover
> > > > uid of the user who has created the program, it usually contains
> > > > 0 (root) which is not very informative. Let's store the comm of the
> > > > calling process and export it via bpf_prog_info. This should help
> > > > answer the question "which process loaded this particular program".
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  include/linux/bpf.h      | 1 +
> > > >  include/uapi/linux/bpf.h | 2 ++
> > > >  kernel/bpf/syscall.c     | 4 ++++
> > > >  3 files changed, 7 insertions(+)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 5b9d22338606..b03ea396afe5 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -421,6 +421,7 @@ struct bpf_prog_aux {
> > > >                 struct work_struct work;
> > > >                 struct rcu_head rcu;
> > > >         };
> > > > +       char created_by_comm[BPF_CREATED_COMM_LEN];
> > > >  };
> > > >
> > > >  struct bpf_array {
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index a65c3b0c6935..4e883ecbba1e 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -326,6 +326,7 @@ enum bpf_attach_type {
> > > >  #define BPF_F_NUMA_NODE                (1U << 2)
> > > >
> > > >  #define BPF_OBJ_NAME_LEN 16U
> > > > +#define BPF_CREATED_COMM_LEN   16U
> > >
> > > Nack.
> > > 16 bytes is going to be useless.
> > > We found it the hard way with prog_name.
> > > If you want to embed additional debug information
> > > please use BTF for that.
> > BTF was my natural choice initially, but then I saw created_by_uid and
> > thought created_by_comm might have a chance :-)
> >
> > To clarify, by BTF you mean creating some unused global variable
> > and use its name as the debugging info? Or there is some better way?
> 
> I was thinking about adding new section to .btf.ext with this extra data,
> but global variable is a better idea indeed.
> We'd need to standardize such variables names, so that
> bpftool can parse and print it while doing 'bpftool prog show'.
> We see more and more cases where services use more than
> one program in single .c file to accomplish their goals.
> Tying such debug info (like 'created_by_comm') to each program
> individually isn't quite right.
> In that sense global variables are better, since they cover the
> whole .c file.
> Beyond 'created_by_comm' there are others things that people
> will likely want to know.
> Like which version of llvm was used to compile this .o file.
> Which unix user name compiled it.
> The name of service/daemon that will be using this .o
> and so on.
> May be some standard prefix to such global variables will do?
> Like "bpftool prog show" can scan global data for
> "__annotate_#name" and print both name and string contents ?
> For folks who regularly ssh into servers to debug bpf progs
> that will help a lot.
> May be some annotations llvm can automatically add to .o.
> Thoughts?
We started some proof-of-concept prototyping yesterday; the idea, roughly:
* build system generates build_info.h header which contains:
  char __attribute__((section("aux_timestamp"))) *__aux_<build timestamp> = "";
  char __attribute__((section("aux_version"))) *__aux_<version> = "";
  ...
* clang has -include flag which includes this auto-generated file,
  so we don't rely on users including it
* 'bpftool show btf | grep aux_' can be used for low-level debugging

It's not pretty, but it gets the job done. I agree that having some sort of
convention is nice to make it more usable. If we can agree on a pre-defined
section (aux in my case) so that bpftool can take "variable" from
aux_<variable> section name and print "value" from __aux_<value>,
that would be nice.

One thing I still have no idea how to implement with this scheme
is some alternative to created_by_comm. There is no easy way to
add some BTF at runtime (load) time. Ideas?

> "__annotate_#name" and print both name and string contents ?
As Andrii just pointed out, this requires knowing where to look
for the obj files to print the contents of the vars :-(

  parent reply	other threads:[~2019-10-15 22:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-11 16:21 [PATCH bpf-next 1/3] bpf: preserve command of the process that loaded the program Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 2/3] tools/bpf: sync bpf.h Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 3/3] bpftool: print the comm of the process that loaded the program Stanislav Fomichev
2019-10-11 20:19   ` Martin Lau
2019-10-11 20:37     ` Stanislav Fomichev
2019-10-11 21:11       ` Arnaldo Carvalho de Melo
2019-10-11 21:30         ` Stanislav Fomichev
2019-10-12  0:10 ` [PATCH bpf-next 1/3] bpf: preserve command " Alexei Starovoitov
2019-10-12  0:38   ` Stanislav Fomichev
2019-10-15 21:21     ` debug annotations for bpf progs. Was: " Alexei Starovoitov
2019-10-15 22:14       ` Andrii Nakryiko
2019-10-15 22:24         ` Alexei Starovoitov
2019-10-15 22:33           ` Andrii Nakryiko
2019-10-15 22:48             ` Alexei Starovoitov
2019-10-15 22:26       ` Stanislav Fomichev [this message]
2019-10-16 14:01       ` Daniel Borkmann
2019-10-17 16:28         ` Stanislav Fomichev
2019-10-18  6:19           ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191015222612.GB1897241@mini-arch \
    --to=sdf@fomichev.me \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.