From: Stanislav Fomichev <sdf@fomichev.me>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Stanislav Fomichev <sdf@google.com>,
Network Development <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Yonghong Song <yhs@fb.com>
Subject: Re: debug annotations for bpf progs. Was: [PATCH bpf-next 1/3] bpf: preserve command of the process that loaded the program
Date: Thu, 17 Oct 2019 09:28:43 -0700 [thread overview]
Message-ID: <20191017162843.GB2090@mini-arch> (raw)
In-Reply-To: <20191016140112.GF21367@pc-63.home>
On 10/16, Daniel Borkmann wrote:
> On Tue, Oct 15, 2019 at 02:21:50PM -0700, Alexei Starovoitov wrote:
> > On Fri, Oct 11, 2019 at 5:38 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > On 10/11, Alexei Starovoitov wrote:
> > > > On Fri, Oct 11, 2019 at 9:21 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > Even though we have the pointer to user_struct and can recover
> > > > > uid of the user who has created the program, it usually contains
> > > > > 0 (root) which is not very informative. Let's store the comm of the
> > > > > calling process and export it via bpf_prog_info. This should help
> > > > > answer the question "which process loaded this particular program".
> > > > >
> > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > ---
> > > > > include/linux/bpf.h | 1 +
> > > > > include/uapi/linux/bpf.h | 2 ++
> > > > > kernel/bpf/syscall.c | 4 ++++
> > > > > 3 files changed, 7 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index 5b9d22338606..b03ea396afe5 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -421,6 +421,7 @@ struct bpf_prog_aux {
> > > > > struct work_struct work;
> > > > > struct rcu_head rcu;
> > > > > };
> > > > > + char created_by_comm[BPF_CREATED_COMM_LEN];
> > > > > };
> > > > >
> > > > > struct bpf_array {
> > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > index a65c3b0c6935..4e883ecbba1e 100644
> > > > > --- a/include/uapi/linux/bpf.h
> > > > > +++ b/include/uapi/linux/bpf.h
> > > > > @@ -326,6 +326,7 @@ enum bpf_attach_type {
> > > > > #define BPF_F_NUMA_NODE (1U << 2)
> > > > >
> > > > > #define BPF_OBJ_NAME_LEN 16U
> > > > > +#define BPF_CREATED_COMM_LEN 16U
> > > >
> > > > Nack.
> > > > 16 bytes is going to be useless.
> > > > We found it the hard way with prog_name.
> > > > If you want to embed additional debug information
> > > > please use BTF for that.
> > > BTF was my natural choice initially, but then I saw created_by_uid and
> > > thought created_by_comm might have a chance :-)
> > >
> > > To clarify, by BTF you mean creating some unused global variable
> > > and use its name as the debugging info? Or there is some better way?
> >
> > I was thinking about adding new section to .btf.ext with this extra data,
> > but global variable is a better idea indeed.
> > We'd need to standardize such variables names, so that
> > bpftool can parse and print it while doing 'bpftool prog show'.
>
> +1, much better indeed.
>
> > We see more and more cases where services use more than
> > one program in single .c file to accomplish their goals.
> > Tying such debug info (like 'created_by_comm') to each program
> > individually isn't quite right.
> > In that sense global variables are better, since they cover the
> > whole .c file.
> > Beyond 'created_by_comm' there are others things that people
> > will likely want to know.
> > Like which version of llvm was used to compile this .o file.
> > Which unix user name compiled it.
> > The name of service/daemon that will be using this .o
> > and so on.
>
> Also latest git sha of the source repo, for example.
>
> > May be some standard prefix to such global variables will do?
> > Like "bpftool prog show" can scan global data for
> > "__annotate_#name" and print both name and string contents ?
> > For folks who regularly ssh into servers to debug bpf progs
> > that will help a lot.
> > May be some annotations llvm can automatically add to .o.
> > Thoughts?
>
> One thing that might be less clear is how information such as comm
> or comm args would be stuffed into BTF here, but perhaps these two
> wouldn't necessarily need to be part of it since these can be retrieved
> today (as in: "which program is currently holding a reference via fd
> to a certain prog/map"). For that bpftool could simply walk procfs
> once and correlate via fdinfo on unique prog/map id, so we could list
> comms in the dump which should be trivial to add:
>
> # ls -la /proc/30651/fd/10
> lrwx------ 1 root root 64 Oct 16 15:53 /proc/30651/fd/10 -> anon_inode:bpf-map
> # cat /proc/30651/fdinfo/10
> pos: 0
> flags: 02000002
> mnt_id: 15
> map_type: 1
> key_size: 24
> value_size: 12
> max_entries: 65536
> map_flags: 0x0
> memlock: 6819840
> map_id: 384 <---
> frozen: 0
> # cat /proc/30651/comm
> cilium-agent
> # cat /proc/30651/cmdline
> ./daemon/cilium-agent--ipv4-range10.11.0.0/16[...]--enable-node-port=true
>
> ... and similar for progs. Getting the cmdline from kernel side seems
> rather annoying from looking into what detour procfs needs to perform.
>
> But aside from these, such annotations via BTF would be really useful.
Tried to do the following:
1. Add: static volatile const char __annotate_source1[] = __FILE__;
to test_rdonly_maps.c and I think it got optimized away :-/
At least I don't see it in the 'bpftool btf dump' output.
2. Add: char __annotate_source2[] SEC(".meta") = __FILE__;
to test_rdonly_maps.c and do all the required plumbing in libbpf
to treat .meta like .rodata. I think it works, but the map
disappears after bpftool exits because this data is not referenced
in the prog and the refcount drops to zero :-(
Am I missing something?
next prev parent reply other threads:[~2019-10-17 16:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-11 16:21 [PATCH bpf-next 1/3] bpf: preserve command of the process that loaded the program Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 2/3] tools/bpf: sync bpf.h Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 3/3] bpftool: print the comm of the process that loaded the program Stanislav Fomichev
2019-10-11 20:19 ` Martin Lau
2019-10-11 20:37 ` Stanislav Fomichev
2019-10-11 21:11 ` Arnaldo Carvalho de Melo
2019-10-11 21:30 ` Stanislav Fomichev
2019-10-12 0:10 ` [PATCH bpf-next 1/3] bpf: preserve command " Alexei Starovoitov
2019-10-12 0:38 ` Stanislav Fomichev
2019-10-15 21:21 ` debug annotations for bpf progs. Was: " Alexei Starovoitov
2019-10-15 22:14 ` Andrii Nakryiko
2019-10-15 22:24 ` Alexei Starovoitov
2019-10-15 22:33 ` Andrii Nakryiko
2019-10-15 22:48 ` Alexei Starovoitov
2019-10-15 22:26 ` Stanislav Fomichev
2019-10-16 14:01 ` Daniel Borkmann
2019-10-17 16:28 ` Stanislav Fomichev [this message]
2019-10-18 6:19 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191017162843.GB2090@mini-arch \
--to=sdf@fomichev.me \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox