public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <kafai@fb.com>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [RFC PATCH bpf-next 05/16] bpf: create file or anonymous dumpers
Date: Tue, 14 Apr 2020 16:59:12 -0700	[thread overview]
Message-ID: <4bf72b3c-5fee-269f-1d71-7f808f436db9@fb.com> (raw)
In-Reply-To: <CAEf4Bzawu2dFXL7nvYhq1tKv9P7Bb9=6ksDpui5nBjxRrx=3_w@mail.gmail.com>



On 4/13/20 10:56 PM, Andrii Nakryiko wrote:
> On Wed, Apr 8, 2020 at 4:26 PM Yonghong Song <yhs@fb.com> wrote:
>>
>> Given a loaded dumper bpf program, which already
>> knows which target it should bind to, there
>> two ways to create a dumper:
>>    - a file based dumper under hierarchy of
>>      /sys/kernel/bpfdump/ which uses can
>>      "cat" to print out the output.
>>    - an anonymous dumper which user application
>>      can "read" the dumping output.
>>
>> For file based dumper, BPF_OBJ_PIN syscall interface
>> is used. For anonymous dumper, BPF_PROG_ATTACH
>> syscall interface is used.
> 
> We discussed this offline with Yonghong a bit, but I thought I'd put
> my thoughts about this in writing for completeness. To me, it seems
> like the most consistent way to do both anonymous and named dumpers is
> through the following steps:

The main motivation for me to use bpf_link is to enumerate
anonymous bpf dumpers by using idr based link_query mechanism in one
of previous Andrii's RFC patch so I do not need to re-invent the wheel.

But looks like there are some difficulties:

> 
> 1. BPF_PROG_LOAD to load/verify program, that created program FD.
> 2. LINK_CREATE using that program FD and direntry FD. This creates
> dumper bpf_link (bpf_dumper_link), returns anonymous link FD. If link

bpf dump program already have the target information as part of
verification propose, so it does not need directory FD.
LINK_CREATE probably not a good fit here.

bpf dump program is kind similar to fentry/fexit program,
where after successful program loading, the program will know
where to attach trampoline.

Looking at kernel code, for fentry/fexit program, at raw_tracepoint_open
syscall, the trampoline will be installed and actually bpf program will
be called.

So, ideally, if we want to use kernel bpf_link, we want to
return a cat-able bpf_link because ultimately we want to query
file descriptors which actually 'read' bpf program outputs.

Current bpf_link is not cat-able.
I try to hack by manipulating fops and other stuff, it may work,
but looks ugly. Or we create a bpf_catable_link and build an 
infrastructure around that? Not sure whether it is worthwhile for this 
one-off thing (bpfdump)?

Or to query anonymous bpf dumpers, I can just write a bpf dump program
to go through all fd's to find out.

BTW, my current approach (in my private branch),
anonymous dumper:
    bpf_raw_tracepoint_open(NULL, prog) -> cat-able fd
file dumper:
    bpf_obj_pin(prog, path)  -> a cat-able file

If you consider program itself is a link, this is like what
described below in 3 and 4.


> FD is closed, dumper program is detached and dumper is destroyed
> (unless pinned in bpffs, just like with any other bpf_link.
> 3. At this point bpf_dumper_link can be treated like a factory of
> seq_files. We can add a new BPF_DUMPER_OPEN_FILE (all names are for
> illustration purposes) command, that accepts dumper link FD and
> returns a new seq_file FD, which can be read() normally (or, e.g.,
> cat'ed from shell).

In this case, link_query may not be accurate if a bpf_dumper_link
is created but no corresponding bpf_dumper_open_file. What we really
need to iterate through all dumper seq_file FDs.

> 4. Additionally, this anonymous bpf_link can be pinned/mounted in
> bpfdumpfs. We can do it as BPF_OBJ_PIN or as a separate command. Once
> pinned at, e.g., /sys/fs/bpfdump/task/my_dumper, just opening that
> file is equivalent to BPF_DUMPER_OPEN_FILE and will create a new
> seq_file that can be read() independently from other seq_files opened
> against the same dumper. Pinning bpfdumpfs entry also bumps refcnt of
> bpf_link itself, so even if process that created link dies, bpf dumper
> stays attached until its bpfdumpfs entry is deleted.
> 
> Apart from BPF_DUMPER_OPEN_FILE and open()'ing bpfdumpfs file duality,
> it seems pretty consistent and follows safe-by-default auto-cleanup of
> anonymous link, unless pinned in bpfdumpfs (or one can still pin
> bpf_link in bpffs, but it can't be open()'ed the same way, it just
> preserves BPF program from being cleaned up).
> 
> Out of all schemes I could come up with, this one seems most unified
> and nicely fits into bpf_link infra. Thoughts?
> 
>>
>> To facilitate target seq_ops->show() to get the
>> bpf program easily, dumper creation increased
>> the target-provided seq_file private data size
>> so bpf program pointer is also stored in seq_file
>> private data.
>>
>> Further, a seq_num which represents how many
>> bpf_dump_get_prog() has been called is also
>> available to the target seq_ops->show().
>> Such information can be used to e.g., print
>> banner before printing out actual data.
>>
>> Note the seq_num does not represent the num
>> of unique kernel objects the bpf program has
>> seen. But it should be a good approximate.
>>
>> A target feature BPF_DUMP_SEQ_NET_PRIVATE
>> is implemented specifically useful for
>> net based dumpers. It sets net namespace
>> as the current process net namespace.
>> This avoids changing existing net seq_ops
>> in order to retrieve net namespace from
>> the seq_file pointer.
>>
>> For open dumper files, anonymous or not, the
>> fdinfo will show the target and prog_id associated
>> with that file descriptor. For dumper file itself,
>> a kernel interface will be provided to retrieve the
>> prog_id in one of the later patches.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/linux/bpf.h            |   5 +
>>   include/uapi/linux/bpf.h       |   6 +-
>>   kernel/bpf/dump.c              | 338 ++++++++++++++++++++++++++++++++-
>>   kernel/bpf/syscall.c           |  11 +-
>>   tools/include/uapi/linux/bpf.h |   6 +-
>>   5 files changed, 362 insertions(+), 4 deletions(-)
>>
> 
> [...]
> 

  reply	other threads:[~2020-04-14 23:59 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08 23:25 [RFC PATCH bpf-next 00/16] bpf: implement bpf based dumping of kernel data structures Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 01/16] net: refactor net assignment for seq_net_private structure Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 02/16] bpf: create /sys/kernel/bpfdump mount file system Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 03/16] bpf: provide a way for targets to register themselves Yonghong Song
2020-04-10 22:18   ` Andrii Nakryiko
2020-04-10 23:24     ` Yonghong Song
2020-04-13 19:31       ` Andrii Nakryiko
2020-04-15 22:57     ` Yonghong Song
2020-04-10 22:25   ` Andrii Nakryiko
2020-04-10 23:25     ` Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 04/16] bpf: allow loading of a dumper program Yonghong Song
2020-04-10 22:36   ` Andrii Nakryiko
2020-04-10 23:28     ` Yonghong Song
2020-04-13 19:33       ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 05/16] bpf: create file or anonymous dumpers Yonghong Song
2020-04-10  3:00   ` Alexei Starovoitov
2020-04-10  6:09     ` Yonghong Song
2020-04-10 22:42     ` Yonghong Song
2020-04-10 22:53       ` Andrii Nakryiko
2020-04-10 23:47         ` Yonghong Song
2020-04-11 23:11           ` Alexei Starovoitov
2020-04-12  6:51             ` Yonghong Song
2020-04-13 20:48             ` Andrii Nakryiko
2020-04-10 22:51   ` Andrii Nakryiko
2020-04-10 23:41     ` Yonghong Song
2020-04-13 19:45       ` Andrii Nakryiko
2020-04-10 23:25   ` Andrii Nakryiko
2020-04-11  0:23     ` Yonghong Song
2020-04-11 23:17       ` Alexei Starovoitov
2020-04-13 21:04         ` Andrii Nakryiko
2020-04-13 19:59       ` Andrii Nakryiko
2020-04-14  5:56   ` Andrii Nakryiko
2020-04-14 23:59     ` Yonghong Song [this message]
2020-04-15  4:45       ` Andrii Nakryiko
2020-04-15 16:46         ` Alexei Starovoitov
2020-04-16  1:48           ` Andrii Nakryiko
2020-04-16  7:15             ` Yonghong Song
2020-04-16 17:04             ` Alexei Starovoitov
2020-04-16 19:35               ` Andrii Nakryiko
2020-04-16 23:18                 ` Alexei Starovoitov
2020-04-17  5:11                   ` Andrii Nakryiko
2020-04-19  6:11                     ` Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 06/16] bpf: add netlink and ipv6_route targets Yonghong Song
2020-04-10 23:13   ` Andrii Nakryiko
2020-04-10 23:52     ` Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 07/16] bpf: add bpf_map target Yonghong Song
2020-04-13 22:18   ` Andrii Nakryiko
2020-04-13 22:47     ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 08/16] bpf: add task and task/file targets Yonghong Song
2020-04-10  3:22   ` Alexei Starovoitov
2020-04-10  6:19     ` Yonghong Song
2020-04-10 21:31       ` Alexei Starovoitov
2020-04-10 21:33         ` Alexei Starovoitov
2020-04-13 23:00   ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 09/16] bpf: add bpf_seq_printf and bpf_seq_write helpers Yonghong Song
2020-04-10  3:26   ` Alexei Starovoitov
2020-04-10  6:12     ` Yonghong Song
2020-04-14  5:28   ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 10/16] bpf: support variable length array in tracing programs Yonghong Song
2020-04-14  0:13   ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 11/16] bpf: implement query for target_proto and file dumper prog_id Yonghong Song
2020-04-10  3:10   ` Alexei Starovoitov
2020-04-10  6:11     ` Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 12/16] tools/libbpf: libbpf support for bpfdump Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 13/16] tools/bpftool: add bpf dumper support Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 14/16] tools/bpf: selftests: add dumper programs for ipv6_route and netlink Yonghong Song
2020-04-14  5:39   ` Andrii Nakryiko
2020-04-08 23:25 ` [RFC PATCH bpf-next 15/16] tools/bpf: selftests: add dumper progs for bpf_map/task/task_file Yonghong Song
2020-04-10  3:33   ` Alexei Starovoitov
2020-04-10  6:41     ` Yonghong Song
2020-04-08 23:25 ` [RFC PATCH bpf-next 16/16] tools/bpf: selftests: add a selftest for anonymous dumper Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bf72b3c-5fee-269f-1d71-7f808f436db9@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox