All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Juntong Deng <juntong.deng@outlook.com>,
	ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com,
	andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com,
	song@kernel.org, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, memxor@gmail.com,
	snorcht@gmail.com, brauner@kernel.org
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH bpf-next v6 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc
Date: Thu, 19 Dec 2024 08:11:12 -0800	[thread overview]
Message-ID: <09256acb-9b23-4a25-a260-a4063d219899@linux.dev> (raw)
In-Reply-To: <AM6PR03MB5080DC63013560E26507079E99042@AM6PR03MB5080.eurprd03.prod.outlook.com>




On 12/17/24 3:34 PM, Juntong Deng wrote:
> This patch series adds open-coded style process file iterator
> bpf_iter_task_file and bpf_fget_task() kfunc, and corresponding
> selftests test cases.
>
> In addition, since fs kfuncs is generic and useful for scenarios
> other than LSM, this patch makes fs kfuncs available for SYSCALL
> program type.
>
> Although iter/task_file already exists, for CRIB we still need the
> open-coded iterator style process file iterator, and the same is true
> for other bpf iterators such as iter/tcp, iter/udp, etc.
>
> The traditional bpf iterator is more like a bpf version of procfs, but
> similar to procfs, it is not suitable for CRIB scenarios that need to
> obtain large amounts of complex, multi-level in-kernel information.
>
> The following is from previous discussions [1].
>
> [1]: https://lore.kernel.org/bpf/AM6PR03MB5848CA34B5B68C90F210285E99B12@AM6PR03MB5848.eurprd03.prod.outlook.com/
>
> This is because the context of bpf iterators is fixed and bpf iterators
> cannot be nested. This means that a bpf iterator program can only
> complete a specific small iterative dump task, and cannot dump
> multi-level data.
>
> An example, when we need to dump all the sockets of a process, we need
> to iterate over all the files (sockets) of the process, and iterate over
> the all packets in the queue of each socket, and iterate over all data
> in each packet.
>
> If we use bpf iterator, since the iterator can not be nested, we need to
> use socket iterator program to get all the basic information of all
> sockets (pass pid as filter), and then use packet iterator program to
> get the basic information of all packets of a specific socket (pass pid,
> fd as filter), and then use packet data iterator program to get all the
> data of a specific packet (pass pid, fd, packet index as filter).
>
> This would be complicated and require a lot of (each iteration)
> bpf program startup and exit (leading to poor performance).
>
> By comparison, open coded iterator is much more flexible, we can iterate
> in any context, at any time, and iteration can be nested, so we can
> achieve more flexible and more elegant dumping through open coded
> iterators.
>
> With open coded iterators, all of the above can be done in a single
> bpf program, and with nested iterators, everything becomes compact
> and simple.
>
> Also, bpf iterators transmit data to user space through seq_file,
> which involves a lot of open (bpf_iter_create), read, close syscalls,
> context switching, memory copying, and cannot achieve the performance
> of using ringbuf.
>
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
> ---
> v5 -> v6:
> * Remove local variable in bpf_fget_task.
>
> * Remove KF_RCU_PROTECTED from bpf_iter_task_file_new.
>
> * Remove bpf_fs_kfunc_set from being available for TRACING.
>
> * Use get_task_struct in bpf_iter_task_file_new.
>
> * Use put_task_struct in bpf_iter_task_file_destroy.
>
> v4 -> v5:
> * Add file type checks in test cases for process file iterator
>    and bpf_fget_task().
>
> * Use fentry to synchronize tests instead of waiting in a loop.
>
> * Remove path_d_path_kfunc_non_lsm test case.
>
> * Replace task_lookup_next_fdget_rcu() with fget_task_next().
>
> * Remove future merge conflict section in cover letter (resolved).
>
> v3 -> v4:
> * Make all kfuncs generic, not CRIB specific.
>
> * Move bpf_fget_task to fs/bpf_fs_kfuncs.c.
>
> * Remove bpf_iter_task_file_get_fd and bpf_get_file_ops_type.
>
> * Use struct bpf_iter_task_file_item * as the return value of
>    bpf_iter_task_file_next.
>
> * Change fd to unsigned int type and add next_fd.
>
> * Add KF_RCU_PROTECTED to bpf_iter_task_file_new.
>
> * Make fs kfuncs available to SYSCALL and TRACING program types.
>
> * Update all relevant test cases.
>
> * Remove the discussion section from cover letter.
>
> v2 -> v3:
> * Move task_file open-coded iterator to kernel/bpf/helpers.c.
>
> * Fix duplicate error code 7 in test_bpf_iter_task_file().
>
> * Add comment for case when bpf_iter_task_file_get_fd() returns -1.
>
> * Add future plans in commit message of "Add struct file related
>    CRIB kfuncs".
>
> * Add Discussion section to cover letter.
>
> v1 -> v2:
> * Fix a type definition error in the fd parameter of
>    bpf_fget_task() at crib_common.h.
>
> Juntong Deng (5):
>    bpf: Introduce task_file open-coded iterator kfuncs
>    selftests/bpf: Add tests for open-coded style process file iterator
>    bpf: Add bpf_fget_task() kfunc
>    bpf: Make fs kfuncs available for SYSCALL program type
>    selftests/bpf: Add tests for bpf_fget_task() kfunc
>
>   fs/bpf_fs_kfuncs.c                            | 38 ++++----
>   kernel/bpf/helpers.c                          |  3 +
>   kernel/bpf/task_iter.c                        | 91 +++++++++++++++++++
>   .../testing/selftests/bpf/bpf_experimental.h  | 15 +++
>   .../selftests/bpf/prog_tests/fs_kfuncs.c      | 46 ++++++++++
>   .../testing/selftests/bpf/prog_tests/iters.c  | 79 ++++++++++++++++
>   .../selftests/bpf/progs/fs_kfuncs_failure.c   | 33 +++++++
>   .../selftests/bpf/progs/iters_task_file.c     | 86 ++++++++++++++++++
>   .../bpf/progs/iters_task_file_failure.c       | 91 +++++++++++++++++++
>   .../selftests/bpf/progs/test_fget_task.c      | 63 +++++++++++++
>   .../selftests/bpf/progs/verifier_vfs_reject.c | 10 --
>   11 files changed, 529 insertions(+), 26 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
>   create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
>   create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c
>   create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c

There are quite some CI failures.

    https://github.com/kernel-patches/bpf/actions/runs/12403224240/job/34626610882?pr=8266

Please investigate.



  parent reply	other threads:[~2024-12-19 16:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17 23:34 [PATCH bpf-next v6 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
2024-12-17 23:37 ` [PATCH bpf-next v6 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
2024-12-19 16:19   ` Yonghong Song
2024-12-17 23:37 ` [PATCH bpf-next v6 2/5] selftests/bpf: Add tests for open-coded style process file iterator Juntong Deng
2024-12-19 16:35   ` Yonghong Song
2024-12-24  0:43     ` Juntong Deng
2024-12-17 23:37 ` [PATCH bpf-next v6 3/5] bpf: Add bpf_fget_task() kfunc Juntong Deng
2024-12-17 23:37 ` [PATCH bpf-next v6 4/5] bpf: Make fs kfuncs available for SYSCALL program type Juntong Deng
2024-12-19 16:41   ` Alexei Starovoitov
2024-12-24  0:51     ` Juntong Deng
2024-12-24 12:10       ` Juntong Deng
2025-01-09 19:23       ` per st_ops kfunc allow/deny mask. Was: " Alexei Starovoitov
2025-01-09 20:49         ` Song Liu
2025-01-09 21:24           ` Alexei Starovoitov
2025-01-10 20:19           ` Tejun Heo
2025-01-10 20:50             ` Juntong Deng
2025-01-10 20:42         ` Juntong Deng
2025-01-16 19:52           ` Juntong Deng
2024-12-17 23:37 ` [PATCH bpf-next v6 5/5] selftests/bpf: Add tests for bpf_fget_task() kfunc Juntong Deng
2024-12-19 16:11 ` Yonghong Song [this message]
2024-12-24  0:42   ` [PATCH bpf-next v6 0/5] bpf: Add open-coded style process file iterator and " Juntong Deng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09256acb-9b23-4a25-a260-a4063d219899@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=juntong.deng@outlook.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=sdf@fomichev.me \
    --cc=snorcht@gmail.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.