public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc
@ 2025-01-27 23:44 Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:44 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

This patch series adds open-coded style process file iterator
bpf_iter_task_file and bpf_fget_task() kfunc, and corresponding
selftests test cases.

In addition, since fs kfuncs is general and useful for scenarios
other than LSM, this patch makes fs kfuncs available for SYSCALL
program type.

(In this version I did not remove the declarations in
bpf_experimental.h as I guess these might be useful to others?)

Please do not ignore the previous version. The iters/task_file test case
revealed inconsistent behavior between gcc-compiled kernel and
llvm-compiled kernel [0].

[0]: https://lore.kernel.org/bpf/AM6PR03MB5080DFE11B733C9DCD6547E199EC2@AM6PR03MB5080.eurprd03.prod.outlook.com/

Although iter/task_file already exists, for CRIB we still need the
open-coded iterator style process file iterator, and the same is true
for other bpf iterators such as iter/tcp, iter/udp, etc.

The traditional bpf iterator is more like a bpf version of procfs, but
similar to procfs, it is not suitable for CRIB scenarios that need to
obtain large amounts of complex, multi-level in-kernel information.

The following is from previous discussions [1].

[1]: https://lore.kernel.org/bpf/AM6PR03MB5848CA34B5B68C90F210285E99B12@AM6PR03MB5848.eurprd03.prod.outlook.com/

This is because the context of bpf iterators is fixed and bpf iterators
cannot be nested. This means that a bpf iterator program can only
complete a specific small iterative dump task, and cannot dump
multi-level data.

An example, when we need to dump all the sockets of a process, we need
to iterate over all the files (sockets) of the process, and iterate over
the all packets in the queue of each socket, and iterate over all data
in each packet.

If we use bpf iterator, since the iterator can not be nested, we need to
use socket iterator program to get all the basic information of all
sockets (pass pid as filter), and then use packet iterator program to
get the basic information of all packets of a specific socket (pass pid,
fd as filter), and then use packet data iterator program to get all the
data of a specific packet (pass pid, fd, packet index as filter).

This would be complicated and require a lot of (each iteration)
bpf program startup and exit (leading to poor performance).

By comparison, open coded iterator is much more flexible, we can iterate
in any context, at any time, and iteration can be nested, so we can
achieve more flexible and more elegant dumping through open coded
iterators.

With open coded iterators, all of the above can be done in a single
bpf program, and with nested iterators, everything becomes compact
and simple.

Also, bpf iterators transmit data to user space through seq_file,
which involves a lot of open (bpf_iter_create), read, close syscalls,
context switching, memory copying, and cannot achieve the performance
of using ringbuf.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
v8 -> v9:
* Replace task->parent in test_bpf_iter_task_file with
  task->real_parent.

v7 -> v8:
* Keep path_d_path_kfunc_non_lsm

* Add back the const following extern

v6 -> v7:
* Fix argument index mistake

* Remove __aligned(8) at bpf_iter_task_file_kern

* Make the if statement that checks item->file closer to
  fget_task_next

* Remove the const following extern

* Keep bpf_fs_kfuncs_filter

v5 -> v6:
* Remove local variable in bpf_fget_task.

* Remove KF_RCU_PROTECTED from bpf_iter_task_file_new.

* Remove bpf_fs_kfunc_set from being available for TRACING.

* Use get_task_struct in bpf_iter_task_file_new.

* Use put_task_struct in bpf_iter_task_file_destroy.

v4 -> v5:
* Add file type checks in test cases for process file iterator
  and bpf_fget_task().

* Use fentry to synchronize tests instead of waiting in a loop.

* Remove path_d_path_kfunc_non_lsm test case.

* Replace task_lookup_next_fdget_rcu() with fget_task_next().

* Remove future merge conflict section in cover letter (resolved).

v3 -> v4:
* Make all kfuncs generic, not CRIB specific.

* Move bpf_fget_task to fs/bpf_fs_kfuncs.c.

* Remove bpf_iter_task_file_get_fd and bpf_get_file_ops_type.

* Use struct bpf_iter_task_file_item * as the return value of
  bpf_iter_task_file_next.

* Change fd to unsigned int type and add next_fd.

* Add KF_RCU_PROTECTED to bpf_iter_task_file_new.

* Make fs kfuncs available to SYSCALL and TRACING program types.

* Update all relevant test cases.

* Remove the discussion section from cover letter.

v2 -> v3:
* Move task_file open-coded iterator to kernel/bpf/helpers.c.

* Fix duplicate error code 7 in test_bpf_iter_task_file().

* Add comment for case when bpf_iter_task_file_get_fd() returns -1.

* Add future plans in commit message of "Add struct file related
  CRIB kfuncs".

* Add Discussion section to cover letter.

v1 -> v2:
* Fix a type definition error in the fd parameter of
  bpf_fget_task() at crib_common.h.

Juntong Deng (5):
  bpf: Introduce task_file open-coded iterator kfuncs
  selftests/bpf: Add tests for open-coded style process file iterator
  bpf: Add bpf_fget_task() kfunc
  bpf: Make fs kfuncs available for SYSCALL program type
  selftests/bpf: Add tests for bpf_fget_task() kfunc

 fs/bpf_fs_kfuncs.c                            | 32 +++++--
 kernel/bpf/helpers.c                          |  3 +
 kernel/bpf/task_iter.c                        | 90 ++++++++++++++++++
 .../testing/selftests/bpf/bpf_experimental.h  | 15 +++
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 46 ++++++++++
 .../testing/selftests/bpf/prog_tests/iters.c  | 78 ++++++++++++++++
 .../selftests/bpf/progs/fs_kfuncs_failure.c   | 33 +++++++
 .../selftests/bpf/progs/iters_task_file.c     | 87 ++++++++++++++++++
 .../bpf/progs/iters_task_file_failure.c       | 91 +++++++++++++++++++
 .../selftests/bpf/progs/test_fget_task.c      | 63 +++++++++++++
 10 files changed, 530 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c

-- 
2.39.5


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs
  2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
@ 2025-01-27 23:46 ` Juntong Deng
  2025-01-30  2:35   ` Alexei Starovoitov
  2025-01-30 16:04   ` Christian Brauner
  2025-01-27 23:46 ` [PATCH bpf-next v9 2/5] selftests/bpf: Add tests for open-coded style process file iterator Juntong Deng
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:46 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

This patch adds the open-coded iterator style process file iterator
kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all
files opened by the specified process.

bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item,
which currently contains *task, *file, fd. This is an extensible
structure that enables compatibility with different versions
through CO-RE.

The reference to struct file acquired by the previous
bpf_iter_task_file_next() is released in the next
bpf_iter_task_file_next(), and the last reference is released in the
last bpf_iter_task_file_next() that returns NULL.

In the bpf_iter_task_file_destroy(), if the iterator does not iterate to
the end, then the last struct file reference is released at this time.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
 kernel/bpf/helpers.c   |  3 ++
 kernel/bpf/task_iter.c | 90 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f27ce162427a..359c5bbf4814 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -3157,6 +3157,9 @@ BTF_ID_FLAGS(func, bpf_iter_css_destroy, KF_ITER_DESTROY)
 BTF_ID_FLAGS(func, bpf_iter_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS | KF_RCU_PROTECTED)
 BTF_ID_FLAGS(func, bpf_iter_task_next, KF_ITER_NEXT | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_iter_task_destroy, KF_ITER_DESTROY)
+BTF_ID_FLAGS(func, bpf_iter_task_file_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_task_file_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_task_file_destroy, KF_ITER_DESTROY)
 BTF_ID_FLAGS(func, bpf_dynptr_adjust)
 BTF_ID_FLAGS(func, bpf_dynptr_is_null)
 BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index 98d9b4c0daff..24a5af67e6c8 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -1027,6 +1027,96 @@ __bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it)
 {
 }
 
+struct bpf_iter_task_file_item {
+	struct task_struct *task;
+	struct file *file;
+	unsigned int fd;
+};
+
+struct bpf_iter_task_file {
+	__u64 __opaque[4];
+} __aligned(8);
+
+struct bpf_iter_task_file_kern {
+	struct bpf_iter_task_file_item item;
+	unsigned int next_fd;
+} __aligned(8);
+
+/**
+ * bpf_iter_task_file_new() - Initialize a new task file iterator for a task,
+ * used to iterate over all files opened by a specified task
+ *
+ * @it: the new bpf_iter_task_file to be created
+ * @task: a pointer pointing to the task to be iterated over
+ */
+__bpf_kfunc int bpf_iter_task_file_new(struct bpf_iter_task_file *it, struct task_struct *task)
+{
+	struct bpf_iter_task_file_kern *kit = (void *)it;
+	struct bpf_iter_task_file_item *item = &kit->item;
+
+	BUILD_BUG_ON(sizeof(struct bpf_iter_task_file_kern) > sizeof(struct bpf_iter_task_file));
+	BUILD_BUG_ON(__alignof__(struct bpf_iter_task_file_kern) !=
+		     __alignof__(struct bpf_iter_task_file));
+
+	item->task = get_task_struct(task);
+	item->file = NULL;
+	item->fd = 0;
+	kit->next_fd = 0;
+
+	return 0;
+}
+
+/**
+ * bpf_iter_task_file_next() - Get the next file in bpf_iter_task_file
+ *
+ * bpf_iter_task_file_next acquires a reference to the struct file.
+ *
+ * The reference to struct file acquired by the previous
+ * bpf_iter_task_file_next() is released in the next bpf_iter_task_file_next(),
+ * and the last reference is released in the last bpf_iter_task_file_next()
+ * that returns NULL.
+ *
+ * @it: the bpf_iter_task_file to be checked
+ *
+ * @returns a pointer to bpf_iter_task_file_item
+ */
+__bpf_kfunc struct bpf_iter_task_file_item *bpf_iter_task_file_next(struct bpf_iter_task_file *it)
+{
+	struct bpf_iter_task_file_kern *kit = (void *)it;
+	struct bpf_iter_task_file_item *item = &kit->item;
+
+	if (item->file)
+		fput(item->file);
+
+	item->file = fget_task_next(item->task, &kit->next_fd);
+	if (!item->file)
+		return NULL;
+
+	item->fd = kit->next_fd;
+	kit->next_fd++;
+
+	return item;
+}
+
+/**
+ * bpf_iter_task_file_destroy() - Destroy a bpf_iter_task_file
+ *
+ * If the iterator does not iterate to the end, then the last
+ * struct file reference is released at this time.
+ *
+ * @it: the bpf_iter_task_file to be destroyed
+ */
+__bpf_kfunc void bpf_iter_task_file_destroy(struct bpf_iter_task_file *it)
+{
+	struct bpf_iter_task_file_kern *kit = (void *)it;
+	struct bpf_iter_task_file_item *item = &kit->item;
+
+	if (item->file)
+		fput(item->file);
+
+	put_task_struct(item->task);
+}
+
 __bpf_kfunc_end_defs();
 
 DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v9 2/5] selftests/bpf: Add tests for open-coded style process file iterator
  2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
@ 2025-01-27 23:46 ` Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 3/5] bpf: Add bpf_fget_task() kfunc Juntong Deng
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:46 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

This patch adds test cases for open-coded style process file iterator.

Test cases related to process files are run in the newly created child
process. Close all opened files inherited from the parent process in
the child process to avoid the files opened by the parent process
affecting the test results.

In addition, this patch adds failure test cases where bpf programs
cannot pass the verifier due to uninitialized or untrusted
arguments, etc.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
 .../testing/selftests/bpf/bpf_experimental.h  |  7 ++
 .../testing/selftests/bpf/prog_tests/iters.c  | 78 ++++++++++++++++
 .../selftests/bpf/progs/iters_task_file.c     | 87 ++++++++++++++++++
 .../bpf/progs/iters_task_file_failure.c       | 91 +++++++++++++++++++
 4 files changed, 263 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c

diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index cd8ecd39c3f3..ce1520c56b55 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -588,4 +588,11 @@ extern int bpf_iter_kmem_cache_new(struct bpf_iter_kmem_cache *it) __weak __ksym
 extern struct kmem_cache *bpf_iter_kmem_cache_next(struct bpf_iter_kmem_cache *it) __weak __ksym;
 extern void bpf_iter_kmem_cache_destroy(struct bpf_iter_kmem_cache *it) __weak __ksym;
 
+struct bpf_iter_task_file;
+struct bpf_iter_task_file_item;
+extern int bpf_iter_task_file_new(struct bpf_iter_task_file *it, struct task_struct *task) __ksym;
+extern struct bpf_iter_task_file_item *
+bpf_iter_task_file_next(struct bpf_iter_task_file *it) __ksym;
+extern void bpf_iter_task_file_destroy(struct bpf_iter_task_file *it) __ksym;
+
 #endif
diff --git a/tools/testing/selftests/bpf/prog_tests/iters.c b/tools/testing/selftests/bpf/prog_tests/iters.c
index 3cea71f9c500..f400aeb91cba 100644
--- a/tools/testing/selftests/bpf/prog_tests/iters.c
+++ b/tools/testing/selftests/bpf/prog_tests/iters.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
 
+#define _GNU_SOURCE
+#include <sys/socket.h>
 #include <sys/syscall.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
@@ -16,11 +18,13 @@
 #include "iters_num.skel.h"
 #include "iters_testmod.skel.h"
 #include "iters_testmod_seq.skel.h"
+#include "iters_task_file.skel.h"
 #include "iters_task_vma.skel.h"
 #include "iters_task.skel.h"
 #include "iters_css_task.skel.h"
 #include "iters_css.skel.h"
 #include "iters_task_failure.skel.h"
+#include "iters_task_file_failure.skel.h"
 
 static void subtest_num_iters(void)
 {
@@ -291,6 +295,77 @@ static void subtest_css_iters(void)
 	iters_css__destroy(skel);
 }
 
+static int task_file_test_process(void *args)
+{
+	int pipefd[2], sockfd, err = 0;
+
+	/* Create a clean file descriptor table for the test process */
+	close_range(0, ~0U, 0);
+
+	if (pipe(pipefd) < 0)
+		return 1;
+
+	sockfd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+	if (sockfd < 0) {
+		err = 2;
+		goto cleanup_pipe;
+	}
+
+	usleep(1);
+
+	close(sockfd);
+cleanup_pipe:
+	close(pipefd[0]);
+	close(pipefd[1]);
+	return err;
+}
+
+static void subtest_task_file_iters(void)
+{
+	const int stack_size = 1024 * 1024;
+	struct iters_task_file *skel;
+	int child_pid, wstatus, err;
+	char *stack;
+
+	skel = iters_task_file__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	if (!ASSERT_OK(skel->bss->err, "pre_test_err"))
+		goto cleanup_skel;
+
+	skel->bss->parent_pid = getpid();
+	skel->bss->count = 0;
+
+	err = iters_task_file__attach(skel);
+	if (!ASSERT_OK(err, "skel_attach"))
+		goto cleanup_skel;
+
+	stack = (char *)malloc(stack_size);
+	if (!ASSERT_OK_PTR(stack, "clone_stack"))
+		goto cleanup_attach;
+
+	/* Note that there is no CLONE_FILES */
+	child_pid = clone(task_file_test_process, stack + stack_size, CLONE_VM | SIGCHLD, NULL);
+	if (!ASSERT_GT(child_pid, -1, "child_pid"))
+		goto cleanup_stack;
+
+	if (!ASSERT_GT(waitpid(child_pid, &wstatus, 0), -1, "waitpid"))
+		goto cleanup_stack;
+
+	if (!ASSERT_OK(WEXITSTATUS(wstatus), "run_task_file_iters_test_err"))
+		goto cleanup_stack;
+
+	ASSERT_EQ(skel->bss->count, 1, "run_task_file_iters_test_count_err");
+	ASSERT_OK(skel->bss->err, "run_task_file_iters_test_failure");
+cleanup_stack:
+	free(stack);
+cleanup_attach:
+	iters_task_file__detach(skel);
+cleanup_skel:
+	iters_task_file__destroy(skel);
+}
+
 void test_iters(void)
 {
 	RUN_TESTS(iters_state_safety);
@@ -315,5 +390,8 @@ void test_iters(void)
 		subtest_css_task_iters();
 	if (test__start_subtest("css"))
 		subtest_css_iters();
+	if (test__start_subtest("task_file"))
+		subtest_task_file_iters();
 	RUN_TESTS(iters_task_failure);
+	RUN_TESTS(iters_task_file_failure);
 }
diff --git a/tools/testing/selftests/bpf/progs/iters_task_file.c b/tools/testing/selftests/bpf/progs/iters_task_file.c
new file mode 100644
index 000000000000..cdf81ae022a0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/iters_task_file.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+#include "task_kfunc_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+int err, parent_pid, count;
+
+/* 'const' can eliminate the "taking address of expression of type 'void'" warning */
+extern const void pipefifo_fops __ksym;
+extern const void socket_file_ops __ksym;
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int test_bpf_iter_task_file(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+	struct bpf_iter_task_file_item *item;
+	struct task_struct *task;
+
+	task = bpf_get_current_task_btf();
+	if (task->real_parent->pid != parent_pid)
+		return 0;
+
+	count++;
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	item = bpf_iter_task_file_next(&task_file_it);
+	if (item == NULL) {
+		err = 1;
+		goto cleanup;
+	}
+
+	if (item->fd != 0) {
+		err = 2;
+		goto cleanup;
+	}
+
+	if (item->file->f_op != &pipefifo_fops) {
+		err = 3;
+		goto cleanup;
+	}
+
+	item = bpf_iter_task_file_next(&task_file_it);
+	if (item == NULL) {
+		err = 4;
+		goto cleanup;
+	}
+
+	if (item->fd != 1) {
+		err = 5;
+		goto cleanup;
+	}
+
+	if (item->file->f_op != &pipefifo_fops) {
+		err = 6;
+		goto cleanup;
+	}
+
+	item = bpf_iter_task_file_next(&task_file_it);
+	if (item == NULL) {
+		err = 7;
+		goto cleanup;
+	}
+
+	if (item->fd != 2) {
+		err = 8;
+		goto cleanup;
+	}
+
+	if (item->file->f_op != &socket_file_ops) {
+		err = 9;
+		goto cleanup;
+	}
+
+	item = bpf_iter_task_file_next(&task_file_it);
+	if (item != NULL)
+		err = 10;
+cleanup:
+	bpf_iter_task_file_destroy(&task_file_it);
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/iters_task_file_failure.c b/tools/testing/selftests/bpf/progs/iters_task_file_failure.c
new file mode 100644
index 000000000000..f7c26c49d8b4
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/iters_task_file_failure.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+#include "task_kfunc_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+SEC("syscall")
+__failure __msg("expected uninitialized iter_task_file as arg #0")
+int bpf_iter_task_file_new_inited_iter(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+	struct task_struct *task;
+
+	task = bpf_get_current_task_btf();
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	bpf_iter_task_file_destroy(&task_file_it);
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("Possibly NULL pointer passed to trusted arg1")
+int bpf_iter_task_file_new_null_task(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+	struct task_struct *task = NULL;
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	bpf_iter_task_file_destroy(&task_file_it);
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("R2 must be referenced or trusted")
+int bpf_iter_task_file_new_untrusted_task(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+	struct task_struct *task;
+
+	task = bpf_get_current_task_btf()->parent;
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	bpf_iter_task_file_destroy(&task_file_it);
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("Unreleased reference")
+int bpf_iter_task_file_no_destory(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+	struct task_struct *task;
+
+	task = bpf_get_current_task_btf();
+
+	bpf_iter_task_file_new(&task_file_it, task);
+
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("expected an initialized iter_task_file as arg #0")
+int bpf_iter_task_file_next_uninit_iter(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+
+	bpf_iter_task_file_next(&task_file_it);
+
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("expected an initialized iter_task_file as arg #0")
+int bpf_iter_task_file_destroy_uninit_iter(void *ctx)
+{
+	struct bpf_iter_task_file task_file_it;
+
+	bpf_iter_task_file_destroy(&task_file_it);
+
+	return 0;
+}
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v9 3/5] bpf: Add bpf_fget_task() kfunc
  2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 2/5] selftests/bpf: Add tests for open-coded style process file iterator Juntong Deng
@ 2025-01-27 23:46 ` Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type Juntong Deng
  2025-01-27 23:46 ` [PATCH bpf-next v9 5/5] selftests/bpf: Add tests for bpf_fget_task() kfunc Juntong Deng
  4 siblings, 0 replies; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:46 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

This patch adds bpf_fget_task() kfunc.

bpf_fget_task() is used to get a pointer to the struct file
corresponding to the task file descriptor. Note that this function
acquires a reference to struct file.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
 fs/bpf_fs_kfuncs.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
index 3fe9f59ef867..4a810046dcf3 100644
--- a/fs/bpf_fs_kfuncs.c
+++ b/fs/bpf_fs_kfuncs.c
@@ -152,6 +152,23 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
 	return bpf_get_dentry_xattr(dentry, name__str, value_p);
 }
 
+/**
+ * bpf_fget_task() - Get a pointer to the struct file corresponding to
+ * the task file descriptor
+ *
+ * Note that this function acquires a reference to struct file.
+ *
+ * @task: the specified struct task_struct
+ * @fd: the file descriptor
+ *
+ * @returns the corresponding struct file pointer if found,
+ * otherwise returns NULL
+ */
+__bpf_kfunc struct file *bpf_fget_task(struct task_struct *task, unsigned int fd)
+{
+	return fget_task(task, fd);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
@@ -161,6 +178,7 @@ BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE)
 BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_fget_task, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
 
 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type
  2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
                   ` (2 preceding siblings ...)
  2025-01-27 23:46 ` [PATCH bpf-next v9 3/5] bpf: Add bpf_fget_task() kfunc Juntong Deng
@ 2025-01-27 23:46 ` Juntong Deng
  2025-01-30 15:32   ` Christian Brauner
  2025-01-27 23:46 ` [PATCH bpf-next v9 5/5] selftests/bpf: Add tests for bpf_fget_task() kfunc Juntong Deng
  4 siblings, 1 reply; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:46 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

Currently fs kfuncs are only available for LSM program type, but fs
kfuncs are general and useful for scenarios other than LSM.

This patch makes fs kfuncs available for SYSCALL program type.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
 fs/bpf_fs_kfuncs.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
index 4a810046dcf3..8a7e9ed371de 100644
--- a/fs/bpf_fs_kfuncs.c
+++ b/fs/bpf_fs_kfuncs.c
@@ -26,8 +26,6 @@ __bpf_kfunc_start_defs();
  * acquired by this BPF kfunc will result in the BPF program being rejected by
  * the BPF verifier.
  *
- * This BPF kfunc may only be called from BPF LSM programs.
- *
  * Internally, this BPF kfunc leans on get_task_exe_file(), such that calling
  * bpf_get_task_exe_file() would be analogous to calling get_task_exe_file()
  * directly in kernel context.
@@ -49,8 +47,6 @@ __bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
  * passed to this BPF kfunc. Attempting to pass an unreferenced file pointer, or
  * any other arbitrary pointer for that matter, will result in the BPF program
  * being rejected by the BPF verifier.
- *
- * This BPF kfunc may only be called from BPF LSM programs.
  */
 __bpf_kfunc void bpf_put_file(struct file *file)
 {
@@ -70,8 +66,6 @@ __bpf_kfunc void bpf_put_file(struct file *file)
  * reference, or else the BPF program will be outright rejected by the BPF
  * verifier.
  *
- * This BPF kfunc may only be called from BPF LSM programs.
- *
  * Return: A positive integer corresponding to the length of the resolved
  * pathname in *buf*, including the NUL termination character. On error, a
  * negative integer is returned.
@@ -184,7 +178,8 @@ BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
 	if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
-	    prog->type == BPF_PROG_TYPE_LSM)
+	    prog->type == BPF_PROG_TYPE_LSM ||
+	    prog->type == BPF_PROG_TYPE_SYSCALL)
 		return 0;
 	return -EACCES;
 }
@@ -197,7 +192,10 @@ static const struct btf_kfunc_id_set bpf_fs_kfunc_set = {
 
 static int __init bpf_fs_kfuncs_init(void)
 {
-	return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set);
+	int ret;
+
+	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set);
+	return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &bpf_fs_kfunc_set);
 }
 
 late_initcall(bpf_fs_kfuncs_init);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v9 5/5] selftests/bpf: Add tests for bpf_fget_task() kfunc
  2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
                   ` (3 preceding siblings ...)
  2025-01-27 23:46 ` [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type Juntong Deng
@ 2025-01-27 23:46 ` Juntong Deng
  4 siblings, 0 replies; 11+ messages in thread
From: Juntong Deng @ 2025-01-27 23:46 UTC (permalink / raw)
  To: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht,
	brauner
  Cc: bpf, linux-kernel, linux-fsdevel

This patch adds test cases for bpf_fget_task() kfunc.

test_bpf_fget_task is used to test obtaining struct file based on
the file descriptor in the current process.

bpf_fget_task_null_task and bpf_fget_task_untrusted_task are used to
test the failure cases of passing NULL or untrusted pointer as argument.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
 .../testing/selftests/bpf/bpf_experimental.h  |  8 +++
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 46 ++++++++++++++
 .../selftests/bpf/progs/fs_kfuncs_failure.c   | 33 ++++++++++
 .../selftests/bpf/progs/test_fget_task.c      | 63 +++++++++++++++++++
 4 files changed, 150 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c

diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index ce1520c56b55..e0c9e7d9ba0a 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -221,6 +221,14 @@ extern void bpf_put_file(struct file *file) __ksym;
  */
 extern int bpf_path_d_path(struct path *path, char *buf, size_t buf__sz) __ksym;
 
+/* Description
+ *	Get a pointer to the struct file corresponding to the task file descriptor
+ *	Note that this function acquires a reference to struct file.
+ * Returns
+ *	The corresponding struct file pointer if found, otherwise returns NULL
+ */
+extern struct file *bpf_fget_task(struct task_struct *task, unsigned int fd) __ksym;
+
 /* This macro must be used to mark the exception callback corresponding to the
  * main program. For example:
  *
diff --git a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
index 5a0b51157451..89f5e09672b3 100644
--- a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
+++ b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
@@ -9,6 +9,8 @@
 #include <test_progs.h>
 #include "test_get_xattr.skel.h"
 #include "test_fsverity.skel.h"
+#include "test_fget_task.skel.h"
+#include "fs_kfuncs_failure.skel.h"
 
 static const char testfile[] = "/tmp/test_progs_fs_kfuncs";
 
@@ -139,6 +141,45 @@ static void test_fsverity(void)
 	remove(testfile);
 }
 
+static void test_fget_task(void)
+{
+	int pipefd[2], prog_fd, err;
+	struct test_fget_task *skel;
+	struct bpf_program *prog;
+
+	skel = test_fget_task__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	if (!ASSERT_OK(skel->bss->err, "pre_test_err"))
+		goto cleanup_skel;
+
+	prog = bpf_object__find_program_by_name(skel->obj, "test_bpf_fget_task");
+	if (!ASSERT_OK_PTR(prog, "find_program_by_name"))
+		goto cleanup_skel;
+
+	prog_fd = bpf_program__fd(prog);
+	if (!ASSERT_GT(prog_fd, -1, "bpf_program__fd"))
+		goto cleanup_skel;
+
+	if (pipe(pipefd) < 0)
+		goto cleanup_skel;
+
+	skel->bss->test_fd1 = pipefd[0];
+	skel->bss->test_fd2 = pipefd[1];
+
+	err = bpf_prog_test_run_opts(prog_fd, NULL);
+	if (!ASSERT_OK(err, "prog_test_run"))
+		goto cleanup_pipe;
+
+	ASSERT_OK(skel->bss->err, "run_bpf_fget_task_test_failure");
+cleanup_pipe:
+	close(pipefd[0]);
+	close(pipefd[1]);
+cleanup_skel:
+	test_fget_task__destroy(skel);
+}
+
 void test_fs_kfuncs(void)
 {
 	if (test__start_subtest("xattr"))
@@ -146,4 +187,9 @@ void test_fs_kfuncs(void)
 
 	if (test__start_subtest("fsverity"))
 		test_fsverity();
+
+	if (test__start_subtest("fget_task"))
+		test_fget_task();
+
+	RUN_TESTS(fs_kfuncs_failure);
 }
diff --git a/tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c b/tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
new file mode 100644
index 000000000000..57aa6d2787ac
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+
+char _license[] SEC("license") = "GPL";
+
+SEC("syscall")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int bpf_fget_task_null_task(void *ctx)
+{
+	struct task_struct *task = NULL;
+
+	bpf_fget_task(task, 1);
+
+	return 0;
+}
+
+SEC("syscall")
+__failure __msg("R1 must be referenced or trusted")
+int bpf_fget_task_untrusted_task(void *ctx)
+{
+	struct task_struct *task;
+
+	task = bpf_get_current_task_btf()->parent;
+
+	bpf_fget_task(task, 1);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_fget_task.c b/tools/testing/selftests/bpf/progs/test_fget_task.c
new file mode 100644
index 000000000000..fee5d5e1244a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_fget_task.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+#include "task_kfunc_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+int err, test_fd1, test_fd2;
+
+extern const void pipefifo_fops __ksym;
+
+SEC("syscall")
+int test_bpf_fget_task(void *ctx)
+{
+	struct task_struct *task;
+	struct file *file;
+
+	task = bpf_get_current_task_btf();
+	if (task == NULL) {
+		err = 1;
+		return 0;
+	}
+
+	file = bpf_fget_task(task, test_fd1);
+	if (file == NULL) {
+		err = 2;
+		return 0;
+	}
+
+	if (file->f_op != &pipefifo_fops) {
+		err = 3;
+		bpf_put_file(file);
+		return 0;
+	}
+
+	bpf_put_file(file);
+
+	file = bpf_fget_task(task, test_fd2);
+	if (file == NULL) {
+		err = 4;
+		return 0;
+	}
+
+	if (file->f_op != &pipefifo_fops) {
+		err = 5;
+		bpf_put_file(file);
+		return 0;
+	}
+
+	bpf_put_file(file);
+
+	file = bpf_fget_task(task, 9999);
+	if (file != NULL) {
+		err = 6;
+		bpf_put_file(file);
+	}
+
+	return 0;
+}
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs
  2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
@ 2025-01-30  2:35   ` Alexei Starovoitov
  2025-01-30 16:04   ` Christian Brauner
  1 sibling, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2025-01-30  2:35 UTC (permalink / raw)
  To: Juntong Deng
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eddy Z, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Kumar Kartikeya Dwivedi, snorcht, Christian Brauner, bpf, LKML,
	Linux-Fsdevel

On Mon, Jan 27, 2025 at 3:48 PM Juntong Deng <juntong.deng@outlook.com> wrote:
>
> This patch adds the open-coded iterator style process file iterator
> kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all
> files opened by the specified process.
>
> bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item,
> which currently contains *task, *file, fd. This is an extensible
> structure that enables compatibility with different versions
> through CO-RE.
>
> The reference to struct file acquired by the previous
> bpf_iter_task_file_next() is released in the next
> bpf_iter_task_file_next(), and the last reference is released in the
> last bpf_iter_task_file_next() that returns NULL.
>
> In the bpf_iter_task_file_destroy(), if the iterator does not iterate to
> the end, then the last struct file reference is released at this time.
>
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>

All patches look fine,
but we need an ack from Christian for patches 3 and 4 to proceed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type
  2025-01-27 23:46 ` [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type Juntong Deng
@ 2025-01-30 15:32   ` Christian Brauner
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Brauner @ 2025-01-30 15:32 UTC (permalink / raw)
  To: Juntong Deng
  Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht, bpf,
	linux-kernel, linux-fsdevel

On Mon, Jan 27, 2025 at 11:46:53PM +0000, Juntong Deng wrote:
> Currently fs kfuncs are only available for LSM program type, but fs
> kfuncs are general and useful for scenarios other than LSM.
> 
> This patch makes fs kfuncs available for SYSCALL program type.
> 
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
> ---

I still have a hard time understanding what syscall program types do and
why we should want to allow the usage of all current fs functions that
were added for LSMs specifically to such program types. I can't say
anything about this until I have a rough understanding what a syscall
bpf program allows you to do and what it's used for. Preferably some
example.

>  fs/bpf_fs_kfuncs.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> index 4a810046dcf3..8a7e9ed371de 100644
> --- a/fs/bpf_fs_kfuncs.c
> +++ b/fs/bpf_fs_kfuncs.c
> @@ -26,8 +26,6 @@ __bpf_kfunc_start_defs();
>   * acquired by this BPF kfunc will result in the BPF program being rejected by
>   * the BPF verifier.
>   *
> - * This BPF kfunc may only be called from BPF LSM programs.
> - *
>   * Internally, this BPF kfunc leans on get_task_exe_file(), such that calling
>   * bpf_get_task_exe_file() would be analogous to calling get_task_exe_file()
>   * directly in kernel context.
> @@ -49,8 +47,6 @@ __bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
>   * passed to this BPF kfunc. Attempting to pass an unreferenced file pointer, or
>   * any other arbitrary pointer for that matter, will result in the BPF program
>   * being rejected by the BPF verifier.
> - *
> - * This BPF kfunc may only be called from BPF LSM programs.
>   */
>  __bpf_kfunc void bpf_put_file(struct file *file)
>  {
> @@ -70,8 +66,6 @@ __bpf_kfunc void bpf_put_file(struct file *file)
>   * reference, or else the BPF program will be outright rejected by the BPF
>   * verifier.
>   *
> - * This BPF kfunc may only be called from BPF LSM programs.
> - *
>   * Return: A positive integer corresponding to the length of the resolved
>   * pathname in *buf*, including the NUL termination character. On error, a
>   * negative integer is returned.
> @@ -184,7 +178,8 @@ BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
>  static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
>  {
>  	if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> -	    prog->type == BPF_PROG_TYPE_LSM)
> +	    prog->type == BPF_PROG_TYPE_LSM ||
> +	    prog->type == BPF_PROG_TYPE_SYSCALL)
>  		return 0;
>  	return -EACCES;
>  }
> @@ -197,7 +192,10 @@ static const struct btf_kfunc_id_set bpf_fs_kfunc_set = {
>  
>  static int __init bpf_fs_kfuncs_init(void)
>  {
> -	return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set);
> +	int ret;
> +
> +	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set);
> +	return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &bpf_fs_kfunc_set);
>  }
>  
>  late_initcall(bpf_fs_kfuncs_init);
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs
  2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
  2025-01-30  2:35   ` Alexei Starovoitov
@ 2025-01-30 16:04   ` Christian Brauner
  2025-01-30 16:35     ` Linus Torvalds
  2025-01-31  5:52     ` Al Viro
  1 sibling, 2 replies; 11+ messages in thread
From: Christian Brauner @ 2025-01-30 16:04 UTC (permalink / raw)
  To: Juntong Deng, Alexander Viro, Linus Torvalds
  Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, memxor, snorcht, bpf,
	linux-kernel, linux-fsdevel

On Mon, Jan 27, 2025 at 11:46:50PM +0000, Juntong Deng wrote:
> This patch adds the open-coded iterator style process file iterator
> kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all
> files opened by the specified process.
> 
> bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item,
> which currently contains *task, *file, fd. This is an extensible
> structure that enables compatibility with different versions
> through CO-RE.
> 
> The reference to struct file acquired by the previous
> bpf_iter_task_file_next() is released in the next
> bpf_iter_task_file_next(), and the last reference is released in the
> last bpf_iter_task_file_next() that returns NULL.
> 
> In the bpf_iter_task_file_destroy(), if the iterator does not iterate to
> the end, then the last struct file reference is released at this time.
> 
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
> ---

I deeply dislike that this allows bpf programs to iterate through
another tasks files more than what is already possible with the
task_file_seq_* bpf api.

This here means that bpf programs have access to all file types that
exist in the kernel. From general simple filesystem files to pidfds, kvm
fd, epoll fd, drm fds - anything you can think of. And then do arbitrary
and ever expanding stuff with those files from the iterator with less
restrictions (if I'm reading this right) than the task_file_seq_*
iterator.

Possibly even keeping that reference for a long time leading to weird
EBUSY issues for filesystem shutdown and similar problems.

This is a bad idea. Even in the kernel we only allow this type of
iteration for procfs and procfs-like usage and there we hold references
to files from another task for a very short time when we e.g., access
/proc/<PID>/fd/.

And you already have an iterator for that with task_file_seq_get_*()
even if it is more work.

I'm also not at all swayed by the fact that this is coming out of an
effort to move CRIU into bpf just to make things easier. Not a selling
point as we do have CRIU and I don't think we need to now put more CRIU
related stuff into the kernel.

So this will not get an ACK from me. I'm putting Al and Linus here as
well as they might have opinions on this and might disagree with me.

>  kernel/bpf/helpers.c   |  3 ++
>  kernel/bpf/task_iter.c | 90 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 93 insertions(+)
> 
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index f27ce162427a..359c5bbf4814 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -3157,6 +3157,9 @@ BTF_ID_FLAGS(func, bpf_iter_css_destroy, KF_ITER_DESTROY)
>  BTF_ID_FLAGS(func, bpf_iter_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS | KF_RCU_PROTECTED)
>  BTF_ID_FLAGS(func, bpf_iter_task_next, KF_ITER_NEXT | KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_task_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_iter_task_file_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
> +BTF_ID_FLAGS(func, bpf_iter_task_file_next, KF_ITER_NEXT | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_iter_task_file_destroy, KF_ITER_DESTROY)
>  BTF_ID_FLAGS(func, bpf_dynptr_adjust)
>  BTF_ID_FLAGS(func, bpf_dynptr_is_null)
>  BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index 98d9b4c0daff..24a5af67e6c8 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -1027,6 +1027,96 @@ __bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it)
>  {
>  }
>  
> +struct bpf_iter_task_file_item {
> +	struct task_struct *task;
> +	struct file *file;
> +	unsigned int fd;
> +};
> +
> +struct bpf_iter_task_file {
> +	__u64 __opaque[4];
> +} __aligned(8);
> +
> +struct bpf_iter_task_file_kern {
> +	struct bpf_iter_task_file_item item;
> +	unsigned int next_fd;
> +} __aligned(8);
> +
> +/**
> + * bpf_iter_task_file_new() - Initialize a new task file iterator for a task,
> + * used to iterate over all files opened by a specified task
> + *
> + * @it: the new bpf_iter_task_file to be created
> + * @task: a pointer pointing to the task to be iterated over
> + */
> +__bpf_kfunc int bpf_iter_task_file_new(struct bpf_iter_task_file *it, struct task_struct *task)
> +{
> +	struct bpf_iter_task_file_kern *kit = (void *)it;
> +	struct bpf_iter_task_file_item *item = &kit->item;
> +
> +	BUILD_BUG_ON(sizeof(struct bpf_iter_task_file_kern) > sizeof(struct bpf_iter_task_file));
> +	BUILD_BUG_ON(__alignof__(struct bpf_iter_task_file_kern) !=
> +		     __alignof__(struct bpf_iter_task_file));
> +
> +	item->task = get_task_struct(task);
> +	item->file = NULL;
> +	item->fd = 0;
> +	kit->next_fd = 0;
> +
> +	return 0;
> +}
> +
> +/**
> + * bpf_iter_task_file_next() - Get the next file in bpf_iter_task_file
> + *
> + * bpf_iter_task_file_next acquires a reference to the struct file.
> + *
> + * The reference to struct file acquired by the previous
> + * bpf_iter_task_file_next() is released in the next bpf_iter_task_file_next(),
> + * and the last reference is released in the last bpf_iter_task_file_next()
> + * that returns NULL.
> + *
> + * @it: the bpf_iter_task_file to be checked
> + *
> + * @returns a pointer to bpf_iter_task_file_item
> + */
> +__bpf_kfunc struct bpf_iter_task_file_item *bpf_iter_task_file_next(struct bpf_iter_task_file *it)
> +{
> +	struct bpf_iter_task_file_kern *kit = (void *)it;
> +	struct bpf_iter_task_file_item *item = &kit->item;
> +
> +	if (item->file)
> +		fput(item->file);
> +
> +	item->file = fget_task_next(item->task, &kit->next_fd);
> +	if (!item->file)
> +		return NULL;
> +
> +	item->fd = kit->next_fd;
> +	kit->next_fd++;
> +
> +	return item;
> +}
> +
> +/**
> + * bpf_iter_task_file_destroy() - Destroy a bpf_iter_task_file
> + *
> + * If the iterator does not iterate to the end, then the last
> + * struct file reference is released at this time.
> + *
> + * @it: the bpf_iter_task_file to be destroyed
> + */
> +__bpf_kfunc void bpf_iter_task_file_destroy(struct bpf_iter_task_file *it)
> +{
> +	struct bpf_iter_task_file_kern *kit = (void *)it;
> +	struct bpf_iter_task_file_item *item = &kit->item;
> +
> +	if (item->file)
> +		fput(item->file);
> +
> +	put_task_struct(item->task);
> +}
> +
>  __bpf_kfunc_end_defs();
>  
>  DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work);
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs
  2025-01-30 16:04   ` Christian Brauner
@ 2025-01-30 16:35     ` Linus Torvalds
  2025-01-31  5:52     ` Al Viro
  1 sibling, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2025-01-30 16:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Juntong Deng, Alexander Viro, ast, daniel, john.fastabend, andrii,
	martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
	jolsa, memxor, snorcht, bpf, linux-kernel, linux-fsdevel

On Thu, 30 Jan 2025 at 08:04, Christian Brauner <brauner@kernel.org> wrote:
>
> I deeply dislike that this allows bpf programs to iterate through
> another tasks files more than what is already possible with the
> task_file_seq_* bpf api.

Ack. This needs to just die.

There is no excuse for this, and no, CRIU is absolutely *not* that excuse.

In fact, CRIU is a huge red flag, and has caused endless issues before.

We should absolutely not add special BPF interfaces for CRIU. It only
leads to pain and misery.

           Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs
  2025-01-30 16:04   ` Christian Brauner
  2025-01-30 16:35     ` Linus Torvalds
@ 2025-01-31  5:52     ` Al Viro
  1 sibling, 0 replies; 11+ messages in thread
From: Al Viro @ 2025-01-31  5:52 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Juntong Deng, Linus Torvalds, ast, daniel, john.fastabend, andrii,
	martin.lau, eddyz87, song, yonghong.song, kpsingh, sdf, haoluo,
	jolsa, memxor, snorcht, bpf, linux-kernel, linux-fsdevel

On Thu, Jan 30, 2025 at 05:04:42PM +0100, Christian Brauner wrote:

> I'm also not at all swayed by the fact that this is coming out of an
> effort to move CRIU into bpf just to make things easier. Not a selling
> point as we do have CRIU and I don't think we need to now put more CRIU
> related stuff into the kernel.
> 
> So this will not get an ACK from me. I'm putting Al and Linus here as
> well as they might have opinions on this and might disagree with me.

Strongly seconded.  While we are at it, one thing I really hate about
BPF access to descriptor tables is the idiotic idea of using descriptor
table as private data structure.  Sure, when talking to userland it's
perfectly fine to stash your object into descriptor table and use
descriptors for marshalling that.  Doing that kernel-side is inherently
racy, especially if you end up assuming that underlying object won't
go away until your skel_closenz() - or that it will go away as soon
as that thing is called.  And judging by experience with regular
kernel code, that's an assumption that gets made again and again;
see anon_inode_getfile() callers that used to be anon_inode_getfd() -
a lot of those appeared while whacking those moles...

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-01-31  5:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-27 23:44 [PATCH bpf-next v9 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc Juntong Deng
2025-01-27 23:46 ` [PATCH bpf-next v9 1/5] bpf: Introduce task_file open-coded iterator kfuncs Juntong Deng
2025-01-30  2:35   ` Alexei Starovoitov
2025-01-30 16:04   ` Christian Brauner
2025-01-30 16:35     ` Linus Torvalds
2025-01-31  5:52     ` Al Viro
2025-01-27 23:46 ` [PATCH bpf-next v9 2/5] selftests/bpf: Add tests for open-coded style process file iterator Juntong Deng
2025-01-27 23:46 ` [PATCH bpf-next v9 3/5] bpf: Add bpf_fget_task() kfunc Juntong Deng
2025-01-27 23:46 ` [PATCH bpf-next v9 4/5] bpf: Make fs kfuncs available for SYSCALL program type Juntong Deng
2025-01-30 15:32   ` Christian Brauner
2025-01-27 23:46 ` [PATCH bpf-next v9 5/5] selftests/bpf: Add tests for bpf_fget_task() kfunc Juntong Deng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox