[PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
@ 2024-03-06  7:39 Matt Bobrowski
  2024-03-06  7:39 ` [PATCH v2 bpf-next 1/9] bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids Matt Bobrowski
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:39 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

G'day All,

The original cover letter providing background context and motivating
factors around the needs for the BPF kfuncs introduced within this
patch series can be found here [0], so please do reference that if
need be.

Notably, one of the main contention points within v1 of this patch
series was that we were effectively leaning on some preexisting
in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
within some of the newly introduced BPF kfuncs. As noted in my
response here [1] though, I struggle to understand the technical
reasoning behind why exposing such in-kernel helpers, specifically
only to BPF LSM program types in the form of BPF kfuncs, is inherently
a terrible idea. So, until someone provides me with a sound technical
explanation as to why this cannot or should not be done, I'll continue
to lean on them. The alternative is to reimplement the necessary
in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.

Changes since v1:
   * Dropped the probe-read related patches [2, 3], which focused on
     retroactively fixing bpf_d_path() such that it's susceptability
     to memory corruption issues is drastically reduced. Rightfully so
     though, it was deemed that reimplementing a semi-functional
     variant of d_path() that was effectively backed by
     copy_from_kernel_nofault() is suboptimal.

[0] https://lore.kernel.org/bpf/cover.1708377880.git.mattbobrowski@google.com/
[1] https://lore.kernel.org/bpf/ZdX83H7rTEwMYvs2@google.com/
[2] https://lore.kernel.org/bpf/5643840bd57d0c2345635552ae228dfb2ed3428c.1708377880.git.mattbobrowski@google.com/
[3] https://lore.kernel.org/bpf/18c7b587d43bbc7e80593bf51ea9d3eb99e47bc1.1708377880.git.mattbobrowski@google.com/

Matt Bobrowski (9):
  bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids
  bpf: add new acquire/release BPF kfuncs for mm_struct
  bpf/selftests: add selftests for mm_struct acquire/release BPF kfuncs
  bpf: add new acquire/release based BPF kfuncs for exe_file
  bpf/selftests: add selftests for exe_file acquire/release BPF kfuncs
  bpf: add acquire/release based BPF kfuncs for fs_struct's paths
  bpf/selftests: add selftests for root/pwd path based BPF kfuncs
  bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()
  bpf/selftests: adapt selftests test_d_path for BPF kfunc
    bpf_path_d_path()

 kernel/trace/bpf_trace.c                      | 248 +++++++++++++++++-
 .../testing/selftests/bpf/prog_tests/d_path.c |  80 ++++++
 .../selftests/bpf/prog_tests/exe_file_kfunc.c |  49 ++++
 .../selftests/bpf/prog_tests/mm_kfunc.c       |  48 ++++
 .../selftests/bpf/prog_tests/path_kfunc.c     |  48 ++++
 .../selftests/bpf/progs/d_path_common.h       |  35 +++
 .../bpf/progs/d_path_kfunc_failure.c          |  66 +++++
 .../bpf/progs/d_path_kfunc_success.c          |  25 ++
 .../bpf/progs/exe_file_kfunc_common.h         |  23 ++
 .../bpf/progs/exe_file_kfunc_failure.c        | 181 +++++++++++++
 .../bpf/progs/exe_file_kfunc_success.c        |  52 ++++
 .../selftests/bpf/progs/mm_kfunc_common.h     |  19 ++
 .../selftests/bpf/progs/mm_kfunc_failure.c    | 103 ++++++++
 .../selftests/bpf/progs/mm_kfunc_success.c    |  30 +++
 .../selftests/bpf/progs/path_kfunc_common.h   |  20 ++
 .../selftests/bpf/progs/path_kfunc_failure.c  | 114 ++++++++
 .../selftests/bpf/progs/path_kfunc_success.c  |  30 +++
 .../testing/selftests/bpf/progs/test_d_path.c |  20 +-
 .../bpf/progs/test_d_path_check_rdonly_mem.c  |   8 +-
 .../bpf/progs/test_d_path_check_types.c       |   8 +-
 20 files changed, 1160 insertions(+), 47 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/exe_file_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/mm_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/path_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_kfunc_success.c
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_success.c
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_success.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_success.c

-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 1/9] bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
@ 2024-03-06  7:39 ` Matt Bobrowski
  2024-03-06  7:39 ` [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct Matt Bobrowski
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:39 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

fs_kfunc_set_ids is rather specific to a single BPF kfunc at the
moment. Rename it to something a little more generic such that other
future BPF kfuncs that are also restricted to BPF LSM program types
can reside in the same btf_kfunc_id_set and make use of the same
btf_kfunc_filter_t.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 kernel/trace/bpf_trace.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 241ddf5e3895..f639663ac339 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1435,7 +1435,7 @@ static int __init bpf_key_sig_kfuncs_init(void)
 late_initcall(bpf_key_sig_kfuncs_init);
 #endif /* CONFIG_KEYS */
 
-/* filesystem kfuncs */
+/* A set of kfuncs that may only be called from BPF LSM program types. */
 __bpf_kfunc_start_defs();
 
 /**
@@ -1475,31 +1475,33 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
 
 __bpf_kfunc_end_defs();
 
-BTF_KFUNCS_START(fs_kfunc_set_ids)
+BTF_KFUNCS_START(lsm_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
-BTF_KFUNCS_END(fs_kfunc_set_ids)
+BTF_KFUNCS_END(lsm_kfunc_set_ids)
 
-static int bpf_get_file_xattr_filter(const struct bpf_prog *prog, u32 kfunc_id)
+static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
-	if (!btf_id_set8_contains(&fs_kfunc_set_ids, kfunc_id))
+	if (!btf_id_set8_contains(&lsm_kfunc_set_ids, kfunc_id))
 		return 0;
 
-	/* Only allow to attach from LSM hooks, to avoid recursion */
+	/* To avoid recursion, only permit kfuncs included within
+	 * lsm_kfunc_set_ids to be called from BPF LSM program types.
+	 */
 	return prog->type != BPF_PROG_TYPE_LSM ? -EACCES : 0;
 }
 
-static const struct btf_kfunc_id_set bpf_fs_kfunc_set = {
+static const struct btf_kfunc_id_set bpf_lsm_kfunc_set = {
 	.owner = THIS_MODULE,
-	.set = &fs_kfunc_set_ids,
-	.filter = bpf_get_file_xattr_filter,
+	.set = &lsm_kfunc_set_ids,
+	.filter = bpf_lsm_kfunc_filter,
 };
 
-static int __init bpf_fs_kfuncs_init(void)
+static int __init bpf_lsm_kfuncs_init(void)
 {
-	return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set);
+	return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_lsm_kfunc_set);
 }
 
-late_initcall(bpf_fs_kfuncs_init);
+late_initcall(bpf_lsm_kfuncs_init);
 
 static const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
  2024-03-06  7:39 ` [PATCH v2 bpf-next 1/9] bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids Matt Bobrowski
@ 2024-03-06  7:39 ` Matt Bobrowski
  2024-03-06 11:50   ` Christian Brauner
  2024-03-06  7:39 ` [PATCH v2 bpf-next 3/9] bpf/selftests: add selftests for mm_struct acquire/release BPF kfuncs Matt Bobrowski
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:39 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

A BPF LSM program will at times introspect the mm_struct that is
nested within a given task_struct. Such introspection performed by a
BPF LSM program may involve reading virtual addresses out from fields
like arg_start/arg_end and env_start/env_end, or reading fields
directly out from the backing exe_file. In order to perform reliable
reads against fields contained within mm_struct, we need to introduce
a new set of BPF kfuncs that have the ability to acquire and release
references on the mm_struct that is nested within a task_struct.

The following BPF kfuncs have been added in order to support this
capability:

struct mm_struct *bpf_task_mm_grab(struct task_struct *task);
void bpf_mm_drop(struct mm_struct *mm);

These new BPF kfuncs are pretty self-explanatory, but in kernel terms
bpf_task_mm_grab() effectively allows you to get a reference on the
mm_struct nested within a supplied task_struct. Whereas, bpf_mm_drop()
allows you put a reference on a previously gotten mm_struct
reference. Both BPF kfuncs are also backed by BPF's respective
KF_ACQUIRE/KF_RELEASE semantics, ensuring that the BPF program behaves
in accordance to the constraints enforced upon it when operating on
reference counted in-kernel data structures.

Notably, these newly added BPF kfuncs are simple wrappers around the
mmgrab() and mmdrop() in-kernel helpers. Both mmgrab() and mmdrop()
are used in favour of their somewhat similar counterparts mmget() and
mmput() as they're considered to be the more lightweight variants in
comparison, and there's no requirement to also pin the underlying
address spaces just yet.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 kernel/trace/bpf_trace.c | 47 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f639663ac339..801808b6efb0 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1473,10 +1473,57 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
 	return __vfs_getxattr(dentry, dentry->d_inode, name__str, value, value_len);
 }

+/**
+ * bpf_task_mm_grab - get a reference on the mm_struct nested within the
+ * 		      supplied task_struct
+ * @task: task_struct nesting the mm_struct that is to be referenced
+ *
+ * Grab a reference on the mm_struct that is nested within the supplied
+ * *task*. This kfunc will return NULL for threads that do not possess a valid
+ * mm_struct. For example, those that are flagged as PF_KTHREAD. A reference on
+ * a mm_struct acquired by this kfunc must be released using bpf_mm_drop().
+ *
+ * This helper only pins the mm_struct and not necessarily the address space
+ * associated with the referenced mm_struct that is returned from this
+ * kfunc. Internally, this kfunc leans on mmgrab(), such that calling
+ * bpf_task_mm_grab() would be analogous to calling mmgrab() outside of BPF
+ * program context.
+ *
+ * Return: A referenced pointer to the mm_struct nested within the supplied
+ * *task*, or NULL.
+ */
+__bpf_kfunc struct mm_struct *bpf_task_mm_grab(struct task_struct *task)
+{
+	struct mm_struct *mm;
+
+	task_lock(task);
+	mm = task->mm;
+	if (likely(mm))
+		mmgrab(mm);
+	task_unlock(task);
+
+	return mm;
+}
+
+/**
+ * bpf_mm_drop - put a reference on the supplied mm_struct
+ * @mm: mm_struct of which to put a reference on
+ *
+ * Put a reference on the supplied *mm*. This kfunc internally leans on
+ * mmdrop(), such that calling bpf_mm_drop() would be analogous to calling
+ * mmdrop() outside of BPF program context.
+ */
+__bpf_kfunc void bpf_mm_drop(struct mm_struct *mm)
+{
+	mmdrop(mm);
+}
+
 __bpf_kfunc_end_defs();

 BTF_KFUNCS_START(lsm_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_task_mm_grab, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL);
+BTF_ID_FLAGS(func, bpf_mm_drop, KF_RELEASE);
 BTF_KFUNCS_END(lsm_kfunc_set_ids)

 static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 3/9] bpf/selftests: add selftests for mm_struct acquire/release BPF kfuncs
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
  2024-03-06  7:39 ` [PATCH v2 bpf-next 1/9] bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids Matt Bobrowski
  2024-03-06  7:39 ` [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct Matt Bobrowski
@ 2024-03-06  7:39 ` Matt Bobrowski
  2024-03-06  7:40 ` [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file Matt Bobrowski
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:39 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

Add a new mm_kfunc test suite that is responsible for verifying the
behaviour of the newly added mm_struct based BPF kfuncs. As of now,
these selftests cover the operability of the following:

struct mm_struct *bpf_task_mm_grab(struct task_struct *task);
void bpf_mm_drop(struct mm_struct *mm);

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 .../selftests/bpf/prog_tests/mm_kfunc.c       |  48 ++++++++
 .../selftests/bpf/progs/mm_kfunc_common.h     |  19 ++++
 .../selftests/bpf/progs/mm_kfunc_failure.c    | 103 ++++++++++++++++++
 .../selftests/bpf/progs/mm_kfunc_success.c    |  30 +++++
 4 files changed, 200 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/mm_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/mm_kfunc_success.c

diff --git a/tools/testing/selftests/bpf/prog_tests/mm_kfunc.c b/tools/testing/selftests/bpf/prog_tests/mm_kfunc.c
new file mode 100644
index 000000000000..aece5c25486d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/mm_kfunc.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#define _GNU_SOURCE
+#include <test_progs.h>
+
+#include "mm_kfunc_failure.skel.h"
+#include "mm_kfunc_success.skel.h"
+
+static void run_test(const char *prog_name)
+{
+	struct bpf_link *link;
+	struct bpf_program *prog;
+	struct mm_kfunc_success *skel;
+
+	skel = mm_kfunc_success__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "mm_kfunc_success__open_and_load"))
+		return;
+
+	link = NULL;
+	prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+		goto cleanup;
+
+	link = bpf_program__attach(prog);
+	ASSERT_OK_PTR(link, "bpf_program__attach");
+cleanup:
+	bpf_link__destroy(link);
+	mm_kfunc_success__destroy(skel);
+}
+
+static const char * const success_tests[] = {
+	"task_mm_grab_drop_from_argument",
+	"task_mm_acquire_release_from_current",
+};
+
+void test_mm_kfunc(void)
+{
+	int i = 0;
+
+	for (; i < ARRAY_SIZE(success_tests); i++) {
+		if (!test__start_subtest(success_tests[i]))
+			continue;
+		run_test(success_tests[i]);
+	}
+
+	RUN_TESTS(mm_kfunc_failure);
+}
diff --git a/tools/testing/selftests/bpf/progs/mm_kfunc_common.h b/tools/testing/selftests/bpf/progs/mm_kfunc_common.h
new file mode 100644
index 000000000000..043d74d4148b
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mm_kfunc_common.h
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#ifndef _MM_KFUNC_COMMON_H
+#define _MM_KFUNC_COMMON_H
+
+#include <vmlinux.h>
+#include <errno.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include "bpf_misc.h"
+
+struct mm_struct *bpf_task_mm_grab(struct task_struct *task) __ksym;
+void bpf_mm_drop(struct mm_struct *mm) __ksym;
+
+char _license[] SEC("license") = "GPL";
+
+#endif /* _MM_KFUNC_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/mm_kfunc_failure.c b/tools/testing/selftests/bpf/progs/mm_kfunc_failure.c
new file mode 100644
index 000000000000..d818dfcab20e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mm_kfunc_failure.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "mm_kfunc_common.h"
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(task_mm_grab_null_kfunc)
+{
+	struct mm_struct *acquired;
+
+	/* Can't pass a NULL pointer to bpf_task_mm_grab(). */
+	acquired = bpf_task_mm_grab(NULL);
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
+
+SEC("lsm/task_free")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(task_mm_grab_from_lsm_task_free_kfunc, struct task_struct *task)
+{
+	struct mm_struct *acquired;
+
+	/* The task_struct supplied to this LSM hook isn't trusted. */
+	acquired = bpf_task_mm_grab(task);
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("arg#0 pointer type STRUCT task_struct must point")
+int BPF_PROG(task_mm_grab_fp_kfunc, struct task_struct *task, u64 clone_flags)
+{
+	struct task_struct *fp;
+	struct mm_struct *acquired;
+
+	fp = (struct task_struct *)&clone_flags;
+	/* Can't pass random frame pointer to bpf_task_mm_grab(). */
+	acquired = bpf_task_mm_grab(fp);
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("Unreleased reference")
+int BPF_PROG(task_mm_grab_unreleased_kfunc, struct task_struct *task)
+{
+	struct mm_struct *acquired;
+
+	acquired = bpf_task_mm_grab(task);
+	__sink(acquired);
+
+	/* Acquired but never released. */
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(task_mm_drop_untrusted_kfunc, struct task_struct *task)
+{
+	struct mm_struct *acquired;
+
+	/* task->mm from struct task_struct yields an untrusted pointer. */
+	acquired = task->mm;
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
+
+SEC("lsm/vm_enough_memory")
+__failure __msg("release kernel function bpf_mm_drop expects")
+int BPF_PROG(mm_drop_unacquired_kfunc, struct mm_struct *mm)
+{
+	/* Can't release an unacquired pointer. */
+	bpf_mm_drop(mm);
+
+	return 0;
+}
+
+SEC("lsm/vm_enough_memory")
+__failure __msg("arg#0 pointer type STRUCT mm_struct must point")
+int BPF_PROG(mm_drop_fp_kfunc, struct mm_struct *mm, long pages)
+{
+	struct mm_struct *fp;
+
+	fp = (struct mm_struct *)&pages;
+
+	/* Can't release random frame pointer. */
+	bpf_mm_drop(fp);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/mm_kfunc_success.c b/tools/testing/selftests/bpf/progs/mm_kfunc_success.c
new file mode 100644
index 000000000000..5400abd2ee2d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mm_kfunc_success.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "mm_kfunc_common.h"
+
+SEC("lsm.s/task_alloc")
+int BPF_PROG(task_mm_grab_drop_from_argument, struct task_struct *task)
+{
+	struct mm_struct *acquired;
+
+	acquired = bpf_task_mm_grab(task);
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+int BPF_PROG(task_mm_acquire_release_from_current)
+{
+	struct mm_struct *acquired;
+
+	acquired = bpf_task_mm_grab(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	bpf_mm_drop(acquired);
+
+	return 0;
+}
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (2 preceding siblings ...)
  2024-03-06  7:39 ` [PATCH v2 bpf-next 3/9] bpf/selftests: add selftests for mm_struct acquire/release BPF kfuncs Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06 11:31   ` Christian Brauner
  2024-03-06  7:40 ` [PATCH v2 bpf-next 5/9] bpf/selftests: add selftests for exe_file acquire/release BPF kfuncs Matt Bobrowski
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

It is rather common for BPF LSM program types to perform the struct
walk current->mm->exe_file and subsequently operate on fields of the
backing file. At times, some of these operations involve passing a
exe_file's field on to BPF helpers and such
i.e. bpf_d_path(&current->mm->exe_file->f_path). However, doing so
isn't necessarily always reliable as the backing file that exe_file is
pointing to may be in the midst of being torn down and handing
anything contained within this file to BPF helpers and such can lead
to memory corruption issues [0].

To alleviate possibly operating on semi-torn down instances of
current->mm->exe_file we introduce a set of BPF kfuncs that posses
KF_ACQUIRE/KF_RELEASE based semantics. Such BPF kfuncs will allow BPF
LSM program types to reliably get/put a reference on a
current->mm->exe_file.

The following new BPF kfuncs have been added:

struct file *bpf_get_task_exe_file(struct task_struct *task);
struct file *bpf_get_mm_exe_file(struct mm_struct *mm);
void bpf_put_file(struct file *f);

Internally, these new BPF kfuncs simply call the preexisting in-kernel
functions get_task_exe_file(), get_mm_exe_file(), and fput()
accordingly. From a technical standpoint, there's absolutely no need
to re-implement such helpers just for BPF as they're currently scoped
to BPF LSM program types.

Note that we explicitly do not explicitly rely on the use of very low
level in-kernel functions like get_file_rcu() and get_file_active() to
acquire a reference on current->mm->exe_file and such. This is super
subtle code and we probably want to avoid exposing any such subtleties
to BPF in the form of BPF kfuncs. Additionally, the usage of a double
pointer i.e. struct file **, isn't something that the BPF verifier
currently recognizes nor has any intention to recognize for the
foreseeable future.

[0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 kernel/trace/bpf_trace.c | 56 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 801808b6efb0..539c58db74d7 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1518,12 +1518,68 @@ __bpf_kfunc void bpf_mm_drop(struct mm_struct *mm)
 	mmdrop(mm);
 }
 
+/**
+ * bpf_get_task_exe_file - get a reference on the exe_file associated with the
+ * 	       		   mm_struct that is nested within the supplied
+ * 	       		   task_struct
+ * @task: task_struct of which the nested mm_struct's exe_file is to be
+ * referenced
+ *
+ * Get a reference on the exe_file that is associated with the mm_struct nested
+ * within the supplied *task*. A reference on a file pointer acquired by this
+ * kfunc must be released using bpf_put_file(). Internally, this kfunc leans on
+ * get_task_exe_file(), such that calling bpf_get_task_exe_file() would be
+ * analogous to calling get_task_exe_file() outside of BPF program context.
+ *
+ * Return: A referenced pointer to the exe_file associated with the mm_struct
+ * nested in the supplied *task*, or NULL.
+ */
+__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
+{
+	return get_task_exe_file(task);
+}
+
+/**
+ * bpf_get_mm_exe_file - get a reference on the exe_file for the supplied
+ * 			 mm_struct.
+ * @mm: mm_struct of which the exe_file to get a reference on
+ *
+ * Get a reference on the exe_file associated with the supplied *mm*. A
+ * reference on a file pointer acquired by this kfunc must be released using
+ * bpf_put_file(). Internally, this kfunc leans on get_mm_exe_file(), such that
+ * calling bpf_get_mm_exe_file() would be analogous to calling get_mm_exe_file()
+ * outside of BPF program context.
+ *
+ * Return: A referenced file pointer to the exe_file for the supplied *mm*, or
+ * NULL.
+ */
+__bpf_kfunc struct file *bpf_get_mm_exe_file(struct mm_struct *mm)
+{
+	return get_mm_exe_file(mm);
+}
+
+/**
+ * bpf_put_file - put a reference on the supplied file
+ * @f: file of which to put a reference on
+ *
+ * Put a reference on the supplied *f*.
+ */
+__bpf_kfunc void bpf_put_file(struct file *f)
+{
+	fput(f);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(lsm_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_task_mm_grab, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL);
 BTF_ID_FLAGS(func, bpf_mm_drop, KF_RELEASE);
+BTF_ID_FLAGS(func, bpf_get_task_exe_file,
+	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_get_mm_exe_file,
+	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE | KF_SLEEPABLE)
 BTF_KFUNCS_END(lsm_kfunc_set_ids)
 
 static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 5/9] bpf/selftests: add selftests for exe_file acquire/release BPF kfuncs
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (3 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06  7:40 ` [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths Matt Bobrowski
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

Add a new exe_file_kfunc test suite that is responsible for verifying
the behaviour of the newly added exe_file based BPF kfuncs.

For now, this new exe_file_kfunc test suite covers the following BPF
kfuncs:

struct file *bpf_get_task_exe_file(struct task_struct *task);
struct file *bpf_get_mm_exe_file(struct mm_struct *mm);
void bpf_put_file(struct file *f);

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 .../selftests/bpf/prog_tests/exe_file_kfunc.c |  49 +++++
 .../bpf/progs/exe_file_kfunc_common.h         |  23 +++
 .../bpf/progs/exe_file_kfunc_failure.c        | 181 ++++++++++++++++++
 .../bpf/progs/exe_file_kfunc_success.c        |  52 +++++
 4 files changed, 305 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/exe_file_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/exe_file_kfunc_success.c

diff --git a/tools/testing/selftests/bpf/prog_tests/exe_file_kfunc.c b/tools/testing/selftests/bpf/prog_tests/exe_file_kfunc.c
new file mode 100644
index 000000000000..5900c1d4e820
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/exe_file_kfunc.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#define _GNU_SOURCE
+#include <test_progs.h>
+
+#include "exe_file_kfunc_failure.skel.h"
+#include "exe_file_kfunc_success.skel.h"
+
+static void run_test(const char *prog_name)
+{
+	struct bpf_link *link;
+	struct bpf_program *prog;
+	struct exe_file_kfunc_success *skel;
+
+	skel = exe_file_kfunc_success__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "file_kfunc_success__open_and_load"))
+		return;
+
+	link = NULL;
+	prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+		goto cleanup;
+
+	link = bpf_program__attach(prog);
+	ASSERT_OK_PTR(link, "bpf_program__attach");
+cleanup:
+	bpf_link__destroy(link);
+	exe_file_kfunc_success__destroy(skel);
+}
+
+static const char * const success_tests[] = {
+	"get_task_exe_file_and_put_kfunc_from_current",
+	"get_task_exe_file_and_put_kfunc_from_argument",
+	"get_mm_exe_file_and_put_kfunc_from_current",
+};
+
+void test_exe_file_kfunc(void)
+{
+	int i = 0;
+
+	for (; i < ARRAY_SIZE(success_tests); i++) {
+		if (!test__start_subtest(success_tests[i]))
+			continue;
+		run_test(success_tests[i]);
+	}
+
+	RUN_TESTS(exe_file_kfunc_failure);
+}
diff --git a/tools/testing/selftests/bpf/progs/exe_file_kfunc_common.h b/tools/testing/selftests/bpf/progs/exe_file_kfunc_common.h
new file mode 100644
index 000000000000..6623bcc130c3
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/exe_file_kfunc_common.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#ifndef _FILE_KFUNC_COMMON_H
+#define _FILE_KFUNC_COMMON_H
+
+#include <vmlinux.h>
+#include <errno.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include "bpf_misc.h"
+
+struct mm_struct *bpf_task_mm_grab(struct task_struct *task) __ksym;
+void bpf_mm_drop(struct mm_struct *mm) __ksym;
+
+struct file *bpf_get_task_exe_file(struct task_struct *task) __ksym;
+struct file *bpf_get_mm_exe_file(struct mm_struct *mm) __ksym;
+void bpf_put_file(struct file *f) __ksym;
+
+char _license[] SEC("license") = "GPL";
+
+#endif /* _FILE_KFUNC_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/exe_file_kfunc_failure.c b/tools/testing/selftests/bpf/progs/exe_file_kfunc_failure.c
new file mode 100644
index 000000000000..8a4464481531
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/exe_file_kfunc_failure.c
@@ -0,0 +1,181 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "exe_file_kfunc_common.h"
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(get_task_exe_file_kfunc_null)
+{
+	struct file *acquired;
+
+	/* Can't pass a NULL pointer to bpf_get_task_exe_file(). */
+	acquired = bpf_get_task_exe_file(NULL);
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(get_mm_exe_file_kfunc_null)
+{
+	struct file *acquired;
+
+	/* Can't pass a NULL pointer to bpf_get_mm_exe_file(). */
+	acquired = bpf_get_mm_exe_file(NULL);
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/inode_getxattr")
+__failure __msg("arg#0 pointer type STRUCT task_struct must point to scalar, or struct with scalar")
+int BPF_PROG(get_task_exe_file_kfunc_fp)
+{
+	u64 x;
+	struct file *acquired;
+	struct task_struct *fp;
+
+	fp = (struct task_struct *)&x;
+	/* Can't pass random frame pointer to bpf_get_task_exe_file(). */
+	acquired = bpf_get_task_exe_file(fp);
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/inode_getxattr")
+__failure __msg("arg#0 pointer type STRUCT mm_struct must point to scalar, or struct with scalar")
+int BPF_PROG(get_mm_exe_file_kfunc_fp)
+{
+	int x;
+	struct file *acquired;
+	struct mm_struct *fp;
+
+	fp = (struct mm_struct *)&x;
+	/* Can't pass random frame pointer to bpf_get_mm_exe_file(). */
+	acquired = bpf_get_mm_exe_file(fp);
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(get_task_exe_file_kfunc_untrusted)
+{
+	struct file *acquired;
+	struct task_struct *parent;
+
+	/* Walking a trusted struct task_struct returned from
+	 * bpf_get_current_task_btf() yields an untrusted pointer. */
+	parent = bpf_get_current_task_btf()->parent;
+	/* Can't pass untrusted pointer to bpf_get_task_exe_file(). */
+	acquired = bpf_get_task_exe_file(parent);
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(get_mm_exe_file_kfunc_untrusted)
+{
+	struct file *acquired;
+	struct mm_struct *mm;
+
+	/* Walking a struct task_struct obtained from bpf_get_current_task_btf()
+	 * yields an untrusted pointer. */
+	mm = bpf_get_current_task_btf()->mm;
+	/* Can't pass untrusted pointer to bpf_get_mm_exe_file(). */
+	acquired = bpf_get_mm_exe_file(mm);
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Unreleased reference")
+int BPF_PROG(get_task_exe_file_kfunc_unreleased)
+{
+	struct file *acquired;
+
+	acquired = bpf_get_task_exe_file(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	__sink(acquired);
+
+	/* Acquired but never released. */
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Unreleased reference")
+int BPF_PROG(get_mm_exe_file_kfunc_unreleased)
+{
+	struct file *acquired;
+	struct mm_struct *mm;
+
+	mm = bpf_task_mm_grab(bpf_get_current_task_btf());
+	if (!mm)
+		return 0;
+
+	acquired = bpf_get_mm_exe_file(mm);
+	if (!acquired) {
+		bpf_mm_drop(mm);
+		return 0;
+	}
+	__sink(acquired);
+	bpf_mm_drop(mm);
+
+	/* Acquired but never released. */
+	return 0;
+}
+
+SEC("lsm/file_open")
+__failure __msg("program must be sleepable to call sleepable kfunc bpf_put_file")
+int BPF_PROG(put_file_kfunc_not_sleepable, struct file *f)
+{
+	struct file *acquired;
+
+	acquired = bpf_get_task_exe_file(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+
+	/* Can't call bpf_put_file() from non-sleepable BPF program. */
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("release kernel function bpf_put_file expects")
+int BPF_PROG(put_file_kfunc_unacquired, struct file *f)
+{
+	/* Can't release an unacquired pointer. */
+	bpf_put_file(f);
+
+	return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("calling kernel function bpf_get_task_exe_file is not allowed")
+int BPF_PROG(get_task_exe_file_kfunc_not_lsm_prog, struct task_struct *task)
+{
+	struct file *acquired;
+
+	/* bpf_get_task_exe_file() can only be called from BPF LSM program. */
+	acquired = bpf_get_task_exe_file(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/exe_file_kfunc_success.c b/tools/testing/selftests/bpf/progs/exe_file_kfunc_success.c
new file mode 100644
index 000000000000..ae789cb0a9ae
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/exe_file_kfunc_success.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "exe_file_kfunc_common.h"
+
+SEC("lsm.s/file_open")
+int BPF_PROG(get_task_exe_file_and_put_kfunc_from_current)
+{
+	struct file *acquired;
+
+	acquired = bpf_get_task_exe_file(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+int BPF_PROG(get_task_exe_file_and_put_kfunc_from_argument,
+	     struct task_struct *task)
+{
+	struct file *acquired;
+
+	acquired = bpf_get_task_exe_file(task);
+	if (!acquired)
+		return 0;
+	bpf_put_file(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+int BPF_PROG(get_mm_exe_file_and_put_kfunc_from_current)
+{
+	struct file *acquired;
+	struct mm_struct *mm;
+
+	mm = bpf_task_mm_grab(bpf_get_current_task_btf());
+	if (!mm)
+		return 0;
+
+	acquired = bpf_get_mm_exe_file(mm);
+	if (!acquired) {
+		bpf_mm_drop(mm);
+		return 0;
+	}
+	bpf_put_file(acquired);
+	bpf_mm_drop(mm);
+
+	return 0;
+}
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (4 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 5/9] bpf/selftests: add selftests for exe_file acquire/release BPF kfuncs Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06 11:47   ` Christian Brauner
  2024-03-06  7:40 ` [PATCH v2 bpf-next 7/9] bpf/selftests: add selftests for root/pwd path based BPF kfuncs Matt Bobrowski
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

Add the ability to obtain a reference on the root and pwd paths which
are nested within the fs_struct associated with a supplied
task_struct. Both fs_struct's root and pwd are commonly operated on in
BPF LSM program types and at times are further handed off to BPF
helpers and such. There needs to be a mechanism that supports BPF LSM
program types the ability to obtain stable handles to such paths in
order to avoid possible memory corruption bugs [0].

We provide this mechanism through the introduction of the following
new KF_ACQUIRE/KF_RELEASE BPF kfuncs:

struct path *bpf_get_task_fs_root(struct task_struct *task);
struct path *bpf_get_task_fs_pwd(struct task_struct *task);
void bpf_put_path(struct path *path);

Note that bpf_get_task_fs_root() and bpf_get_task_fs_pwd() are
effectively open-coded variants of the in-kernel helpers get_fs_root()
and get_fs_pwd(). We don't lean on these in-kernel helpers directly
within the newly introduced BPF kfuncs as leaning on them would be
rather awkward as we're wanting to return referenced path pointers
directly BPF LSM program types.

[0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 kernel/trace/bpf_trace.c | 83 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 539c58db74d7..84fd87ead20c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -10,6 +10,7 @@
 #include <linux/bpf_perf_event.h>
 #include <linux/btf.h>
 #include <linux/filter.h>
+#include <linux/fs_struct.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
 #include <linux/kprobes.h>
@@ -1569,6 +1570,83 @@ __bpf_kfunc void bpf_put_file(struct file *f)
 	fput(f);
 }
 
+/**
+ * bpf_get_task_fs_root - get a reference on the fs_struct's root path for the
+ * 			  supplied task_struct
+ * @Task: task_struct of which the fs_struct's root path to get a reference on
+ *
+ * Get a reference on the root path nested within the fs_struct of the
+ * associated *task*. The referenced path retruned from this kfunc must be
+ * released using bpf_put_path().
+ *
+ * Return: A referenced path pointer to the root path nested within the
+ * fs_struct of the supplied *task*, or NULL.
+ */
+__bpf_kfunc struct path *bpf_get_task_fs_root(struct task_struct *task)
+{
+	struct path *root;
+	struct fs_struct *fs;
+
+	task_lock(task);
+	fs = task->fs;
+	if (unlikely(fs)) {
+		task_unlock(task);
+		return NULL;
+	}
+
+	spin_lock(&fs->lock);
+	root = &fs->root;
+	path_get(root);
+	spin_unlock(&fs->lock);
+	task_unlock(task);
+
+	return root;
+}
+
+/**
+ * bpf_get_task_fs_pwd - get a reference on the fs_struct's pwd path for the
+ * 			 supplied task_struct
+ * @task: task_struct of which the fs_struct's pwd path to get a reference on
+ *
+ * Get a reference on the pwd path nested within the fs_struct of the associated
+ * *task*. The referenced path retruned from this kfunc must be released using
+ * bpf_put_path().
+ *
+ * Return: A referenced path pointer to the root path nested within the
+ * fs_struct of the supplied *task*, or NULL.
+ */
+__bpf_kfunc struct path *bpf_get_task_fs_pwd(struct task_struct *task)
+{
+	struct path *pwd;
+	struct fs_struct *fs;
+
+	task_lock(task);
+	fs = task->fs;
+	if (unlikely(fs)) {
+		task_unlock(task);
+		return NULL;
+	}
+
+	spin_lock(&fs->lock);
+	pwd = &fs->pwd;
+	path_get(pwd);
+	spin_unlock(&fs->lock);
+	task_unlock(task);
+
+	return pwd;
+}
+
+/**
+ * bpf_put_path - put a reference on the supplied path
+ * @path: path of which to put a reference on
+ *
+ * Put a reference on the supplied *path*.
+  */
+__bpf_kfunc void bpf_put_path(struct path *path)
+{
+	path_put(path);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(lsm_kfunc_set_ids)
@@ -1580,6 +1658,11 @@ BTF_ID_FLAGS(func, bpf_get_task_exe_file,
 BTF_ID_FLAGS(func, bpf_get_mm_exe_file,
 	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE | KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_get_task_fs_root,
+	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_get_task_fs_pwd,
+	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_put_path, KF_RELEASE | KF_SLEEPABLE)
 BTF_KFUNCS_END(lsm_kfunc_set_ids)
 
 static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 7/9] bpf/selftests: add selftests for root/pwd path based BPF kfuncs
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (5 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06  7:40 ` [PATCH v2 bpf-next 9/9] bpf/selftests: adapt selftests test_d_path for BPF kfunc bpf_path_d_path() Matt Bobrowski
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

Add a new path_kfunc test suite that is responsible for verifiying the
operability of the newly added root/pwd path based BPF kfuncs. This
test suite covers the following BPF kfuncs:

struct path *bpf_get_task_fs_root(struct task_struct *task);
struct path *bpf_get_task_fs_pwd(struct task_struct *task);
void bpf_put_path(struct path *path);

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 .../selftests/bpf/prog_tests/path_kfunc.c     |  48 ++++++++
 .../selftests/bpf/progs/path_kfunc_common.h   |  20 +++
 .../selftests/bpf/progs/path_kfunc_failure.c  | 114 ++++++++++++++++++
 .../selftests/bpf/progs/path_kfunc_success.c  |  30 +++++
 4 files changed, 212 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/path_kfunc.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_kfunc_success.c

diff --git a/tools/testing/selftests/bpf/prog_tests/path_kfunc.c b/tools/testing/selftests/bpf/prog_tests/path_kfunc.c
new file mode 100644
index 000000000000..9a8701a7999c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/path_kfunc.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#define _GNU_SOURCE
+#include <test_progs.h>
+
+#include "path_kfunc_failure.skel.h"
+#include "path_kfunc_success.skel.h"
+
+static void run_test(const char *prog_name)
+{
+	struct bpf_link *link;
+	struct bpf_program *prog;
+	struct path_kfunc_success *skel;
+
+	skel = path_kfunc_success__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "path_kfunc_success__open_and_load"))
+		return;
+
+	link = NULL;
+	prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+		goto cleanup;
+
+	link = bpf_program__attach(prog);
+	ASSERT_OK_PTR(link, "bpf_program__attach");
+cleanup:
+	bpf_link__destroy(link);
+	path_kfunc_success__destroy(skel);
+}
+
+static const char * const success_tests[] = {
+	"get_task_fs_root_and_put_from_current",
+	"get_task_fs_pwd_and_put_from_current",
+};
+
+void test_path_kfunc(void)
+{
+	int i = 0;
+
+	for (; i < ARRAY_SIZE(success_tests); i++) {
+		if (!test__start_subtest(success_tests[i]))
+			continue;
+		run_test(success_tests[i]);
+	}
+
+	RUN_TESTS(path_kfunc_failure);
+}
diff --git a/tools/testing/selftests/bpf/progs/path_kfunc_common.h b/tools/testing/selftests/bpf/progs/path_kfunc_common.h
new file mode 100644
index 000000000000..837dc03c136d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/path_kfunc_common.h
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#ifndef _PATH_KFUNC_COMMON_H
+#define _PATH_KFUNC_COMMON_H
+
+#include <vmlinux.h>
+#include <errno.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include "bpf_misc.h"
+
+char _license[] SEC("license") = "GPL";
+
+struct path *bpf_get_task_fs_root(struct task_struct *task) __ksym;
+struct path *bpf_get_task_fs_pwd(struct task_struct *task) __ksym;
+void bpf_put_path(struct path *path) __ksym;
+
+#endif /* _PATH_KFUNC_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/path_kfunc_failure.c b/tools/testing/selftests/bpf/progs/path_kfunc_failure.c
new file mode 100644
index 000000000000..a28797e245e3
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/path_kfunc_failure.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "path_kfunc_common.h"
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(get_task_fs_root_kfunc_null)
+{
+	struct path *acquired;
+
+	/* Can't pass a NULL pointer to bpf_get_task_fs_root(). */
+	acquired = bpf_get_task_fs_root(NULL);
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(get_task_fs_pwd_kfunc_null)
+{
+	struct path *acquired;
+
+	/* Can't pass a NULL pointer to bpf_get_task_fs_pwd(). */
+	acquired = bpf_get_task_fs_pwd(NULL);
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(get_task_fs_root_kfunc_untrusted, struct task_struct *task)
+{
+	struct path *acquired;
+	struct task_struct *parent;
+
+	/* Walking the struct task_struct will yield an untrusted pointer. */
+	parent = task->parent;
+	if (!parent)
+		return 0;
+
+	acquired = bpf_get_task_fs_root(parent);
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(get_task_fs_pwd_kfunc_untrusted, struct task_struct *task)
+{
+	struct path *acquired;
+	struct task_struct *parent;
+
+	/* Walking the struct task_struct will yield an untrusted pointer. */
+	parent = task->parent;
+	if (!parent)
+		return 0;
+
+	acquired = bpf_get_task_fs_pwd(parent);
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Unreleased reference")
+int BPF_PROG(get_task_fs_root_kfunc_unreleased)
+{
+	struct path *acquired;
+
+	acquired = bpf_get_task_fs_root(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	__sink(acquired);
+
+	/* Acquired but never released. */
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Unreleased reference")
+int BPF_PROG(get_task_fs_pwd_kfunc_unreleased)
+{
+	struct path *acquired;
+
+	acquired = bpf_get_task_fs_pwd(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	__sink(acquired);
+
+	/* Acquired but never released. */
+	return 0;
+}
+
+SEC("lsm.s/inode_getattr")
+__failure __msg("release kernel function bpf_put_path expects refcounted PTR_TO_BTF_ID")
+int BPF_PROG(put_path_kfunc_unacquired, struct path *path)
+{
+	/* Can't release an unacquired pointer. */
+	bpf_put_path(path);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/path_kfunc_success.c b/tools/testing/selftests/bpf/progs/path_kfunc_success.c
new file mode 100644
index 000000000000..8fc8e3c51405
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/path_kfunc_success.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "path_kfunc_common.h"
+
+SEC("lsm.s/file_open")
+int BPF_PROG(get_task_fs_root_and_put_from_current)
+{
+	struct path *acquired;
+
+	acquired = bpf_get_task_fs_root(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+int BPF_PROG(get_task_fs_pwd_and_put_from_current)
+{
+	struct path *acquired;
+
+	acquired = bpf_get_task_fs_pwd(bpf_get_current_task_btf());
+	if (!acquired)
+		return 0;
+	bpf_put_path(acquired);
+
+	return 0;
+}
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 9/9] bpf/selftests: adapt selftests test_d_path for BPF kfunc bpf_path_d_path()
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (6 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 7/9] bpf/selftests: add selftests for root/pwd path based BPF kfuncs Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06  7:40 ` [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based " Matt Bobrowski
  2024-03-06 11:21 ` [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Christian Brauner
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

Adapt the existing test_d_path test suite to cover the operability of
the newly introduced trusted d_path() based BPF kfunc
bpf_path_d_path().

This new BPF kfunc is functionally identical with regards to its path
reconstruction based capabilities to that of its predecessor
bpf_d_path(), so it makes sense to recycle the existing test_d_path
suite.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 .../testing/selftests/bpf/prog_tests/d_path.c | 80 +++++++++++++++++++
 .../selftests/bpf/progs/d_path_common.h       | 35 ++++++++
 .../bpf/progs/d_path_kfunc_failure.c          | 66 +++++++++++++++
 .../bpf/progs/d_path_kfunc_success.c          | 25 ++++++
 .../testing/selftests/bpf/progs/test_d_path.c | 20 +----
 .../bpf/progs/test_d_path_check_rdonly_mem.c  |  8 +-
 .../bpf/progs/test_d_path_check_types.c       |  8 +-
 7 files changed, 209 insertions(+), 33 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_kfunc_failure.c
 create mode 100644 tools/testing/selftests/bpf/progs/d_path_kfunc_success.c

diff --git a/tools/testing/selftests/bpf/prog_tests/d_path.c b/tools/testing/selftests/bpf/prog_tests/d_path.c
index ccc768592e66..cc5c886fe59b 100644
--- a/tools/testing/selftests/bpf/prog_tests/d_path.c
+++ b/tools/testing/selftests/bpf/prog_tests/d_path.c
@@ -11,6 +11,8 @@
 #include "test_d_path.skel.h"
 #include "test_d_path_check_rdonly_mem.skel.h"
 #include "test_d_path_check_types.skel.h"
+#include "d_path_kfunc_failure.skel.h"
+#include "d_path_kfunc_success.skel.h"
 
 /* sys_close_range is not around for long time, so let's
  * make sure we can call it on systems with older glibc
@@ -124,6 +126,13 @@ static void test_d_path_basic(void)
 	struct test_d_path *skel;
 	int err;
 
+	/*
+	 * Carrying global state across test function invocations is super
+	 * gross, but it was late and I was tired and I just wanted to get the
+	 * darn test working. Zero'ing this out was a simple no brainer.
+	 */
+	memset(&src, 0, sizeof(src));
+
 	skel = test_d_path__open_and_load();
 	if (CHECK(!skel, "setup", "d_path skeleton failed\n"))
 		goto cleanup;
@@ -195,8 +204,72 @@ static void test_d_path_check_types(void)
 	test_d_path_check_types__destroy(skel);
 }
 
+static struct bpf_path_d_path_t {
+	const char *prog_name;
+} success_test_cases[] = {
+	{
+		.prog_name = "path_d_path_from_path_argument",
+	},
+};
+
+static void test_bpf_path_d_path(struct bpf_path_d_path_t *t)
+{
+	int i, ret;
+	struct bpf_link *link;
+	struct bpf_program *prog;
+	struct d_path_kfunc_success__bss *bss;
+	struct d_path_kfunc_success *skel;
+
+	/*
+	 * Carrying global state across function invocations is super gross, but
+	 * it was late and I was tired and I just wanted to get the darn test
+	 * working. Zero'ing this out was a simple no brainer.
+	 */
+	memset(&src, 0, sizeof(src));
+
+	skel = d_path_kfunc_success__open();
+	if (!ASSERT_OK_PTR(skel, "d_path_kfunc_success__open"))
+		return;
+
+	bss = skel->bss;
+	bss->my_pid = getpid();
+
+	ret = d_path_kfunc_success__load(skel);
+	if (CHECK(ret, "setup", "d_path_kfunc_success__load\n"))
+		goto cleanup;
+
+	link = NULL;
+	prog = bpf_object__find_program_by_name(skel->obj, t->prog_name);
+	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+		goto cleanup;
+
+	link = bpf_program__attach(prog);
+	if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
+		goto cleanup;
+
+	ret = trigger_fstat_events(bss->my_pid);
+	if (ret < 0)
+		goto cleanup;
+
+	for (i = 0; i < MAX_FILES; i++) {
+		CHECK(strncmp(src.paths[i], bss->paths_stat[i], MAX_PATH_LEN),
+		      "check", "failed to get stat path[%d]: %s vs %s\n", i,
+		      src.paths[i], bss->paths_stat[i]);
+		CHECK(bss->rets_stat[i] != strlen(bss->paths_stat[i]) + 1,
+		      "check",
+		      "failed to match stat return [%d]: %d vs %zd [%s]\n", i,
+		      bss->rets_stat[i], strlen(bss->paths_stat[i]) + 1,
+		      bss->paths_stat[i]);
+	}
+cleanup:
+	bpf_link__destroy(link);
+	d_path_kfunc_success__destroy(skel);
+}
+
 void test_d_path(void)
 {
+	int i = 0;
+
 	if (test__start_subtest("basic"))
 		test_d_path_basic();
 
@@ -205,4 +278,11 @@ void test_d_path(void)
 
 	if (test__start_subtest("check_alloc_mem"))
 		test_d_path_check_types();
+
+	for (; i < ARRAY_SIZE(success_test_cases); i++) {
+		if (!test__start_subtest(success_test_cases[i].prog_name))
+			continue;
+		test_bpf_path_d_path(&success_test_cases[i]);
+	}
+	RUN_TESTS(d_path_kfunc_failure);
 }
diff --git a/tools/testing/selftests/bpf/progs/d_path_common.h b/tools/testing/selftests/bpf/progs/d_path_common.h
new file mode 100644
index 000000000000..276331b5ff9f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/d_path_common.h
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#ifndef _D_PATH_COMMON_H
+#define _D_PATH_COMMON_H
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include "bpf_misc.h"
+
+#define MAX_PATH_LEN 128
+#define MAX_FILES 7
+
+extern const int bpf_prog_active __ksym;
+int bpf_path_d_path(struct path *path, char *buf, int buflen) __ksym;
+
+pid_t my_pid = 0;
+
+__u32 cnt_stat = 0;
+__u32 cnt_close = 0;
+
+char paths_stat[MAX_FILES][MAX_PATH_LEN] = {};
+char paths_close[MAX_FILES][MAX_PATH_LEN] = {};
+
+int rets_stat[MAX_FILES] = {};
+int rets_close[MAX_FILES] = {};
+
+int called_stat = 0;
+int called_close = 0;
+
+char _license[] SEC("license") = "GPL";
+
+#endif /* _D_PATH_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/d_path_kfunc_failure.c b/tools/testing/selftests/bpf/progs/d_path_kfunc_failure.c
new file mode 100644
index 000000000000..9da5f0d395c9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/d_path_kfunc_failure.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "d_path_common.h"
+
+char buf[MAX_PATH_LEN] = {};
+
+SEC("lsm.s/file_open")
+__failure __msg("Possibly NULL pointer passed to trusted arg0")
+int BPF_PROG(path_d_path_kfunc_null)
+{
+	/* Can't pass NULL value to bpf_path_d_path() kfunc. */
+	bpf_path_d_path(NULL, buf, sizeof(buf));
+	return 0;
+}
+
+SEC("fentry/vfs_open")
+__failure __msg("calling kernel function bpf_path_d_path is not allowed")
+int BPF_PROG(path_d_path_kfunc_non_lsm, struct path *path, struct file *f)
+{
+	/* Calling bpf_path_d_path() kfunc from a non-sleepable and non-LSM
+	 * based program isn't permitted.
+	 */
+	bpf_path_d_path(path, buf, sizeof(buf));
+	return 0;
+}
+
+SEC("lsm.s/task_alloc")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(path_d_path_kfunc_untrusted_from_argument, struct task_struct *task)
+{
+	struct path *root;
+
+	/* Walking a trusted argument yields an untrusted pointer. */
+	root = &task->fs->root;
+	bpf_path_d_path(root, buf, sizeof(buf));
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
+{
+	struct path *pwd;
+	struct task_struct *current;
+
+	current = bpf_get_current_task_btf();
+	/* Walking a trusted pointer returned from bpf_get_current_task_btf()
+	 * yields and untrusted pointer. */
+	pwd = &current->fs->pwd;
+	bpf_path_d_path(pwd, buf, sizeof(buf));
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("R1 must have zero offset when passed to release func or trusted arg to kfunc")
+int BPF_PROG(path_d_path_kfunc_trusted_variable_offset, struct file *file)
+{
+	/* Passing variable offsets from a trusted aren't supported just yet,
+	 * despite being perfectly OK i.e. file->f_path. Once the BPF verifier
+	 * has been updated to handle this case, this test can be removed. For
+	 * now, ensure we reject the BPF program upon load if this is attempted.
+	 */
+	bpf_path_d_path(&file->f_path, buf, sizeof(buf));
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/d_path_kfunc_success.c b/tools/testing/selftests/bpf/progs/d_path_kfunc_success.c
new file mode 100644
index 000000000000..72d1a64618d1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/d_path_kfunc_success.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+
+#include "d_path_common.h"
+
+SEC("lsm.s/inode_getattr")
+int BPF_PROG(path_d_path_from_path_argument, struct path *path)
+{
+	u32 cnt = cnt_stat;
+	int ret;
+	pid_t pid;
+
+	pid = bpf_get_current_pid_tgid() >> 32;
+	if (pid != my_pid)
+		return 0;
+
+	if (cnt >= MAX_FILES)
+		return 0;
+
+	ret = bpf_path_d_path(path, paths_stat[cnt], MAX_PATH_LEN);
+	rets_stat[cnt] = ret;
+	cnt_stat++;
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_d_path.c b/tools/testing/selftests/bpf/progs/test_d_path.c
index 84e1f883f97b..5bdfa4abb5f6 100644
--- a/tools/testing/selftests/bpf/progs/test_d_path.c
+++ b/tools/testing/selftests/bpf/progs/test_d_path.c
@@ -1,22 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#include "vmlinux.h"
-#include <bpf/bpf_helpers.h>
-#include <bpf/bpf_tracing.h>
-
-#define MAX_PATH_LEN		128
-#define MAX_FILES		7
-
-pid_t my_pid = 0;
-__u32 cnt_stat = 0;
-__u32 cnt_close = 0;
-char paths_stat[MAX_FILES][MAX_PATH_LEN] = {};
-char paths_close[MAX_FILES][MAX_PATH_LEN] = {};
-int rets_stat[MAX_FILES] = {};
-int rets_close[MAX_FILES] = {};
-
-int called_stat = 0;
-int called_close = 0;
+#include "d_path_common.h"
 
 SEC("fentry/security_inode_getattr")
 int BPF_PROG(prog_stat, struct path *path, struct kstat *stat,
@@ -61,5 +45,3 @@ int BPF_PROG(prog_close, struct file *file, void *id)
 	cnt_close++;
 	return 0;
 }
-
-char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_d_path_check_rdonly_mem.c b/tools/testing/selftests/bpf/progs/test_d_path_check_rdonly_mem.c
index 27c27cff6a3a..6094a58321a4 100644
--- a/tools/testing/selftests/bpf/progs/test_d_path_check_rdonly_mem.c
+++ b/tools/testing/selftests/bpf/progs/test_d_path_check_rdonly_mem.c
@@ -1,11 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2022 Google */
 
-#include "vmlinux.h"
-#include <bpf/bpf_helpers.h>
-#include <bpf/bpf_tracing.h>
-
-extern const int bpf_prog_active __ksym;
+#include "d_path_common.h"
 
 SEC("fentry/security_inode_getattr")
 int BPF_PROG(d_path_check_rdonly_mem, struct path *path, struct kstat *stat,
@@ -24,5 +20,3 @@ int BPF_PROG(d_path_check_rdonly_mem, struct path *path, struct kstat *stat,
 	}
 	return 0;
 }
-
-char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_d_path_check_types.c b/tools/testing/selftests/bpf/progs/test_d_path_check_types.c
index 7e02b7361307..6d4204136489 100644
--- a/tools/testing/selftests/bpf/progs/test_d_path_check_types.c
+++ b/tools/testing/selftests/bpf/progs/test_d_path_check_types.c
@@ -1,10 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#include "vmlinux.h"
-#include <bpf/bpf_helpers.h>
-#include <bpf/bpf_tracing.h>
-
-extern const int bpf_prog_active __ksym;
+#include "d_path_common.h"
 
 struct {
 	__uint(type, BPF_MAP_TYPE_RINGBUF);
@@ -28,5 +24,3 @@ int BPF_PROG(d_path_check_rdonly_mem, struct path *path, struct kstat *stat,
 	}
 	return 0;
 }
-
-char _license[] SEC("license") = "GPL";
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (7 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 9/9] bpf/selftests: adapt selftests test_d_path for BPF kfunc bpf_path_d_path() Matt Bobrowski
@ 2024-03-06  7:40 ` Matt Bobrowski
  2024-03-06 11:21 ` [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Christian Brauner
  9 siblings, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-06  7:40 UTC (permalink / raw)
  To: bpf
  Cc: ast, andrii, kpsingh, jannh, jolsa, daniel, brauner, torvalds,
	linux-fsdevel

The legacy bpf_d_path() helper did not operate on trusted pointer
arguments, therefore under certain circumstances was susceptible to
memory corruption issues [0]. This new d_path() based BPF kfunc
bpf_path_d_path() makes use of the trusted pointer argument constraint
KF_TRUSTED_ARGS. Making use of the KF_TRUSTED_ARGS constraint will
ensure that d_path() may only be called when and underlying BPF
program holds a stable handle for a given path.

For now, we restrict bpf_path_d_path() to BPF LSM program types, but
this may be relaxed in the future.

Notably, we are consciously not retroactively enforcing the
KF_TRUSTED_ARGS constraint onto the legacy bpf_d_path() helper, as
that would lead to wide-scale BPF program breakage.

[0]
https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
---
 kernel/trace/bpf_trace.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 84fd87ead20c..4989639153cd 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1647,6 +1647,37 @@ __bpf_kfunc void bpf_put_path(struct path *path)
 	path_put(path);
 }
 
+/**
+ * bpf_path_d_path - resolve the pathname for a given path
+ * @path: path to resolve the pathname for
+ * @buf: buffer to return the resolved path value in
+ * @buflen: length of the supplied buffer
+ *
+ * Resolve the pathname for the supplied trusted *path* in *buf*. This kfunc is
+ * the trusted/safer variant of the legacy bpf_d_path() helper and should be
+ * used in place of bpf_d_path() whenever possible.
+ *
+ * Return: A strictly positive integer corresponding to the length of the string
+ * representing the resolved pathname, including the NUL termination
+ * character. On error, a negative integer is returned.
+ */
+__bpf_kfunc int bpf_path_d_path(struct path *path, char *buf, int buflen)
+{
+	int len;
+	char *ret;
+
+	if (buflen <= 0)
+		return -EINVAL;
+
+	ret = d_path(path, buf, buflen);
+	if (IS_ERR(ret))
+		return PTR_ERR(ret);
+
+	len = buf + buflen - ret;
+	memmove(buf, ret, len);
+	return len;
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(lsm_kfunc_set_ids)
@@ -1663,6 +1694,7 @@ BTF_ID_FLAGS(func, bpf_get_task_fs_root,
 BTF_ID_FLAGS(func, bpf_get_task_fs_pwd,
 	     KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_put_path, KF_RELEASE | KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS | KF_SLEEPABLE)
 BTF_KFUNCS_END(lsm_kfunc_set_ids)
 
 static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.44.0.278.ge034bb2e1d-goog

/M

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
                   ` (8 preceding siblings ...)
  2024-03-06  7:40 ` [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based " Matt Bobrowski
@ 2024-03-06 11:21 ` Christian Brauner
  2024-03-06 12:13   ` Christian Brauner
  9 siblings, 1 reply; 31+ messages in thread
From: Christian Brauner @ 2024-03-06 11:21 UTC (permalink / raw)
  To: Matt Bobrowski
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel

On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> G'day All,
> 
> The original cover letter providing background context and motivating
> factors around the needs for the BPF kfuncs introduced within this
> patch series can be found here [0], so please do reference that if
> need be.
> 
> Notably, one of the main contention points within v1 of this patch
> series was that we were effectively leaning on some preexisting
> in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> within some of the newly introduced BPF kfuncs. As noted in my
> response here [1] though, I struggle to understand the technical
> reasoning behind why exposing such in-kernel helpers, specifically
> only to BPF LSM program types in the form of BPF kfuncs, is inherently
> a terrible idea. So, until someone provides me with a sound technical
> explanation as to why this cannot or should not be done, I'll continue
> to lean on them. The alternative is to reimplement the necessary
> in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.

You may lean as much as you like. What I've reacted to is that you've
(not you specifically, I'm sure) messed up. You've exposed d_path() to
users  without understanding that it wasn't safe apparently.

And now we get patches that use the self-inflicted brokeness as an
argument to expose a bunch of other low-level helpers to fix that.

The fact that it's "just bpf LSM" programs doesn't alleviate any
concerns whatsoever. Not just because that is just an entry vector but
also because we have LSMs induced API abuse that we only ever get to see
the fallout from when we refactor apis and then it causes pain for the vfs.

I'll take another look at the proposed helpers you need as bpf kfuncs
and I'll give my best not to be overly annoyed by all of this. I have no
intention of not helping you quite the opposite but I'm annoyed that
we're here in the first place.

What I want is to stop this madness of exposing stuff to users without
fully understanding it's semantics and required guarantees.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file
  2024-03-06  7:40 ` [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file Matt Bobrowski
@ 2024-03-06 11:31   ` Christian Brauner
  0 siblings, 0 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-06 11:31 UTC (permalink / raw)
  To: Matt Bobrowski
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel

On Wed, Mar 06, 2024 at 07:40:00AM +0000, Matt Bobrowski wrote:
> It is rather common for BPF LSM program types to perform the struct
> walk current->mm->exe_file and subsequently operate on fields of the
> backing file. At times, some of these operations involve passing a
> exe_file's field on to BPF helpers and such
> i.e. bpf_d_path(&current->mm->exe_file->f_path). However, doing so
> isn't necessarily always reliable as the backing file that exe_file is
> pointing to may be in the midst of being torn down and handing
> anything contained within this file to BPF helpers and such can lead
> to memory corruption issues [0].
> 
> To alleviate possibly operating on semi-torn down instances of
> current->mm->exe_file we introduce a set of BPF kfuncs that posses
> KF_ACQUIRE/KF_RELEASE based semantics. Such BPF kfuncs will allow BPF
> LSM program types to reliably get/put a reference on a
> current->mm->exe_file.
> 
> The following new BPF kfuncs have been added:
> 
> struct file *bpf_get_task_exe_file(struct task_struct *task);
> struct file *bpf_get_mm_exe_file(struct mm_struct *mm);
> void bpf_put_file(struct file *f);
> 
> Internally, these new BPF kfuncs simply call the preexisting in-kernel
> functions get_task_exe_file(), get_mm_exe_file(), and fput()
> accordingly. From a technical standpoint, there's absolutely no need
> to re-implement such helpers just for BPF as they're currently scoped
> to BPF LSM program types.
> 
> Note that we explicitly do not explicitly rely on the use of very low
> level in-kernel functions like get_file_rcu() and get_file_active() to
> acquire a reference on current->mm->exe_file and such. This is super
> subtle code and we probably want to avoid exposing any such subtleties
> to BPF in the form of BPF kfuncs. Additionally, the usage of a double
> pointer i.e. struct file **, isn't something that the BPF verifier
> currently recognizes nor has any intention to recognize for the
> foreseeable future.
> 
> [0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/
> 
> Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
> ---

get_mm_exe_file() and get_task_exe_file() aren't even exported to
modules which makes me a lot more hesitant.

>  kernel/trace/bpf_trace.c | 56 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 801808b6efb0..539c58db74d7 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1518,12 +1518,68 @@ __bpf_kfunc void bpf_mm_drop(struct mm_struct *mm)
>  	mmdrop(mm);
>  }
>  
> +/**
> + * bpf_get_task_exe_file - get a reference on the exe_file associated with the
> + * 	       		   mm_struct that is nested within the supplied
> + * 	       		   task_struct
> + * @task: task_struct of which the nested mm_struct's exe_file is to be
> + * referenced
> + *
> + * Get a reference on the exe_file that is associated with the mm_struct nested
> + * within the supplied *task*. A reference on a file pointer acquired by this
> + * kfunc must be released using bpf_put_file(). Internally, this kfunc leans on
> + * get_task_exe_file(), such that calling bpf_get_task_exe_file() would be
> + * analogous to calling get_task_exe_file() outside of BPF program context.
> + *
> + * Return: A referenced pointer to the exe_file associated with the mm_struct
> + * nested in the supplied *task*, or NULL.
> + */
> +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
> +{
> +	return get_task_exe_file(task);
> +}
> +
> +/**
> + * bpf_get_mm_exe_file - get a reference on the exe_file for the supplied
> + * 			 mm_struct.
> + * @mm: mm_struct of which the exe_file to get a reference on
> + *
> + * Get a reference on the exe_file associated with the supplied *mm*. A
> + * reference on a file pointer acquired by this kfunc must be released using
> + * bpf_put_file(). Internally, this kfunc leans on get_mm_exe_file(), such that
> + * calling bpf_get_mm_exe_file() would be analogous to calling get_mm_exe_file()
> + * outside of BPF program context.
> + *
> + * Return: A referenced file pointer to the exe_file for the supplied *mm*, or
> + * NULL.
> + */
> +__bpf_kfunc struct file *bpf_get_mm_exe_file(struct mm_struct *mm)
> +{
> +	return get_mm_exe_file(mm);
> +}
> +
> +/**
> + * bpf_put_file - put a reference on the supplied file
> + * @f: file of which to put a reference on
> + *
> + * Put a reference on the supplied *f*.
> + */
> +__bpf_kfunc void bpf_put_file(struct file *f)
> +{
> +	fput(f);
> +}

That's probably reasonable because this is already exported to modules.
But that's not a generic argument. So there'll never be kfuncs for
__fput_sync(), flushed_delayed_fput() and other special purpose file put
or get helpers.

Conditio sine qua non is that all such kfunc definitions must live in
the respective file in fs/. They won't be hidden somewhere in
kernel/bpf/.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths
  2024-03-06  7:40 ` [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths Matt Bobrowski
@ 2024-03-06 11:47   ` Christian Brauner
  0 siblings, 0 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-06 11:47 UTC (permalink / raw)
  To: Matt Bobrowski
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel

On Wed, Mar 06, 2024 at 07:40:12AM +0000, Matt Bobrowski wrote:
> Add the ability to obtain a reference on the root and pwd paths which
> are nested within the fs_struct associated with a supplied
> task_struct. Both fs_struct's root and pwd are commonly operated on in
> BPF LSM program types and at times are further handed off to BPF
> helpers and such. There needs to be a mechanism that supports BPF LSM
> program types the ability to obtain stable handles to such paths in
> order to avoid possible memory corruption bugs [0].
> 
> We provide this mechanism through the introduction of the following
> new KF_ACQUIRE/KF_RELEASE BPF kfuncs:
> 
> struct path *bpf_get_task_fs_root(struct task_struct *task);
> struct path *bpf_get_task_fs_pwd(struct task_struct *task);
> void bpf_put_path(struct path *path);
> 
> Note that bpf_get_task_fs_root() and bpf_get_task_fs_pwd() are

Right now all I'm seeing are requests for exporting a bunch of helpers
with no clear explanation other than "This is common in BPF LSM
programs.". So not going to happen if this is some private users pet bpf
program. Where's that bpf lsm program that has to use this?

> effectively open-coded variants of the in-kernel helpers get_fs_root()
> and get_fs_pwd(). We don't lean on these in-kernel helpers directly
> within the newly introduced BPF kfuncs as leaning on them would be
> rather awkward as we're wanting to return referenced path pointers
> directly BPF LSM program types.
> 
> [0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/
> 
> Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
> ---
>  kernel/trace/bpf_trace.c | 83 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 539c58db74d7..84fd87ead20c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -10,6 +10,7 @@
>  #include <linux/bpf_perf_event.h>
>  #include <linux/btf.h>
>  #include <linux/filter.h>
> +#include <linux/fs_struct.h>
>  #include <linux/uaccess.h>
>  #include <linux/ctype.h>
>  #include <linux/kprobes.h>
> @@ -1569,6 +1570,83 @@ __bpf_kfunc void bpf_put_file(struct file *f)
>  	fput(f);
>  }
>  
> +/**
> + * bpf_get_task_fs_root - get a reference on the fs_struct's root path for the
> + * 			  supplied task_struct
> + * @Task: task_struct of which the fs_struct's root path to get a reference on
> + *
> + * Get a reference on the root path nested within the fs_struct of the
> + * associated *task*. The referenced path retruned from this kfunc must be
> + * released using bpf_put_path().
> + *
> + * Return: A referenced path pointer to the root path nested within the
> + * fs_struct of the supplied *task*, or NULL.
> + */
> +__bpf_kfunc struct path *bpf_get_task_fs_root(struct task_struct *task)
> +{
> +	struct path *root;
> +	struct fs_struct *fs;
> +
> +	task_lock(task);
> +	fs = task->fs;
> +	if (unlikely(fs)) {
> +		task_unlock(task);
> +		return NULL;
> +	}
> +
> +	spin_lock(&fs->lock);
> +	root = &fs->root;
> +	path_get(root);
> +	spin_unlock(&fs->lock);
> +	task_unlock(task);
> +
> +	return root;
> +}
> +
> +/**
> + * bpf_get_task_fs_pwd - get a reference on the fs_struct's pwd path for the
> + * 			 supplied task_struct
> + * @task: task_struct of which the fs_struct's pwd path to get a reference on
> + *
> + * Get a reference on the pwd path nested within the fs_struct of the associated
> + * *task*. The referenced path retruned from this kfunc must be released using
> + * bpf_put_path().
> + *
> + * Return: A referenced path pointer to the root path nested within the
> + * fs_struct of the supplied *task*, or NULL.
> + */
> +__bpf_kfunc struct path *bpf_get_task_fs_pwd(struct task_struct *task)
> +{
> +	struct path *pwd;
> +	struct fs_struct *fs;
> +
> +	task_lock(task);
> +	fs = task->fs;
> +	if (unlikely(fs)) {
> +		task_unlock(task);
> +		return NULL;
> +	}
> +
> +	spin_lock(&fs->lock);
> +	pwd = &fs->pwd;
> +	path_get(pwd);
> +	spin_unlock(&fs->lock);
> +	task_unlock(task);
> +
> +	return pwd;
> +}
> +
> +/**
> + * bpf_put_path - put a reference on the supplied path
> + * @path: path of which to put a reference on
> + *
> + * Put a reference on the supplied *path*.
> +  */
> +__bpf_kfunc void bpf_put_path(struct path *path)
> +{
> +	path_put(path);
> +}

Probably ok since it's exported to modules but same condition as
mentioned in my earlier mail.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct
  2024-03-06  7:39 ` [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct Matt Bobrowski
@ 2024-03-06 11:50   ` Christian Brauner
  0 siblings, 0 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-06 11:50 UTC (permalink / raw)
  To: Matt Bobrowski, linux-mm
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel, Andrew Morton

On Wed, Mar 06, 2024 at 07:39:31AM +0000, Matt Bobrowski wrote:
> A BPF LSM program will at times introspect the mm_struct that is
> nested within a given task_struct. Such introspection performed by a
> BPF LSM program may involve reading virtual addresses out from fields
> like arg_start/arg_end and env_start/env_end, or reading fields
> directly out from the backing exe_file. In order to perform reliable
> reads against fields contained within mm_struct, we need to introduce
> a new set of BPF kfuncs that have the ability to acquire and release
> references on the mm_struct that is nested within a task_struct.
> 
> The following BPF kfuncs have been added in order to support this
> capability:
> 
> struct mm_struct *bpf_task_mm_grab(struct task_struct *task);
> void bpf_mm_drop(struct mm_struct *mm);
> 
> These new BPF kfuncs are pretty self-explanatory, but in kernel terms
> bpf_task_mm_grab() effectively allows you to get a reference on the
> mm_struct nested within a supplied task_struct. Whereas, bpf_mm_drop()
> allows you put a reference on a previously gotten mm_struct
> reference. Both BPF kfuncs are also backed by BPF's respective
> KF_ACQUIRE/KF_RELEASE semantics, ensuring that the BPF program behaves
> in accordance to the constraints enforced upon it when operating on
> reference counted in-kernel data structures.
> 
> Notably, these newly added BPF kfuncs are simple wrappers around the
> mmgrab() and mmdrop() in-kernel helpers. Both mmgrab() and mmdrop()
> are used in favour of their somewhat similar counterparts mmget() and
> mmput() as they're considered to be the more lightweight variants in
> comparison, and there's no requirement to also pin the underlying
> address spaces just yet.
> 
> Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
> ---

That's not something I can in any way ACK or NAK. That's clearly mm.
And same question as in the other mail. What's the user of this? I find
it extremly strange that the justification is "some LSM program" needs
this. This is really an annoying justification when we can't even see
the users. With LSMs we can at least see what they're doing with this in
their hooks.

>  kernel/trace/bpf_trace.c | 47 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index f639663ac339..801808b6efb0 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1473,10 +1473,57 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
>  	return __vfs_getxattr(dentry, dentry->d_inode, name__str, value, value_len);
>  }
>  
> +/**
> + * bpf_task_mm_grab - get a reference on the mm_struct nested within the
> + * 		      supplied task_struct
> + * @task: task_struct nesting the mm_struct that is to be referenced
> + *
> + * Grab a reference on the mm_struct that is nested within the supplied
> + * *task*. This kfunc will return NULL for threads that do not possess a valid
> + * mm_struct. For example, those that are flagged as PF_KTHREAD. A reference on
> + * a mm_struct acquired by this kfunc must be released using bpf_mm_drop().
> + *
> + * This helper only pins the mm_struct and not necessarily the address space
> + * associated with the referenced mm_struct that is returned from this
> + * kfunc. Internally, this kfunc leans on mmgrab(), such that calling
> + * bpf_task_mm_grab() would be analogous to calling mmgrab() outside of BPF
> + * program context.
> + *
> + * Return: A referenced pointer to the mm_struct nested within the supplied
> + * *task*, or NULL.
> + */
> +__bpf_kfunc struct mm_struct *bpf_task_mm_grab(struct task_struct *task)
> +{
> +	struct mm_struct *mm;
> +
> +	task_lock(task);
> +	mm = task->mm;
> +	if (likely(mm))
> +		mmgrab(mm);
> +	task_unlock(task);
> +
> +	return mm;
> +}
> +
> +/**
> + * bpf_mm_drop - put a reference on the supplied mm_struct
> + * @mm: mm_struct of which to put a reference on
> + *
> + * Put a reference on the supplied *mm*. This kfunc internally leans on
> + * mmdrop(), such that calling bpf_mm_drop() would be analogous to calling
> + * mmdrop() outside of BPF program context.
> + */
> +__bpf_kfunc void bpf_mm_drop(struct mm_struct *mm)
> +{
> +	mmdrop(mm);
> +}
> +
>  __bpf_kfunc_end_defs();
>  
>  BTF_KFUNCS_START(lsm_kfunc_set_ids)
>  BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
> +BTF_ID_FLAGS(func, bpf_task_mm_grab, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL);
> +BTF_ID_FLAGS(func, bpf_mm_drop, KF_RELEASE);
>  BTF_KFUNCS_END(lsm_kfunc_set_ids)
>  
>  static int bpf_lsm_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
> -- 
> 2.44.0.278.ge034bb2e1d-goog
> 
> /M

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-06 11:21 ` [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Christian Brauner
@ 2024-03-06 12:13   ` Christian Brauner
  2024-03-06 21:44     ` Paul Moore
  2024-03-07  4:05     ` Alexei Starovoitov
  0 siblings, 2 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-06 12:13 UTC (permalink / raw)
  To: Matt Bobrowski
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel, Andrew Morton, linux-mm, linux-security-module

On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote:
> On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> > G'day All,
> > 
> > The original cover letter providing background context and motivating
> > factors around the needs for the BPF kfuncs introduced within this
> > patch series can be found here [0], so please do reference that if
> > need be.
> > 
> > Notably, one of the main contention points within v1 of this patch
> > series was that we were effectively leaning on some preexisting
> > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> > within some of the newly introduced BPF kfuncs. As noted in my
> > response here [1] though, I struggle to understand the technical
> > reasoning behind why exposing such in-kernel helpers, specifically
> > only to BPF LSM program types in the form of BPF kfuncs, is inherently
> > a terrible idea. So, until someone provides me with a sound technical
> > explanation as to why this cannot or should not be done, I'll continue
> > to lean on them. The alternative is to reimplement the necessary
> > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.
> 
> You may lean as much as you like. What I've reacted to is that you've
> (not you specifically, I'm sure) messed up. You've exposed d_path() to
> users  without understanding that it wasn't safe apparently.
> 
> And now we get patches that use the self-inflicted brokeness as an
> argument to expose a bunch of other low-level helpers to fix that.
> 
> The fact that it's "just bpf LSM" programs doesn't alleviate any
> concerns whatsoever. Not just because that is just an entry vector but
> also because we have LSMs induced API abuse that we only ever get to see
> the fallout from when we refactor apis and then it causes pain for the vfs.
> 
> I'll take another look at the proposed helpers you need as bpf kfuncs
> and I'll give my best not to be overly annoyed by all of this. I have no
> intention of not helping you quite the opposite but I'm annoyed that
> we're here in the first place.
> 
> What I want is to stop this madness of exposing stuff to users without
> fully understanding it's semantics and required guarantees.

So, looking at this series you're now asking us to expose:

(1) mmgrab()
(2) mmput()
(3) fput()
(5) get_mm_exe_file()
(4) get_task_exe_file()
(7) get_task_fs_pwd()
(6) get_task_fs_root()
(8) path_get()
(9) path_put()

in one go and the justification in all patches amounts to "This is
common in some BPF LSM programs".

So, broken stuff got exposed to users or at least a broken BPF LSM
program was written somewhere out there that is susceptible to UAFs
becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
So you're now scrambling to fix this by asking for a bunch of low-level
exports.

What is the guarantee that you don't end up writing another BPF LSM that
abuses these exports in a way that causes even more issues and then
someone else comes back asking for the next round of bpf funcs to be
exposed to fix it.

The difference between a regular LSM asking about this and a BPF LSM
program is that we can see in the hook implementation what the LSM
intends to do with this and we can judge whether that's safe or not.

Here you're asking us to do this blindfolded. So I feel we can't really
ACK a series such as this without actually seeing what is intended to be
done with all these helpers that you want as kfuncs. Not after the
previous brokenness.

In any case, you need separate ACKs from mm for the mmgrab()/mmput()
kfuncs as well.

Because really, all I see immediately supportable is the addition of a
safe variant of bpf making use of the trusted pointer argument
constraint:

[PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-06 12:13   ` Christian Brauner
@ 2024-03-06 21:44     ` Paul Moore
  2024-03-07  4:05     ` Alexei Starovoitov
  1 sibling, 0 replies; 31+ messages in thread
From: Paul Moore @ 2024-03-06 21:44 UTC (permalink / raw)
  To: Christian Brauner, Matt Bobrowski
  Cc: bpf, ast, andrii, kpsingh, jannh, jolsa, daniel, torvalds,
	linux-fsdevel, Andrew Morton, linux-mm, linux-security-module

On Wed, Mar 6, 2024 at 7:14 AM Christian Brauner <brauner@kernel.org> wrote:
> On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote:
> > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> > > G'day All,
> > >
> > > The original cover letter providing background context and motivating
> > > factors around the needs for the BPF kfuncs introduced within this
> > > patch series can be found here [0], so please do reference that if
> > > need be.
> > >
> > > Notably, one of the main contention points within v1 of this patch
> > > series was that we were effectively leaning on some preexisting
> > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> > > within some of the newly introduced BPF kfuncs. As noted in my
> > > response here [1] though, I struggle to understand the technical
> > > reasoning behind why exposing such in-kernel helpers, specifically
> > > only to BPF LSM program types in the form of BPF kfuncs, is inherently
> > > a terrible idea. So, until someone provides me with a sound technical
> > > explanation as to why this cannot or should not be done, I'll continue
> > > to lean on them. The alternative is to reimplement the necessary
> > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.
> >
> > You may lean as much as you like. What I've reacted to is that you've
> > (not you specifically, I'm sure) messed up. You've exposed d_path() to
> > users  without understanding that it wasn't safe apparently.
> >
> > And now we get patches that use the self-inflicted brokeness as an
> > argument to expose a bunch of other low-level helpers to fix that.
> >
> > The fact that it's "just bpf LSM" programs doesn't alleviate any
> > concerns whatsoever. Not just because that is just an entry vector but
> > also because we have LSMs induced API abuse that we only ever get to see
> > the fallout from when we refactor apis and then it causes pain for the vfs.

Let's not start throwing stones at the in-tree, native LSM folks;
instead let's just all admit that it's often a challenge working
across subsystem lines, but we usually find a way because we all want
the kernel to move forward.  We've all got our grudges and scars,
let's not start poking each other with sticks.

> > I'll take another look at the proposed helpers you need as bpf kfuncs
> > and I'll give my best not to be overly annoyed by all of this. I have no
> > intention of not helping you quite the opposite but I'm annoyed that
> > we're here in the first place.
> >
> > What I want is to stop this madness of exposing stuff to users without
> > fully understanding it's semantics and required guarantees.
>
> So, looking at this series you're now asking us to expose:
>
> (1) mmgrab()
> (2) mmput()
> (3) fput()
> (5) get_mm_exe_file()
> (4) get_task_exe_file()
> (7) get_task_fs_pwd()
> (6) get_task_fs_root()
> (8) path_get()
> (9) path_put()
>
> in one go and the justification in all patches amounts to "This is
> common in some BPF LSM programs".
>
> So, broken stuff got exposed to users or at least a broken BPF LSM
> program was written somewhere out there that is susceptible to UAFs
> becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> So you're now scrambling to fix this by asking for a bunch of low-level
> exports.
>
> What is the guarantee that you don't end up writing another BPF LSM that
> abuses these exports in a way that causes even more issues and then
> someone else comes back asking for the next round of bpf funcs to be
> exposed to fix it.
>
> The difference between a regular LSM asking about this and a BPF LSM
> program is that we can see in the hook implementation what the LSM
> intends to do with this and we can judge whether that's safe or not.

I wish things were different, but the current reality is that from a
practical perspective BPF LSMs are not much different than
out-of-tree, native LSMs.  There is no requirement, or even
convention, for BPF LSMs to post their designs, docs, or
implementations on the LSM list for review like we do with other LSMs;
maybe it has happened once or twice, but I can't remember the last
time I saw a serious BPF LSM posted to the LSM list.  Even if someone
did do The Right Thing with their BPF LSM and proposed it for review
just like a native LSM, there is no guarantee someone else couldn't
fork that LSM and do something wrong (intentional or not).

Unless something changes, I'm left to treat BPF LSMs the same as any
other out-of-tree kernel code: I'm not going to make life difficult
for a /out-of-tree/BPF/ LSM simply because it is /out-of-tree/BPF/,
but I'm not going to do anything to negatively impact the current, or
future, state of the in-tree LSMs simply to appease the
/out-of-tree/BPF/ LSMs.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-06 12:13   ` Christian Brauner
  2024-03-06 21:44     ` Paul Moore
@ 2024-03-07  4:05     ` Alexei Starovoitov
  2024-03-07  9:54       ` Christian Brauner
  1 sibling, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-07  4:05 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Wed, Mar 6, 2024 at 4:13 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote:
> > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> > > G'day All,
> > >
> > > The original cover letter providing background context and motivating
> > > factors around the needs for the BPF kfuncs introduced within this
> > > patch series can be found here [0], so please do reference that if
> > > need be.
> > >
> > > Notably, one of the main contention points within v1 of this patch
> > > series was that we were effectively leaning on some preexisting
> > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> > > within some of the newly introduced BPF kfuncs. As noted in my
> > > response here [1] though, I struggle to understand the technical
> > > reasoning behind why exposing such in-kernel helpers, specifically
> > > only to BPF LSM program types in the form of BPF kfuncs, is inherently
> > > a terrible idea. So, until someone provides me with a sound technical
> > > explanation as to why this cannot or should not be done, I'll continue
> > > to lean on them. The alternative is to reimplement the necessary
> > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.
> >
> > You may lean as much as you like. What I've reacted to is that you've
> > (not you specifically, I'm sure) messed up. You've exposed d_path() to
> > users  without understanding that it wasn't safe apparently.
> >
> > And now we get patches that use the self-inflicted brokeness as an
> > argument to expose a bunch of other low-level helpers to fix that.
> >
> > The fact that it's "just bpf LSM" programs doesn't alleviate any
> > concerns whatsoever. Not just because that is just an entry vector but
> > also because we have LSMs induced API abuse that we only ever get to see
> > the fallout from when we refactor apis and then it causes pain for the vfs.
> >
> > I'll take another look at the proposed helpers you need as bpf kfuncs
> > and I'll give my best not to be overly annoyed by all of this. I have no
> > intention of not helping you quite the opposite but I'm annoyed that
> > we're here in the first place.
> >
> > What I want is to stop this madness of exposing stuff to users without
> > fully understanding it's semantics and required guarantees.
>
> So, looking at this series you're now asking us to expose:
>
> (1) mmgrab()
> (2) mmput()
> (3) fput()
> (5) get_mm_exe_file()
> (4) get_task_exe_file()
> (7) get_task_fs_pwd()
> (6) get_task_fs_root()
> (8) path_get()
> (9) path_put()
>
> in one go and the justification in all patches amounts to "This is
> common in some BPF LSM programs".
>
> So, broken stuff got exposed to users or at least a broken BPF LSM
> program was written somewhere out there that is susceptible to UAFs
> becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> So you're now scrambling to fix this by asking for a bunch of low-level
> exports.
>
> What is the guarantee that you don't end up writing another BPF LSM that
> abuses these exports in a way that causes even more issues and then
> someone else comes back asking for the next round of bpf funcs to be
> exposed to fix it.

There is no guarantee.
We made a safety mistake with bpf_d_path() though
we restricted it very tight. And that UAF is tricky.
I'm still amazed how Jann managed to find it.
We all make mistakes.
It's not the first one and not going to be the last.

What Matt is doing is an honest effort to fix it
in the upstream kernel for all bpf users to benefit.
He could have done it with a kernel module.
The above "low level" helpers are all either static inline
in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.

One can implement such kfuncs in an out of tree kernel module
and be done with it, but in the bpf community we encourage
everyone to upstream their work.

So kudos to Matt for working on these patches.

His bpf-lsm use case is not special.
It just needs a safe way to call d_path.
Let's look at one of the test in patch 9:

+SEC("lsm.s/file_open")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
+{
+       struct path *pwd;
+       struct task_struct *current;
+
+       current = bpf_get_current_task_btf();
+       /* Walking a trusted pointer returned from bpf_get_current_task_btf()
+        * yields and untrusted pointer. */
+       pwd = &current->fs->pwd;
+       bpf_path_d_path(pwd, buf, sizeof(buf));
+       return 0;
+}

This test checks that such an access pattern is unsafe and
the verifier will catch it.

To make it safe one needs to do:

  current = bpf_get_current_task_btf();
  pwd = bpf_get_task_fs_pwd(current);
  if (!pwd) // error path
  bpf_path_d_path(pwd, ...);
  bpf_put_path(pwd);

these are the kfuncs from patch 6.

And notice that they have KF_ACQUIRE and KF_RELEASE flags.

They tell the verifier to recognize that bpf_get_task_fs_pwd()
kfunc acquires 'struct path *'.
Meaning that bpf prog cannot just return without releasing it.

The bpf prog cannot use-after-free that 'pwd' either
after it was released by bpf_put_path(pwd).

The verifier static analysis catches such UAF-s.
It didn't catch Jann's UAF earlier, because we didn't have
these kfuncs! Hence the fix is to add such kfuncs with
acquire/release semantics.

> The difference between a regular LSM asking about this and a BPF LSM
> program is that we can see in the hook implementation what the LSM
> intends to do with this and we can judge whether that's safe or not.

See above example.
The verifier is doing a much better job than humans when it comes
to safety.

> Here you're asking us to do this blindfolded.

If you don't trust the verifier to enforce safety,
you shouldn't trust Rust compiler to generate safe code either.

In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL.
Such analogy is correct to some extent,
but unlike exported symbols kfuncs are restricted to particular
program types. They don't accept arbitrary pointers,
and reference count is enforced as well.
That's a pretty big difference vs EXPORT_SYMBOL.

> Because really, all I see immediately supportable is the addition of a
> safe variant of bpf making use of the trusted pointer argument
> constraint:
>
> [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()

This kfunc alone is useful in limited cases.
See above example. For simple LSM file_open hook
the bpf prog needs bpf_get_task_fs_pwd() to use this d_path kfunc.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-07  4:05     ` Alexei Starovoitov
@ 2024-03-07  9:54       ` Christian Brauner
  2024-03-07 20:50         ` Paul Moore
  2024-03-08  3:11         ` Alexei Starovoitov
  0 siblings, 2 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-07  9:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Wed, Mar 06, 2024 at 08:05:05PM -0800, Alexei Starovoitov wrote:
> On Wed, Mar 6, 2024 at 4:13 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote:
> > > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> > > > G'day All,
> > > >
> > > > The original cover letter providing background context and motivating
> > > > factors around the needs for the BPF kfuncs introduced within this
> > > > patch series can be found here [0], so please do reference that if
> > > > need be.
> > > >
> > > > Notably, one of the main contention points within v1 of this patch
> > > > series was that we were effectively leaning on some preexisting
> > > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> > > > within some of the newly introduced BPF kfuncs. As noted in my
> > > > response here [1] though, I struggle to understand the technical
> > > > reasoning behind why exposing such in-kernel helpers, specifically
> > > > only to BPF LSM program types in the form of BPF kfuncs, is inherently
> > > > a terrible idea. So, until someone provides me with a sound technical
> > > > explanation as to why this cannot or should not be done, I'll continue
> > > > to lean on them. The alternative is to reimplement the necessary
> > > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.
> > >
> > > You may lean as much as you like. What I've reacted to is that you've
> > > (not you specifically, I'm sure) messed up. You've exposed d_path() to
> > > users  without understanding that it wasn't safe apparently.
> > >
> > > And now we get patches that use the self-inflicted brokeness as an
> > > argument to expose a bunch of other low-level helpers to fix that.
> > >
> > > The fact that it's "just bpf LSM" programs doesn't alleviate any
> > > concerns whatsoever. Not just because that is just an entry vector but
> > > also because we have LSMs induced API abuse that we only ever get to see
> > > the fallout from when we refactor apis and then it causes pain for the vfs.
> > >
> > > I'll take another look at the proposed helpers you need as bpf kfuncs
> > > and I'll give my best not to be overly annoyed by all of this. I have no
> > > intention of not helping you quite the opposite but I'm annoyed that
> > > we're here in the first place.
> > >
> > > What I want is to stop this madness of exposing stuff to users without
> > > fully understanding it's semantics and required guarantees.
> >
> > So, looking at this series you're now asking us to expose:
> >
> > (1) mmgrab()
> > (2) mmput()
> > (3) fput()
> > (5) get_mm_exe_file()
> > (4) get_task_exe_file()
> > (7) get_task_fs_pwd()
> > (6) get_task_fs_root()
> > (8) path_get()
> > (9) path_put()
> >
> > in one go and the justification in all patches amounts to "This is
> > common in some BPF LSM programs".
> >
> > So, broken stuff got exposed to users or at least a broken BPF LSM
> > program was written somewhere out there that is susceptible to UAFs
> > becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> > So you're now scrambling to fix this by asking for a bunch of low-level
> > exports.
> >
> > What is the guarantee that you don't end up writing another BPF LSM that
> > abuses these exports in a way that causes even more issues and then
> > someone else comes back asking for the next round of bpf funcs to be
> > exposed to fix it.
> 
> There is no guarantee.
> We made a safety mistake with bpf_d_path() though
> we restricted it very tight. And that UAF is tricky.
> I'm still amazed how Jann managed to find it.
> We all make mistakes.
> It's not the first one and not going to be the last.
> 
> What Matt is doing is an honest effort to fix it
> in the upstream kernel for all bpf users to benefit.
> He could have done it with a kernel module.
> The above "low level" helpers are all either static inline
> in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.
> 
> One can implement such kfuncs in an out of tree kernel module
> and be done with it, but in the bpf community we encourage
> everyone to upstream their work.
> 
> So kudos to Matt for working on these patches.
> 
> His bpf-lsm use case is not special.
> It just needs a safe way to call d_path.
> 
> +SEC("lsm.s/file_open")
> +__failure __msg("R1 must be referenced or trusted")
> +int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
> +{
> +       struct path *pwd;
> +       struct task_struct *current;
> +
> +       current = bpf_get_current_task_btf();
> +       /* Walking a trusted pointer returned from bpf_get_current_task_btf()
> +        * yields and untrusted pointer. */
> +       pwd = &current->fs->pwd;
> +       bpf_path_d_path(pwd, buf, sizeof(buf));
> +       return 0;
> +}
> 
> This test checks that such an access pattern is unsafe and
> the verifier will catch it.
> 
> To make it safe one needs to do:
> 
>   current = bpf_get_current_task_btf();
>   pwd = bpf_get_task_fs_pwd(current);
>   if (!pwd) // error path
>   bpf_path_d_path(pwd, ...);
>   bpf_put_path(pwd);
> 
> these are the kfuncs from patch 6.
> 
> And notice that they have KF_ACQUIRE and KF_RELEASE flags.
> 
> They tell the verifier to recognize that bpf_get_task_fs_pwd()
> kfunc acquires 'struct path *'.
> Meaning that bpf prog cannot just return without releasing it.
> 
> The bpf prog cannot use-after-free that 'pwd' either
> after it was released by bpf_put_path(pwd).
> 
> The verifier static analysis catches such UAF-s.
> It didn't catch Jann's UAF earlier, because we didn't have
> these kfuncs! Hence the fix is to add such kfuncs with
> acquire/release semantics.
> 
> > The difference between a regular LSM asking about this and a BPF LSM
> > program is that we can see in the hook implementation what the LSM
> > intends to do with this and we can judge whether that's safe or not.
> 
> See above example.
> The verifier is doing a much better job than humans when it comes
> to safety.
> 
> > Here you're asking us to do this blindfolded.
> 
> If you don't trust the verifier to enforce safety,
> you shouldn't trust Rust compiler to generate safe code either.
> 
> In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL.
> Such analogy is correct to some extent,
> but unlike exported symbols kfuncs are restricted to particular
> program types. They don't accept arbitrary pointers,
> and reference count is enforced as well.
> That's a pretty big difference vs EXPORT_SYMBOL.

There's one fundamental question here that we'll need an official answer to:

Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
to request access to various helpers in the kernel?

Because fundamentally this is what this patchset is asking to be done.

If the ZFS out-of-tree kernel module were to send us a similar patch
series asking us for a list of 9 functions that they'd like us to export
what would the answer to that be? It would be "no" - on principle alone.

So what is different between an out-of-tree BPF LSM program that no one
even has ever seen and an out-of-tree kernel module that one can at
least look at in Github? Why should we reject requests from the latter
but are supposed to accept requests from the former?

If we say yes to the BPF LSM program requests we would have to say yes
to ZFS as well.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-07  9:54       ` Christian Brauner
@ 2024-03-07 20:50         ` Paul Moore
  2024-03-08  3:25           ` Alexei Starovoitov
  2024-03-08  3:11         ` Alexei Starovoitov
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Moore @ 2024-03-07 20:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexei Starovoitov, Matt Bobrowski, bpf, Alexei Starovoitov,
	Andrii Nakryiko, KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann,
	Linus Torvalds, Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote:
>
> There's one fundamental question here that we'll need an official answer to:
>
> Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
> to request access to various helpers in the kernel?

Phrased in a slightly different way, and a bit more generalized: do we
treat out-of-tree BPF programs the same as we do with out-of-tree
kernel modules?  I believe that's the real question, and if we answer
that, we should also have our answer for the internal helper function
question.

> Because fundamentally this is what this patchset is asking to be done.
>
> If the ZFS out-of-tree kernel module were to send us a similar patch
> series asking us for a list of 9 functions that they'd like us to export
> what would the answer to that be? It would be "no" - on principle alone.
>
> So what is different between an out-of-tree BPF LSM program that no one
> even has ever seen and an out-of-tree kernel module that one can at
> least look at in Github? Why should we reject requests from the latter
> but are supposed to accept requests from the former?
>
> If we say yes to the BPF LSM program requests we would have to say yes
> to ZFS as well.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-07  9:54       ` Christian Brauner
  2024-03-07 20:50         ` Paul Moore
@ 2024-03-08  3:11         ` Alexei Starovoitov
  2024-03-08 10:35           ` Christian Brauner
  1 sibling, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-08  3:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Thu, Mar 7, 2024 at 1:55 AM Christian Brauner <brauner@kernel.org> wrote:
>
> > >
> > > So, looking at this series you're now asking us to expose:
> > >
> > > (1) mmgrab()
> > > (2) mmput()
> > > (3) fput()
> > > (5) get_mm_exe_file()
> > > (4) get_task_exe_file()
> > > (7) get_task_fs_pwd()
> > > (6) get_task_fs_root()
> > > (8) path_get()
> > > (9) path_put()
> > >
> > > in one go and the justification in all patches amounts to "This is
> > > common in some BPF LSM programs".
> > >
> > > So, broken stuff got exposed to users or at least a broken BPF LSM
> > > program was written somewhere out there that is susceptible to UAFs
> > > becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> > > So you're now scrambling to fix this by asking for a bunch of low-level
> > > exports.
> > >
> > > What is the guarantee that you don't end up writing another BPF LSM that
> > > abuses these exports in a way that causes even more issues and then
> > > someone else comes back asking for the next round of bpf funcs to be
> > > exposed to fix it.
> >
> > There is no guarantee.
> > We made a safety mistake with bpf_d_path() though
> > we restricted it very tight. And that UAF is tricky.
> > I'm still amazed how Jann managed to find it.
> > We all make mistakes.
> > It's not the first one and not going to be the last.
> >
> > What Matt is doing is an honest effort to fix it
> > in the upstream kernel for all bpf users to benefit.
> > He could have done it with a kernel module.
> > The above "low level" helpers are all either static inline
> > in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.
> >
> > One can implement such kfuncs in an out of tree kernel module
> > and be done with it, but in the bpf community we encourage
> > everyone to upstream their work.
> >
> > So kudos to Matt for working on these patches.
> >
> > His bpf-lsm use case is not special.
> > It just needs a safe way to call d_path.
> >
> > +SEC("lsm.s/file_open")
> > +__failure __msg("R1 must be referenced or trusted")
> > +int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
> > +{
> > +       struct path *pwd;
> > +       struct task_struct *current;
> > +
> > +       current = bpf_get_current_task_btf();
> > +       /* Walking a trusted pointer returned from bpf_get_current_task_btf()
> > +        * yields and untrusted pointer. */
> > +       pwd = &current->fs->pwd;
> > +       bpf_path_d_path(pwd, buf, sizeof(buf));
> > +       return 0;
> > +}
> >
> > This test checks that such an access pattern is unsafe and
> > the verifier will catch it.
> >
> > To make it safe one needs to do:
> >
> >   current = bpf_get_current_task_btf();
> >   pwd = bpf_get_task_fs_pwd(current);
> >   if (!pwd) // error path
> >   bpf_path_d_path(pwd, ...);
> >   bpf_put_path(pwd);
> >
> > these are the kfuncs from patch 6.
> >
> > And notice that they have KF_ACQUIRE and KF_RELEASE flags.
> >
> > They tell the verifier to recognize that bpf_get_task_fs_pwd()
> > kfunc acquires 'struct path *'.
> > Meaning that bpf prog cannot just return without releasing it.
> >
> > The bpf prog cannot use-after-free that 'pwd' either
> > after it was released by bpf_put_path(pwd).
> >
> > The verifier static analysis catches such UAF-s.
> > It didn't catch Jann's UAF earlier, because we didn't have
> > these kfuncs! Hence the fix is to add such kfuncs with
> > acquire/release semantics.
> >
> > > The difference between a regular LSM asking about this and a BPF LSM
> > > program is that we can see in the hook implementation what the LSM
> > > intends to do with this and we can judge whether that's safe or not.
> >
> > See above example.
> > The verifier is doing a much better job than humans when it comes
> > to safety.
> >
> > > Here you're asking us to do this blindfolded.
> >
> > If you don't trust the verifier to enforce safety,
> > you shouldn't trust Rust compiler to generate safe code either.
> >
> > In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL.
> > Such analogy is correct to some extent,
> > but unlike exported symbols kfuncs are restricted to particular
> > program types. They don't accept arbitrary pointers,
> > and reference count is enforced as well.
> > That's a pretty big difference vs EXPORT_SYMBOL.
>
> There's one fundamental question here that we'll need an official answer to:
>
> Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
> to request access to various helpers in the kernel?

obviously not.

> Because fundamentally this is what this patchset is asking to be done.

Pardon ?

> If the ZFS out-of-tree kernel module were to send us a similar patch
> series asking us for a list of 9 functions that they'd like us to export
> what would the answer to that be?

This patch set doesn't request any new EXPORT_SYMBOL.
Quoting previous reply:
"
 The above "low level" helpers are all either static inline
 in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.
"

Let's look at your list:

> > > (1) mmgrab()
> > > (2) mmput()
> > > (3) fput()
> > > (5) get_mm_exe_file()
> > > (4) get_task_exe_file()
> > > (7) get_task_fs_pwd()
> > > (6) get_task_fs_root()
> > > (8) path_get()
> > > (9) path_put()

First of all, there is no such thing as get_task_fs_pwd/root
in the kernel.

The hypothetical out-of-tree ZFS can use the rest just fine.
One can argue that get_mm_exe_file() is not exported,
but it's nothing but rcu_lock-wrap plus get_file_rcu()
which is EXPORT_SYMBOL.

kfunc-s in patch 6 wrap get_fs_root/pwd from linux/fs_struct.h
that calls EXPORT_SYMBOL(path_get).

So upon close examination no new EXPORT_SYMBOL-s were requested.

Christian,

I understand your irritation with anything bpf related
due to our mistake with fd=0, but as I said earlier it was
an honest mistake. There was no malicious intent.
Time to move on.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-07 20:50         ` Paul Moore
@ 2024-03-08  3:25           ` Alexei Starovoitov
  2024-03-08 10:58             ` Christian Brauner
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-08  3:25 UTC (permalink / raw)
  To: Paul Moore
  Cc: Christian Brauner, Matt Bobrowski, bpf, Alexei Starovoitov,
	Andrii Nakryiko, KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann,
	Linus Torvalds, Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Thu, Mar 7, 2024 at 12:51 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > There's one fundamental question here that we'll need an official answer to:
> >
> > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
> > to request access to various helpers in the kernel?
>
> Phrased in a slightly different way, and a bit more generalized: do we
> treat out-of-tree BPF programs the same as we do with out-of-tree
> kernel modules?  I believe that's the real question, and if we answer
> that, we should also have our answer for the internal helper function
> question.

From 10k ft view bpf programs may look like kernel modules,
but looking closely they are very different.

Modules can read/write any data structure and can call any exported function.
All modules fall into two categories GPL or not.
While bpf progs are divided by program type.
Tracing progs can read any kernel memory safely via probe_read_kernel.
Networking prog can read/write packets, but cannot read kernel memory.
bpf_lsm programs can be called from lsm hooks and
call only kfuncs that were explicitly allowlisted to bpf_lsm prog type.
Furthermore kfuncs have acquire/release semantics enforced by
the verifier.
For example, bpf progs can do bpf_rcu_read_lock() which is
a wrapper around rcu_read_lock() and the verifier will make sure
that bpf_rcu_read_unlock() is called.
Under bpf_rcu_read_lock() bpf programs can dereference __rcu tagged
fields and the verifier will track them as rcu protected objects
until bpf_rcu_read_unlock().
In other words the verifier is doing sparse-on-steroids analysis
and enforcing it.
Kernel modules are not subject to such enforcement.

One more distinction: 99.9% of bpf features require a GPL-ed bpf program.
All kfuncs are GPL only.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-08  3:11         ` Alexei Starovoitov
@ 2024-03-08 10:35           ` Christian Brauner
  2024-03-09  1:23             ` Alexei Starovoitov
  0 siblings, 1 reply; 31+ messages in thread
From: Christian Brauner @ 2024-03-08 10:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Thu, Mar 07, 2024 at 07:11:22PM -0800, Alexei Starovoitov wrote:
> On Thu, Mar 7, 2024 at 1:55 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > > >
> > > > So, looking at this series you're now asking us to expose:
> > > >
> > > > (1) mmgrab()
> > > > (2) mmput()
> > > > (3) fput()
> > > > (5) get_mm_exe_file()
> > > > (4) get_task_exe_file()
> > > > (7) get_task_fs_pwd()
> > > > (6) get_task_fs_root()
> > > > (8) path_get()
> > > > (9) path_put()
> > > >
> > > > in one go and the justification in all patches amounts to "This is
> > > > common in some BPF LSM programs".
> > > >
> > > > So, broken stuff got exposed to users or at least a broken BPF LSM
> > > > program was written somewhere out there that is susceptible to UAFs
> > > > becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> > > > So you're now scrambling to fix this by asking for a bunch of low-level
> > > > exports.
> > > >
> > > > What is the guarantee that you don't end up writing another BPF LSM that
> > > > abuses these exports in a way that causes even more issues and then
> > > > someone else comes back asking for the next round of bpf funcs to be
> > > > exposed to fix it.
> > >
> > > There is no guarantee.
> > > We made a safety mistake with bpf_d_path() though
> > > we restricted it very tight. And that UAF is tricky.
> > > I'm still amazed how Jann managed to find it.
> > > We all make mistakes.
> > > It's not the first one and not going to be the last.
> > >
> > > What Matt is doing is an honest effort to fix it
> > > in the upstream kernel for all bpf users to benefit.
> > > He could have done it with a kernel module.
> > > The above "low level" helpers are all either static inline
> > > in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.
> > >
> > > One can implement such kfuncs in an out of tree kernel module
> > > and be done with it, but in the bpf community we encourage
> > > everyone to upstream their work.
> > >
> > > So kudos to Matt for working on these patches.
> > >
> > > His bpf-lsm use case is not special.
> > > It just needs a safe way to call d_path.
> > >
> > > +SEC("lsm.s/file_open")
> > > +__failure __msg("R1 must be referenced or trusted")
> > > +int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
> > > +{
> > > +       struct path *pwd;
> > > +       struct task_struct *current;
> > > +
> > > +       current = bpf_get_current_task_btf();
> > > +       /* Walking a trusted pointer returned from bpf_get_current_task_btf()
> > > +        * yields and untrusted pointer. */
> > > +       pwd = &current->fs->pwd;
> > > +       bpf_path_d_path(pwd, buf, sizeof(buf));
> > > +       return 0;
> > > +}
> > >
> > > This test checks that such an access pattern is unsafe and
> > > the verifier will catch it.
> > >
> > > To make it safe one needs to do:
> > >
> > >   current = bpf_get_current_task_btf();
> > >   pwd = bpf_get_task_fs_pwd(current);
> > >   if (!pwd) // error path
> > >   bpf_path_d_path(pwd, ...);
> > >   bpf_put_path(pwd);
> > >
> > > these are the kfuncs from patch 6.
> > >
> > > And notice that they have KF_ACQUIRE and KF_RELEASE flags.
> > >
> > > They tell the verifier to recognize that bpf_get_task_fs_pwd()
> > > kfunc acquires 'struct path *'.
> > > Meaning that bpf prog cannot just return without releasing it.
> > >
> > > The bpf prog cannot use-after-free that 'pwd' either
> > > after it was released by bpf_put_path(pwd).
> > >
> > > The verifier static analysis catches such UAF-s.
> > > It didn't catch Jann's UAF earlier, because we didn't have
> > > these kfuncs! Hence the fix is to add such kfuncs with
> > > acquire/release semantics.
> > >
> > > > The difference between a regular LSM asking about this and a BPF LSM
> > > > program is that we can see in the hook implementation what the LSM
> > > > intends to do with this and we can judge whether that's safe or not.
> > >
> > > See above example.
> > > The verifier is doing a much better job than humans when it comes
> > > to safety.
> > >
> > > > Here you're asking us to do this blindfolded.
> > >
> > > If you don't trust the verifier to enforce safety,
> > > you shouldn't trust Rust compiler to generate safe code either.
> > >
> > > In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL.
> > > Such analogy is correct to some extent,
> > > but unlike exported symbols kfuncs are restricted to particular
> > > program types. They don't accept arbitrary pointers,
> > > and reference count is enforced as well.
> > > That's a pretty big difference vs EXPORT_SYMBOL.
> >
> > There's one fundamental question here that we'll need an official answer to:
> >
> > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
> > to request access to various helpers in the kernel?
> 
> obviously not.
> 
> > Because fundamentally this is what this patchset is asking to be done.
> 
> Pardon ?
> 
> > If the ZFS out-of-tree kernel module were to send us a similar patch
> > series asking us for a list of 9 functions that they'd like us to export
> > what would the answer to that be?
> 
> This patch set doesn't request any new EXPORT_SYMBOL.

In order for that out-of-tree BPF LSM program to get similar guarantees
than an equivalent kernel module we need to export all nine functions
for BPF. Included in that set are functions that aren't currently even
exported to modules.

These exports are specifically for an out-of-tree BPF LSM program that
is not accessible to the public. The question in the other mail stands.

> First of all, there is no such thing as get_task_fs_pwd/root
> in the kernel.

Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
LSM. I don't see how that's different from an out-of-tree kernel module.

> One can argue that get_mm_exe_file() is not exported,
> but it's nothing but rcu_lock-wrap plus get_file_rcu()
> which is EXPORT_SYMBOL.

Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
be exported. So that'll be removed asap.

> Christian,
> 
> I understand your irritation with anything bpf related
> due to our mistake with fd=0, but as I said earlier it was
> an honest mistake. There was no malicious intent.
> Time to move on.

I understand your concern but I'm neither irritated by bpf nor do I hold
grudges. As I'm writing this bpf's bpffs and token changes are on their
way into mainline for v6.9 which I reviewed and acked immediately after
that discussion.

Rest assures that you can move on from that concern. There's no need to
keep bringing it up in unrelated discussions.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-08  3:25           ` Alexei Starovoitov
@ 2024-03-08 10:58             ` Christian Brauner
  0 siblings, 0 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-08 10:58 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Paul Moore, Matt Bobrowski, bpf, Alexei Starovoitov,
	Andrii Nakryiko, KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann,
	Linus Torvalds, Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Thu, Mar 07, 2024 at 07:25:05PM -0800, Alexei Starovoitov wrote:
> On Thu, Mar 7, 2024 at 12:51 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > There's one fundamental question here that we'll need an official answer to:
> > >
> > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen
> > > to request access to various helpers in the kernel?
> >
> > Phrased in a slightly different way, and a bit more generalized: do we
> > treat out-of-tree BPF programs the same as we do with out-of-tree
> > kernel modules?  I believe that's the real question, and if we answer
> > that, we should also have our answer for the internal helper function
> > question.
> 
> From 10k ft view bpf programs may look like kernel modules,
> but looking closely they are very different.
> 
> Modules can read/write any data structure and can call any exported function.
> All modules fall into two categories GPL or not.
> While bpf progs are divided by program type.
> Tracing progs can read any kernel memory safely via probe_read_kernel.
> Networking prog can read/write packets, but cannot read kernel memory.
> bpf_lsm programs can be called from lsm hooks and
> call only kfuncs that were explicitly allowlisted to bpf_lsm prog type.
> Furthermore kfuncs have acquire/release semantics enforced by
> the verifier.
> For example, bpf progs can do bpf_rcu_read_lock() which is
> a wrapper around rcu_read_lock() and the verifier will make sure
> that bpf_rcu_read_unlock() is called.
> Under bpf_rcu_read_lock() bpf programs can dereference __rcu tagged
> fields and the verifier will track them as rcu protected objects
> until bpf_rcu_read_unlock().
> In other words the verifier is doing sparse-on-steroids analysis
> and enforcing it.
> Kernel modules are not subject to such enforcement.
> 
> One more distinction: 99.9% of bpf features require a GPL-ed bpf program.
> All kfuncs are GPL only.

While these are certainly all very nice properties it doesn't change the
core question.

An out of-tree BPF LSM program is requesting us to add nine vfs helpers
to the BPF kfunc api. That out-of-tree BPF LSM program is not even
accessible to the public. There is no meaningful difference to an
out-of-tree kernel module asking us to add nine functions to the export
api. The safety properties don't matter for this. Clearly, they didn't
prevent the previous bpf_d_path() stuff from happening.

We need rules for how this is supposed to be handled. Unless those rules
are clearly specified and make sense we shouldn't be adding BPF kfuncs
based on requests from out-of-tree BPF LSMs that amount to "we need
this".

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-08 10:35           ` Christian Brauner
@ 2024-03-09  1:23             ` Alexei Starovoitov
  2024-03-11 12:00               ` Christian Brauner
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-09  1:23 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote:
>
>
> These exports are specifically for an out-of-tree BPF LSM program that
> is not accessible to the public. The question in the other mail stands.

The question was already answered. You just don't like the answer.
bpf progs are not equivalent to kernel modules.
They have completely different safety and visibility properties.
The safety part I already talked about.
Sounds like the visibility has to be explained.
Kernel modules are opaque binary blobs.
bpf programs are fully transparent. The intent is known
to the verifier and to anyone with understanding
of bpf assembly.
Those that cannot read bpf asm can read C source code that is
embedded in the bpf program in kernel memory.
It's not the same as "llvm-dwarfdump module.ko" on disk.
The bpf prog source code is loaded into the kernel
at program verification time for debugging and visibility reasons.
If there is a verifier bug and bpf manages to crash the kernel
vmcore will have relevant lines of program C source code right there.

Hence out-of-tree or in-tree bpf makes no practical difference.
The program cannot hide its meaning and doesn't hamper debugging.

Hence adding EXPORT_SYMBOL == Brace for impact!
Expect crashes, api misuse and what not.

While adding bpf_kfunc is a nop for kernel development.
If kfunc is in the way of code refactoring it can be removed
(as we demonstrated several times).
A kfunc won't cause headaches for the kernel code it is
calling (assuming no verifier bugs).
If there is a bug it's on us to fix it as we demonstrated in the past.
For example: bpf_probe_read_kernel().
It's a wrapper of copy_from_kernel_nofault() and over the years
bpf users hit various bugs in copy_from_kernel_nofault(),
reported them, and _bpf developers_ fixed them.
Though copy_from_kernel_nofault() is as generic as it can get
and the same bugs could have been reproduced without bpf
we took care of fixing these parts of the kernel.

Look at path_put().
It's EXPORT_SYMBOL and any kernel module can easily screw up
reference counting, so that sooner or later distro folks
will experience debug pains due to out-of-tree drivers.

kfunc that calls path_put() won't have such consequences.
The verifier will prevent path_put() on a pointer that wasn't
acquired by the same bpf program. No support pains.
It's a nop for vfs folks.

> > First of all, there is no such thing as get_task_fs_pwd/root
> > in the kernel.
>
> Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> LSM. I don't see how that's different from an out-of-tree kernel module.

Sorry, but you don't seem to understand what bpf can and cannot do,
hence they look similar.

> > One can argue that get_mm_exe_file() is not exported,
> > but it's nothing but rcu_lock-wrap plus get_file_rcu()
> > which is EXPORT_SYMBOL.
>
> Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
> be exported. So that'll be removed asap.

So, just to make a point that
"Included in that set are functions that aren't currently even
exported to modules"
you want to un-export get_file_rcu() ?

Because as the patch stands today everything that these kfuncs are
calling is EXPORT_SYMBOL.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-09  1:23             ` Alexei Starovoitov
@ 2024-03-11 12:00               ` Christian Brauner
  2024-03-12 17:06                 ` Matt Bobrowski
  2024-03-13 21:05                 ` Alexei Starovoitov
  0 siblings, 2 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-11 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote:
> On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> >
> > These exports are specifically for an out-of-tree BPF LSM program that
> > is not accessible to the public. The question in the other mail stands.
> 
> The question was already answered. You just don't like the answer.
> bpf progs are not equivalent to kernel modules.
> They have completely different safety and visibility properties.
> The safety part I already talked about.
> Sounds like the visibility has to be explained.
> Kernel modules are opaque binary blobs.
> bpf programs are fully transparent. The intent is known
> to the verifier and to anyone with understanding
> of bpf assembly.
> Those that cannot read bpf asm can read C source code that is
> embedded in the bpf program in kernel memory.
> It's not the same as "llvm-dwarfdump module.ko" on disk.
> The bpf prog source code is loaded into the kernel
> at program verification time for debugging and visibility reasons.
> If there is a verifier bug and bpf manages to crash the kernel
> vmcore will have relevant lines of program C source code right there.
> 
> Hence out-of-tree or in-tree bpf makes no practical difference.
> The program cannot hide its meaning and doesn't hamper debugging.
> 
> Hence adding EXPORT_SYMBOL == Brace for impact!
> Expect crashes, api misuse and what not.
> 
> While adding bpf_kfunc is a nop for kernel development.
> If kfunc is in the way of code refactoring it can be removed
> (as we demonstrated several times).
> A kfunc won't cause headaches for the kernel code it is
> calling (assuming no verifier bugs).
> If there is a bug it's on us to fix it as we demonstrated in the past.
> For example: bpf_probe_read_kernel().
> It's a wrapper of copy_from_kernel_nofault() and over the years
> bpf users hit various bugs in copy_from_kernel_nofault(),
> reported them, and _bpf developers_ fixed them.
> Though copy_from_kernel_nofault() is as generic as it can get
> and the same bugs could have been reproduced without bpf
> we took care of fixing these parts of the kernel.
> 
> Look at path_put().
> It's EXPORT_SYMBOL and any kernel module can easily screw up
> reference counting, so that sooner or later distro folks
> will experience debug pains due to out-of-tree drivers.
> 
> kfunc that calls path_put() won't have such consequences.
> The verifier will prevent path_put() on a pointer that wasn't
> acquired by the same bpf program. No support pains.
> It's a nop for vfs folks.
> 
> > > First of all, there is no such thing as get_task_fs_pwd/root
> > > in the kernel.
> >
> > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> > LSM. I don't see how that's different from an out-of-tree kernel module.
> 
> Sorry, but you don't seem to understand what bpf can and cannot do,
> hence they look similar.

Maybe. On the other hand you seem to ignore what I'm saying. You
currently don't have a clear set of rules for when it's ok for someone
to send patches and request access to bpf kfuncs to implement a new BPF
program. This patchset very much illustrates this point. The safety
properties of bpf don't matter for this. And again, your safety
properties very much didn't protect you from your bpf_d_path() mess.

We're not even clearly told where and how these helper are supposed to be
used. That's not ok and will never be ok. As long as there are no clear
criteria to operate under this is highly problematic. This may be fine
from a bpf perspective and one can even understand why because that's
apparently your model or promise to your users. But there's no reason to
expect the same level of laxness from any of the subsystems you're
requesting kfuncs from.

> > > One can argue that get_mm_exe_file() is not exported,
> > > but it's nothing but rcu_lock-wrap plus get_file_rcu()
> > > which is EXPORT_SYMBOL.
> >
> > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
> > be exported. So that'll be removed asap.
> 
> So, just to make a point that
> "Included in that set are functions that aren't currently even
> exported to modules"
> you want to un-export get_file_rcu() ?

No. The reason it was exported was because of the drm subsystem and we
already quite disliked that. But it turned out that's not needed so in
commit 61d4fb0b349e ("file, i915: fix file reference for
mmap_singleton()") they were moved away from this helper.

And then we simply forgot to unexport it.

A helper such as get_file_rcu() is called on a file object that is
subject to SLAB_TYPESAFE_BY_RCU semantics where the caller doesn't hold
a reference. The semantics of that are maybe understood by a couple of
people in the kernel. There is absolutely no way that any userspace will
get access to such low-level helpers. They have zero business to be
involved in the lifetimes of objects on this level just as no module has.

So really, this is an orthogonal cleanup.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-11 12:00               ` Christian Brauner
@ 2024-03-12 17:06                 ` Matt Bobrowski
  2024-03-12 20:11                   ` Matt Bobrowski
  2024-03-18 13:24                   ` Christian Brauner
  2024-03-13 21:05                 ` Alexei Starovoitov
  1 sibling, 2 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-12 17:06 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

Hey Christian,

On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote:
> On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote:
> > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > >
> > > These exports are specifically for an out-of-tree BPF LSM program that
> > > is not accessible to the public. The question in the other mail stands.
> > 
> > The question was already answered. You just don't like the answer.
> > bpf progs are not equivalent to kernel modules.
> > They have completely different safety and visibility properties.
> > The safety part I already talked about.
> > Sounds like the visibility has to be explained.
> > Kernel modules are opaque binary blobs.
> > bpf programs are fully transparent. The intent is known
> > to the verifier and to anyone with understanding
> > of bpf assembly.
> > Those that cannot read bpf asm can read C source code that is
> > embedded in the bpf program in kernel memory.
> > It's not the same as "llvm-dwarfdump module.ko" on disk.
> > The bpf prog source code is loaded into the kernel
> > at program verification time for debugging and visibility reasons.
> > If there is a verifier bug and bpf manages to crash the kernel
> > vmcore will have relevant lines of program C source code right there.
> > 
> > Hence out-of-tree or in-tree bpf makes no practical difference.
> > The program cannot hide its meaning and doesn't hamper debugging.
> > 
> > Hence adding EXPORT_SYMBOL == Brace for impact!
> > Expect crashes, api misuse and what not.
> > 
> > While adding bpf_kfunc is a nop for kernel development.
> > If kfunc is in the way of code refactoring it can be removed
> > (as we demonstrated several times).
> > A kfunc won't cause headaches for the kernel code it is
> > calling (assuming no verifier bugs).
> > If there is a bug it's on us to fix it as we demonstrated in the past.
> > For example: bpf_probe_read_kernel().
> > It's a wrapper of copy_from_kernel_nofault() and over the years
> > bpf users hit various bugs in copy_from_kernel_nofault(),
> > reported them, and _bpf developers_ fixed them.
> > Though copy_from_kernel_nofault() is as generic as it can get
> > and the same bugs could have been reproduced without bpf
> > we took care of fixing these parts of the kernel.
> > 
> > Look at path_put().
> > It's EXPORT_SYMBOL and any kernel module can easily screw up
> > reference counting, so that sooner or later distro folks
> > will experience debug pains due to out-of-tree drivers.
> > 
> > kfunc that calls path_put() won't have such consequences.
> > The verifier will prevent path_put() on a pointer that wasn't
> > acquired by the same bpf program. No support pains.
> > It's a nop for vfs folks.
> > 
> > > > First of all, there is no such thing as get_task_fs_pwd/root
> > > > in the kernel.
> > >
> > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> > > LSM. I don't see how that's different from an out-of-tree kernel module.
> > 
> > Sorry, but you don't seem to understand what bpf can and cannot do,
> > hence they look similar.
> 
> Maybe. On the other hand you seem to ignore what I'm saying. You
> currently don't have a clear set of rules for when it's ok for someone
> to send patches and request access to bpf kfuncs to implement a new BPF
> program. This patchset very much illustrates this point. The safety
> properties of bpf don't matter for this. And again, your safety
> properties very much didn't protect you from your bpf_d_path() mess.
> 
> We're not even clearly told where and how these helper are supposed to be
> used. That's not ok and will never be ok. As long as there are no clear
> criteria to operate under this is highly problematic. This may be fine
> from a bpf perspective and one can even understand why because that's
> apparently your model or promise to your users. But there's no reason to
> expect the same level of laxness from any of the subsystems you're
> requesting kfuncs from.

You raise a completely fair point, and I truly do apologies for the
lack of context and in depth explanations around the specific
situations that the proposed BPF kfuncs are intended to be used
from. Admittedly, that's a failure on my part, and I can completely
understand why from a maintainers point of view there would be
reservations around acknowledging requests for adding such invisible
dependencies.

Now, I'm in a little bit of a tough situation as I'm unable to point
you to an open-source BPF LSM implementation that intends to make use
of such newly proposed BPF kfuncs. That's just an unfortunate
constraint and circumstance that I'm having to deal with, so I'm just
going to have to provide heavily redacted and incomplete example to
illustrate how these BPF kfuncs intend to be used from BPF LSM
programs that I personally work on here at Google. Notably though, the
contexts that I do share here may obviously be a nonholistic view on
how these newly introduced BPF kfuncs end up getting used in practice
by some other completely arbitrary open-source BPF LSM programs.

Anyway, as Alexei had pointed out in one of the prior responses, the
core motivating factor behind introducing these newly proposed BPF
kfuncs purely stems from the requirement of needing to call
bpf_d_path() safely on a struct path from the context of a BPF LSM
program, specifically within the security_file_open() and
security_mmap_file() LSM hooks. Now, as noted within the original bug
report [0], it's currently not considered safe to pluck a struct path
out from an arbitrary in-kernel data structure, which in our case was
current->mm->exe_file->f_path, and have it passed to bpf_d_path() from
the aforementioned LSM hook points, or any other LSM hook point for
that matter.

So, without using these newly introduced BPF kfuncs, our BPF LSM
program hanging off security_file_open() looks as follows:

```
int BPF_PROG(file_open, struct file *file)
{
  // Perform a whole bunch of operations on the supplied file argument. This
  // includes some form of policy evaluation, and if there's a violation against
  // policy and auditing is enabled, then we eventually call bpf_d_path() on
  // file->f_path. Calling bpf_d_path() on the file argument isn't problematic
  // as we have a stable path here as the file argument is reference counted.
  struct path *target = &file->f_path;

  // ...

  struct task_struct *current = bpf_get_current_task_btf();

  // ...

  bpf_rcu_read_lock();
  // Reserve a slot on the BPF ring buffer such that the actor's path can be
  // passed back to userspace.
  void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0);
  if (!buf) {
    goto unlock;
  }

  // For contextual purposes when performing an audit we also call bpf_d_path()
  // on the actor, being current->mm->exe_file->f_path.
  struct path *actor = &current->mm->exe_file->f_path;

  // Now perform the path resolution on the actor via bpf_d_path().
  u64 ret = bpf_d_path(actor, buf, PATH_MAX);
  if (ret > 0) {
    bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP);
  } else {
    bpf_ringbuf_discard(buf, 0);
  }

unlock:
  bpf_rcu_read_unlock();
  return 0;
}
```

Post landing these BPF kfuncs, the BPF LSM program hanging off
security_file_open() would be updated to make use of the
acquire/release BPF kfuncs as below. Here I'm only making use of
bpf_get_task_exe_file(), but similar usage also extends to
bpf_get_task_fs_root() and bpf_get_task_fs_pwd().

```
int BPF_PROG(file_open, struct file *file)
{
  // Perform a whole bunch of operations on the supplied file argument. This
  // includes some form of policy evaluation, and if there's a violation against
  // policy and auditing is enabled, then we eventually call bpf_path_d_path()
  // on file->f_path. Calling bpf_path_d_path() on the file argument isn't
  // problematic as we have a stable path here as the file argument is trusted
  // and reference counted.
  struct path *target = &file->f_path;

  // ...

  struct task_struct *current = bpf_get_current_task_btf();

  // ...

  // Reserve a slot on the BPF ring buffer such that the actor's path can be
  // passed back to userspace.
  void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0);
  if (!buf) {
    return 0;
  }

  // For contextual purposes when performing an audit we also call
  // bpf_path_d_path() on the actor, being current->mm->exe_file->f_path.
  // Here we're operating on a stable trused and reference counted file,
  // thanks to bpf_get_task_exe_file().
  struct file *exe_file = bpf_get_task_exe_file(current);
  if (!exe_file) {
    bpf_ringbuf_discard(buf, 0);
    return 0;
  }

  // Now perform the path resolution on the actor via bpf_path_d_path(), which
  // only accepts a trusted struct path.
  u64 ret = bpf_path_d_path(&exe_file->f_path, buf, PATH_MAX);
  if (ret > 0) {
    bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP);
  } else {
    bpf_ringbuf_discard(buf, 0);
  }

  // Drop the reference on exe_file.
  bpf_put_file(exe_file);
  return 0;
}
```

This is rather incredibly straightforward, but the fundamental
difference between the two implementations is that one allows us to
work on stable file, whereas the other does not. That's really
it. Similarly, we do more or less the same for our BPF LSM program
that hangs off security_mmap_file().

Do you need anything else which illustrates the proposed BPF kfunc
usage?

[0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/

/M

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-12 17:06                 ` Matt Bobrowski
@ 2024-03-12 20:11                   ` Matt Bobrowski
  2024-03-18 13:24                   ` Christian Brauner
  1 sibling, 0 replies; 31+ messages in thread
From: Matt Bobrowski @ 2024-03-12 20:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Tue, Mar 12, 2024 at 05:06:36PM +0000, Matt Bobrowski wrote:
> Hey Christian,
> 
> On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote:
> > On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote:
> > > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote:
> > > >
> > > >
> > > > These exports are specifically for an out-of-tree BPF LSM program that
> > > > is not accessible to the public. The question in the other mail stands.
> > > 
> > > The question was already answered. You just don't like the answer.
> > > bpf progs are not equivalent to kernel modules.
> > > They have completely different safety and visibility properties.
> > > The safety part I already talked about.
> > > Sounds like the visibility has to be explained.
> > > Kernel modules are opaque binary blobs.
> > > bpf programs are fully transparent. The intent is known
> > > to the verifier and to anyone with understanding
> > > of bpf assembly.
> > > Those that cannot read bpf asm can read C source code that is
> > > embedded in the bpf program in kernel memory.
> > > It's not the same as "llvm-dwarfdump module.ko" on disk.
> > > The bpf prog source code is loaded into the kernel
> > > at program verification time for debugging and visibility reasons.
> > > If there is a verifier bug and bpf manages to crash the kernel
> > > vmcore will have relevant lines of program C source code right there.
> > > 
> > > Hence out-of-tree or in-tree bpf makes no practical difference.
> > > The program cannot hide its meaning and doesn't hamper debugging.
> > > 
> > > Hence adding EXPORT_SYMBOL == Brace for impact!
> > > Expect crashes, api misuse and what not.
> > > 
> > > While adding bpf_kfunc is a nop for kernel development.
> > > If kfunc is in the way of code refactoring it can be removed
> > > (as we demonstrated several times).
> > > A kfunc won't cause headaches for the kernel code it is
> > > calling (assuming no verifier bugs).
> > > If there is a bug it's on us to fix it as we demonstrated in the past.
> > > For example: bpf_probe_read_kernel().
> > > It's a wrapper of copy_from_kernel_nofault() and over the years
> > > bpf users hit various bugs in copy_from_kernel_nofault(),
> > > reported them, and _bpf developers_ fixed them.
> > > Though copy_from_kernel_nofault() is as generic as it can get
> > > and the same bugs could have been reproduced without bpf
> > > we took care of fixing these parts of the kernel.
> > > 
> > > Look at path_put().
> > > It's EXPORT_SYMBOL and any kernel module can easily screw up
> > > reference counting, so that sooner or later distro folks
> > > will experience debug pains due to out-of-tree drivers.
> > > 
> > > kfunc that calls path_put() won't have such consequences.
> > > The verifier will prevent path_put() on a pointer that wasn't
> > > acquired by the same bpf program. No support pains.
> > > It's a nop for vfs folks.
> > > 
> > > > > First of all, there is no such thing as get_task_fs_pwd/root
> > > > > in the kernel.
> > > >
> > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> > > > LSM. I don't see how that's different from an out-of-tree kernel module.
> > > 
> > > Sorry, but you don't seem to understand what bpf can and cannot do,
> > > hence they look similar.
> > 
> > Maybe. On the other hand you seem to ignore what I'm saying. You
> > currently don't have a clear set of rules for when it's ok for someone
> > to send patches and request access to bpf kfuncs to implement a new BPF
> > program. This patchset very much illustrates this point. The safety
> > properties of bpf don't matter for this. And again, your safety
> > properties very much didn't protect you from your bpf_d_path() mess.
> > 
> > We're not even clearly told where and how these helper are supposed to be
> > used. That's not ok and will never be ok. As long as there are no clear
> > criteria to operate under this is highly problematic. This may be fine
> > from a bpf perspective and one can even understand why because that's
> > apparently your model or promise to your users. But there's no reason to
> > expect the same level of laxness from any of the subsystems you're
> > requesting kfuncs from.
> 
> You raise a completely fair point, and I truly do apologies for the
> lack of context and in depth explanations around the specific
> situations that the proposed BPF kfuncs are intended to be used
> from. Admittedly, that's a failure on my part, and I can completely
> understand why from a maintainers point of view there would be
> reservations around acknowledging requests for adding such invisible
> dependencies.
> 
> Now, I'm in a little bit of a tough situation as I'm unable to point
> you to an open-source BPF LSM implementation that intends to make use
> of such newly proposed BPF kfuncs. That's just an unfortunate
> constraint and circumstance that I'm having to deal with, so I'm just
> going to have to provide heavily redacted and incomplete example to
> illustrate how these BPF kfuncs intend to be used from BPF LSM
> programs that I personally work on here at Google. Notably though, the
> contexts that I do share here may obviously be a nonholistic view on
> how these newly introduced BPF kfuncs end up getting used in practice
> by some other completely arbitrary open-source BPF LSM programs.
> 
> Anyway, as Alexei had pointed out in one of the prior responses, the
> core motivating factor behind introducing these newly proposed BPF
> kfuncs purely stems from the requirement of needing to call
> bpf_d_path() safely on a struct path from the context of a BPF LSM
> program, specifically within the security_file_open() and
> security_mmap_file() LSM hooks. Now, as noted within the original bug
> report [0], it's currently not considered safe to pluck a struct path
> out from an arbitrary in-kernel data structure, which in our case was
> current->mm->exe_file->f_path, and have it passed to bpf_d_path() from
> the aforementioned LSM hook points, or any other LSM hook point for
> that matter.
> 
> So, without using these newly introduced BPF kfuncs, our BPF LSM
> program hanging off security_file_open() looks as follows:
> 
> ```
> int BPF_PROG(file_open, struct file *file)
> {
>   // Perform a whole bunch of operations on the supplied file argument. This
>   // includes some form of policy evaluation, and if there's a violation against
>   // policy and auditing is enabled, then we eventually call bpf_d_path() on
>   // file->f_path. Calling bpf_d_path() on the file argument isn't problematic
>   // as we have a stable path here as the file argument is reference counted.
>   struct path *target = &file->f_path;
> 
>   // ...
> 
>   struct task_struct *current = bpf_get_current_task_btf();
> 
>   // ...
>   
>   bpf_rcu_read_lock();
>   // Reserve a slot on the BPF ring buffer such that the actor's path can be
>   // passed back to userspace.
>   void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0);
>   if (!buf) {
>     goto unlock;
>   }
> 
>   // For contextual purposes when performing an audit we also call bpf_d_path()
>   // on the actor, being current->mm->exe_file->f_path.
>   struct path *actor = &current->mm->exe_file->f_path;
> 
>   // Now perform the path resolution on the actor via bpf_d_path().
>   u64 ret = bpf_d_path(actor, buf, PATH_MAX);
>   if (ret > 0) {
>     bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP);
>   } else {
>     bpf_ringbuf_discard(buf, 0);
>   }
> 
> unlock:
>   bpf_rcu_read_unlock();
>   return 0;
> }
> ```

Note that we're also aware of the fact that calling bpf_d_path()
within an RCU read-side critical shouldn't be permitted. I have a
patch teed up which addresses this. bpf_path_d_path() OTOH isn't
susceptible to this problem as the BPF verifier ensure that BPF kfuncs
annotated KF_SLEEPABLE can't be called whilst in an RCU read-side
critical section.

/M

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-11 12:00               ` Christian Brauner
  2024-03-12 17:06                 ` Matt Bobrowski
@ 2024-03-13 21:05                 ` Alexei Starovoitov
  2024-03-18 13:14                   ` Christian Brauner
  1 sibling, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-13 21:05 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Mon, Mar 11, 2024 at 5:01 AM Christian Brauner <brauner@kernel.org> wrote:
>
> > > > One can argue that get_mm_exe_file() is not exported,
> > > > but it's nothing but rcu_lock-wrap plus get_file_rcu()
> > > > which is EXPORT_SYMBOL.
> > >
> > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
> > > be exported. So that'll be removed asap.
> >
> > So, just to make a point that
> > "Included in that set are functions that aren't currently even
> > exported to modules"
> > you want to un-export get_file_rcu() ?
>
> No. The reason it was exported was because of the drm subsystem and we
> already quite disliked that. But it turned out that's not needed so in
> commit 61d4fb0b349e ("file, i915: fix file reference for
> mmap_singleton()") they were moved away from this helper.

Arguably that commit 61d4fb0b349e should have had
Fixes: 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU")
i915 was buggy before you touched it
and safe_by_rcu exposed the bug.
I can see why you guys looked at it, saw issues,
and decided to look away.
Though your guess in commit 61d4fb0b349e
"
    Now, there might be delays until
    file->f_op->release::singleton_release() is called and
    i915->gem.mmap_singleton is set to NULL.
"
feels unlikely.
I suspect release() delay cannot be that long to cause rcu stall.
In the log prior to the splat there are just two mmap related calls
from selftests in i915_gem_mman_live_selftests():
i915: Running i915_gem_mman_live_selftests/igt_mmap_offset_exhaustion
i915: Running i915_gem_mman_live_selftests/igt_mmap
1st mmap test passed, but 2nd failed.
So it looks like it's not a race, but an issue with cleanup in that driver.
And instead of getting to the bottom of the issue
you've decided to paper over with get_file_active().
I agree with that trade-off.
But the bug in i915 is still there and it's probably an UAF.
get_file_active() is probably operating on a broken 'struct file'
that got to zero, but somehow it still around
or it's just a garbage memory and file->f_count
just happened to be zero.

My point is that it's not ok to have such double standards.
On one side you're arguing that we shouldn't introduce kfunc:
+__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
+{
+ return get_task_exe_file(task);
+}
that cleanly takes ref cnt on task->mm->exe_file and _not_ using lower
level get_file/get_file_rcu/get_file_active api-s directly which
are certainly problematic to expose anywhere, since safe_by_rcu
protocol is delicate.

But on the other side there is buggy i915 that does
questionable dance with get_file_active().
It's EXPORT_SYMBOL_GPL as well and out of tree driver can
ruin safe_by_rcu file properties with hard to debug consequences.

> There is absolutely no way that any userspace will
> get access to such low-level helpers. They have zero business to be
> involved in the lifetimes of objects on this level just as no module has.

correct, and kfuncs do not give bpf prog to do direct get_file*() access
because we saw how tricky safe_by_rcu is.
Hence kfuncs acquire file via get_task_exe_file or get_mm_exe_file
and release via fput.
That's the same pattern that security/tomoyo/util.c is doing:
   exe_file = get_mm_exe_file(mm);
   if (!exe_file)
        return NULL;

   cp = tomoyo_realpath_from_path(&exe_file->f_path);
   fput(exe_file);

in bpf_lsm case it will be:

   exe_file = bpf_get_mm_exe_file(mm);
   if (!exe_file)
   // the verifier will enforce that bpf prog has this NULL check here
   // because we annotate kfunc as:
BTF_ID_FLAGS(func, bpf_get_mm_exe_file, KF_ACQUIRE | KF_TRUSTED_ARGS |
KF_RET_NULL)

 bpf_path_d_path(&exe_file->f_path, ...);
 bpf_put_file(exe_file);
// and the verifier will enforce that bpf_put_file() is called too.
// and there is no path out of this bpf program that can take file refcnt
// without releasing.

So really these kfuncs are a nop from vfs pov.
If there is a bug in the verifier we will debug it and we will fix it.

You keep saying that bpf_d_path() is a mess.
Right. It is a mess now and we're fixing it.
When it was introduced 4 years ago it was safe at that time.
The unrelated verifier "smartness" made it possible to use it in UAF.
We found the issue now and we're fixing it.
Over these years we didn't ask vfs folks to help fix such bugs,
and not asking for help now.
You're being cc-ed on the patches to be aware on how we plan to fix
this bpf_d_path() mess. If you have a viable alternative please suggest.
As it stands the new kfuncs are clean and safe way to solve this mess.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-13 21:05                 ` Alexei Starovoitov
@ 2024-03-18 13:14                   ` Christian Brauner
  2024-03-27 21:41                     ` Alexei Starovoitov
  0 siblings, 1 reply; 31+ messages in thread
From: Christian Brauner @ 2024-03-18 13:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Wed, Mar 13, 2024 at 02:05:13PM -0700, Alexei Starovoitov wrote:
> On Mon, Mar 11, 2024 at 5:01 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > > > > One can argue that get_mm_exe_file() is not exported,
> > > > > but it's nothing but rcu_lock-wrap plus get_file_rcu()
> > > > > which is EXPORT_SYMBOL.
> > > >
> > > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
> > > > be exported. So that'll be removed asap.
> > >
> > > So, just to make a point that
> > > "Included in that set are functions that aren't currently even
> > > exported to modules"
> > > you want to un-export get_file_rcu() ?
> >
> > No. The reason it was exported was because of the drm subsystem and we
> > already quite disliked that. But it turned out that's not needed so in
> > commit 61d4fb0b349e ("file, i915: fix file reference for
> > mmap_singleton()") they were moved away from this helper.
> 
> Arguably that commit 61d4fb0b349e should have had
> Fixes: 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU")
> i915 was buggy before you touched it
> and safe_by_rcu exposed the bug.
> I can see why you guys looked at it, saw issues,
> and decided to look away.
> Though your guess in commit 61d4fb0b349e
> "
>     Now, there might be delays until
>     file->f_op->release::singleton_release() is called and
>     i915->gem.mmap_singleton is set to NULL.
> "
> feels unlikely.
> I suspect release() delay cannot be that long to cause rcu stall.
> In the log prior to the splat there are just two mmap related calls
> from selftests in i915_gem_mman_live_selftests():
> i915: Running i915_gem_mman_live_selftests/igt_mmap_offset_exhaustion
> i915: Running i915_gem_mman_live_selftests/igt_mmap
> 1st mmap test passed, but 2nd failed.
> So it looks like it's not a race, but an issue with cleanup in that driver.
> And instead of getting to the bottom of the issue
> you've decided to paper over with get_file_active().
> I agree with that trade-off.
> But the bug in i915 is still there and it's probably an UAF.
> get_file_active() is probably operating on a broken 'struct file'
> that got to zero, but somehow it still around
> or it's just a garbage memory and file->f_count
> just happened to be zero.
> 
> My point is that it's not ok to have such double standards.
> On one side you're arguing that we shouldn't introduce kfunc:
> +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
> +{
> + return get_task_exe_file(task);
> +}
> that cleanly takes ref cnt on task->mm->exe_file and _not_ using lower
> level get_file/get_file_rcu/get_file_active api-s directly which
> are certainly problematic to expose anywhere, since safe_by_rcu
> protocol is delicate.
> 
> But on the other side there is buggy i915 that does
> questionable dance with get_file_active().
> It's EXPORT_SYMBOL_GPL as well and out of tree driver can
> ruin safe_by_rcu file properties with hard to debug consequences.

You're lending strong support for my earlier point. Because this is a
clear an example where a subsystem got access to a helper that it
shouldn't have had access to. So we fixed the issue.

But this whole polemic just illustrates that you simply didn't bother to
understand how the code works. The way you talk about UAF together with
SLAB_TYPESAFE_BY_RCU is telling. Please read the code instead of
guessing.

So the same way we don't have to take responsibility for bpf
misunderstanding the expectations of d_path() we don't have to take
responsibility for misusing an internal helper by another subsystem.

So your argument here is moot at best and polemical and opportunistic at
worst. It certainly doesn't illustrate what you think it does.

And the above is fundamentally a suspiciously long sideshow. So let's
get back to the core topic: Unless you document your rules when it's ok
for a bpf program to come along and request access to internal apis
patchsets such as this are not acceptable.

> 
> > There is absolutely no way that any userspace will
> > get access to such low-level helpers. They have zero business to be
> > involved in the lifetimes of objects on this level just as no module has.
> 
> correct, and kfuncs do not give bpf prog to do direct get_file*() access
> because we saw how tricky safe_by_rcu is.
> Hence kfuncs acquire file via get_task_exe_file or get_mm_exe_file
> and release via fput.
> That's the same pattern that security/tomoyo/util.c is doing:
>    exe_file = get_mm_exe_file(mm);
>    if (!exe_file)
>         return NULL;
> 
>    cp = tomoyo_realpath_from_path(&exe_file->f_path);
>    fput(exe_file);
> 
> in bpf_lsm case it will be:
> 
>    exe_file = bpf_get_mm_exe_file(mm);
>    if (!exe_file)
>    // the verifier will enforce that bpf prog has this NULL check here
>    // because we annotate kfunc as:
> BTF_ID_FLAGS(func, bpf_get_mm_exe_file, KF_ACQUIRE | KF_TRUSTED_ARGS |
> KF_RET_NULL)
> 
>  bpf_path_d_path(&exe_file->f_path, ...);
>  bpf_put_file(exe_file);
> // and the verifier will enforce that bpf_put_file() is called too.
> // and there is no path out of this bpf program that can take file refcnt
> // without releasing.
> 
> So really these kfuncs are a nop from vfs pov.
> If there is a bug in the verifier we will debug it and we will fix it.
> 
> You keep saying that bpf_d_path() is a mess.
> Right. It is a mess now and we're fixing it.
> When it was introduced 4 years ago it was safe at that time.

Uhm, no it was always sketchy.

> The unrelated verifier "smartness" made it possible to use it in UAF.
> We found the issue now and we're fixing it.
> Over these years we didn't ask vfs folks to help fix such bugs,
> and not asking for help now.
> You're being cc-ed on the patches to be aware on how we plan to fix
> this bpf_d_path() mess. If you have a viable alternative please suggest.

The fix is to export a variant with trusted args.

> As it stands the new kfuncs are clean and safe way to solve this mess.

I will remind you of what you have been told in [1]:

"No. It's not up to maintainers to suggest alternatives. Sometimes it's
simply enough to explain *why* something isn't acceptable.

A plain "no" without explanation isn't sufficient. NAKs need a good
reason. But they don't need more than that.

The onus of coming up with an acceptable solution is on the person who
needs something new."

You've been provided:

a) good reasons why the patchset in it's current form isn't acceptable
   repeated multiple times
b) support for exporting a variant of bpf_d_path() that is safe to use
c) a request that all kfunc exports for the vfs will have to be located
   under fs/, not in kernel/bpf/
d) a path on how to move forward with additional kfunc requests:
   Clear and documented rules when it's ok for someone to come along and
   request access to bpf kfuncs when it's to be rejected and when it's
   ok to be supported.

You repeatedly threatening to go over the heads of people will not make
them more amenable to happily integrate with your subsystem.

[1]: https://lore.kernel.org/all/CAHk-=whD2HMe4ja5nR6WWofUh3nLmhjoSPDvZm2-XMGjeie5Tg@mail.gmail.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-12 17:06                 ` Matt Bobrowski
  2024-03-12 20:11                   ` Matt Bobrowski
@ 2024-03-18 13:24                   ` Christian Brauner
  1 sibling, 0 replies; 31+ messages in thread
From: Christian Brauner @ 2024-03-18 13:24 UTC (permalink / raw)
  To: Matt Bobrowski
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Tue, Mar 12, 2024 at 05:06:36PM +0000, Matt Bobrowski wrote:
> Hey Christian,
> 
> On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote:
> > On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote:
> > > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote:
> > > >
> > > >
> > > > These exports are specifically for an out-of-tree BPF LSM program that
> > > > is not accessible to the public. The question in the other mail stands.
> > > 
> > > The question was already answered. You just don't like the answer.
> > > bpf progs are not equivalent to kernel modules.
> > > They have completely different safety and visibility properties.
> > > The safety part I already talked about.
> > > Sounds like the visibility has to be explained.
> > > Kernel modules are opaque binary blobs.
> > > bpf programs are fully transparent. The intent is known
> > > to the verifier and to anyone with understanding
> > > of bpf assembly.
> > > Those that cannot read bpf asm can read C source code that is
> > > embedded in the bpf program in kernel memory.
> > > It's not the same as "llvm-dwarfdump module.ko" on disk.
> > > The bpf prog source code is loaded into the kernel
> > > at program verification time for debugging and visibility reasons.
> > > If there is a verifier bug and bpf manages to crash the kernel
> > > vmcore will have relevant lines of program C source code right there.
> > > 
> > > Hence out-of-tree or in-tree bpf makes no practical difference.
> > > The program cannot hide its meaning and doesn't hamper debugging.
> > > 
> > > Hence adding EXPORT_SYMBOL == Brace for impact!
> > > Expect crashes, api misuse and what not.
> > > 
> > > While adding bpf_kfunc is a nop for kernel development.
> > > If kfunc is in the way of code refactoring it can be removed
> > > (as we demonstrated several times).
> > > A kfunc won't cause headaches for the kernel code it is
> > > calling (assuming no verifier bugs).
> > > If there is a bug it's on us to fix it as we demonstrated in the past.
> > > For example: bpf_probe_read_kernel().
> > > It's a wrapper of copy_from_kernel_nofault() and over the years
> > > bpf users hit various bugs in copy_from_kernel_nofault(),
> > > reported them, and _bpf developers_ fixed them.
> > > Though copy_from_kernel_nofault() is as generic as it can get
> > > and the same bugs could have been reproduced without bpf
> > > we took care of fixing these parts of the kernel.
> > > 
> > > Look at path_put().
> > > It's EXPORT_SYMBOL and any kernel module can easily screw up
> > > reference counting, so that sooner or later distro folks
> > > will experience debug pains due to out-of-tree drivers.
> > > 
> > > kfunc that calls path_put() won't have such consequences.
> > > The verifier will prevent path_put() on a pointer that wasn't
> > > acquired by the same bpf program. No support pains.
> > > It's a nop for vfs folks.
> > > 
> > > > > First of all, there is no such thing as get_task_fs_pwd/root
> > > > > in the kernel.
> > > >
> > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> > > > LSM. I don't see how that's different from an out-of-tree kernel module.
> > > 
> > > Sorry, but you don't seem to understand what bpf can and cannot do,
> > > hence they look similar.
> > 
> > Maybe. On the other hand you seem to ignore what I'm saying. You
> > currently don't have a clear set of rules for when it's ok for someone
> > to send patches and request access to bpf kfuncs to implement a new BPF
> > program. This patchset very much illustrates this point. The safety
> > properties of bpf don't matter for this. And again, your safety
> > properties very much didn't protect you from your bpf_d_path() mess.
> > 
> > We're not even clearly told where and how these helper are supposed to be
> > used. That's not ok and will never be ok. As long as there are no clear
> > criteria to operate under this is highly problematic. This may be fine
> > from a bpf perspective and one can even understand why because that's
> > apparently your model or promise to your users. But there's no reason to
> > expect the same level of laxness from any of the subsystems you're
> > requesting kfuncs from.
> 
> You raise a completely fair point, and I truly do apologies for the
> lack of context and in depth explanations around the specific
> situations that the proposed BPF kfuncs are intended to be used
> from. Admittedly, that's a failure on my part, and I can completely
> understand why from a maintainers point of view there would be
> reservations around acknowledging requests for adding such invisible
> dependencies.

Thanks for providing more background.

> Now, I'm in a little bit of a tough situation as I'm unable to point
> you to an open-source BPF LSM implementation that intends to make use
> of such newly proposed BPF kfuncs. That's just an unfortunate
> constraint and circumstance that I'm having to deal with, so I'm just
> going to have to provide heavily redacted and incomplete example to
> illustrate how these BPF kfuncs intend to be used from BPF LSM
> programs that I personally work on here at Google. Notably though, the
> contexts that I do share here may obviously be a nonholistic view on
> how these newly introduced BPF kfuncs end up getting used in practice
> by some other completely arbitrary open-source BPF LSM programs.

I have to say that this to me is wild. Essentially we're supposed to
allow access to our internal APIs based on internal use-cases that
aren't public and likely never will be.

If that's acceptable then I want bpf to document this in their kernel
Documentation and submit this for review to the wider community.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs
  2024-03-18 13:14                   ` Christian Brauner
@ 2024-03-27 21:41                     ` Alexei Starovoitov
  0 siblings, 0 replies; 31+ messages in thread
From: Alexei Starovoitov @ 2024-03-27 21:41 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Matt Bobrowski, bpf, Alexei Starovoitov, Andrii Nakryiko,
	KP Singh, Jann Horn, Jiri Olsa, Daniel Borkmann, Linus Torvalds,
	Linux-Fsdevel, Andrew Morton, linux-mm, LSM List

On Mon, Mar 18, 2024 at 6:14 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Mar 13, 2024 at 02:05:13PM -0700, Alexei Starovoitov wrote:
>
> But this whole polemic just illustrates that you simply didn't bother to
> understand how the code works. The way you talk about UAF together with
> SLAB_TYPESAFE_BY_RCU is telling. Please read the code instead of
> guessing.

Ok. Fair enough. I've read the code and old threads from Sep-Nov.
I think the concerns about typesafe_by_rcu made folks believe in
races that don't exist.

           if (unlikely(!atomic_long_inc_not_zero(&file->f_count)))
..
            *  (a) the file ref already went down to zero and the
            *      file hasn't been reused yet or the file count
            *      isn't zero but the file has already been reused.

..
            if (unlikely(file != rcu_dereference_raw(*fdentry)) ||

The first part of the comment is partially incorrect.
(it's in the wrong place).

The file ref could have been down to zero and not reused yet,
but it's before atomic_long_inc_not_zero.
Once the code reaches 2nd check the file guaranteed to be reused
and it went through init_file(), because that's the only code
that brings it back from zero.
This race is ok:

cpu0                                    cpu1
file = rcu_dereference_raw(*fdentry);
// file->f_count == 1
                                        rcu_assign_pointer(fdt->fd[fd], NULL);
                                        fput() // reaches zero

atomic_long_inc_not_zero()
// will not succeed.

This race is ok too:
cpu0                                    cpu1
file = rcu_dereference_raw(*fdentry);
// file->f_count == 1
                                        rcu_assign_pointer(fdt->fd[fd], NULL);

atomic_long_inc_not_zero()
// succeeds. f_count == 2
                                        fput() // f_count == 1

file != rcu_dereference_raw(*fdentry)
// will fail
but it doesn't have to.
This is a safe race.
It's no different if cpu0 went through
these steps including successful last check
                                        and then cpu1 did close(fd)

The file held by cpu0 is not on the
path to zero.

Similarly, back then, there was a concern about two parallel __fget_files_rcu()
where one cpu incremented refcnt, failed some check and didn't do fput yet.
In this case the file is not on the path to zero either.
Both cpu-s saw non-zero f_count when they went through atomic_long_inc_not_zero.
The file is not on the path to be reused.

Now the second part of the comment
"the file count isn't zero but the file has already been reused"
is misleading.

The (file != rcu_dereference_raw(*fdentry)) check is racy.
Another cpu could have replaced that slot right after that check.
Example:
cpu0 doing lockless __fget_files_rcu() while cpu1 doing sys_close.
__fget_files_rcu() will be returning a file that doesn't exist in fdt.
And it's safe.

This race is possible:
cpu0                                    cpu1
file = rcu_dereference_raw(*fdentry);
                                        fput() reaches zero
                                        file_free
                                        alloc_empty_file // got same file
                                        init_file // f_count = 1
atomic_long_inc_not_zero()
// succeeds. f_count == 2
file != rcu_dereference_raw(*fdentry))
// preventing a reuse of a file that
was never in this fdt.

The only value of this check is to make sure that this file
either _is_ or _was_ at some point in this fdt.
It's not preventing reuse per-se.

This race is possible:
cpu0                                    cpu1
file = rcu_dereference_raw(*fdentry);
                                        fput() reaches zero
                                        file_free
                                        alloc_empty_file // got same file
                                        init_file // f_count = 1
                                        fd_install
atomic_long_inc_not_zero()
// succeeds. f_count == 2
file == rcu_dereference_raw(*fdentry))
// success, though the file _was reused_.

I suggest to revise the comment.

>
> You've been provided:
>
> a) good reasons why the patchset in it's current form isn't acceptable
>    repeated multiple times

We will improve commit logs in the next revision.

> b) support for exporting a variant of bpf_d_path() that is safe to use

good, but bpf_d_path kfunc alone is not useful.
As Jann noted back in September:
https://lore.kernel.org/all/CAG48ez2d5CW=CDi+fBOU1YqtwHfubN3q6w=1LfD+ss+Q1PWHgQ@mail.gmail.com/

The conversion of files to typesafe_by_rcu broke that verifier
assumption about mm->exe_file and we need kfuncs to safely
acquire/release file reference to fix that breakage.

> c) a request that all kfunc exports for the vfs will have to be located
>    under fs/, not in kernel/bpf/

we've added kfuncs to
net/netfilter/, net/xfrm/, net/ipv4/, kernel/cgroup/, drivers/hid/
because maintainers of those subsystems demonstrated understanding
of what bpf is doing and what these kfuncs are for.

We can put them in fs/, but you need to demonstrate willingness to
understand the problem we're solving instead of arguing
about how hard file typesafe_by_rcu is to understand.

> d) a path on how to move forward with additional kfunc requests:
>    Clear and documented rules when it's ok for someone to come along and
>    request access to bpf kfuncs when it's to be rejected and when it's
>    ok to be supported.

There are ~36500 EXPORT_SYMBOL in the kernel.
Are there "clear documented rules when it's ok for someone to"
add or remove them?
There is a gentleman's agreement that maintainers of subsystems need to
be cc-ed when new EXPORT_SYMBOL-s are added.
In this case no new EXPORT_SYMBOLs are requested.

Compare that to 221 bpf helpers (which are uapi, and for the last
2 years we didn't add a single one) and 151 bpf kfuncs which are
not uapi as clearly documented in Documentation/bpf/kfuncs.rst
When developers want to add them they cc bpf@vger and relevant
subsystems just like we did with netfilter, xfrm, cgroup, hid.
kfunc deprecation rules are also documented in kfunc.rst

> You repeatedly threatening to go over the heads of people will not make
> them more amenable to happily integrate with your subsystem.

This is not it. We made our own mistakes with bpf_d_path safety, and now
file typesafe_by_rcu broke bpf safety assumptions.
We have to fix it one way or the other.
It's bpf that got affected by your changes.
But we don't demand that you fix bpf bits. We're fixing them.
But you have to provide technical reasons why file acquire/release
kfuncs are not suitable.
"Only 3 people understand typesafe_by_rcu" is not it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-03-27 21:41 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-06  7:39 [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Matt Bobrowski
2024-03-06  7:39 ` [PATCH v2 bpf-next 1/9] bpf: rename fs_kfunc_set_ids to lsm_kfunc_set_ids Matt Bobrowski
2024-03-06  7:39 ` [PATCH v2 bpf-next 2/9] bpf: add new acquire/release BPF kfuncs for mm_struct Matt Bobrowski
2024-03-06 11:50   ` Christian Brauner
2024-03-06  7:39 ` [PATCH v2 bpf-next 3/9] bpf/selftests: add selftests for mm_struct acquire/release BPF kfuncs Matt Bobrowski
2024-03-06  7:40 ` [PATCH v2 bpf-next 4/9] bpf: add new acquire/release based BPF kfuncs for exe_file Matt Bobrowski
2024-03-06 11:31   ` Christian Brauner
2024-03-06  7:40 ` [PATCH v2 bpf-next 5/9] bpf/selftests: add selftests for exe_file acquire/release BPF kfuncs Matt Bobrowski
2024-03-06  7:40 ` [PATCH v2 bpf-next 6/9] bpf: add acquire/release based BPF kfuncs for fs_struct's paths Matt Bobrowski
2024-03-06 11:47   ` Christian Brauner
2024-03-06  7:40 ` [PATCH v2 bpf-next 7/9] bpf/selftests: add selftests for root/pwd path based BPF kfuncs Matt Bobrowski
2024-03-06  7:40 ` [PATCH v2 bpf-next 9/9] bpf/selftests: adapt selftests test_d_path for BPF kfunc bpf_path_d_path() Matt Bobrowski
2024-03-06  7:40 ` [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based " Matt Bobrowski
2024-03-06 11:21 ` [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs Christian Brauner
2024-03-06 12:13   ` Christian Brauner
2024-03-06 21:44     ` Paul Moore
2024-03-07  4:05     ` Alexei Starovoitov
2024-03-07  9:54       ` Christian Brauner
2024-03-07 20:50         ` Paul Moore
2024-03-08  3:25           ` Alexei Starovoitov
2024-03-08 10:58             ` Christian Brauner
2024-03-08  3:11         ` Alexei Starovoitov
2024-03-08 10:35           ` Christian Brauner
2024-03-09  1:23             ` Alexei Starovoitov
2024-03-11 12:00               ` Christian Brauner
2024-03-12 17:06                 ` Matt Bobrowski
2024-03-12 20:11                   ` Matt Bobrowski
2024-03-18 13:24                   ` Christian Brauner
2024-03-13 21:05                 ` Alexei Starovoitov
2024-03-18 13:14                   ` Christian Brauner
2024-03-27 21:41                     ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).