linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/4] bpf path iterator
@ 2025-05-28 22:26 Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent() Song Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 45+ messages in thread
From: Song Liu @ 2025-05-28 22:26 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel, linux-security-module
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack, Song Liu

In security use cases, it is common to apply rules to VFS subtrees.
However, filtering files in a subtree is not straightforward [1].

One solution to this problem is to start from a path and walk up the VFS
tree (towards the root). Among in-tree LSMs, Landlock uses this solution.

BPF LSM solutions, such like Tetragon [2], also use similar approaches.
However, due to lack of proper helper/kfunc support, BPF LSM solutions
usually do the path walk with probe read, which is racy.

This patchset introduce a reliable helper path_parent, which walks path to
its VFS parent. The helper is use in Landlock.

A new BPF iterator, path iterator, is introduced to do the path walking.
The BPF path iterator uses the new path_parent help to walk the VFS tree
reliably.

[1] https://lpc.events/event/18/contributions/1940/
[2] https://github.com/cilium/tetragon/

Song Liu (4):
  namei: Introduce new helper function path_parent()
  landlock: Use path_parent()
  bpf: Introduce path iterator
  selftests/bpf: Add tests for bpf path iterator

 fs/namei.c                                    |  39 +++++
 include/linux/namei.h                         |   8 ++
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/helpers.c                          |   3 +
 kernel/bpf/path_iter.c                        |  74 ++++++++++
 kernel/bpf/verifier.c                         |   5 +
 security/landlock/fs.c                        |  34 ++---
 .../testing/selftests/bpf/bpf_experimental.h  |   6 +
 .../selftests/bpf/prog_tests/path_iter.c      |  12 ++
 tools/testing/selftests/bpf/progs/path_iter.c | 134 ++++++++++++++++++
 10 files changed, 299 insertions(+), 17 deletions(-)
 create mode 100644 kernel/bpf/path_iter.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/path_iter.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_iter.c

--
2.47.1

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent()
  2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
@ 2025-05-28 22:26 ` Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 2/4] landlock: Use path_parent() Song Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 45+ messages in thread
From: Song Liu @ 2025-05-28 22:26 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel, linux-security-module
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack, Song Liu

This helper walks an input path to its parent. Logic are added to handle
walking across mount tree.

This will be used by landlock, and BPF LSM.

Signed-off-by: Song Liu <song@kernel.org>
---
 fs/namei.c            | 39 +++++++++++++++++++++++++++++++++++++++
 include/linux/namei.h |  8 ++++++++
 2 files changed, 47 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 84a0e0b0111c..475f4188a0e6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1379,6 +1379,45 @@ int follow_up(struct path *path)
 }
 EXPORT_SYMBOL(follow_up);
 
+/**
+ * path_parent - Find the parent of path
+ * @path: input and output path.
+ *
+ * Given a path, find the parent path. Replace @path with the parent path.
+ * If we were already at the real root or a disconnected root, @path is
+ * not changed.
+ *
+ * Returns:
+ *   PATH_PARENT_SAME_MOUNT         - if we walked up within the same mount;
+ *   PATH_PARENT_CHANGED_MOUNT      - if we walked up a mount point;
+ *   PATH_PARENT_REAL_ROOT          - if we were already at real root;
+ *   PATH_PARENT_DISCONNECTED_ROOT  - if we were at the root of a disconnected
+ *                                    filesystem.
+ */
+enum path_parent_status path_parent(struct path *path)
+{
+	struct dentry *parent;
+
+	if (path->dentry == path->mnt->mnt_root) {
+		if (!follow_up(path))
+			return PATH_PARENT_REAL_ROOT;
+		return PATH_PARENT_CHANGED_MOUNT;
+	}
+	if (unlikely(IS_ROOT(path->dentry)))
+		return PATH_PARENT_DISCONNECTED_ROOT;
+
+	parent = dget_parent(path->dentry);
+	if (unlikely(!path_connected(path->mnt, parent))) {
+		dput(parent);
+		return PATH_PARENT_DISCONNECTED_ROOT;
+	}
+
+	dput(path->dentry);
+	path->dentry = parent;
+	return PATH_PARENT_SAME_MOUNT;
+}
+EXPORT_SYMBOL_GPL(path_parent);
+
 static bool choose_mountpoint_rcu(struct mount *m, const struct path *root,
 				  struct path *path, unsigned *seqp)
 {
diff --git a/include/linux/namei.h b/include/linux/namei.h
index bbaf55fb3101..dc6e9096eb15 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -86,6 +86,14 @@ extern int follow_down_one(struct path *);
 extern int follow_down(struct path *path, unsigned int flags);
 extern int follow_up(struct path *);
 
+enum path_parent_status {
+	PATH_PARENT_SAME_MOUNT = 0,
+	PATH_PARENT_CHANGED_MOUNT,
+	PATH_PARENT_REAL_ROOT,
+	PATH_PARENT_DISCONNECTED_ROOT,
+};
+enum path_parent_status path_parent(struct path *path);
+
 extern struct dentry *lock_rename(struct dentry *, struct dentry *);
 extern struct dentry *lock_rename_child(struct dentry *, struct dentry *);
 extern void unlock_rename(struct dentry *, struct dentry *);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent() Song Liu
@ 2025-05-28 22:26 ` Song Liu
  2025-05-31 13:51   ` Tingmao Wang
  2025-05-28 22:26 ` [PATCH bpf-next 3/4] bpf: Introduce path iterator Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf " Song Liu
  3 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-28 22:26 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel, linux-security-module
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack, Song Liu

Use path_parent() to walk a path up to its parent.

While path_parent() has an extra check with path_connected() than existing
code, there is no functional changes intended for landlock.

Signed-off-by: Song Liu <song@kernel.org>
---
 security/landlock/fs.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 6fee7c20f64d..32a24758ad6e 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -837,7 +837,6 @@ static bool is_access_to_paths_allowed(
 	 * restriction.
 	 */
 	while (true) {
-		struct dentry *parent_dentry;
 		const struct landlock_rule *rule;
 
 		/*
@@ -896,19 +895,17 @@ static bool is_access_to_paths_allowed(
 		if (allowed_parent1 && allowed_parent2)
 			break;
 jump_up:
-		if (walker_path.dentry == walker_path.mnt->mnt_root) {
-			if (follow_up(&walker_path)) {
-				/* Ignores hidden mount points. */
-				goto jump_up;
-			} else {
-				/*
-				 * Stops at the real root.  Denies access
-				 * because not all layers have granted access.
-				 */
-				break;
-			}
-		}
-		if (unlikely(IS_ROOT(walker_path.dentry))) {
+		switch (path_parent(&walker_path)) {
+		case PATH_PARENT_CHANGED_MOUNT:
+			/* Ignores hidden mount points. */
+			goto jump_up;
+		case PATH_PARENT_REAL_ROOT:
+			/*
+			 * Stops at the real root.  Denies access
+			 * because not all layers have granted access.
+			 */
+			goto walk_done;
+		case PATH_PARENT_DISCONNECTED_ROOT:
 			/*
 			 * Stops at disconnected root directories.  Only allows
 			 * access to internal filesystems (e.g. nsfs, which is
@@ -918,12 +915,15 @@ static bool is_access_to_paths_allowed(
 				allowed_parent1 = true;
 				allowed_parent2 = true;
 			}
+			goto walk_done;
+		case PATH_PARENT_SAME_MOUNT:
 			break;
+		default:
+			WARN_ON_ONCE(1);
+			goto walk_done;
 		}
-		parent_dentry = dget_parent(walker_path.dentry);
-		dput(walker_path.dentry);
-		walker_path.dentry = parent_dentry;
 	}
+walk_done:
 	path_put(&walker_path);
 
 	if (!allowed_parent1) {
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent() Song Liu
  2025-05-28 22:26 ` [PATCH bpf-next 2/4] landlock: Use path_parent() Song Liu
@ 2025-05-28 22:26 ` Song Liu
  2025-05-28 22:37   ` Al Viro
  2025-05-28 22:26 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf " Song Liu
  3 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-28 22:26 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel, linux-security-module
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack, Song Liu

Introduce a path iterator, which reliably walk a struct path.

Current version only support walking towards the root, which helper
path_parent. But the path iterator API can be extended to cover other
use cases, for example, walking the mount tree.

Signed-off-by: Song Liu <song@kernel.org>
---
 kernel/bpf/Makefile    |  1 +
 kernel/bpf/helpers.c   |  3 ++
 kernel/bpf/path_iter.c | 74 ++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c  |  5 +++
 4 files changed, 83 insertions(+)
 create mode 100644 kernel/bpf/path_iter.c

diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 70502f038b92..8075a83d5e08 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -53,6 +53,7 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o
 obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o
+obj-$(CONFIG_BPF_SYSCALL) += path_iter.o
 
 CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index c1113b74e1e2..d77b055092e7 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -3386,6 +3386,9 @@ BTF_ID_FLAGS(func, bpf_copy_from_user_dynptr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_copy_from_user_str_dynptr, KF_SLEEPABLE)
 BTF_ID_FLAGS(func, bpf_copy_from_user_task_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_copy_from_user_task_str_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_path_new, KF_ITER_NEW | KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_iter_path_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_iter_path_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
 BTF_KFUNCS_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/path_iter.c b/kernel/bpf/path_iter.c
new file mode 100644
index 000000000000..838ebbeac6c2
--- /dev/null
+++ b/kernel/bpf/path_iter.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+#include <linux/bpf.h>
+#include <linux/bpf_mem_alloc.h>
+#include <linux/namei.h>
+#include <linux/path.h>
+
+enum bpf_path_iter_mode {
+	BPF_PATH_ITER_MODE_PARENT = 1,
+};
+
+/* open-coded iterator */
+struct bpf_iter_path {
+	__u64 __opaque[3];
+} __aligned(8);
+
+struct bpf_iter_path_kern {
+	struct path path;
+	enum bpf_path_iter_mode mode;
+} __aligned(8);
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc int bpf_iter_path_new(struct bpf_iter_path *it,
+				  struct path *start,
+				  enum bpf_path_iter_mode mode)
+{
+	struct bpf_iter_path_kern *kit = (void *)it;
+
+	BUILD_BUG_ON(sizeof(*kit) > sizeof(*it));
+	BUILD_BUG_ON(__alignof__(*kit) != __alignof__(*it));
+
+	kit->mode = mode;
+
+	switch (mode) {
+	case BPF_PATH_ITER_MODE_PARENT:
+		break;
+	default:
+		memset(&kit->path, 0, sizeof(struct path));
+		return -EINVAL;
+	}
+
+	kit->path = *start;
+	path_get(&kit->path);
+	return 0;
+}
+
+__bpf_kfunc struct path *bpf_iter_path_next(struct bpf_iter_path *it)
+{
+	struct bpf_iter_path_kern *kit = (void *)it;
+
+	switch (kit->mode) {
+	case BPF_PATH_ITER_MODE_PARENT:
+		enum path_parent_status status = path_parent(&kit->path);
+
+		/* If already at a root, return NULL */
+		if (status == PATH_PARENT_REAL_ROOT ||
+		    status == PATH_PARENT_DISCONNECTED_ROOT)
+			return NULL;
+		break;
+	default:
+		return NULL;
+	}
+	return &kit->path;
+}
+
+__bpf_kfunc void bpf_iter_path_destroy(struct bpf_iter_path *it)
+{
+	struct bpf_iter_path_kern *kit = (void *)it;
+
+	path_put(&kit->path);
+}
+
+__bpf_kfunc_end_defs();
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d5807d2efc92..734c06809563 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7034,6 +7034,10 @@ BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct socket) {
 	struct sock *sk;
 };
 
+BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct path) {
+	struct dentry *dentry;
+};
+
 static bool type_is_rcu(struct bpf_verifier_env *env,
 			struct bpf_reg_state *reg,
 			const char *field_name, u32 btf_id)
@@ -7074,6 +7078,7 @@ static bool type_is_trusted_or_null(struct bpf_verifier_env *env,
 				    const char *field_name, u32 btf_id)
 {
 	BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct socket));
+	BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED_OR_NULL(struct path));
 
 	return btf_nested_type_is_trusted(&env->log, reg, field_name, btf_id,
 					  "__safe_trusted_or_null");
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf path iterator
  2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
                   ` (2 preceding siblings ...)
  2025-05-28 22:26 ` [PATCH bpf-next 3/4] bpf: Introduce path iterator Song Liu
@ 2025-05-28 22:26 ` Song Liu
  3 siblings, 0 replies; 45+ messages in thread
From: Song Liu @ 2025-05-28 22:26 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel, linux-security-module
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack, Song Liu

Add tests for bpf path iterator, including test cases similar to real
workload (call bpf_path_d_path and bpf_get_dentry_xattr), and test cases
where the verifier rejects invalid use of the iterator.

Signed-off-by: Song Liu <song@kernel.org>
---
 .../testing/selftests/bpf/bpf_experimental.h  |   6 +
 .../selftests/bpf/prog_tests/path_iter.c      |  12 ++
 tools/testing/selftests/bpf/progs/path_iter.c | 134 ++++++++++++++++++
 3 files changed, 152 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/path_iter.c
 create mode 100644 tools/testing/selftests/bpf/progs/path_iter.c

diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index 6535c8ae3c46..e9eb2b105eb2 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -591,4 +591,10 @@ extern int bpf_iter_kmem_cache_new(struct bpf_iter_kmem_cache *it) __weak __ksym
 extern struct kmem_cache *bpf_iter_kmem_cache_next(struct bpf_iter_kmem_cache *it) __weak __ksym;
 extern void bpf_iter_kmem_cache_destroy(struct bpf_iter_kmem_cache *it) __weak __ksym;
 
+struct bpf_iter_path;
+extern int bpf_iter_path_new(struct bpf_iter_path *it, struct path *start,
+			     enum bpf_path_iter_mode mode) __weak __ksym;
+extern struct path *bpf_iter_path_next(struct bpf_iter_path *it) __weak __ksym;
+extern void bpf_iter_path_destroy(struct bpf_iter_path *it) __weak __ksym;
+
 #endif
diff --git a/tools/testing/selftests/bpf/prog_tests/path_iter.c b/tools/testing/selftests/bpf/prog_tests/path_iter.c
new file mode 100644
index 000000000000..3c99c24fbd96
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/path_iter.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "path_iter.skel.h"
+
+void test_path_iter(void)
+{
+	RUN_TESTS(path_iter);
+}
diff --git a/tools/testing/selftests/bpf/progs/path_iter.c b/tools/testing/selftests/bpf/progs/path_iter.c
new file mode 100644
index 000000000000..d5733c797a3f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/path_iter.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_misc.h"
+#include "bpf_experimental.h"
+
+char _license[] SEC("license") = "GPL";
+
+char path_name[256];
+char xattr_val[64];
+
+static __always_inline void access_path_dentry(struct path *p)
+{
+	struct bpf_dynptr ptr;
+	struct dentry *dentry;
+
+	if (!p)
+		return;
+
+	bpf_dynptr_from_mem(xattr_val, sizeof(xattr_val), 0, &ptr);
+	bpf_path_d_path(p, path_name, sizeof(path_name));
+
+	dentry = p->dentry;
+	if (dentry)
+		bpf_get_dentry_xattr(dentry, "user.xattr", &ptr);
+}
+
+SEC("lsm.s/file_open")
+__success
+int BPF_PROG(open_code, struct file *f)
+{
+	struct bpf_iter_path path_it;
+	struct path *p;
+	int ret;
+
+	ret = bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	if (ret) {
+		bpf_iter_path_destroy(&path_it);
+		return 0;
+	}
+
+	p = bpf_iter_path_next(&path_it);
+	access_path_dentry(p);
+	bpf_iter_path_destroy(&path_it);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__success
+int BPF_PROG(for_each, struct file *f)
+{
+	struct path *p;
+
+	bpf_for_each(path, p, &f->f_path, BPF_PATH_ITER_MODE_PARENT)
+		access_path_dentry(p);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("Unreleased reference")
+int BPF_PROG(missing_destroy, struct file *f)
+{
+	struct bpf_iter_path path_it;
+
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("expected an initialized iter_path")
+int BPF_PROG(missing_new, struct file *f)
+{
+	struct bpf_iter_path path_it;
+
+	bpf_iter_path_destroy(&path_it);
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("expected uninitialized iter_path")
+int BPF_PROG(new_twice, struct file *f)
+{
+	struct bpf_iter_path path_it;
+
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	bpf_iter_path_destroy(&path_it);
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("expected an initialized iter_path")
+int BPF_PROG(destroy_twice, struct file *f)
+{
+	struct bpf_iter_path path_it;
+
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	bpf_iter_path_destroy(&path_it);
+	bpf_iter_path_destroy(&path_it);
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__success
+int BPF_PROG(reuse_path_iter, struct file *f)
+{
+	struct bpf_iter_path path_it;
+
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	bpf_iter_path_destroy(&path_it);
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	bpf_iter_path_destroy(&path_it);
+	return 0;
+}
+
+SEC("lsm.s/file_open")
+__failure __msg("invalid read from stack off")
+int BPF_PROG(invalid_read_path_iter, struct file *f)
+{
+	struct bpf_iter_path path_it;
+	struct bpf_iter_path path_it_2;
+
+
+	bpf_iter_path_new(&path_it, &f->f_path, BPF_PATH_ITER_MODE_PARENT);
+	path_it_2 = path_it;
+	bpf_iter_path_destroy(&path_it_2);
+	return 0;
+}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-28 22:26 ` [PATCH bpf-next 3/4] bpf: Introduce path iterator Song Liu
@ 2025-05-28 22:37   ` Al Viro
  2025-05-29 11:58     ` Jan Kara
  0 siblings, 1 reply; 45+ messages in thread
From: Al Viro @ 2025-05-28 22:37 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef,
	mic, gnoack

On Wed, May 28, 2025 at 03:26:22PM -0700, Song Liu wrote:
> Introduce a path iterator, which reliably walk a struct path.

No, it does not.  If you have no external warranty that mount
*and* dentry trees are stable, it's not reliable at all.

And I'm extremely suspicious regarding the validity of anything
that pokes around in mount trees.  There is a very good reason
struct mount is *not* visible in include/linux/*.h

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-28 22:37   ` Al Viro
@ 2025-05-29 11:58     ` Jan Kara
  2025-05-29 16:53       ` Song Liu
  0 siblings, 1 reply; 45+ messages in thread
From: Jan Kara @ 2025-05-29 11:58 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef,
	mic, gnoack

On Wed 28-05-25 23:37:24, Al Viro wrote:
> On Wed, May 28, 2025 at 03:26:22PM -0700, Song Liu wrote:
> > Introduce a path iterator, which reliably walk a struct path.
> 
> No, it does not.  If you have no external warranty that mount
> *and* dentry trees are stable, it's not reliable at all.

I agree that advertising this as "reliable walk" is misleading. It is
realiable in the sense that it will not dereference freed memory, leak
references etc. As you say it is also reliable in the sense that without
external modifications to dentry & mount tree, it will crawl the path to
root. But in presence of external modifications the only reliability it
offers is "it will not crash". E.g. malicious parallel modifications can
arbitrarily prolong the duration of the walk.

> And I'm extremely suspicious regarding the validity of anything
> that pokes around in mount trees.  There is a very good reason
> struct mount is *not* visible in include/linux/*.h

Well, but looking at the actual code I don't see anything problematic
there. It does not export any new functionality from the VFS AFAICT. It
just factors out some parent lookup details from Landlock into generic code
and exposes it as a helper to fetch parent dentry. But overall any kernel
module can do what's in the helper already today and exposing the
functionality of looking up dentry parent to BPF as well seems OK to me.

So I have only reservations against calling it "reliable walk". It should
rather come with warnings like "The sequence of dentries may be rather
surprising in presence of parallel directory or mount tree modifications
and the iteration need not ever finish in face of parallel malicious
directory tree manipulations."

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 11:58     ` Jan Kara
@ 2025-05-29 16:53       ` Song Liu
  2025-05-29 16:57         ` Alexei Starovoitov
  2025-05-29 17:38         ` Al Viro
  0 siblings, 2 replies; 45+ messages in thread
From: Song Liu @ 2025-05-29 16:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

Hi Al and Jan,

Thanks for your review!

On Thu, May 29, 2025 at 4:58 AM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 28-05-25 23:37:24, Al Viro wrote:
> > On Wed, May 28, 2025 at 03:26:22PM -0700, Song Liu wrote:
> > > Introduce a path iterator, which reliably walk a struct path.
> >
> > No, it does not.  If you have no external warranty that mount
> > *and* dentry trees are stable, it's not reliable at all.
>
> I agree that advertising this as "reliable walk" is misleading. It is
> realiable in the sense that it will not dereference freed memory, leak
> references etc. As you say it is also reliable in the sense that without
> external modifications to dentry & mount tree, it will crawl the path to
> root. But in presence of external modifications the only reliability it
> offers is "it will not crash". E.g. malicious parallel modifications can
> arbitrarily prolong the duration of the walk.

How about we describe this as:

Introduce a path iterator, which safely (no crash) walks a struct path.
Without malicious parallel modifications, the walk is guaranteed to
terminate. The sequence of dentries maybe surprising in presence
of parallel directory or mount tree modifications and the iteration may
not ever finish in face of parallel malicious directory tree manipulations.

Current version of path iterator only supports walking towards the root,
with helper path_parent. But the path iterator API can be extended
to cover other use cases.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 16:53       ` Song Liu
@ 2025-05-29 16:57         ` Alexei Starovoitov
  2025-05-29 17:05           ` Song Liu
  2025-05-29 17:38         ` Al Viro
  1 sibling, 1 reply; 45+ messages in thread
From: Alexei Starovoitov @ 2025-05-29 16:57 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, Al Viro, bpf, Linux-Fsdevel, LKML, LSM List,
	Kernel Team, Andrii Nakryiko, Eduard, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Christian Brauner, KP Singh,
	Matt Bobrowski, Amir Goldstein, repnop, Jeff Layton, Josef Bacik,
	Mickaël Salaün, Günther Noack

On Thu, May 29, 2025 at 9:53 AM Song Liu <song@kernel.org> wrote:
>
> Hi Al and Jan,
>
> Thanks for your review!
>
> On Thu, May 29, 2025 at 4:58 AM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 28-05-25 23:37:24, Al Viro wrote:
> > > On Wed, May 28, 2025 at 03:26:22PM -0700, Song Liu wrote:
> > > > Introduce a path iterator, which reliably walk a struct path.
> > >
> > > No, it does not.  If you have no external warranty that mount
> > > *and* dentry trees are stable, it's not reliable at all.
> >
> > I agree that advertising this as "reliable walk" is misleading. It is
> > realiable in the sense that it will not dereference freed memory, leak
> > references etc. As you say it is also reliable in the sense that without
> > external modifications to dentry & mount tree, it will crawl the path to
> > root. But in presence of external modifications the only reliability it
> > offers is "it will not crash". E.g. malicious parallel modifications can
> > arbitrarily prolong the duration of the walk.
>
> How about we describe this as:
>
> Introduce a path iterator, which safely (no crash) walks a struct path.
> Without malicious parallel modifications, the walk is guaranteed to
> terminate. The sequence of dentries maybe surprising in presence
> of parallel directory or mount tree modifications and the iteration may
> not ever finish in face of parallel malicious directory tree manipulations.

Hold on. If it's really the case then is the landlock susceptible
to this type of attack already ?
landlock may infinitely loop in the kernel ?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 16:57         ` Alexei Starovoitov
@ 2025-05-29 17:05           ` Song Liu
  2025-05-30 14:20             ` Mickaël Salaün
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-29 17:05 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jan Kara, Al Viro, bpf, Linux-Fsdevel, LKML, LSM List,
	Kernel Team, Andrii Nakryiko, Eduard, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Christian Brauner, KP Singh,
	Matt Bobrowski, Amir Goldstein, repnop, Jeff Layton, Josef Bacik,
	Mickaël Salaün, Günther Noack

On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
[...]
> >
> > How about we describe this as:
> >
> > Introduce a path iterator, which safely (no crash) walks a struct path.
> > Without malicious parallel modifications, the walk is guaranteed to
> > terminate. The sequence of dentries maybe surprising in presence
> > of parallel directory or mount tree modifications and the iteration may
> > not ever finish in face of parallel malicious directory tree manipulations.
>
> Hold on. If it's really the case then is the landlock susceptible
> to this type of attack already ?
> landlock may infinitely loop in the kernel ?

I think this only happens if the attacker can modify the mount or
directory tree as fast as the walk, which is probably impossible
in reality.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 16:53       ` Song Liu
  2025-05-29 16:57         ` Alexei Starovoitov
@ 2025-05-29 17:38         ` Al Viro
  2025-05-29 18:00           ` Song Liu
  1 sibling, 1 reply; 45+ messages in thread
From: Al Viro @ 2025-05-29 17:38 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:

> Current version of path iterator only supports walking towards the root,
> with helper path_parent. But the path iterator API can be extended
> to cover other use cases.

Clarify the last part, please - call me paranoid, but that sounds like
a beginning of something that really should be discussed upfront.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 17:38         ` Al Viro
@ 2025-05-29 18:00           ` Song Liu
  2025-05-29 18:35             ` Al Viro
  2025-06-02  9:27             ` Christian Brauner
  0 siblings, 2 replies; 45+ messages in thread
From: Song Liu @ 2025-05-29 18:00 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
>
> > Current version of path iterator only supports walking towards the root,
> > with helper path_parent. But the path iterator API can be extended
> > to cover other use cases.
>
> Clarify the last part, please - call me paranoid, but that sounds like
> a beginning of something that really should be discussed upfront.

We don't have any plan with future use cases yet. The only example
I mentioned in the original version of the commit log is "walk the
mount tree". IOW, it is similar to the current iterator, but skips non
mount point iterations.

Since we call it "path iterator", it might make sense to add ways to
iterate the VFS tree in different patterns. For example, we may
have an iterator that iterates all files within a directory. Again, we
don't see urgent use cases other than the current "walk to root"
iterator.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 18:00           ` Song Liu
@ 2025-05-29 18:35             ` Al Viro
  2025-05-29 19:46               ` Song Liu
  2025-06-02  9:27             ` Christian Brauner
  1 sibling, 1 reply; 45+ messages in thread
From: Al Viro @ 2025-05-29 18:35 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> >
> > > Current version of path iterator only supports walking towards the root,
> > > with helper path_parent. But the path iterator API can be extended
> > > to cover other use cases.
> >
> > Clarify the last part, please - call me paranoid, but that sounds like
> > a beginning of something that really should be discussed upfront.
> 
> We don't have any plan with future use cases yet. The only example
> I mentioned in the original version of the commit log is "walk the
> mount tree". IOW, it is similar to the current iterator, but skips non
> mount point iterations.
> 
> Since we call it "path iterator", it might make sense to add ways to
> iterate the VFS tree in different patterns. For example, we may
> have an iterator that iterates all files within a directory. Again, we
> don't see urgent use cases other than the current "walk to root"
> iterator.

What kinds of locking environments can that end up used in?

The reason why I'm getting more and more unhappy with this thing is
that it sounds like a massive headache for any correctness analysis in
VFS work.

Going straight to the root starting at a point you already have pinned
is relatively mild - you can't do path_put() in any blocking contexts,
obviously, and you'd better be careful with what you are doing on
mountpoint traversal (e.g. combined with "now let's open that directory
and read it" it's an instant "hell, no" - you could easily bypass MNT_LOCKED
restrictions that way), but if there's a threat of that getting augmented
with other things (iterating through all files in directory would be
a very different beast from the locking POV, if nothing else)... ouch.

Basically, you are creating a spot we will need to watch very carefully
from now on.  And the rationale appears to include "so that we could
expose that to random out-of-tree code that decided to call itself LSM",
so pardon me for being rather suspicious about the details.

PS: one general observation: "some LSM does it" does not imply even
"what that LSM is doing is sane and safe", let along "what that LSM is
doing doesn't happen to avoid breakage only by accident".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 18:35             ` Al Viro
@ 2025-05-29 19:46               ` Song Liu
  2025-05-29 20:15                 ` Al Viro
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-29 19:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 11:35 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> > On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> > >
> > > > Current version of path iterator only supports walking towards the root,
> > > > with helper path_parent. But the path iterator API can be extended
> > > > to cover other use cases.
> > >
> > > Clarify the last part, please - call me paranoid, but that sounds like
> > > a beginning of something that really should be discussed upfront.
> >
> > We don't have any plan with future use cases yet. The only example
> > I mentioned in the original version of the commit log is "walk the
> > mount tree". IOW, it is similar to the current iterator, but skips non
> > mount point iterations.
> >
> > Since we call it "path iterator", it might make sense to add ways to
> > iterate the VFS tree in different patterns. For example, we may
> > have an iterator that iterates all files within a directory. Again, we
> > don't see urgent use cases other than the current "walk to root"
> > iterator.
>
> What kinds of locking environments can that end up used in?

This will start with a referenced "struct path", in a sleepable context.

> The reason why I'm getting more and more unhappy with this thing is
> that it sounds like a massive headache for any correctness analysis in
> VFS work.
>
> Going straight to the root starting at a point you already have pinned
> is relatively mild - you can't do path_put() in any blocking contexts,
> obviously, and you'd better be careful with what you are doing on
> mountpoint traversal (e.g. combined with "now let's open that directory
> and read it" it's an instant "hell, no" - you could easily bypass MNT_LOCKED
> restrictions that way), but if there's a threat of that getting augmented
> with other things (iterating through all files in directory would be
> a very different beast from the locking POV, if nothing else)... ouch.

We are fully aware that a "files in the directory" iterator may need
different locking. This is the exact reason we want to provide this
logic as an iterator in the kernel: to get locking/etc correct in the
first place, so that the users can avoid making mistakes.

> Basically, you are creating a spot we will need to watch very carefully
> from now on.  And the rationale appears to include "so that we could
> expose that to random out-of-tree code that decided to call itself LSM",
> so pardon me for being rather suspicious about the details.

No matter what we call them, these use cases exist, out-of-tree or
in-tree, as BPF programs or kernel modules. We are learning from
Landlock here, simply because it is probably the best way to achieve
this.

This particular set introduces a safer API than combinations of
existing APIs (follow_up(), dget_parent(), etc.). It guarantees all
the memory accesses are to properly referenced kernel objects;
it also guaranteed all the acquired references are released.
Therefore, I don't see it adds risks in any sense.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 19:46               ` Song Liu
@ 2025-05-29 20:15                 ` Al Viro
  2025-05-29 21:07                   ` Song Liu
  0 siblings, 1 reply; 45+ messages in thread
From: Al Viro @ 2025-05-29 20:15 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 12:46:00PM -0700, Song Liu wrote:

> > Basically, you are creating a spot we will need to watch very carefully
> > from now on.  And the rationale appears to include "so that we could
> > expose that to random out-of-tree code that decided to call itself LSM",
> > so pardon me for being rather suspicious about the details.
> 
> No matter what we call them, these use cases exist, out-of-tree or
> in-tree, as BPF programs or kernel modules. We are learning from
> Landlock here, simply because it is probably the best way to achieve
> this.

If out-of-tree code breaks from something we do kernel-side, it's the
problem of that out-of-tree code.  You are asking for a considerable
buy-in, without even bothering to spell out what it is that we are
supposed to care about supporting.

If you want cooperation, explain what is needed, and do it first, so that
there's no goalpost shifting afterwards.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 20:15                 ` Al Viro
@ 2025-05-29 21:07                   ` Song Liu
  2025-05-29 21:45                     ` Al Viro
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-29 21:07 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 1:15 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, May 29, 2025 at 12:46:00PM -0700, Song Liu wrote:
>
> > > Basically, you are creating a spot we will need to watch very carefully
> > > from now on.  And the rationale appears to include "so that we could
> > > expose that to random out-of-tree code that decided to call itself LSM",
> > > so pardon me for being rather suspicious about the details.
> >
> > No matter what we call them, these use cases exist, out-of-tree or
> > in-tree, as BPF programs or kernel modules. We are learning from
> > Landlock here, simply because it is probably the best way to achieve
> > this.
>
> If out-of-tree code breaks from something we do kernel-side, it's the
> problem of that out-of-tree code.  You are asking for a considerable
> buy-in, without even bothering to spell out what it is that we are
> supposed to care about supporting.
>
> If you want cooperation, explain what is needed, and do it first, so that
> there's no goalpost shifting afterwards.

We have made it very clear what is needed now: an iterator that iterates
towards the root. This has been discussed in LPC [1] and
LSF/MM/BPF [2].

We don't know what might be needed in the future. That's why nothing
is shared. If the problem is that this code looks extendible, we sure can
remove it for now. But we cannot promise there will never be use cases
that could benefit from a slightly different path iterator. Either way, if we
are adding/changing anything to the path iterator, you will always be
CC'ed. You are always welcome to NAK anything if there is real issue
with the code being developed.

Thanks,
Song


[1] https://lpc.events/event/18/contributions/1940/
[2] https://lwn.net/Articles/1018493/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 21:07                   ` Song Liu
@ 2025-05-29 21:45                     ` Al Viro
  2025-05-29 22:13                       ` Song Liu
  0 siblings, 1 reply; 45+ messages in thread
From: Al Viro @ 2025-05-29 21:45 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 02:07:31PM -0700, Song Liu wrote:

> We have made it very clear what is needed now: an iterator that iterates
> towards the root. This has been discussed in LPC [1] and
> LSF/MM/BPF [2].
> 
> We don't know what might be needed in the future. That's why nothing
> is shared. If the problem is that this code looks extendible, we sure can
> remove it for now. But we cannot promise there will never be use cases
> that could benefit from a slightly different path iterator.

For the record, "use cases that could benefit from X" != "sufficient reason
to accept X".

> Either way, if we
> are adding/changing anything to the path iterator, you will always be
> CC'ed. You are always welcome to NAK anything if there is real issue
> with the code being developed.

Umm...  What about walking into the mountpoint of MNT_LOCKED mount?
That, BTW, is an example of non-trivial implications - at the moment
you *can* check that in path->mnt->mnt_flags before walking rootwards
and repeat the step if you walked into the parent.  Clumsy and easy
to get wrong, but it's doable.

OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
included, out of ->mnt_flags and into a separate field in struct mount.
However, that would conflict with any code using that to deal with
your iterator safely.

What's more, AFAICS in case of a stack of mounts each covering the root
of parent mount, you stop in each of those.  The trouble is, umount(2)
propagation logics assumes that intermediate mounts can be pulled out of
such stack without causing trouble.  For pathname resolution that is
true; it goes through the entire stack atomically wrt that stuff.
For your API that's not the case; somebody who has no idea about an
intermediate mount being there might get caught on it while it's getting
pulled from the stack.

What exactly do you need around the mountpoint crossing?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 21:45                     ` Al Viro
@ 2025-05-29 22:13                       ` Song Liu
  2025-05-29 23:10                         ` Al Viro
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-05-29 22:13 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 2:45 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, May 29, 2025 at 02:07:31PM -0700, Song Liu wrote:
>
> > We have made it very clear what is needed now: an iterator that iterates
> > towards the root. This has been discussed in LPC [1] and
> > LSF/MM/BPF [2].
> >
> > We don't know what might be needed in the future. That's why nothing
> > is shared. If the problem is that this code looks extendible, we sure can
> > remove it for now. But we cannot promise there will never be use cases
> > that could benefit from a slightly different path iterator.
>
> For the record, "use cases that could benefit from X" != "sufficient reason
> to accept X".

Agreed.

>
> > Either way, if we
> > are adding/changing anything to the path iterator, you will always be
> > CC'ed. You are always welcome to NAK anything if there is real issue
> > with the code being developed.
>
> Umm...  What about walking into the mountpoint of MNT_LOCKED mount?
> That, BTW, is an example of non-trivial implications - at the moment
> you *can* check that in path->mnt->mnt_flags before walking rootwards
> and repeat the step if you walked into the parent.  Clumsy and easy
> to get wrong, but it's doable.

Is it an issue if we only hold a reference to a MNT_LOCKED mount for
short period of time? "Short period" means it may get interrupted, page
faults, or wait for an IO (read xattr), but it won't hold a reference to the
mount and sleep indefinitely.

>
> OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> included, out of ->mnt_flags and into a separate field in struct mount.
> However, that would conflict with any code using that to deal with
> your iterator safely.
>
> What's more, AFAICS in case of a stack of mounts each covering the root
> of parent mount, you stop in each of those.  The trouble is, umount(2)
> propagation logics assumes that intermediate mounts can be pulled out of
> such stack without causing trouble.  For pathname resolution that is
> true; it goes through the entire stack atomically wrt that stuff.
> For your API that's not the case; somebody who has no idea about an
> intermediate mount being there might get caught on it while it's getting
> pulled from the stack.
>
> What exactly do you need around the mountpoint crossing?

I thought about skipping intermediate mounts (that are hidden by
other mounts). AFAICT, not skipping them will not cause any issue.
So I got the API to also show these mounts.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 22:13                       ` Song Liu
@ 2025-05-29 23:10                         ` Al Viro
  2025-05-30  0:42                           ` Song Liu
  2025-06-02  9:30                           ` Christian Brauner
  0 siblings, 2 replies; 45+ messages in thread
From: Al Viro @ 2025-05-29 23:10 UTC (permalink / raw)
  To: Song Liu
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote:

> Is it an issue if we only hold a reference to a MNT_LOCKED mount for
> short period of time? "Short period" means it may get interrupted, page
> faults, or wait for an IO (read xattr), but it won't hold a reference to the
> mount and sleep indefinitely.

MNT_LOCKED mount itself is not a problem.  What shouldn't be done is
looking around in the mountpoint it covers.  It depends upon the things
you are going to do with that, but it's very easy to get an infoleak
that way.

> > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> > included, out of ->mnt_flags and into a separate field in struct mount.
> > However, that would conflict with any code using that to deal with
> > your iterator safely.
> >
> > What's more, AFAICS in case of a stack of mounts each covering the root
> > of parent mount, you stop in each of those.  The trouble is, umount(2)
> > propagation logics assumes that intermediate mounts can be pulled out of
> > such stack without causing trouble.  For pathname resolution that is
> > true; it goes through the entire stack atomically wrt that stuff.
> > For your API that's not the case; somebody who has no idea about an
> > intermediate mount being there might get caught on it while it's getting
> > pulled from the stack.
> >
> > What exactly do you need around the mountpoint crossing?
> 
> I thought about skipping intermediate mounts (that are hidden by
> other mounts). AFAICT, not skipping them will not cause any issue.

It can.  Suppose e.g. that /mnt gets propagation from another namespace,
but not the other way round and you mount something on /mnt.

Later, in that another namespace, somebody mounts something on wherever
your /mnt gets propagation to.  A copy will be propagated _between_
your /mnt and whatever you've mounted on top of it; it will be entirely
invisible until you umount your /mnt.  At that point the propagated
copy will show up there, same as if it had appeared just after your
umount.  Prior to that it's entirely invisible.  If its original
counterpart in another namespace gets unmounted first, the copy will
be quietly pulled out.

Note that choose_mountpoint_rcu() callers (including choose_mountpoint())
will have mount_lock seqcount sampled before the traversal _and_ recheck
it after having reached the bottom of stack.  IOW, if you traverse ..
on the way to root, you won't get caught on the sucker being pulled out.

Your iterator, OTOH, would stop in that intermediate mount - and get
an unpleasant surprise when it comes back to do the next step (towards
/mnt on root filesystem, that is) and finds that path->mnt points
to something that is detached from everything - no way to get from
it any further.  That - despite the fact that location you've started
from is still mounted, still has the same pathname, etc. and nothing
had been disrupted for it.

And yes, landlock has a narrow race in the matching place.  Needs to
be fixed.  At least it does ignore those as far as any decisions are
concerned...

Note, BTW, that it might be better off by doing that similar to
d_path.c - without arseloads of dget_parent/dput et.al.; not sure
how feasible it is, but if everything in it can be done under
rcu_read_lock(), that's something to look into.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 23:10                         ` Al Viro
@ 2025-05-30  0:42                           ` Song Liu
  2025-05-30 12:20                             ` Mickaël Salaün
  2025-06-04  0:58                             ` Tingmao Wang
  2025-06-02  9:30                           ` Christian Brauner
  1 sibling, 2 replies; 45+ messages in thread
From: Song Liu @ 2025-05-30  0:42 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On Thu, May 29, 2025 at 4:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote:
>
> > Is it an issue if we only hold a reference to a MNT_LOCKED mount for
> > short period of time? "Short period" means it may get interrupted, page
> > faults, or wait for an IO (read xattr), but it won't hold a reference to the
> > mount and sleep indefinitely.
>
> MNT_LOCKED mount itself is not a problem.  What shouldn't be done is
> looking around in the mountpoint it covers.  It depends upon the things
> you are going to do with that, but it's very easy to get an infoleak
> that way.
>
> > > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> > > included, out of ->mnt_flags and into a separate field in struct mount.
> > > However, that would conflict with any code using that to deal with
> > > your iterator safely.
> > >
> > > What's more, AFAICS in case of a stack of mounts each covering the root
> > > of parent mount, you stop in each of those.  The trouble is, umount(2)
> > > propagation logics assumes that intermediate mounts can be pulled out of
> > > such stack without causing trouble.  For pathname resolution that is
> > > true; it goes through the entire stack atomically wrt that stuff.
> > > For your API that's not the case; somebody who has no idea about an
> > > intermediate mount being there might get caught on it while it's getting
> > > pulled from the stack.
> > >
> > > What exactly do you need around the mountpoint crossing?
> >
> > I thought about skipping intermediate mounts (that are hidden by
> > other mounts). AFAICT, not skipping them will not cause any issue.
>
> It can.  Suppose e.g. that /mnt gets propagation from another namespace,
> but not the other way round and you mount something on /mnt.
>
> Later, in that another namespace, somebody mounts something on wherever
> your /mnt gets propagation to.  A copy will be propagated _between_
> your /mnt and whatever you've mounted on top of it; it will be entirely
> invisible until you umount your /mnt.  At that point the propagated
> copy will show up there, same as if it had appeared just after your
> umount.  Prior to that it's entirely invisible.  If its original
> counterpart in another namespace gets unmounted first, the copy will
> be quietly pulled out.

Thanks for sharing this information!

> Note that choose_mountpoint_rcu() callers (including choose_mountpoint())
> will have mount_lock seqcount sampled before the traversal _and_ recheck
> it after having reached the bottom of stack.  IOW, if you traverse ..
> on the way to root, you won't get caught on the sucker being pulled out.

In some of our internal discussions, we talked about using
choose_mountpoint() instead of follow_up(). I didn't go that direction in this
version because it requires holding "root". But if it makes more sense
to use, choose_mountpoint(), we sure can hold "root".

Alternatively, I think it is also OK to pass a zero'ed root to
choose_mountpoint().

> Your iterator, OTOH, would stop in that intermediate mount - and get
> an unpleasant surprise when it comes back to do the next step (towards
> /mnt on root filesystem, that is) and finds that path->mnt points
> to something that is detached from everything - no way to get from
> it any further.  That - despite the fact that location you've started
> from is still mounted, still has the same pathname, etc. and nothing
> had been disrupted for it.
>
> And yes, landlock has a narrow race in the matching place.  Needs to
> be fixed.  At least it does ignore those as far as any decisions are
> concerned...

If we update path_parent in this patchset with choose_mountpoint(),
and use it in Landlock, we will close this race condition, right?

>
> Note, BTW, that it might be better off by doing that similar to
> d_path.c - without arseloads of dget_parent/dput et.al.; not sure
> how feasible it is, but if everything in it can be done under
> rcu_read_lock(), that's something to look into.

I don't think we can do everything here inside rcu_read_lock().
But d_path.c does have some code we can probably reuse or
learn from. Also, we probably need two variations of iterators,
one walk until absolute root, while the other walk until root of
current->fs, just like d_path() vs. d_absolute_path(). Does this
sound reasonable?

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30  0:42                           ` Song Liu
@ 2025-05-30 12:20                             ` Mickaël Salaün
  2025-05-30 18:43                               ` Al Viro
  2025-05-30 18:55                               ` Song Liu
  2025-06-04  0:58                             ` Tingmao Wang
  1 sibling, 2 replies; 45+ messages in thread
From: Mickaël Salaün @ 2025-05-30 12:20 UTC (permalink / raw)
  To: Song Liu
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Thu, May 29, 2025 at 05:42:16PM -0700, Song Liu wrote:
> On Thu, May 29, 2025 at 4:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote:
> >
> > > Is it an issue if we only hold a reference to a MNT_LOCKED mount for
> > > short period of time? "Short period" means it may get interrupted, page
> > > faults, or wait for an IO (read xattr), but it won't hold a reference to the
> > > mount and sleep indefinitely.
> >
> > MNT_LOCKED mount itself is not a problem.  What shouldn't be done is
> > looking around in the mountpoint it covers.  It depends upon the things
> > you are going to do with that, but it's very easy to get an infoleak
> > that way.
> >
> > > > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> > > > included, out of ->mnt_flags and into a separate field in struct mount.
> > > > However, that would conflict with any code using that to deal with
> > > > your iterator safely.
> > > >
> > > > What's more, AFAICS in case of a stack of mounts each covering the root
> > > > of parent mount, you stop in each of those.  The trouble is, umount(2)
> > > > propagation logics assumes that intermediate mounts can be pulled out of
> > > > such stack without causing trouble.  For pathname resolution that is
> > > > true; it goes through the entire stack atomically wrt that stuff.
> > > > For your API that's not the case; somebody who has no idea about an
> > > > intermediate mount being there might get caught on it while it's getting
> > > > pulled from the stack.
> > > >
> > > > What exactly do you need around the mountpoint crossing?
> > >
> > > I thought about skipping intermediate mounts (that are hidden by
> > > other mounts). AFAICT, not skipping them will not cause any issue.
> >
> > It can.  Suppose e.g. that /mnt gets propagation from another namespace,
> > but not the other way round and you mount something on /mnt.
> >
> > Later, in that another namespace, somebody mounts something on wherever
> > your /mnt gets propagation to.  A copy will be propagated _between_
> > your /mnt and whatever you've mounted on top of it; it will be entirely
> > invisible until you umount your /mnt.  At that point the propagated
> > copy will show up there, same as if it had appeared just after your
> > umount.  Prior to that it's entirely invisible.  If its original
> > counterpart in another namespace gets unmounted first, the copy will
> > be quietly pulled out.
> 
> Thanks for sharing this information!
> 
> > Note that choose_mountpoint_rcu() callers (including choose_mountpoint())
> > will have mount_lock seqcount sampled before the traversal _and_ recheck
> > it after having reached the bottom of stack.  IOW, if you traverse ..
> > on the way to root, you won't get caught on the sucker being pulled out.
> 
> In some of our internal discussions, we talked about using
> choose_mountpoint() instead of follow_up(). I didn't go that direction in this
> version because it requires holding "root". But if it makes more sense
> to use, choose_mountpoint(), we sure can hold "root".
> 
> Alternatively, I think it is also OK to pass a zero'ed root to
> choose_mountpoint().
> 
> > Your iterator, OTOH, would stop in that intermediate mount - and get
> > an unpleasant surprise when it comes back to do the next step (towards
> > /mnt on root filesystem, that is) and finds that path->mnt points
> > to something that is detached from everything - no way to get from
> > it any further.  That - despite the fact that location you've started
> > from is still mounted, still has the same pathname, etc. and nothing
> > had been disrupted for it.
> >
> > And yes, landlock has a narrow race in the matching place.  Needs to
> > be fixed.  At least it does ignore those as far as any decisions are
> > concerned...

Thanks for pointing this out.  In the case of Landlock, walking to a
disconnected mount point (because of this umount race condition) would
deny the requested access whereas it may be allowed otherwise.  This is
not a security issue but still an issue because an event unrelated to
the request (umount) can abort a path resolution, which should not be
the case.

Without access to mount_lock, what would be the best way to fix this
Landlock issue while making it backportable?

> 
> If we update path_parent in this patchset with choose_mountpoint(),
> and use it in Landlock, we will close this race condition, right?

choose_mountpoint() is currently private, but if we add a new filesystem
helper, I think the right approach would be to expose follow_dotdot(),
updating its arguments with public types.  This way the intermediates
mount points will not be exposed, RCU optimization will be leveraged,
and usage of this new helper will be simplified.

> 
> >
> > Note, BTW, that it might be better off by doing that similar to
> > d_path.c - without arseloads of dget_parent/dput et.al.; not sure
> > how feasible it is, but if everything in it can be done under
> > rcu_read_lock(), that's something to look into.
> 
> I don't think we can do everything here inside rcu_read_lock().
> But d_path.c does have some code we can probably reuse or
> learn from. Also, we probably need two variations of iterators,
> one walk until absolute root, while the other walk until root of
> current->fs, just like d_path() vs. d_absolute_path(). Does this
> sound reasonable?

Passing the root to a public follow_dotdot() helper should do the job.

> 
> Thanks,
> Song
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 17:05           ` Song Liu
@ 2025-05-30 14:20             ` Mickaël Salaün
  2025-06-02  9:41               ` Christian Brauner
  2025-06-03  9:46               ` Jan Kara
  0 siblings, 2 replies; 45+ messages in thread
From: Mickaël Salaün @ 2025-05-30 14:20 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Jan Kara, Al Viro, bpf, Linux-Fsdevel, LKML,
	LSM List, Kernel Team, Andrii Nakryiko, Eduard,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Christian Brauner, KP Singh, Matt Bobrowski, Amir Goldstein,
	repnop, Jeff Layton, Josef Bacik, Günther Noack

On Thu, May 29, 2025 at 10:05:59AM -0700, Song Liu wrote:
> On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> [...]
> > >
> > > How about we describe this as:
> > >
> > > Introduce a path iterator, which safely (no crash) walks a struct path.
> > > Without malicious parallel modifications, the walk is guaranteed to
> > > terminate. The sequence of dentries maybe surprising in presence
> > > of parallel directory or mount tree modifications and the iteration may
> > > not ever finish in face of parallel malicious directory tree manipulations.
> >
> > Hold on. If it's really the case then is the landlock susceptible
> > to this type of attack already ?
> > landlock may infinitely loop in the kernel ?
> 
> I think this only happens if the attacker can modify the mount or
> directory tree as fast as the walk, which is probably impossible
> in reality.

Yes, so this is not an infinite loop but an infinite race between the
kernel and a very fast malicious user space process with an infinite
number of available nested writable directories, that would also require
a filesystem (and a kernel) supporting infinite pathname length.

> 
> Thanks,
> Song
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 12:20                             ` Mickaël Salaün
@ 2025-05-30 18:43                               ` Al Viro
  2025-05-31  8:39                                 ` Mickaël Salaün
  2025-06-02  9:32                                 ` Christian Brauner
  2025-05-30 18:55                               ` Song Liu
  1 sibling, 2 replies; 45+ messages in thread
From: Al Viro @ 2025-05-30 18:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Fri, May 30, 2025 at 02:20:39PM +0200, Mickaël Salaün wrote:

> Without access to mount_lock, what would be the best way to fix this
> Landlock issue while making it backportable?
> 
> > 
> > If we update path_parent in this patchset with choose_mountpoint(),
> > and use it in Landlock, we will close this race condition, right?
> 
> choose_mountpoint() is currently private, but if we add a new filesystem
> helper, I think the right approach would be to expose follow_dotdot(),
> updating its arguments with public types.  This way the intermediates
> mount points will not be exposed, RCU optimization will be leveraged,
> and usage of this new helper will be simplified.

IMO anything that involves struct nameidata should remain inside
fs/namei.c - something public might share helpers with it, but that's
it.  We had more than enough pain on changes in there, and I'm pretty
sure that we are not done yet; in the area around atomic_open, but not
only there.  Parts of that are still too subtle, IMO - it got a lot
better over the years, but I would really prefer to avoid the need
to bring more code into analysis for any further massage.

Are you sure that follow_dotdot() behaviour is what you really want?

Note that it's not quite how the pathname resolution works.  There we
have the result of follow_dotdot() fed to step_into(), and that changes
things.  Try this:

mkdir /tmp/foo
mkdir /tmp/foo/bar
cd /tmp/foo/bar
mount -t tmpfs none /tmp/foo
touch /tmp/foo/x
ls -Uldi . .. /tmp/foo ../.. /tmp ../x

and think about the results.  Traversing .. is basically "follow_up as much
as possible, then to parent, then follow_down as much as possible" and
the last part (../x) explains why we do it that way.

Which objects would you want to iterate through when dealing with the
current directory in the experiment above?  Simulation of pathwalk
would have the root of overmounting filesystem as the second object
visited; follow_dotdot() would yield the directory overmounted by
that instead.

I'm not saying that either behaviour is right for your case - just that
they are not identical and it's something that needs to be consciously
chosen.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 12:20                             ` Mickaël Salaün
  2025-05-30 18:43                               ` Al Viro
@ 2025-05-30 18:55                               ` Song Liu
  2025-05-31  8:40                                 ` Mickaël Salaün
  2025-05-31 14:05                                 ` Tingmao Wang
  1 sibling, 2 replies; 45+ messages in thread
From: Song Liu @ 2025-05-30 18:55 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Fri, May 30, 2025 at 5:20 AM Mickaël Salaün <mic@digikod.net> wrote:
[...]
> >
> > If we update path_parent in this patchset with choose_mountpoint(),
> > and use it in Landlock, we will close this race condition, right?
>
> choose_mountpoint() is currently private, but if we add a new filesystem
> helper, I think the right approach would be to expose follow_dotdot(),
> updating its arguments with public types.  This way the intermediates
> mount points will not be exposed, RCU optimization will be leveraged,
> and usage of this new helper will be simplified.

I think it is easier to add a helper similar to follow_dotdot(), but not with
nameidata. follow_dotdot() touches so many things in nameidata, so it
is better to keep it as-is. I am having the following:

/**
 * path_parent - Find the parent of path
 * @path: input and output path.
 * @root: root of the path walk, do not go beyond this root. If @root is
 *        zero'ed, walk all the way to real root.
 *
 * Given a path, find the parent path. Replace @path with the parent path.
 * If we were already at the real root or a disconnected root, @path is
 * not changed.
 *
 * Returns:
 *  true  - if @path is updated to its parent.
 *  false - if @path is already the root (real root or @root).
 */
bool path_parent(struct path *path, const struct path *root)
{
        struct dentry *parent;

        if (path_equal(path, root))
                return false;

        if (unlikely(path->dentry == path->mnt->mnt_root)) {
                struct path p;

                if (!choose_mountpoint(real_mount(path->mnt), root, &p))
                        return false;
                path_put(path);
                *path = p;
                return true;
        }

        if (unlikely(IS_ROOT(path->dentry)))
                return false;

        parent = dget_parent(path->dentry);
        if (unlikely(!path_connected(path->mnt, parent))) {
                dput(parent);
                return false;
        }
        dput(path->dentry);
        path->dentry = parent;
        return true;
}
EXPORT_SYMBOL_GPL(path_parent);

And for Landlock, it is simply:

                if (path_parent(&walker_path, &root))
                        continue;

                if (unlikely(IS_ROOT(walker_path.dentry))) {
                        /*
                         * Stops at disconnected or real root directories.
                         * Only allows access to internal filesystems
                         * (e.g. nsfs, which is reachable through
                         * /proc/<pid>/ns/<namespace>).
                         */
                        if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
                                allowed_parent1 = true;
                                allowed_parent2 = true;
                        }
                        break;
                }

Does this look right?

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 18:43                               ` Al Viro
@ 2025-05-31  8:39                                 ` Mickaël Salaün
  2025-06-02  9:32                                 ` Christian Brauner
  1 sibling, 0 replies; 45+ messages in thread
From: Mickaël Salaün @ 2025-05-31  8:39 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Fri, May 30, 2025 at 07:43:48PM +0100, Al Viro wrote:
> On Fri, May 30, 2025 at 02:20:39PM +0200, Mickaël Salaün wrote:
> 
> > Without access to mount_lock, what would be the best way to fix this
> > Landlock issue while making it backportable?
> > 
> > > 
> > > If we update path_parent in this patchset with choose_mountpoint(),
> > > and use it in Landlock, we will close this race condition, right?
> > 
> > choose_mountpoint() is currently private, but if we add a new filesystem
> > helper, I think the right approach would be to expose follow_dotdot(),
> > updating its arguments with public types.  This way the intermediates
> > mount points will not be exposed, RCU optimization will be leveraged,
> > and usage of this new helper will be simplified.
> 
> IMO anything that involves struct nameidata should remain inside
> fs/namei.c - something public might share helpers with it, but that's
> it.  We had more than enough pain on changes in there, and I'm pretty
> sure that we are not done yet; in the area around atomic_open, but not
> only there.  Parts of that are still too subtle, IMO - it got a lot
> better over the years, but I would really prefer to avoid the need
> to bring more code into analysis for any further massage.
> 
> Are you sure that follow_dotdot() behaviour is what you really want?
> 
> Note that it's not quite how the pathname resolution works.  There we
> have the result of follow_dotdot() fed to step_into(), and that changes
> things.  Try this:
> 
> mkdir /tmp/foo
> mkdir /tmp/foo/bar
> cd /tmp/foo/bar
> mount -t tmpfs none /tmp/foo
> touch /tmp/foo/x
> ls -Uldi . .. /tmp/foo ../.. /tmp ../x
> 
> and think about the results.  Traversing .. is basically "follow_up as much
> as possible, then to parent, then follow_down as much as possible" and
> the last part (../x) explains why we do it that way.
> 
> Which objects would you want to iterate through when dealing with the
> current directory in the experiment above?  Simulation of pathwalk
> would have the root of overmounting filesystem as the second object
> visited; follow_dotdot() would yield the directory overmounted by
> that instead.
> 
> I'm not saying that either behaviour is right for your case - just that
> they are not identical and it's something that needs to be consciously
> chosen.

Thanks, this helps. I didn't though about this semantic difference.  We
don't want the handle_dots() semantic (which call follow_dotdot() and
step_into()), only the (backward) pathwalk one which is equivalent to
follow_dotdot().  I'll add Landlock tests for this specific scenario.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 18:55                               ` Song Liu
@ 2025-05-31  8:40                                 ` Mickaël Salaün
  2025-05-31 14:05                                 ` Tingmao Wang
  1 sibling, 0 replies; 45+ messages in thread
From: Mickaël Salaün @ 2025-05-31  8:40 UTC (permalink / raw)
  To: Song Liu
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Fri, May 30, 2025 at 11:55:22AM -0700, Song Liu wrote:
> On Fri, May 30, 2025 at 5:20 AM Mickaël Salaün <mic@digikod.net> wrote:
> [...]
> > >
> > > If we update path_parent in this patchset with choose_mountpoint(),
> > > and use it in Landlock, we will close this race condition, right?
> >
> > choose_mountpoint() is currently private, but if we add a new filesystem
> > helper, I think the right approach would be to expose follow_dotdot(),
> > updating its arguments with public types.  This way the intermediates
> > mount points will not be exposed, RCU optimization will be leveraged,
> > and usage of this new helper will be simplified.
> 
> I think it is easier to add a helper similar to follow_dotdot(), but not with
> nameidata. follow_dotdot() touches so many things in nameidata, so it
> is better to keep it as-is. I am having the following:

I was not suggesting to expose nameidata (only struct path and int), but
yes, a standalone helper is OK and it will not tie it to the current
follow_dotdot() internals.

> 
> /**
>  * path_parent - Find the parent of path

Because we update @path, I'd suggest a name containing "walk", something
like path_walk_parent().

>  * @path: input and output path.
>  * @root: root of the path walk, do not go beyond this root. If @root is
>  *        zero'ed, walk all the way to real root.
>  *
>  * Given a path, find the parent path. Replace @path with the parent path.
>  * If we were already at the real root or a disconnected root, @path is
>  * not changed.

We should explain that the semantic is the same as follow_dotdot(), but
not follow_dots().

>  *
>  * Returns:
>  *  true  - if @path is updated to its parent.
>  *  false - if @path is already the root (real root or @root).
>  */
> bool path_parent(struct path *path, const struct path *root)
> {
>         struct dentry *parent;
> 
>         if (path_equal(path, root))
>                 return false;
> 
>         if (unlikely(path->dentry == path->mnt->mnt_root)) {
>                 struct path p;
> 
>                 if (!choose_mountpoint(real_mount(path->mnt), root, &p))
>                         return false;
>                 path_put(path);
>                 *path = p;
>                 return true;
>         }
> 
>         if (unlikely(IS_ROOT(path->dentry)))
>                 return false;
> 
>         parent = dget_parent(path->dentry);
>         if (unlikely(!path_connected(path->mnt, parent))) {
>                 dput(parent);
>                 return false;
>         }
>         dput(path->dentry);
>         path->dentry = parent;
>         return true;
> }
> EXPORT_SYMBOL_GPL(path_parent);
> 
> And for Landlock, it is simply:
> 
>                 if (path_parent(&walker_path, &root))
>                         continue;
> 
>                 if (unlikely(IS_ROOT(walker_path.dentry))) {
>                         /*
>                          * Stops at disconnected or real root directories.
>                          * Only allows access to internal filesystems
>                          * (e.g. nsfs, which is reachable through
>                          * /proc/<pid>/ns/<namespace>).
>                          */
>                         if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
>                                 allowed_parent1 = true;
>                                 allowed_parent2 = true;
>                         }
>                         break;
>                 }
> 
> Does this look right?

Yes, thanks.

Al, Christian, would that be OK to backport this helper to fix the
Landlock issue?  If yes, Song could you please put it in in a place that
could be easily backported down to 5.15 e.g., just after handle_dots()?

> 
> Thanks,
> Song
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-05-28 22:26 ` [PATCH bpf-next 2/4] landlock: Use path_parent() Song Liu
@ 2025-05-31 13:51   ` Tingmao Wang
  2025-06-02 13:36     ` Song Liu
  2025-06-02 17:35     ` Mickaël Salaün
  0 siblings, 2 replies; 45+ messages in thread
From: Tingmao Wang @ 2025-05-31 13:51 UTC (permalink / raw)
  To: Song Liu, Mickaël Salaün, linux-fsdevel,
	linux-security-module
  Cc: bpf, linux-kernel, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, viro, brauner, jack, kpsingh, mattbobrowski, amir73il,
	repnop, jlayton, josef, gnoack

On 5/28/25 23:26, Song Liu wrote:
> Use path_parent() to walk a path up to its parent.
> 
> While path_parent() has an extra check with path_connected() than existing
> code, there is no functional changes intended for landlock.
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  security/landlock/fs.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 6fee7c20f64d..32a24758ad6e 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -837,7 +837,6 @@ static bool is_access_to_paths_allowed(
>  	 * restriction.
>  	 */
>  	while (true) {
> -		struct dentry *parent_dentry;
>  		const struct landlock_rule *rule;
>  
>  		/*
> @@ -896,19 +895,17 @@ static bool is_access_to_paths_allowed(
>  		if (allowed_parent1 && allowed_parent2)
>  			break;
>  jump_up:
> -		if (walker_path.dentry == walker_path.mnt->mnt_root) {
> -			if (follow_up(&walker_path)) {
> -				/* Ignores hidden mount points. */
> -				goto jump_up;
> -			} else {
> -				/*
> -				 * Stops at the real root.  Denies access
> -				 * because not all layers have granted access.
> -				 */
> -				break;
> -			}
> -		}
> -		if (unlikely(IS_ROOT(walker_path.dentry))) {
> +		switch (path_parent(&walker_path)) {
> +		case PATH_PARENT_CHANGED_MOUNT:
> +			/* Ignores hidden mount points. */
> +			goto jump_up;
> +		case PATH_PARENT_REAL_ROOT:
> +			/*
> +			 * Stops at the real root.  Denies access
> +			 * because not all layers have granted access.
> +			 */
> +			goto walk_done;
> +		case PATH_PARENT_DISCONNECTED_ROOT:
>  			/*
>  			 * Stops at disconnected root directories.  Only allows
>  			 * access to internal filesystems (e.g. nsfs, which is

I was looking at the existing handling of disconnected root in Landlock
and I realized that the comment here confused me a bit:

/*
 * Stops at disconnected root directories.  Only allows
 * access to internal filesystems (e.g. nsfs, which is
 * reachable through /proc/<pid>/ns/<namespace>).
 */

In the original code, this was under a

    if (unlikely(IS_ROOT(walker_path.dentry)))

which means that it only stops walking if we found out we're disconnected
after reaching a filesystem boundary.  However if before we got to this
point, we have already collected enough rules to allow access, access
would be allowed, even if we're currently disconnected.  Demo:

/ # cd /
/ # cp /linux/samples/landlock/sandboxer .
/ # mkdir a b
/ # mkdir a/foo
/ # echo baz > a/foo/bar
/ # mount --bind a b
/ # LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
Executing the sandboxed command...
/ # cd /b/foo
/b/foo # cat bar
baz
/b/foo # mv /a/foo /foo
/b/foo # cd ..     # <- We're now disconnected
bash: cd: ..: No such file or directory
/b/foo # cat bar
baz                # <- but landlock still lets us read the file

However, I think this patch will change this behavior due to the use of
path_connected

root@10a8fff999ce:/# mkdir a b
root@10a8fff999ce:/# mkdir a/foo
root@10a8fff999ce:/# echo baz > a/foo/bar
root@10a8fff999ce:/# mount --bind a b
root@10a8fff999ce:/# LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
Executing the sandboxed command...
bash: cannot set terminal process group (191): Inappropriate ioctl for device
bash: no job control in this shell
root@10a8fff999ce:/# cd /b/foo
root@10a8fff999ce:/b/foo# cat bar
baz
root@10a8fff999ce:/b/foo# mv /a/foo /foo
root@10a8fff999ce:/b/foo# cd ..
bash: cd: ..: No such file or directory
root@10a8fff999ce:/b/foo# cat bar
cat: bar: Permission denied

I'm not sure if the original behavior was intentional, but since this
technically counts as a functional changes, just pointing this out.

Also I'm slightly worried about the performance overhead of doing
path_connected for every hop in the iteration (but ultimately it's
Mickaël's call).  At least for Landlock, I think if we want to block all
access to disconnected files, as long as we eventually realize we have
been disconnected (by doing the "if dentry == path.mnt" check once when we
reach root), and in that case deny access, we should be good.


> @@ -918,12 +915,15 @@ static bool is_access_to_paths_allowed(
>  				allowed_parent1 = true;
>  				allowed_parent2 = true;
>  			}
> +			goto walk_done;
> +		case PATH_PARENT_SAME_MOUNT:
>  			break;
> +		default:
> +			WARN_ON_ONCE(1);
> +			goto walk_done;
>  		}
> -		parent_dentry = dget_parent(walker_path.dentry);
> -		dput(walker_path.dentry);
> -		walker_path.dentry = parent_dentry;
>  	}
> +walk_done:
>  	path_put(&walker_path);
>  
>  	if (!allowed_parent1) {


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 18:55                               ` Song Liu
  2025-05-31  8:40                                 ` Mickaël Salaün
@ 2025-05-31 14:05                                 ` Tingmao Wang
  2025-06-01 23:33                                   ` Song Liu
  1 sibling, 1 reply; 45+ messages in thread
From: Tingmao Wang @ 2025-05-31 14:05 UTC (permalink / raw)
  To: Song Liu, Mickaël Salaün
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, brauner, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack

On 5/30/25 19:55, Song Liu wrote:
> On Fri, May 30, 2025 at 5:20 AM Mickaël Salaün <mic@digikod.net> wrote:
> [...]
>>>
>>> If we update path_parent in this patchset with choose_mountpoint(),
>>> and use it in Landlock, we will close this race condition, right?
>>
>> choose_mountpoint() is currently private, but if we add a new filesystem
>> helper, I think the right approach would be to expose follow_dotdot(),
>> updating its arguments with public types.  This way the intermediates
>> mount points will not be exposed, RCU optimization will be leveraged,
>> and usage of this new helper will be simplified.
> 
> I think it is easier to add a helper similar to follow_dotdot(), but not with
> nameidata. follow_dotdot() touches so many things in nameidata, so it
> is better to keep it as-is. I am having the following:
> 
> /**
>  * path_parent - Find the parent of path
>  * @path: input and output path.
>  * @root: root of the path walk, do not go beyond this root. If @root is
>  *        zero'ed, walk all the way to real root.
>  *
>  * Given a path, find the parent path. Replace @path with the parent path.
>  * If we were already at the real root or a disconnected root, @path is
>  * not changed.
>  *
>  * Returns:
>  *  true  - if @path is updated to its parent.
>  *  false - if @path is already the root (real root or @root).
>  */
> bool path_parent(struct path *path, const struct path *root)
> {
>         struct dentry *parent;
> 
>         if (path_equal(path, root))
>                 return false;
> 
>         if (unlikely(path->dentry == path->mnt->mnt_root)) {
>                 struct path p;
> 
>                 if (!choose_mountpoint(real_mount(path->mnt), root, &p))
>                         return false;
>                 path_put(path);
>                 *path = p;
>                 return true;
>         }
> 
>         if (unlikely(IS_ROOT(path->dentry)))
>                 return false;
> 
>         parent = dget_parent(path->dentry);
>         if (unlikely(!path_connected(path->mnt, parent))) {
>                 dput(parent);
>                 return false;
>         }
>         dput(path->dentry);
>         path->dentry = parent;
>         return true;
> }
> EXPORT_SYMBOL_GPL(path_parent);
> 
> And for Landlock, it is simply:
> 
>                 if (path_parent(&walker_path, &root))
>                         continue;
> 
>                 if (unlikely(IS_ROOT(walker_path.dentry))) {
>                         /*
>                          * Stops at disconnected or real root directories.
>                          * Only allows access to internal filesystems
>                          * (e.g. nsfs, which is reachable through
>                          * /proc/<pid>/ns/<namespace>).
>                          */
>                         if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
>                                 allowed_parent1 = true;
>                                 allowed_parent2 = true;
>                         }
>                         break;


Hi, maybe I'm missing the complete picture of this code, but since
path_parent doesn't change walker_path if it returns false (e.g. if it's
disconnected, or choose_mountpoint fails), I think this `break;` should be
outside the

    if (unlikely(IS_ROOT(walker_path.dentry)))

right? (Assuming this whole thing is under a `while (true)`) Otherwise we
might get stuck at the current path and get infinite loop?

>                 }
> 
> Does this look right?
> 
> Thanks,
> Song


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-31 14:05                                 ` Tingmao Wang
@ 2025-06-01 23:33                                   ` Song Liu
  0 siblings, 0 replies; 45+ messages in thread
From: Song Liu @ 2025-06-01 23:33 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Mickaël Salaün, Al Viro, Jan Kara, bpf, linux-fsdevel,
	linux-kernel, linux-security-module, kernel-team, andrii, eddyz87,
	ast, daniel, martin.lau, brauner, kpsingh, mattbobrowski,
	amir73il, repnop, jlayton, josef, gnoack

On Sat, May 31, 2025 at 7:05 AM Tingmao Wang <m@maowtm.org> wrote:
>
> On 5/30/25 19:55, Song Liu wrote:
> > On Fri, May 30, 2025 at 5:20 AM Mickaël Salaün <mic@digikod.net> wrote:
> > [...]
> >>>
> >>> If we update path_parent in this patchset with choose_mountpoint(),
> >>> and use it in Landlock, we will close this race condition, right?
> >>
> >> choose_mountpoint() is currently private, but if we add a new filesystem
> >> helper, I think the right approach would be to expose follow_dotdot(),
> >> updating its arguments with public types.  This way the intermediates
> >> mount points will not be exposed, RCU optimization will be leveraged,
> >> and usage of this new helper will be simplified.
> >
> > I think it is easier to add a helper similar to follow_dotdot(), but not with
> > nameidata. follow_dotdot() touches so many things in nameidata, so it
> > is better to keep it as-is. I am having the following:
> >
> > /**
> >  * path_parent - Find the parent of path
> >  * @path: input and output path.
> >  * @root: root of the path walk, do not go beyond this root. If @root is
> >  *        zero'ed, walk all the way to real root.
> >  *
> >  * Given a path, find the parent path. Replace @path with the parent path.
> >  * If we were already at the real root or a disconnected root, @path is
> >  * not changed.
> >  *
> >  * Returns:
> >  *  true  - if @path is updated to its parent.
> >  *  false - if @path is already the root (real root or @root).
> >  */
> > bool path_parent(struct path *path, const struct path *root)
> > {
> >         struct dentry *parent;
> >
> >         if (path_equal(path, root))
> >                 return false;
> >
> >         if (unlikely(path->dentry == path->mnt->mnt_root)) {
> >                 struct path p;
> >
> >                 if (!choose_mountpoint(real_mount(path->mnt), root, &p))
> >                         return false;
> >                 path_put(path);
> >                 *path = p;
> >                 return true;
> >         }
> >
> >         if (unlikely(IS_ROOT(path->dentry)))
> >                 return false;
> >
> >         parent = dget_parent(path->dentry);
> >         if (unlikely(!path_connected(path->mnt, parent))) {
> >                 dput(parent);
> >                 return false;
> >         }
> >         dput(path->dentry);
> >         path->dentry = parent;
> >         return true;
> > }
> > EXPORT_SYMBOL_GPL(path_parent);
> >
> > And for Landlock, it is simply:
> >
> >                 if (path_parent(&walker_path, &root))
> >                         continue;
> >
> >                 if (unlikely(IS_ROOT(walker_path.dentry))) {
> >                         /*
> >                          * Stops at disconnected or real root directories.
> >                          * Only allows access to internal filesystems
> >                          * (e.g. nsfs, which is reachable through
> >                          * /proc/<pid>/ns/<namespace>).
> >                          */
> >                         if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
> >                                 allowed_parent1 = true;
> >                                 allowed_parent2 = true;
> >                         }
> >                         break;
>
>
> Hi, maybe I'm missing the complete picture of this code, but since
> path_parent doesn't change walker_path if it returns false (e.g. if it's
> disconnected, or choose_mountpoint fails), I think this `break;` should be
> outside the
>
>     if (unlikely(IS_ROOT(walker_path.dentry)))
>
> right? (Assuming this whole thing is under a `while (true)`) Otherwise we
> might get stuck at the current path and get infinite loop?

Right, we need "break" outside the if condition.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 18:00           ` Song Liu
  2025-05-29 18:35             ` Al Viro
@ 2025-06-02  9:27             ` Christian Brauner
  2025-06-02 13:27               ` Song Liu
  1 sibling, 1 reply; 45+ messages in thread
From: Christian Brauner @ 2025-06-02  9:27 UTC (permalink / raw)
  To: Song Liu
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack

On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> >
> > > Current version of path iterator only supports walking towards the root,
> > > with helper path_parent. But the path iterator API can be extended
> > > to cover other use cases.
> >
> > Clarify the last part, please - call me paranoid, but that sounds like
> > a beginning of something that really should be discussed upfront.
> 
> We don't have any plan with future use cases yet. The only example
> I mentioned in the original version of the commit log is "walk the
> mount tree". IOW, it is similar to the current iterator, but skips non
> mount point iterations.
> 
> Since we call it "path iterator", it might make sense to add ways to
> iterate the VFS tree in different patterns. For example, we may

No, we're not adding a swiss-army knife for consumption by out-of-tree
code. I'm not opposed to adding a sane iterator for targeted use-cases
with a clear scope and internal API behavior as I've said multiple times
already on-list and in-person.

I will not merge anything that will endup exploding into some fancy
"walk subtrees in any order you want".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-29 23:10                         ` Al Viro
  2025-05-30  0:42                           ` Song Liu
@ 2025-06-02  9:30                           ` Christian Brauner
  1 sibling, 0 replies; 45+ messages in thread
From: Christian Brauner @ 2025-06-02  9:30 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack

On Fri, May 30, 2025 at 12:10:18AM +0100, Al Viro wrote:
> On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote:
> 
> > Is it an issue if we only hold a reference to a MNT_LOCKED mount for
> > short period of time? "Short period" means it may get interrupted, page
> > faults, or wait for an IO (read xattr), but it won't hold a reference to the
> > mount and sleep indefinitely.
> 
> MNT_LOCKED mount itself is not a problem.  What shouldn't be done is
> looking around in the mountpoint it covers.  It depends upon the things
> you are going to do with that, but it's very easy to get an infoleak
> that way.
> 
> > > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> > > included, out of ->mnt_flags and into a separate field in struct mount.
> > > However, that would conflict with any code using that to deal with
> > > your iterator safely.
> > >
> > > What's more, AFAICS in case of a stack of mounts each covering the root
> > > of parent mount, you stop in each of those.  The trouble is, umount(2)
> > > propagation logics assumes that intermediate mounts can be pulled out of
> > > such stack without causing trouble.  For pathname resolution that is
> > > true; it goes through the entire stack atomically wrt that stuff.
> > > For your API that's not the case; somebody who has no idea about an
> > > intermediate mount being there might get caught on it while it's getting
> > > pulled from the stack.
> > >
> > > What exactly do you need around the mountpoint crossing?
> > 
> > I thought about skipping intermediate mounts (that are hidden by
> > other mounts). AFAICT, not skipping them will not cause any issue.
> 
> It can.  Suppose e.g. that /mnt gets propagation from another namespace,
> but not the other way round and you mount something on /mnt.
> 
> Later, in that another namespace, somebody mounts something on wherever
> your /mnt gets propagation to.  A copy will be propagated _between_
> your /mnt and whatever you've mounted on top of it; it will be entirely
> invisible until you umount your /mnt.  At that point the propagated
> copy will show up there, same as if it had appeared just after your
> umount.  Prior to that it's entirely invisible.  If its original
> counterpart in another namespace gets unmounted first, the copy will
> be quietly pulled out.

Fwiw, I have explained these and similar issues at length multiple times.

> 
> Note that choose_mountpoint_rcu() callers (including choose_mountpoint())
> will have mount_lock seqcount sampled before the traversal _and_ recheck
> it after having reached the bottom of stack.  IOW, if you traverse ..
> on the way to root, you won't get caught on the sucker being pulled out.
> 
> Your iterator, OTOH, would stop in that intermediate mount - and get
> an unpleasant surprise when it comes back to do the next step (towards
> /mnt on root filesystem, that is) and finds that path->mnt points
> to something that is detached from everything - no way to get from
> it any further.  That - despite the fact that location you've started
> from is still mounted, still has the same pathname, etc. and nothing
> had been disrupted for it.

Same...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 18:43                               ` Al Viro
  2025-05-31  8:39                                 ` Mickaël Salaün
@ 2025-06-02  9:32                                 ` Christian Brauner
  1 sibling, 0 replies; 45+ messages in thread
From: Christian Brauner @ 2025-06-02  9:32 UTC (permalink / raw)
  To: Al Viro
  Cc: Mickaël Salaün, Song Liu, Jan Kara, bpf, linux-fsdevel,
	linux-kernel, linux-security-module, kernel-team, andrii, eddyz87,
	ast, daniel, martin.lau, kpsingh, mattbobrowski, amir73il, repnop,
	jlayton, josef, gnoack, Tingmao Wang

On Fri, May 30, 2025 at 07:43:48PM +0100, Al Viro wrote:
> On Fri, May 30, 2025 at 02:20:39PM +0200, Mickaël Salaün wrote:
> 
> > Without access to mount_lock, what would be the best way to fix this
> > Landlock issue while making it backportable?
> > 
> > > 
> > > If we update path_parent in this patchset with choose_mountpoint(),
> > > and use it in Landlock, we will close this race condition, right?
> > 
> > choose_mountpoint() is currently private, but if we add a new filesystem
> > helper, I think the right approach would be to expose follow_dotdot(),
> > updating its arguments with public types.  This way the intermediates
> > mount points will not be exposed, RCU optimization will be leveraged,
> > and usage of this new helper will be simplified.
> 
> IMO anything that involves struct nameidata should remain inside
> fs/namei.c - something public might share helpers with it, but that's

Strongly agree.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 14:20             ` Mickaël Salaün
@ 2025-06-02  9:41               ` Christian Brauner
  2025-06-03  9:46               ` Jan Kara
  1 sibling, 0 replies; 45+ messages in thread
From: Christian Brauner @ 2025-06-02  9:41 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, Alexei Starovoitov, Jan Kara, Al Viro, bpf,
	Linux-Fsdevel, LKML, LSM List, Kernel Team, Andrii Nakryiko,
	Eduard, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	KP Singh, Matt Bobrowski, Amir Goldstein, repnop, Jeff Layton,
	Josef Bacik, Günther Noack

On Fri, May 30, 2025 at 04:20:39PM +0200, Mickaël Salaün wrote:
> On Thu, May 29, 2025 at 10:05:59AM -0700, Song Liu wrote:
> > On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > [...]
> > > >
> > > > How about we describe this as:
> > > >
> > > > Introduce a path iterator, which safely (no crash) walks a struct path.
> > > > Without malicious parallel modifications, the walk is guaranteed to
> > > > terminate. The sequence of dentries maybe surprising in presence
> > > > of parallel directory or mount tree modifications and the iteration may
> > > > not ever finish in face of parallel malicious directory tree manipulations.
> > >
> > > Hold on. If it's really the case then is the landlock susceptible
> > > to this type of attack already ?
> > > landlock may infinitely loop in the kernel ?
> > 
> > I think this only happens if the attacker can modify the mount or
> > directory tree as fast as the walk, which is probably impossible
> > in reality.
> 
> Yes, so this is not an infinite loop but an infinite race between the
> kernel and a very fast malicious user space process with an infinite
> number of available nested writable directories, that would also require
> a filesystem (and a kernel) supporting infinite pathname length.

Uhm, I'm not so sure. If you have really deep directory chains and
expose them via bind-mounts in multiple location then it was already
easy to trigger livelocks because e.g., is_subdir() did lockless
sequence counter checks and it refired over and over and over again.
We've fixed that since but such issues aren't all that theoretical. IOW,
the bug was caused simply by having too many concurrent tree
modifications and parts of the code need to make sure that they haven't
affected their result. So I would be very careful with asserting that
it's not possible to hit such issues in real-life...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-06-02  9:27             ` Christian Brauner
@ 2025-06-02 13:27               ` Song Liu
  2025-06-02 15:40                 ` Alexei Starovoitov
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-06-02 13:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Al Viro, Jan Kara, bpf, linux-fsdevel, linux-kernel,
	linux-security-module, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, mic, gnoack

On Mon, Jun 2, 2025 at 2:27 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> > On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> > >
> > > > Current version of path iterator only supports walking towards the root,
> > > > with helper path_parent. But the path iterator API can be extended
> > > > to cover other use cases.
> > >
> > > Clarify the last part, please - call me paranoid, but that sounds like
> > > a beginning of something that really should be discussed upfront.
> >
> > We don't have any plan with future use cases yet. The only example
> > I mentioned in the original version of the commit log is "walk the
> > mount tree". IOW, it is similar to the current iterator, but skips non
> > mount point iterations.
> >
> > Since we call it "path iterator", it might make sense to add ways to
> > iterate the VFS tree in different patterns. For example, we may
>
> No, we're not adding a swiss-army knife for consumption by out-of-tree
> code. I'm not opposed to adding a sane iterator for targeted use-cases
> with a clear scope and internal API behavior as I've said multiple times
> already on-list and in-person.
>
> I will not merge anything that will endup exploding into some fancy
> "walk subtrees in any order you want".

We are not proposing (and AFAICT never proposed) to have a
swiss-army knife that "walk subtrees in any order you want". Instead,
we are proposing a sane iterator that serves exactly one use case
now. I guess the concern is that it looks extensible. However, I made
the API like this so that it can be extended, with thorough reviews, to
cover another sane use case. If there is still concern with this. We
sure can make current code not extensible. In case there is a
different sane use case, we will introduce another iterator after
thorough reviews.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-05-31 13:51   ` Tingmao Wang
@ 2025-06-02 13:36     ` Song Liu
  2025-06-03  0:10       ` Song Liu
  2025-06-02 17:35     ` Mickaël Salaün
  1 sibling, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-06-02 13:36 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Mickaël Salaün, linux-fsdevel, linux-security-module,
	bpf, linux-kernel, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, viro, brauner, jack, kpsingh, mattbobrowski, amir73il,
	repnop, jlayton, josef, gnoack

On Sat, May 31, 2025 at 6:51 AM Tingmao Wang <m@maowtm.org> wrote:
[...]
> I'm not sure if the original behavior was intentional, but since this
> technically counts as a functional changes, just pointing this out.

Thanks for pointing it out! I think it is possible to keep current
behavior. Or we can change the behavior and state that clearly
in the commit log. Mickaël, WDYT?

>
> Also I'm slightly worried about the performance overhead of doing
> path_connected for every hop in the iteration (but ultimately it's
> Mickaël's call).  At least for Landlock, I think if we want to block all

Maybe we need a flag to path_parent (or path_walk_parent) so
that we only check for path_connected when necessary.

Thanks,
Song

> access to disconnected files, as long as we eventually realize we have
> been disconnected (by doing the "if dentry == path.mnt" check once when we
> reach root), and in that case deny access, we should be good.
>
>
> > @@ -918,12 +915,15 @@ static bool is_access_to_paths_allowed(
> >                               allowed_parent1 = true;
> >                               allowed_parent2 = true;
> >                       }
> > +                     goto walk_done;
> > +             case PATH_PARENT_SAME_MOUNT:
> >                       break;
> > +             default:
> > +                     WARN_ON_ONCE(1);
> > +                     goto walk_done;
> >               }
> > -             parent_dentry = dget_parent(walker_path.dentry);
> > -             dput(walker_path.dentry);
> > -             walker_path.dentry = parent_dentry;
> >       }
> > +walk_done:
> >       path_put(&walker_path);
> >
> >       if (!allowed_parent1) {
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-06-02 13:27               ` Song Liu
@ 2025-06-02 15:40                 ` Alexei Starovoitov
  2025-06-02 21:39                   ` Song Liu
  0 siblings, 1 reply; 45+ messages in thread
From: Alexei Starovoitov @ 2025-06-02 15:40 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Al Viro, Jan Kara, bpf, Linux-Fsdevel, LKML,
	LSM List, Kernel Team, Andrii Nakryiko, Eduard,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, KP Singh,
	Matt Bobrowski, Amir Goldstein, repnop, Jeff Layton, Josef Bacik,
	Mickaël Salaün, Günther Noack

On Mon, Jun 2, 2025 at 6:27 AM Song Liu <song@kernel.org> wrote:
>
> On Mon, Jun 2, 2025 at 2:27 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> > > On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > >
> > > > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> > > >
> > > > > Current version of path iterator only supports walking towards the root,
> > > > > with helper path_parent. But the path iterator API can be extended
> > > > > to cover other use cases.
> > > >
> > > > Clarify the last part, please - call me paranoid, but that sounds like
> > > > a beginning of something that really should be discussed upfront.
> > >
> > > We don't have any plan with future use cases yet. The only example
> > > I mentioned in the original version of the commit log is "walk the
> > > mount tree". IOW, it is similar to the current iterator, but skips non
> > > mount point iterations.
> > >
> > > Since we call it "path iterator", it might make sense to add ways to
> > > iterate the VFS tree in different patterns. For example, we may
> >
> > No, we're not adding a swiss-army knife for consumption by out-of-tree
> > code. I'm not opposed to adding a sane iterator for targeted use-cases
> > with a clear scope and internal API behavior as I've said multiple times
> > already on-list and in-person.
> >
> > I will not merge anything that will endup exploding into some fancy
> > "walk subtrees in any order you want".
>
> We are not proposing (and AFAICT never proposed) to have a
> swiss-army knife that "walk subtrees in any order you want". Instead,
> we are proposing a sane iterator that serves exactly one use case
> now. I guess the concern is that it looks extensible. However, I made
> the API like this so that it can be extended, with thorough reviews, to
> cover another sane use case. If there is still concern with this. We
> sure can make current code not extensible. In case there is a
> different sane use case, we will introduce another iterator after
> thorough reviews.

It's good that the iterator is extensible, but to achieve that
there is no need to introduce "enum bpf_path_iter_mode"
which implies some unknown walk patterns.
Just add "u64 flags" to bpf_iter_path_new() and
if (!flags) return -EINVAL;
Then we'll have a way to extend that kfunc if really necessary.
Deleting and introducing new kfuncs/iterators is not a big deal,
but reserving 'flags' as an option for extension is almost
always a good backup.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-05-31 13:51   ` Tingmao Wang
  2025-06-02 13:36     ` Song Liu
@ 2025-06-02 17:35     ` Mickaël Salaün
  2025-06-02 22:56       ` Tingmao Wang
  1 sibling, 1 reply; 45+ messages in thread
From: Mickaël Salaün @ 2025-06-02 17:35 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Song Liu, linux-fsdevel, linux-security-module, bpf, linux-kernel,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, gnoack

On Sat, May 31, 2025 at 02:51:22PM +0100, Tingmao Wang wrote:
> On 5/28/25 23:26, Song Liu wrote:
> > Use path_parent() to walk a path up to its parent.
> > 
> > While path_parent() has an extra check with path_connected() than existing
> > code, there is no functional changes intended for landlock.
> > 
> > Signed-off-by: Song Liu <song@kernel.org>
> > ---
> >  security/landlock/fs.c | 34 +++++++++++++++++-----------------
> >  1 file changed, 17 insertions(+), 17 deletions(-)
> > 
> > diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> > index 6fee7c20f64d..32a24758ad6e 100644
> > --- a/security/landlock/fs.c
> > +++ b/security/landlock/fs.c
> > @@ -837,7 +837,6 @@ static bool is_access_to_paths_allowed(
> >  	 * restriction.
> >  	 */
> >  	while (true) {
> > -		struct dentry *parent_dentry;
> >  		const struct landlock_rule *rule;
> >  
> >  		/*
> > @@ -896,19 +895,17 @@ static bool is_access_to_paths_allowed(
> >  		if (allowed_parent1 && allowed_parent2)
> >  			break;
> >  jump_up:
> > -		if (walker_path.dentry == walker_path.mnt->mnt_root) {
> > -			if (follow_up(&walker_path)) {
> > -				/* Ignores hidden mount points. */
> > -				goto jump_up;
> > -			} else {
> > -				/*
> > -				 * Stops at the real root.  Denies access
> > -				 * because not all layers have granted access.
> > -				 */
> > -				break;
> > -			}
> > -		}
> > -		if (unlikely(IS_ROOT(walker_path.dentry))) {
> > +		switch (path_parent(&walker_path)) {
> > +		case PATH_PARENT_CHANGED_MOUNT:
> > +			/* Ignores hidden mount points. */
> > +			goto jump_up;
> > +		case PATH_PARENT_REAL_ROOT:
> > +			/*
> > +			 * Stops at the real root.  Denies access
> > +			 * because not all layers have granted access.
> > +			 */
> > +			goto walk_done;
> > +		case PATH_PARENT_DISCONNECTED_ROOT:
> >  			/*
> >  			 * Stops at disconnected root directories.  Only allows
> >  			 * access to internal filesystems (e.g. nsfs, which is
> 
> I was looking at the existing handling of disconnected root in Landlock
> and I realized that the comment here confused me a bit:
> 
> /*
>  * Stops at disconnected root directories.  Only allows
>  * access to internal filesystems (e.g. nsfs, which is
>  * reachable through /proc/<pid>/ns/<namespace>).
>  */
> 
> In the original code, this was under a
> 
>     if (unlikely(IS_ROOT(walker_path.dentry)))
> 
> which means that it only stops walking if we found out we're disconnected
> after reaching a filesystem boundary.  However if before we got to this
> point, we have already collected enough rules to allow access, access
> would be allowed, even if we're currently disconnected.  Demo:
> 
> / # cd /
> / # cp /linux/samples/landlock/sandboxer .
> / # mkdir a b
> / # mkdir a/foo
> / # echo baz > a/foo/bar
> / # mount --bind a b
> / # LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
> Executing the sandboxed command...
> / # cd /b/foo
> /b/foo # cat bar
> baz
> /b/foo # mv /a/foo /foo
> /b/foo # cd ..     # <- We're now disconnected
> bash: cd: ..: No such file or directory
> /b/foo # cat bar
> baz                # <- but landlock still lets us read the file
> 
> However, I think this patch will change this behavior due to the use of
> path_connected
> 
> root@10a8fff999ce:/# mkdir a b
> root@10a8fff999ce:/# mkdir a/foo
> root@10a8fff999ce:/# echo baz > a/foo/bar
> root@10a8fff999ce:/# mount --bind a b
> root@10a8fff999ce:/# LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
> Executing the sandboxed command...
> bash: cannot set terminal process group (191): Inappropriate ioctl for device
> bash: no job control in this shell
> root@10a8fff999ce:/# cd /b/foo
> root@10a8fff999ce:/b/foo# cat bar
> baz
> root@10a8fff999ce:/b/foo# mv /a/foo /foo
> root@10a8fff999ce:/b/foo# cd ..
> bash: cd: ..: No such file or directory
> root@10a8fff999ce:/b/foo# cat bar
> cat: bar: Permission denied

This is a good test case, we should add a test for that.

> 
> I'm not sure if the original behavior was intentional, but since this
> technically counts as a functional changes, just pointing this out.

This is indeed an issue.

> 
> Also I'm slightly worried about the performance overhead of doing
> path_connected for every hop in the iteration (but ultimately it's
> Mickaël's call).

Yes, we need to check with a benchmark.  We might want to keep the
walker_path.dentry == walker_path.mnt->mnt_root check inlined.

> At least for Landlock, I think if we want to block all
> access to disconnected files, as long as we eventually realize we have
> been disconnected (by doing the "if dentry == path.mnt" check once when we
> reach root), and in that case deny access, we should be good.
> 
> 
> > @@ -918,12 +915,15 @@ static bool is_access_to_paths_allowed(
> >  				allowed_parent1 = true;
> >  				allowed_parent2 = true;
> >  			}
> > +			goto walk_done;
> > +		case PATH_PARENT_SAME_MOUNT:
> >  			break;
> > +		default:
> > +			WARN_ON_ONCE(1);
> > +			goto walk_done;
> >  		}
> > -		parent_dentry = dget_parent(walker_path.dentry);
> > -		dput(walker_path.dentry);
> > -		walker_path.dentry = parent_dentry;
> >  	}
> > +walk_done:
> >  	path_put(&walker_path);
> >  
> >  	if (!allowed_parent1) {
> 
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-06-02 15:40                 ` Alexei Starovoitov
@ 2025-06-02 21:39                   ` Song Liu
  0 siblings, 0 replies; 45+ messages in thread
From: Song Liu @ 2025-06-02 21:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Christian Brauner, Al Viro, Jan Kara, bpf, Linux-Fsdevel, LKML,
	LSM List, Kernel Team, Andrii Nakryiko, Eduard,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, KP Singh,
	Matt Bobrowski, Amir Goldstein, repnop, Jeff Layton, Josef Bacik,
	Mickaël Salaün, Günther Noack

On Mon, Jun 2, 2025 at 8:40 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jun 2, 2025 at 6:27 AM Song Liu <song@kernel.org> wrote:
> >
> > On Mon, Jun 2, 2025 at 2:27 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Thu, May 29, 2025 at 11:00:51AM -0700, Song Liu wrote:
> > > > On Thu, May 29, 2025 at 10:38 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > > >
> > > > > On Thu, May 29, 2025 at 09:53:21AM -0700, Song Liu wrote:
> > > > >
> > > > > > Current version of path iterator only supports walking towards the root,
> > > > > > with helper path_parent. But the path iterator API can be extended
> > > > > > to cover other use cases.
> > > > >
> > > > > Clarify the last part, please - call me paranoid, but that sounds like
> > > > > a beginning of something that really should be discussed upfront.
> > > >
> > > > We don't have any plan with future use cases yet. The only example
> > > > I mentioned in the original version of the commit log is "walk the
> > > > mount tree". IOW, it is similar to the current iterator, but skips non
> > > > mount point iterations.
> > > >
> > > > Since we call it "path iterator", it might make sense to add ways to
> > > > iterate the VFS tree in different patterns. For example, we may
> > >
> > > No, we're not adding a swiss-army knife for consumption by out-of-tree
> > > code. I'm not opposed to adding a sane iterator for targeted use-cases
> > > with a clear scope and internal API behavior as I've said multiple times
> > > already on-list and in-person.
> > >
> > > I will not merge anything that will endup exploding into some fancy
> > > "walk subtrees in any order you want".
> >
> > We are not proposing (and AFAICT never proposed) to have a
> > swiss-army knife that "walk subtrees in any order you want". Instead,
> > we are proposing a sane iterator that serves exactly one use case
> > now. I guess the concern is that it looks extensible. However, I made
> > the API like this so that it can be extended, with thorough reviews, to
> > cover another sane use case. If there is still concern with this. We
> > sure can make current code not extensible. In case there is a
> > different sane use case, we will introduce another iterator after
> > thorough reviews.
>
> It's good that the iterator is extensible, but to achieve that
> there is no need to introduce "enum bpf_path_iter_mode"
> which implies some unknown walk patterns.
> Just add "u64 flags" to bpf_iter_path_new() and
> if (!flags) return -EINVAL;
> Then we'll have a way to extend that kfunc if really necessary.
> Deleting and introducing new kfuncs/iterators is not a big deal,
> but reserving 'flags' as an option for extension is almost
> always a good backup.

Sounds good! I will prepare v2 with a flags field.

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-06-02 17:35     ` Mickaël Salaün
@ 2025-06-02 22:56       ` Tingmao Wang
  0 siblings, 0 replies; 45+ messages in thread
From: Tingmao Wang @ 2025-06-02 22:56 UTC (permalink / raw)
  To: Mickaël Salaün, Song Liu
  Cc: linux-fsdevel, linux-security-module, bpf, linux-kernel,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, amir73il, repnop, jlayton,
	josef, gnoack

On 6/2/25 18:35, Mickaël Salaün wrote:
> On Sat, May 31, 2025 at 02:51:22PM +0100, Tingmao Wang wrote:
>> [...]
>>
>> / # cd /
>> / # cp /linux/samples/landlock/sandboxer .
>> / # mkdir a b
>> / # mkdir a/foo
>> / # echo baz > a/foo/bar
>> / # mount --bind a b
>> / # LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
>> Executing the sandboxed command...
>> / # cd /b/foo
>> /b/foo # cat bar
>> baz
>> /b/foo # mv /a/foo /foo
>> /b/foo # cd ..     # <- We're now disconnected
>> bash: cd: ..: No such file or directory
>> /b/foo # cat bar
>> baz                # <- but landlock still lets us read the file
>>
>> However, I think this patch will change this behavior due to the use of
>> path_connected
>>
>> root@10a8fff999ce:/# mkdir a b
>> root@10a8fff999ce:/# mkdir a/foo
>> root@10a8fff999ce:/# echo baz > a/foo/bar
>> root@10a8fff999ce:/# mount --bind a b
>> root@10a8fff999ce:/# LL_FS_RO=/ LL_FS_RW=/ ./sandboxer bash
>> Executing the sandboxed command...
>> bash: cannot set terminal process group (191): Inappropriate ioctl for device
>> bash: no job control in this shell
>> root@10a8fff999ce:/# cd /b/foo
>> root@10a8fff999ce:/b/foo# cat bar
>> baz
>> root@10a8fff999ce:/b/foo# mv /a/foo /foo
>> root@10a8fff999ce:/b/foo# cd ..
>> bash: cd: ..: No such file or directory
>> root@10a8fff999ce:/b/foo# cat bar
>> cat: bar: Permission denied
> 
> This is a good test case, we should add a test for that.

Agreed with Mickaël that I will work on a Landlock selftest for this.

> 
>>
>> I'm not sure if the original behavior was intentional, but since this
>> technically counts as a functional changes, just pointing this out.
> 
> This is indeed an issue.

Because in the non-Landlocked case, applications with a disconnected CWD
still can access files under its CWD, just not above, my thinking is that
it's best to keep the current Landlock behaviour.  Mickaël might reply
here too but he also thought that, from the point of view of the person
creating the policy, the current behaviour is less surprising.

>
>>
>> Also I'm slightly worried about the performance overhead of doing
>> path_connected for every hop in the iteration (but ultimately it's
>> Mickaël's call).
> 
> Yes, we need to check with a benchmark.  We might want to keep the
> walker_path.dentry == walker_path.mnt->mnt_root check inlined.

I think this will depend on how the final implementation goes, but if we
can check it only once at the very end (potentially for free by having
logic that can realize the walk never reached path.mnt?), I would think it
ought to not make a difference.

For Song's benefit, if you want to do it, here are some scripts that might
come in handy in benchmarking (created by Mickaël):
https://github.com/landlock-lsm/landlock-test-tools/pull/16
and comparing results / BPF-based overhead tracing:
https://github.com/landlock-lsm/landlock-test-tools/pull/17

> >> At least for Landlock, I think if we want to block all
>> access to disconnected files, as long as we eventually realize we have
>> been disconnected (by doing the "if dentry == path.mnt" check once when we
>> reach root), and in that case deny access, we should be good.
>>
>>
>>> @@ -918,12 +915,15 @@ static bool is_access_to_paths_allowed(
>>>  				allowed_parent1 = true;
>>>  				allowed_parent2 = true;
>>>  			}
>>> +			goto walk_done;
>>> +		case PATH_PARENT_SAME_MOUNT:
>>>  			break;
>>> +		default:
>>> +			WARN_ON_ONCE(1);
>>> +			goto walk_done;
>>>  		}
>>> -		parent_dentry = dget_parent(walker_path.dentry);
>>> -		dput(walker_path.dentry);
>>> -		walker_path.dentry = parent_dentry;
>>>  	}
>>> +walk_done:
>>>  	path_put(&walker_path);
>>>  
>>>  	if (!allowed_parent1) {
>>
>>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-06-02 13:36     ` Song Liu
@ 2025-06-03  0:10       ` Song Liu
  2025-06-03 12:47         ` Mickaël Salaün
  0 siblings, 1 reply; 45+ messages in thread
From: Song Liu @ 2025-06-03  0:10 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Mickaël Salaün, linux-fsdevel, linux-security-module,
	bpf, linux-kernel, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, viro, brauner, jack, kpsingh, mattbobrowski, amir73il,
	repnop, jlayton, josef, gnoack

On Mon, Jun 2, 2025 at 6:36 AM Song Liu <song@kernel.org> wrote:
>
> On Sat, May 31, 2025 at 6:51 AM Tingmao Wang <m@maowtm.org> wrote:
> [...]
> > I'm not sure if the original behavior was intentional, but since this
> > technically counts as a functional changes, just pointing this out.
>
> Thanks for pointing it out! I think it is possible to keep current
> behavior. Or we can change the behavior and state that clearly
> in the commit log. Mickaël, WDYT?
>
> >
> > Also I'm slightly worried about the performance overhead of doing
> > path_connected for every hop in the iteration (but ultimately it's
> > Mickaël's call).  At least for Landlock, I think if we want to block all
>
> Maybe we need a flag to path_parent (or path_walk_parent) so
> that we only check for path_connected when necessary.

More thoughts on path_connected(). I think it makes sense for
path_parent (or path_walk_parent) to continue walking
with path_connected() == false. This is because for most security
use cases, it makes sense for umounted bind mount to fall back
to the permissions of the original mount OTOH, it also makes sense
for follow_dotdot to reject this access at path lookup time. If the
user of path_walk_parent decided to stop walking at disconnected
path, another check can be added at the caller side.

If there are no objections, I will remove the path_connected check
from path_walk_parent().

Thanks,
Song

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30 14:20             ` Mickaël Salaün
  2025-06-02  9:41               ` Christian Brauner
@ 2025-06-03  9:46               ` Jan Kara
  2025-06-03 12:49                 ` Mickaël Salaün
  1 sibling, 1 reply; 45+ messages in thread
From: Jan Kara @ 2025-06-03  9:46 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, Alexei Starovoitov, Jan Kara, Al Viro, bpf,
	Linux-Fsdevel, LKML, LSM List, Kernel Team, Andrii Nakryiko,
	Eduard, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Christian Brauner, KP Singh, Matt Bobrowski, Amir Goldstein,
	repnop, Jeff Layton, Josef Bacik, Günther Noack

On Fri 30-05-25 16:20:39, Mickaël Salaün wrote:
> On Thu, May 29, 2025 at 10:05:59AM -0700, Song Liu wrote:
> > On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > [...]
> > > >
> > > > How about we describe this as:
> > > >
> > > > Introduce a path iterator, which safely (no crash) walks a struct path.
> > > > Without malicious parallel modifications, the walk is guaranteed to
> > > > terminate. The sequence of dentries maybe surprising in presence
> > > > of parallel directory or mount tree modifications and the iteration may
> > > > not ever finish in face of parallel malicious directory tree manipulations.
> > >
> > > Hold on. If it's really the case then is the landlock susceptible
> > > to this type of attack already ?
> > > landlock may infinitely loop in the kernel ?
> > 
> > I think this only happens if the attacker can modify the mount or
> > directory tree as fast as the walk, which is probably impossible
> > in reality.
> 
> Yes, so this is not an infinite loop but an infinite race between the
> kernel and a very fast malicious user space process with an infinite
> number of available nested writable directories, that would also require
> a filesystem (and a kernel) supporting infinite pathname length.

Well, you definitely don't need infinite pathname length. Example:

Have a dir hierarchy like:

  A
 / \
B   C
|
D

Start iterating from A/B/D, you climb up to A/B. In parallel atacker does:

mv A/B/ A/C/; mkdir A/B

Now by following parent you get to A/C. In parallel attaker does:

mv A/C/ A/B/; mkdir A/C

And now you are essentially where you've started so this can repeat
forever.

As others wrote this particular timing might be hard enough to hit for it
to not be a practical attack but I would not bet much on somebody not being
able to invent some variant that works, in particular with BPF iterator.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 2/4] landlock: Use path_parent()
  2025-06-03  0:10       ` Song Liu
@ 2025-06-03 12:47         ` Mickaël Salaün
  0 siblings, 0 replies; 45+ messages in thread
From: Mickaël Salaün @ 2025-06-03 12:47 UTC (permalink / raw)
  To: Song Liu
  Cc: Tingmao Wang, linux-fsdevel, linux-security-module, bpf,
	linux-kernel, kernel-team, andrii, eddyz87, ast, daniel,
	martin.lau, viro, brauner, jack, kpsingh, mattbobrowski, amir73il,
	repnop, jlayton, josef, gnoack

On Mon, Jun 02, 2025 at 05:10:21PM -0700, Song Liu wrote:
> On Mon, Jun 2, 2025 at 6:36 AM Song Liu <song@kernel.org> wrote:
> >
> > On Sat, May 31, 2025 at 6:51 AM Tingmao Wang <m@maowtm.org> wrote:
> > [...]
> > > I'm not sure if the original behavior was intentional, but since this
> > > technically counts as a functional changes, just pointing this out.
> >
> > Thanks for pointing it out! I think it is possible to keep current
> > behavior. Or we can change the behavior and state that clearly
> > in the commit log. Mickaël, WDYT?
> >
> > >
> > > Also I'm slightly worried about the performance overhead of doing
> > > path_connected for every hop in the iteration (but ultimately it's
> > > Mickaël's call).  At least for Landlock, I think if we want to block all
> >
> > Maybe we need a flag to path_parent (or path_walk_parent) so
> > that we only check for path_connected when necessary.
> 
> More thoughts on path_connected(). I think it makes sense for
> path_parent (or path_walk_parent) to continue walking
> with path_connected() == false. This is because for most security
> use cases, it makes sense for umounted bind mount to fall back
> to the permissions of the original mount OTOH, it also makes sense
> for follow_dotdot to reject this access at path lookup time. If the
> user of path_walk_parent decided to stop walking at disconnected
> path, another check can be added at the caller side.

I agree.

> 
> If there are no objections, I will remove the path_connected check
> from path_walk_parent().

Sounds good.  The documentation should explain this rationale and
highlight the differences with follow_dotdot().

> 
> Thanks,
> Song
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-06-03  9:46               ` Jan Kara
@ 2025-06-03 12:49                 ` Mickaël Salaün
  2025-06-03 21:13                   ` Jan Kara
  0 siblings, 1 reply; 45+ messages in thread
From: Mickaël Salaün @ 2025-06-03 12:49 UTC (permalink / raw)
  To: Jan Kara
  Cc: Song Liu, Alexei Starovoitov, Al Viro, bpf, Linux-Fsdevel, LKML,
	LSM List, Kernel Team, Andrii Nakryiko, Eduard,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Christian Brauner, KP Singh, Matt Bobrowski, Amir Goldstein,
	repnop, Jeff Layton, Josef Bacik, Günther Noack, Jann Horn

On Tue, Jun 03, 2025 at 11:46:22AM +0200, Jan Kara wrote:
> On Fri 30-05-25 16:20:39, Mickaël Salaün wrote:
> > On Thu, May 29, 2025 at 10:05:59AM -0700, Song Liu wrote:
> > > On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > [...]
> > > > >
> > > > > How about we describe this as:
> > > > >
> > > > > Introduce a path iterator, which safely (no crash) walks a struct path.
> > > > > Without malicious parallel modifications, the walk is guaranteed to
> > > > > terminate. The sequence of dentries maybe surprising in presence
> > > > > of parallel directory or mount tree modifications and the iteration may
> > > > > not ever finish in face of parallel malicious directory tree manipulations.
> > > >
> > > > Hold on. If it's really the case then is the landlock susceptible
> > > > to this type of attack already ?
> > > > landlock may infinitely loop in the kernel ?
> > > 
> > > I think this only happens if the attacker can modify the mount or
> > > directory tree as fast as the walk, which is probably impossible
> > > in reality.
> > 
> > Yes, so this is not an infinite loop but an infinite race between the
> > kernel and a very fast malicious user space process with an infinite
> > number of available nested writable directories, that would also require
> > a filesystem (and a kernel) supporting infinite pathname length.
> 
> Well, you definitely don't need infinite pathname length. Example:
> 
> Have a dir hierarchy like:
> 
>   A
>  / \
> B   C
> |
> D
> 
> Start iterating from A/B/D, you climb up to A/B. In parallel atacker does:
> 
> mv A/B/ A/C/; mkdir A/B
> 
> Now by following parent you get to A/C. In parallel attaker does:
> 
> mv A/C/ A/B/; mkdir A/C
> 
> And now you are essentially where you've started so this can repeat
> forever.

Yes, this is the scenario I had in mind talking about "infinite race"
(instead of infinite loop).  For this to work it will require the
filesystem to support an infinite number of nested directories, but I'm
not sure which FS could be eligible.

Anyway, what would would be the threat model for this infinite race?

> 
> As others wrote this particular timing might be hard enough to hit for it
> to not be a practical attack but I would not bet much on somebody not being
> able to invent some variant that works, in particular with BPF iterator.

There might exist corner cases that could be an issue but would the
impact be different than with other kinds of path walk?

What could we do to avoid or limit such issue?

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-06-03 12:49                 ` Mickaël Salaün
@ 2025-06-03 21:13                   ` Jan Kara
  0 siblings, 0 replies; 45+ messages in thread
From: Jan Kara @ 2025-06-03 21:13 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Jan Kara, Song Liu, Alexei Starovoitov, Al Viro, bpf,
	Linux-Fsdevel, LKML, LSM List, Kernel Team, Andrii Nakryiko,
	Eduard, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Christian Brauner, KP Singh, Matt Bobrowski, Amir Goldstein,
	repnop, Jeff Layton, Josef Bacik, Günther Noack, Jann Horn

On Tue 03-06-25 14:49:09, Mickaël Salaün wrote:
> On Tue, Jun 03, 2025 at 11:46:22AM +0200, Jan Kara wrote:
> > On Fri 30-05-25 16:20:39, Mickaël Salaün wrote:
> > > On Thu, May 29, 2025 at 10:05:59AM -0700, Song Liu wrote:
> > > > On Thu, May 29, 2025 at 9:57 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > [...]
> > > > > >
> > > > > > How about we describe this as:
> > > > > >
> > > > > > Introduce a path iterator, which safely (no crash) walks a struct path.
> > > > > > Without malicious parallel modifications, the walk is guaranteed to
> > > > > > terminate. The sequence of dentries maybe surprising in presence
> > > > > > of parallel directory or mount tree modifications and the iteration may
> > > > > > not ever finish in face of parallel malicious directory tree manipulations.
> > > > >
> > > > > Hold on. If it's really the case then is the landlock susceptible
> > > > > to this type of attack already ?
> > > > > landlock may infinitely loop in the kernel ?
> > > > 
> > > > I think this only happens if the attacker can modify the mount or
> > > > directory tree as fast as the walk, which is probably impossible
> > > > in reality.
> > > 
> > > Yes, so this is not an infinite loop but an infinite race between the
> > > kernel and a very fast malicious user space process with an infinite
> > > number of available nested writable directories, that would also require
> > > a filesystem (and a kernel) supporting infinite pathname length.
> > 
> > Well, you definitely don't need infinite pathname length. Example:
> > 
> > Have a dir hierarchy like:
> > 
> >   A
> >  / \
> > B   C
> > |
> > D
> > 
> > Start iterating from A/B/D, you climb up to A/B. In parallel atacker does:
> > 
> > mv A/B/ A/C/; mkdir A/B
> > 
> > Now by following parent you get to A/C. In parallel attaker does:
> > 
> > mv A/C/ A/B/; mkdir A/C
> > 
> > And now you are essentially where you've started so this can repeat
> > forever.
> 
> Yes, this is the scenario I had in mind talking about "infinite race"
> (instead of infinite loop).  For this to work it will require the
> filesystem to support an infinite number of nested directories, but I'm
> not sure which FS could be eligible.

Well, most filesystems don't limit depth of a directory hierarchy in any
particular way. The depth is limited only by available disk space.

> Anyway, what would would be the threat model for this infinite race?

These kinds of problems can usually lead to DoS. That's not too serious
problem as local users with write fs access can cause smaller or larger
troubles to the system anyway but it can be rather unpleasant e.g. for
container hosting systems if you can force e.g. container management tools
to spend attacker-controlled time in the kernel just because Landlock or
the eBPF "security solution" ends up crawling path in attacker controlled
part of filesystem.

> > As others wrote this particular timing might be hard enough to hit for it
> > to not be a practical attack but I would not bet much on somebody not being
> > able to invent some variant that works, in particular with BPF iterator.
> 
> There might exist corner cases that could be an issue but would the
> impact be different than with other kinds of path walk?

Well, a standard path lookup is limited by the length of the path (and
symlink recursion limit). Things like common parent lookup during rename
acquire locks to block parallel directory tree modifications so similar
attacks are impossible there. It is only the "walk from some dentry to the
root without holding any locks" pattern that has these kind of unbounded
looping issues.

> What could we do to avoid or limit such issue?

That's a tough question. You can hold rename_lock which blocks all
directory renames. This obviously makes the walk-to-the-root safe against
any "rename" attacks but it is a system wide lock with all the obvious
scalability implications. So I don't think that's really feasible. You
could also limit the number of steps in your pathwalk and return error if
it gets exceeded. I believe that would be a more practical solution.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
  2025-05-30  0:42                           ` Song Liu
  2025-05-30 12:20                             ` Mickaël Salaün
@ 2025-06-04  0:58                             ` Tingmao Wang
  1 sibling, 0 replies; 45+ messages in thread
From: Tingmao Wang @ 2025-06-04  0:58 UTC (permalink / raw)
  To: Song Liu, Al Viro
  Cc: Jan Kara, bpf, linux-fsdevel, linux-kernel, linux-security-module,
	kernel-team, andrii, eddyz87, ast, daniel, martin.lau, brauner,
	kpsingh, mattbobrowski, amir73il, repnop, jlayton, josef, mic,
	gnoack

On 5/30/25 01:42, Song Liu wrote:
> [...]
> On Thu, May 29, 2025 at 4:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> Note, BTW, that it might be better off by doing that similar to
>> d_path.c - without arseloads of dget_parent/dput et.al.; not sure
>> how feasible it is, but if everything in it can be done under
>> rcu_read_lock(), that's something to look into.
> 
> I don't think we can do everything here inside rcu_read_lock().
> But d_path.c does have some code we can probably reuse or
> learn from.

Note that I've made an RFC patch for this as I've also been looking into
this a bit earlier:

https://lore.kernel.org/all/cover.1748997840.git.m@maowtm.org/

I've CC'd some people here but not all, in the interest of not spamming
like 20 people, but feedback from all is welcome.  Mine is also its own
separate patch that shouldn't block Song's patch here, and I can rebase it
on top of (v2 or a later version of) this series once this is merged.

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2025-06-04  0:58 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent() Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 2/4] landlock: Use path_parent() Song Liu
2025-05-31 13:51   ` Tingmao Wang
2025-06-02 13:36     ` Song Liu
2025-06-03  0:10       ` Song Liu
2025-06-03 12:47         ` Mickaël Salaün
2025-06-02 17:35     ` Mickaël Salaün
2025-06-02 22:56       ` Tingmao Wang
2025-05-28 22:26 ` [PATCH bpf-next 3/4] bpf: Introduce path iterator Song Liu
2025-05-28 22:37   ` Al Viro
2025-05-29 11:58     ` Jan Kara
2025-05-29 16:53       ` Song Liu
2025-05-29 16:57         ` Alexei Starovoitov
2025-05-29 17:05           ` Song Liu
2025-05-30 14:20             ` Mickaël Salaün
2025-06-02  9:41               ` Christian Brauner
2025-06-03  9:46               ` Jan Kara
2025-06-03 12:49                 ` Mickaël Salaün
2025-06-03 21:13                   ` Jan Kara
2025-05-29 17:38         ` Al Viro
2025-05-29 18:00           ` Song Liu
2025-05-29 18:35             ` Al Viro
2025-05-29 19:46               ` Song Liu
2025-05-29 20:15                 ` Al Viro
2025-05-29 21:07                   ` Song Liu
2025-05-29 21:45                     ` Al Viro
2025-05-29 22:13                       ` Song Liu
2025-05-29 23:10                         ` Al Viro
2025-05-30  0:42                           ` Song Liu
2025-05-30 12:20                             ` Mickaël Salaün
2025-05-30 18:43                               ` Al Viro
2025-05-31  8:39                                 ` Mickaël Salaün
2025-06-02  9:32                                 ` Christian Brauner
2025-05-30 18:55                               ` Song Liu
2025-05-31  8:40                                 ` Mickaël Salaün
2025-05-31 14:05                                 ` Tingmao Wang
2025-06-01 23:33                                   ` Song Liu
2025-06-04  0:58                             ` Tingmao Wang
2025-06-02  9:30                           ` Christian Brauner
2025-06-02  9:27             ` Christian Brauner
2025-06-02 13:27               ` Song Liu
2025-06-02 15:40                 ` Alexei Starovoitov
2025-06-02 21:39                   ` Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf " Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).