[PATCH bpf-next 0/2] Add kfuncs to support reading xattr from dentry

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next 0/2] Add kfuncs to support reading xattr from dentry
@ 2024-07-25 23:47 Song Liu
  2024-07-25 23:47 ` [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read " Song Liu
  2024-07-25 23:47 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
  0 siblings, 2 replies; 25+ messages in thread
From: Song Liu @ 2024-07-25 23:47 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, Song Liu

The use case is to tag directories with xattr. For example, "user.allow_x"
tag on /bin would allow execution of all files under /bin. To achieve this,
we start from the security_file_open() hook and read xattr of all the
parent directories.

4 kfuncs are added in this set:

  bpf_get_dentry_xattr
  bpf_file_dentry
  bpf_dget_parent
  bpf_dput

The first is used to read xattr from dentry; while the other 3 are used to
walk the directory tree.

Note that, these functions are only allowed from LSM hooks.

Song Liu (2):
  bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry
  selftests/bpf: Add tests for bpf_get_dentry_xattr

 kernel/trace/bpf_trace.c                      | 60 +++++++++++++++---
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 61 +++++++++++++++++--
 .../selftests/bpf/progs/test_dentry_xattr.c   | 46 ++++++++++++++
 .../selftests/bpf/progs/test_get_xattr.c      | 16 ++++-
 4 files changed, 168 insertions(+), 15 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_dentry_xattr.c

--
2.43.0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry
  2024-07-25 23:47 [PATCH bpf-next 0/2] Add kfuncs to support reading xattr from dentry Song Liu
@ 2024-07-25 23:47 ` Song Liu
  2024-07-26  5:34   ` Al Viro
  2024-07-25 23:47 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-07-25 23:47 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, Song Liu

The primary use case here is to read xattr for directories from BPF
programs. This feature enables tagging all the files in a directory
with a xattr on the directory. More specifically, starting from LSM hook
security_file_open(), we can read xattr of the file's parent directories.

To have referenced access to dentry, a few more kfuncs are added:
  - bpf_file_dentry
  - bpf_dget_parent
  - bpf_dput

Both bpf_file_dentry and bpf_dget_parent take a reference to the dentry
(KF_ACQUIRE), which has to be released by bpf_dput (KF_RELEASE). This
makes sure only trusted pointers (KF_TRUSTED_ARGS) can be used for the
dentry.

Note that, file_dentry() doesn't take reference to the dentry, but
bpf_file_dentry() does.

Signed-off-by: Song Liu <song@kernel.org>
---
 kernel/trace/bpf_trace.c | 60 ++++++++++++++++++++++++++++++++++------
 1 file changed, 51 insertions(+), 9 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cd098846e251..b8e6eeabb773 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1443,22 +1443,21 @@ late_initcall(bpf_key_sig_kfuncs_init);
 __bpf_kfunc_start_defs();
 
 /**
- * bpf_get_file_xattr - get xattr of a file
- * @file: file to get xattr from
+ * bpf_get_dentry_xattr - get xattr of a dentry
+ * @dentry: dentry to get xattr from
  * @name__str: name of the xattr
  * @value_p: output buffer of the xattr value
  *
- * Get xattr *name__str* of *file* and store the output in *value_ptr*.
+ * Get xattr *name__str* of *dentry* and store the output in *value_ptr*.
  *
  * For security reasons, only *name__str* with prefix "user." is allowed.
  *
  * Return: 0 on success, a negative value on error.
  */
-__bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
-				   struct bpf_dynptr *value_p)
+__bpf_kfunc int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str,
+				     struct bpf_dynptr *value_p)
 {
 	struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
-	struct dentry *dentry;
 	u32 value_len;
 	void *value;
 	int ret;
@@ -1471,20 +1470,63 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
 	if (!value)
 		return -EINVAL;
 
-	dentry = file_dentry(file);
 	ret = inode_permission(&nop_mnt_idmap, dentry->d_inode, MAY_READ);
 	if (ret)
 		return ret;
 	return __vfs_getxattr(dentry, dentry->d_inode, name__str, value, value_len);
 }
 
+/**
+ * bpf_get_file_xattr - get xattr of a file
+ * @file: file to get xattr from
+ * @name__str: name of the xattr
+ * @value_p: output buffer of the xattr value
+ *
+ * Get xattr *name__str* of *file* and store the output in *value_ptr*.
+ *
+ * For security reasons, only *name__str* with prefix "user." is allowed.
+ *
+ * Return: 0 on success, a negative value on error.
+ */
+__bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,
+				   struct bpf_dynptr *value_p)
+{
+	struct dentry *dentry;
+
+	dentry = file_dentry(file);
+	return bpf_get_dentry_xattr(dentry, name__str, value_p);
+}
+
+__bpf_kfunc struct dentry *bpf_file_dentry(const struct file *file)
+{
+	/* file_dentry() does not hold reference to the dentry. We add a
+	 * dget() here so that we can add KF_ACQUIRE flag to
+	 * bpf_file_dentry().
+	 */
+	return dget(file_dentry(file));
+}
+
+__bpf_kfunc struct dentry *bpf_dget_parent(struct dentry *dentry)
+{
+	return dget_parent(dentry);
+}
+
+__bpf_kfunc void bpf_dput(struct dentry *dentry)
+{
+	return dput(dentry);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(fs_kfunc_set_ids)
+BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_file_dentry, KF_TRUSTED_ARGS | KF_ACQUIRE)
+BTF_ID_FLAGS(func, bpf_dget_parent, KF_TRUSTED_ARGS | KF_ACQUIRE)
+BTF_ID_FLAGS(func, bpf_dput, KF_RELEASE)
 BTF_KFUNCS_END(fs_kfunc_set_ids)
 
-static int bpf_get_file_xattr_filter(const struct bpf_prog *prog, u32 kfunc_id)
+static int fs_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
 	if (!btf_id_set8_contains(&fs_kfunc_set_ids, kfunc_id))
 		return 0;
@@ -1496,7 +1538,7 @@ static int bpf_get_file_xattr_filter(const struct bpf_prog *prog, u32 kfunc_id)
 static const struct btf_kfunc_id_set bpf_fs_kfunc_set = {
 	.owner = THIS_MODULE,
 	.set = &fs_kfunc_set_ids,
-	.filter = bpf_get_file_xattr_filter,
+	.filter = fs_kfunc_filter,
 };
 
 static int __init bpf_fs_kfuncs_init(void)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry
  2024-07-25 23:47 ` [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read " Song Liu
@ 2024-07-26  5:34   ` Al Viro
  2024-07-26  7:01     ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Al Viro @ 2024-07-26  5:34 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, linux-fsdevel, linux-kernel, kernel-team, andrii, eddyz87,
	ast, daniel, martin.lau, brauner, jack, kpsingh, mattbobrowski

On Thu, Jul 25, 2024 at 04:47:05PM -0700, Song Liu wrote:

> +__bpf_kfunc struct dentry *bpf_file_dentry(const struct file *file)
> +{
> +	/* file_dentry() does not hold reference to the dentry. We add a
> +	 * dget() here so that we can add KF_ACQUIRE flag to
> +	 * bpf_file_dentry().
> +	 */
> +	return dget(file_dentry(file));
> +}
> +
> +__bpf_kfunc struct dentry *bpf_dget_parent(struct dentry *dentry)
> +{
> +	return dget_parent(dentry);
> +}
> +
> +__bpf_kfunc void bpf_dput(struct dentry *dentry)
> +{
> +	return dput(dentry);
> +}

	If you keep a file reference, why bother grabbing dentry one?
If not, you have a very bad trouble if that opened file is the only
thing that keeps the filesystem busy.

	It's almost certainly a wrong interface; please, explain what
exactly are you trying to do here.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry
  2024-07-26  5:34   ` Al Viro
@ 2024-07-26  7:01     ` Song Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2024-07-26  7:01 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team, Andrii Nakryiko,
	Eduard Zingerman, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, brauner@kernel.org, Jan Kara, KP Singh,
	mattbobrowski@google.com

Hi Al, 

Thanks for your quick reply. 

> On Jul 25, 2024, at 10:34 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> On Thu, Jul 25, 2024 at 04:47:05PM -0700, Song Liu wrote:
> 
>> +__bpf_kfunc struct dentry *bpf_file_dentry(const struct file *file)
>> +{
>> + /* file_dentry() does not hold reference to the dentry. We add a
>> + * dget() here so that we can add KF_ACQUIRE flag to
>> + * bpf_file_dentry().
>> + */
>> + return dget(file_dentry(file));
>> +}
>> +
>> +__bpf_kfunc struct dentry *bpf_dget_parent(struct dentry *dentry)
>> +{
>> + return dget_parent(dentry);
>> +}
>> +
>> +__bpf_kfunc void bpf_dput(struct dentry *dentry)
>> +{
>> + return dput(dentry);
>> +}
> 
> If you keep a file reference, why bother grabbing dentry one?
> If not, you have a very bad trouble if that opened file is the only
> thing that keeps the filesystem busy.

Yes, we keep a file reference for the duration of the BPF program. 
Therefore, it is technically not necessary to grab a dentry one.
However, we grab a dentry reference to make the dentry pointer 
returned by bpf_file_dentry() a trusted pointer from BPF verifier's 
POV, so that these kfuncs are more robust. 

The following explanation is a bit long. Please let me know if it 
turns out confusing.

==== What is trusted pointer? ====

Trusted point is the mechanism to make sure bpf kfuncs are 
called with valid pointer. The BPF verifier requires certain BPF 
kfuncs (helpers) are called with trusted pointers. A pointer is 
trusted if one of the following two is true:

1. The pointer is passed directly by the tracepoint/kprobe, i.e., 
   no pointer walking, no non-zero offset. For example, 

   int bpf_security_file_open(struct file *file)  /* file is trusted */
   {
       /* mapping is not trusted */
       struct address_space    *mapping = file->f_mapping;

       /* file2 is not trusted */
       struct file *file2 = file + 1;
   }

2. The pointer is returned by a kfunc with KF_ACQUIRE. This pointer 
   has to be released by a kfunc with KF_RELEASE. KF_ACQUIRE and 
   KF_RELEASE kfuncs are like any _get() _put() pairs. 

==== bpf_dget_parent and bpf_dput ====

In this case, bpf_dget_parent() is a KF_ACQUIRE kfunc and 
bpf_dput() is a KF_RELEASE function. They are just like regular
_get() _put() functions. 

The BPF verifier makes sure pointers acquired by bpf_dget_parent() 
is always released by bpf_dput() before the BPF program returns. 
For example, in the following BPF program:

xxxx(struct dentry *d)
{
    struct dentry *parent = bpf_dget_parent(d); 

    /* main logic */

    bpf_dput(parent);
}

If the bpf_dput() call is missing, the verifier will not allow
the program to load. 

==== More on kfunc safety ====

Trusted point makes kfunc calls safe. In this case, we want 
bpf_get_dentry_xattr() to only take trusted dentry pointer. 
For example, in the security_inode_listxattr LSM hook:

bpf_security_inode_listxattr(struct dentry *dentry)
{
       /* This is allowed, dentry is an input and thus
        * is trusted 
        */
       bpf_get_dentry_xattr(dentry); 

       /* This is not allowed, as dentry->d_parent is 
        * not trusted
        */
       bpf_get_dentry_xattr(dentry->d_parent);

       /* This is allowed, as bpf_dget_parent() holds  
        * a reference to d_parent, and returns a trusted
        * pointer
        */
       struct dentry *parent = bpf_dget_parent(dentry);

       /* The following is needed, as we need the release 
        * parent pointer. If this line is missing, this
        * program cannot pass BPF verifier. 
        */
       bpf_dput(parent);
}

==== bpf_file_dentry ====

In this use case, we want to get from file pointer, such as
LSM hook security_file_open() to the dentry and thus walk the
directory tree. However, security_file_open() does not pass
in a dentry pointer, and file->f_path.dentry is not a trusted
pointer. There are two ways to get a trusted dentry pointer
from a file pointer:

1. As what we do here, use bpf_file_dentry() to hold a 
   reference on file->f_path.dentry and return a trusted 
   pointer. 
2. Give the verifier special knowledge that if file pointer
   is trusted, file->f_path.dentry is also trusted. This 
   can be achieve with the following macros:
      BTF_TYPE_SAFE_TRUSTED
      BTF_TYPE_SAFE_RCU
      BTF_TYPE_SAFE_RCU_OR_NULL. 

Using the second method here requires a little more work in the
BPF verifier, as dentry is not a simple pointer in struct file, 
but f_path.dentry. Therefore, I chose current approach that
bpf_file_dentry() holds a reference on dentry pointer, and the 
pointer has to be released with bpf_dput(). 

For more details about trusted pointers in kfuncs, please refer to 
Documentation/bpf/kfuncs.rst. 

Does this answer your question? 

Thanks,
Song

> It's almost certainly a wrong interface; please, explain what
> exactly are you trying to do here.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-25 23:47 [PATCH bpf-next 0/2] Add kfuncs to support reading xattr from dentry Song Liu
  2024-07-25 23:47 ` [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read " Song Liu
@ 2024-07-25 23:47 ` Song Liu
  2024-07-26  7:06   ` Christian Brauner
  1 sibling, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-07-25 23:47 UTC (permalink / raw)
  To: bpf, linux-fsdevel, linux-kernel
  Cc: kernel-team, andrii, eddyz87, ast, daniel, martin.lau, viro,
	brauner, jack, kpsingh, mattbobrowski, Song Liu

1. Rename fs_kfuncs/xattr to fs_kfuncs/file_xattr and add a call of
   bpf_get_dentry_xattr() to the test.
2. Add a new sub test fs_kfuncs/dentry_xattr, which checks 3 levels of
   parent directories for xattr. This demonstrate the use case that
   a xattr on a directory is used to tag all files in the directory and
   sub directories.

Signed-off-by: Song Liu <song@kernel.org>
---
 .../selftests/bpf/prog_tests/fs_kfuncs.c      | 61 +++++++++++++++++--
 .../selftests/bpf/progs/test_dentry_xattr.c   | 46 ++++++++++++++
 .../selftests/bpf/progs/test_get_xattr.c      | 16 ++++-
 3 files changed, 117 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_dentry_xattr.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
index 37056ba73847..a960cfbe8907 100644
--- a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
+++ b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
@@ -2,17 +2,19 @@
 /* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
 
 #include <stdlib.h>
+#include <sys/stat.h>
 #include <sys/types.h>
 #include <sys/xattr.h>
 #include <linux/fsverity.h>
 #include <unistd.h>
 #include <test_progs.h>
 #include "test_get_xattr.skel.h"
+#include "test_dentry_xattr.skel.h"
 #include "test_fsverity.skel.h"
 
 static const char testfile[] = "/tmp/test_progs_fs_kfuncs";
 
-static void test_xattr(void)
+static void test_file_xattr(void)
 {
 	struct test_get_xattr *skel = NULL;
 	int fd = -1, err;
@@ -50,7 +52,8 @@ static void test_xattr(void)
 	if (!ASSERT_GE(fd, 0, "open_file"))
 		goto out;
 
-	ASSERT_EQ(skel->bss->found_xattr, 1, "found_xattr");
+	ASSERT_EQ(skel->bss->found_xattr_from_file, 1, "found_xattr_from_file");
+	ASSERT_EQ(skel->bss->found_xattr_from_dentry, 1, "found_xattr_from_dentry");
 
 out:
 	close(fd);
@@ -58,6 +61,53 @@ static void test_xattr(void)
 	remove(testfile);
 }
 
+static void test_directory_xattr(void)
+{
+	struct test_dentry_xattr *skel = NULL;
+	static const char * const paths[] = {
+		"/tmp/a",
+		"/tmp/a/b",
+		"/tmp/a/b/c",
+	};
+	const char *file = "/tmp/a/b/c/d";
+	int i, j, err, fd;
+
+	for (i = 0; i < sizeof(paths) / sizeof(char *); i++) {
+		err = mkdir(paths[i], 0755);
+		if (!ASSERT_OK(err, "mkdir"))
+			goto out;
+		err = setxattr(paths[i], "user.kfunc", "hello", sizeof("hello"), 0);
+		if (!ASSERT_OK(err, "setxattr")) {
+			i++;
+			goto out;
+		}
+	}
+
+	skel = test_dentry_xattr__open_and_load();
+
+	if (!ASSERT_OK_PTR(skel, "test_dentry_xattr__open_and_load"))
+		goto out;
+
+	skel->bss->monitored_pid = getpid();
+	err = test_dentry_xattr__attach(skel);
+
+	if (!ASSERT_OK(err, "test_dentry__xattr__attach"))
+		goto out;
+
+	fd = open(file, O_CREAT | O_RDONLY, 0644);
+	if (!ASSERT_GE(fd, 0, "open_file"))
+		goto out;
+
+	ASSERT_EQ(skel->bss->number_of_xattr_found, 3, "number_of_xattr_found");
+	close(fd);
+out:
+	test_dentry_xattr__destroy(skel);
+	remove(file);
+	for (j = i - 1; j >= 0; j--)
+		rmdir(paths[j]);
+}
+
+
 #ifndef SHA256_DIGEST_SIZE
 #define SHA256_DIGEST_SIZE      32
 #endif
@@ -134,8 +184,11 @@ static void test_fsverity(void)
 
 void test_fs_kfuncs(void)
 {
-	if (test__start_subtest("xattr"))
-		test_xattr();
+	if (test__start_subtest("file_xattr"))
+		test_file_xattr();
+
+	if (test__start_subtest("dentry_xattr"))
+		test_directory_xattr();
 
 	if (test__start_subtest("fsverity"))
 		test_fsverity();
diff --git a/tools/testing/selftests/bpf/progs/test_dentry_xattr.c b/tools/testing/selftests/bpf/progs/test_dentry_xattr.c
new file mode 100644
index 000000000000..d2e378b2e2d5
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_dentry_xattr.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_tracing.h>
+#include "bpf_kfuncs.h"
+
+char _license[] SEC("license") = "GPL";
+
+__u32 monitored_pid;
+__u32 number_of_xattr_found;
+
+static const char expected_value[] = "hello";
+char value[32];
+
+SEC("lsm.s/file_open")
+int BPF_PROG(test_file_open, struct file *f)
+{
+	struct bpf_dynptr value_ptr;
+	struct dentry *dentry, *prev_dentry;
+	__u32 pid, matches = 0;
+	int i, ret;
+
+	pid = bpf_get_current_pid_tgid() >> 32;
+	if (pid != monitored_pid)
+		return 0;
+
+	bpf_dynptr_from_mem(value, sizeof(value), 0, &value_ptr);
+
+	dentry = bpf_file_dentry(f);
+
+	for (i = 0; i < 10; i++) {
+		ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
+		if (ret == sizeof(expected_value) &&
+		    !bpf_strncmp(value, ret, expected_value))
+			matches++;
+
+		prev_dentry = dentry;
+		dentry = bpf_dget_parent(prev_dentry);
+		bpf_dput(prev_dentry);
+	}
+
+	bpf_dput(dentry);
+	number_of_xattr_found = matches;
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_xattr.c b/tools/testing/selftests/bpf/progs/test_get_xattr.c
index 7eb2a4e5a3e5..3b0dc6106ca5 100644
--- a/tools/testing/selftests/bpf/progs/test_get_xattr.c
+++ b/tools/testing/selftests/bpf/progs/test_get_xattr.c
@@ -9,7 +9,8 @@
 char _license[] SEC("license") = "GPL";
 
 __u32 monitored_pid;
-__u32 found_xattr;
+__u32 found_xattr_from_file;
+__u32 found_xattr_from_dentry;
 
 static const char expected_value[] = "hello";
 char value[32];
@@ -18,6 +19,7 @@ SEC("lsm.s/file_open")
 int BPF_PROG(test_file_open, struct file *f)
 {
 	struct bpf_dynptr value_ptr;
+	struct dentry *dentry;
 	__u32 pid;
 	int ret;
 
@@ -32,6 +34,16 @@ int BPF_PROG(test_file_open, struct file *f)
 		return 0;
 	if (bpf_strncmp(value, ret, expected_value))
 		return 0;
-	found_xattr = 1;
+	found_xattr_from_file = 1;
+
+	dentry = bpf_file_dentry(f);
+	ret = bpf_get_dentry_xattr(dentry, "user.kfuncs", &value_ptr);
+	bpf_dput(dentry);
+	if (ret != sizeof(expected_value))
+		return 0;
+	if (bpf_strncmp(value, ret, expected_value))
+		return 0;
+	found_xattr_from_dentry = 1;
+
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-25 23:47 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
@ 2024-07-26  7:06   ` Christian Brauner
  2024-07-26  9:19     ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2024-07-26  7:06 UTC (permalink / raw)
  To: Song Liu
  Cc: bpf, linux-fsdevel, linux-kernel, kernel-team, andrii, eddyz87,
	ast, daniel, martin.lau, viro, jack, kpsingh, mattbobrowski

On Thu, Jul 25, 2024 at 04:47:06PM GMT, Song Liu wrote:
> 1. Rename fs_kfuncs/xattr to fs_kfuncs/file_xattr and add a call of
>    bpf_get_dentry_xattr() to the test.
> 2. Add a new sub test fs_kfuncs/dentry_xattr, which checks 3 levels of
>    parent directories for xattr. This demonstrate the use case that
>    a xattr on a directory is used to tag all files in the directory and
>    sub directories.
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  .../selftests/bpf/prog_tests/fs_kfuncs.c      | 61 +++++++++++++++++--
>  .../selftests/bpf/progs/test_dentry_xattr.c   | 46 ++++++++++++++
>  .../selftests/bpf/progs/test_get_xattr.c      | 16 ++++-
>  3 files changed, 117 insertions(+), 6 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/progs/test_dentry_xattr.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
> index 37056ba73847..a960cfbe8907 100644
> --- a/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
> +++ b/tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c
> @@ -2,17 +2,19 @@
>  /* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
>  
>  #include <stdlib.h>
> +#include <sys/stat.h>
>  #include <sys/types.h>
>  #include <sys/xattr.h>
>  #include <linux/fsverity.h>
>  #include <unistd.h>
>  #include <test_progs.h>
>  #include "test_get_xattr.skel.h"
> +#include "test_dentry_xattr.skel.h"
>  #include "test_fsverity.skel.h"
>  
>  static const char testfile[] = "/tmp/test_progs_fs_kfuncs";
>  
> -static void test_xattr(void)
> +static void test_file_xattr(void)
>  {
>  	struct test_get_xattr *skel = NULL;
>  	int fd = -1, err;
> @@ -50,7 +52,8 @@ static void test_xattr(void)
>  	if (!ASSERT_GE(fd, 0, "open_file"))
>  		goto out;
>  
> -	ASSERT_EQ(skel->bss->found_xattr, 1, "found_xattr");
> +	ASSERT_EQ(skel->bss->found_xattr_from_file, 1, "found_xattr_from_file");
> +	ASSERT_EQ(skel->bss->found_xattr_from_dentry, 1, "found_xattr_from_dentry");
>  
>  out:
>  	close(fd);
> @@ -58,6 +61,53 @@ static void test_xattr(void)
>  	remove(testfile);
>  }
>  
> +static void test_directory_xattr(void)
> +{
> +	struct test_dentry_xattr *skel = NULL;
> +	static const char * const paths[] = {
> +		"/tmp/a",
> +		"/tmp/a/b",
> +		"/tmp/a/b/c",
> +	};
> +	const char *file = "/tmp/a/b/c/d";
> +	int i, j, err, fd;
> +
> +	for (i = 0; i < sizeof(paths) / sizeof(char *); i++) {
> +		err = mkdir(paths[i], 0755);
> +		if (!ASSERT_OK(err, "mkdir"))
> +			goto out;
> +		err = setxattr(paths[i], "user.kfunc", "hello", sizeof("hello"), 0);
> +		if (!ASSERT_OK(err, "setxattr")) {
> +			i++;
> +			goto out;
> +		}
> +	}
> +
> +	skel = test_dentry_xattr__open_and_load();
> +
> +	if (!ASSERT_OK_PTR(skel, "test_dentry_xattr__open_and_load"))
> +		goto out;
> +
> +	skel->bss->monitored_pid = getpid();
> +	err = test_dentry_xattr__attach(skel);
> +
> +	if (!ASSERT_OK(err, "test_dentry__xattr__attach"))
> +		goto out;
> +
> +	fd = open(file, O_CREAT | O_RDONLY, 0644);
> +	if (!ASSERT_GE(fd, 0, "open_file"))
> +		goto out;
> +
> +	ASSERT_EQ(skel->bss->number_of_xattr_found, 3, "number_of_xattr_found");
> +	close(fd);
> +out:
> +	test_dentry_xattr__destroy(skel);
> +	remove(file);
> +	for (j = i - 1; j >= 0; j--)
> +		rmdir(paths[j]);
> +}
> +
> +
>  #ifndef SHA256_DIGEST_SIZE
>  #define SHA256_DIGEST_SIZE      32
>  #endif
> @@ -134,8 +184,11 @@ static void test_fsverity(void)
>  
>  void test_fs_kfuncs(void)
>  {
> -	if (test__start_subtest("xattr"))
> -		test_xattr();
> +	if (test__start_subtest("file_xattr"))
> +		test_file_xattr();
> +
> +	if (test__start_subtest("dentry_xattr"))
> +		test_directory_xattr();
>  
>  	if (test__start_subtest("fsverity"))
>  		test_fsverity();
> diff --git a/tools/testing/selftests/bpf/progs/test_dentry_xattr.c b/tools/testing/selftests/bpf/progs/test_dentry_xattr.c
> new file mode 100644
> index 000000000000..d2e378b2e2d5
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_dentry_xattr.c
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
> +
> +#include "vmlinux.h"
> +#include <bpf/bpf_tracing.h>
> +#include "bpf_kfuncs.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +__u32 monitored_pid;
> +__u32 number_of_xattr_found;
> +
> +static const char expected_value[] = "hello";
> +char value[32];
> +
> +SEC("lsm.s/file_open")
> +int BPF_PROG(test_file_open, struct file *f)
> +{
> +	struct bpf_dynptr value_ptr;
> +	struct dentry *dentry, *prev_dentry;
> +	__u32 pid, matches = 0;
> +	int i, ret;
> +
> +	pid = bpf_get_current_pid_tgid() >> 32;
> +	if (pid != monitored_pid)
> +		return 0;
> +
> +	bpf_dynptr_from_mem(value, sizeof(value), 0, &value_ptr);
> +
> +	dentry = bpf_file_dentry(f);
> +
> +	for (i = 0; i < 10; i++) {
> +		ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
> +		if (ret == sizeof(expected_value) &&
> +		    !bpf_strncmp(value, ret, expected_value))
> +			matches++;
> +
> +		prev_dentry = dentry;
> +		dentry = bpf_dget_parent(prev_dentry);

Why do you need to walk upwards and instead of reading the xattr values
during security_inode_permission()?

> +		bpf_dput(prev_dentry);
> +	}
> +
> +	bpf_dput(dentry);
> +	number_of_xattr_found = matches;
> +	return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_get_xattr.c b/tools/testing/selftests/bpf/progs/test_get_xattr.c
> index 7eb2a4e5a3e5..3b0dc6106ca5 100644
> --- a/tools/testing/selftests/bpf/progs/test_get_xattr.c
> +++ b/tools/testing/selftests/bpf/progs/test_get_xattr.c
> @@ -9,7 +9,8 @@
>  char _license[] SEC("license") = "GPL";
>  
>  __u32 monitored_pid;
> -__u32 found_xattr;
> +__u32 found_xattr_from_file;
> +__u32 found_xattr_from_dentry;
>  
>  static const char expected_value[] = "hello";
>  char value[32];
> @@ -18,6 +19,7 @@ SEC("lsm.s/file_open")
>  int BPF_PROG(test_file_open, struct file *f)
>  {
>  	struct bpf_dynptr value_ptr;
> +	struct dentry *dentry;
>  	__u32 pid;
>  	int ret;
>  
> @@ -32,6 +34,16 @@ int BPF_PROG(test_file_open, struct file *f)
>  		return 0;
>  	if (bpf_strncmp(value, ret, expected_value))
>  		return 0;
> -	found_xattr = 1;
> +	found_xattr_from_file = 1;
> +
> +	dentry = bpf_file_dentry(f);
> +	ret = bpf_get_dentry_xattr(dentry, "user.kfuncs", &value_ptr);
> +	bpf_dput(dentry);
> +	if (ret != sizeof(expected_value))
> +		return 0;
> +	if (bpf_strncmp(value, ret, expected_value))
> +		return 0;
> +	found_xattr_from_dentry = 1;
> +
>  	return 0;
>  }
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-26  7:06   ` Christian Brauner
@ 2024-07-26  9:19     ` Song Liu
  2024-07-26 11:51       ` Christian Brauner
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-07-26  9:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

Hi Christian, 

> On Jul 26, 2024, at 12:06 AM, Christian Brauner <brauner@kernel.org> wrote:

[...]

>> +
>> + for (i = 0; i < 10; i++) {
>> + ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
>> + if (ret == sizeof(expected_value) &&
>> +    !bpf_strncmp(value, ret, expected_value))
>> + matches++;
>> +
>> + prev_dentry = dentry;
>> + dentry = bpf_dget_parent(prev_dentry);
> 
> Why do you need to walk upwards and instead of reading the xattr values
> during security_inode_permission()?

In this use case, we would like to add xattr to the directory to cover
all files under it. For example, assume we have the following xattrs:

  /bin  xattr: user.policy_A = value_A
  /bin/gcc-6.9/ xattr: user.policy_A = value_B
  /bin/gcc-6.9/gcc xattr: user.policy_A = value_C

/bin/gcc-6.9/gcc will use value_C;
/bin/gcc-6.9/<other_files> will use value_B;
/bin/<other_folder_or_file> will use value_A;

By walking upwards from security_file_open(), we can finish the logic 
in a single LSM hook:

    repeat:
        if (dentry have user.policy_A) {
            /* make decision based on value */;
        } else {
            dentry = bpf_dget_parent();
            goto repeat;
        }

Does this make sense? Or maybe I misunderstood the suggestion?

Also, we don't have a bpf_get_inode_xattr() yet. I guess we will need
it for the security_inode_permission approach. If we agree that's a 
better approach, I more than happy to implement it that way. In fact,
I think we will eventually need both bpf_get_inode_xattr() and 
bpf_get_dentry_xattr(). 

Thanks,
Song


>> + bpf_dput(prev_dentry);
>> + }
>> +



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-26  9:19     ` Song Liu
@ 2024-07-26 11:51       ` Christian Brauner
  2024-07-26 19:43         ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2024-07-26 11:51 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

On Fri, Jul 26, 2024 at 09:19:54AM GMT, Song Liu wrote:
> Hi Christian, 
> 
> > On Jul 26, 2024, at 12:06 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> [...]
> 
> >> +
> >> + for (i = 0; i < 10; i++) {
> >> + ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
> >> + if (ret == sizeof(expected_value) &&
> >> +    !bpf_strncmp(value, ret, expected_value))
> >> + matches++;
> >> +
> >> + prev_dentry = dentry;
> >> + dentry = bpf_dget_parent(prev_dentry);
> > 
> > Why do you need to walk upwards and instead of reading the xattr values
> > during security_inode_permission()?
> 
> In this use case, we would like to add xattr to the directory to cover
> all files under it. For example, assume we have the following xattrs:
> 
>   /bin  xattr: user.policy_A = value_A
>   /bin/gcc-6.9/ xattr: user.policy_A = value_B
>   /bin/gcc-6.9/gcc xattr: user.policy_A = value_C
> 
> /bin/gcc-6.9/gcc will use value_C;
> /bin/gcc-6.9/<other_files> will use value_B;
> /bin/<other_folder_or_file> will use value_A;
> 
> By walking upwards from security_file_open(), we can finish the logic 
> in a single LSM hook:
> 
>     repeat:
>         if (dentry have user.policy_A) {
>             /* make decision based on value */;
>         } else {
>             dentry = bpf_dget_parent();
>             goto repeat;
>         }
> 
> Does this make sense? Or maybe I misunderstood the suggestion?

Imho, what you're doing belongs into inode_permission() not into
security_file_open(). That's already too late and it's somewhat clear
from the example you're using that you're essentially doing permission
checking during path lookup.

Btw, what you're doing is potentially very heavy-handed because you're
retrieving xattrs for which no VFS cache exists so you might end up
causing a lot of io.

Say you have a 10000 deep directory hierarchy and you open a
file_at_level_10000. With that dget_parent() logic in the worst case you
end up walking up the whole hierarchy reading xattr values from disk
10000 times. You can achieve the same result and cleaner if you do the
checking in inode_permission() where it belongs and you only cause all
of that pain once and you abort path lookup correctly.

Also, I'm not even sure this is always correct because you're
retroactively checking what policy to apply based on the xattr value
walking up the parent chain. But a rename could happen and then the
ancestor chain you're checking is different from the current chain or
there's a bunch of mounts along the way.

Imho, that dget_parent() thing just encourages very badly written bpf
LSM programs. That's certainly not an interface we want to expose.

> Also, we don't have a bpf_get_inode_xattr() yet. I guess we will need
> it for the security_inode_permission approach. If we agree that's a 

Yes, that's fine.

You also need to ensure that you're only reading user.* xattrs. I know
you already do that for bpf_get_file_xattr() but this helper needs the
same treatment.

And you need to force a drop-out of RCU path lookup btw because you're
almost definitely going to block when you check the xattr.

> better approach, I more than happy to implement it that way. In fact,
> I think we will eventually need both bpf_get_inode_xattr() and 
> bpf_get_dentry_xattr(). 

I'm not sure about that because it's royally annoying in the first place
that we have to dentry and inode separately in the xattr handlers
because LSMs sometimes call them from a location when the dentry and
inode aren't yet fused together. The dentry is the wrong data structure
to care about here.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-26 11:51       ` Christian Brauner
@ 2024-07-26 19:43         ` Song Liu
  2024-07-29 13:46           ` Christian Brauner
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-07-26 19:43 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

Hi Christian, 

Thanks a lot for your comments.

> On Jul 26, 2024, at 4:51 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> On Fri, Jul 26, 2024 at 09:19:54AM GMT, Song Liu wrote:
>> Hi Christian, 
>> 
>>> On Jul 26, 2024, at 12:06 AM, Christian Brauner <brauner@kernel.org> wrote:
>> 
>> [...]
>> 
>>>> +
>>>> + for (i = 0; i < 10; i++) {
>>>> + ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
>>>> + if (ret == sizeof(expected_value) &&
>>>> +    !bpf_strncmp(value, ret, expected_value))
>>>> + matches++;
>>>> +
>>>> + prev_dentry = dentry;
>>>> + dentry = bpf_dget_parent(prev_dentry);
>>> 
>>> Why do you need to walk upwards and instead of reading the xattr values
>>> during security_inode_permission()?
>> 
>> In this use case, we would like to add xattr to the directory to cover
>> all files under it. For example, assume we have the following xattrs:
>> 
>>  /bin  xattr: user.policy_A = value_A
>>  /bin/gcc-6.9/ xattr: user.policy_A = value_B
>>  /bin/gcc-6.9/gcc xattr: user.policy_A = value_C
>> 
>> /bin/gcc-6.9/gcc will use value_C;
>> /bin/gcc-6.9/<other_files> will use value_B;
>> /bin/<other_folder_or_file> will use value_A;
>> 
>> By walking upwards from security_file_open(), we can finish the logic 
>> in a single LSM hook:
>> 
>>    repeat:
>>        if (dentry have user.policy_A) {
>>            /* make decision based on value */;
>>        } else {
>>            dentry = bpf_dget_parent();
>>            goto repeat;
>>        }
>> 
>> Does this make sense? Or maybe I misunderstood the suggestion?
> 
> Imho, what you're doing belongs into inode_permission() not into
> security_file_open(). That's already too late and it's somewhat clear
> from the example you're using that you're essentially doing permission
> checking during path lookup.

I am not sure I follow the suggestion to implement this with 
security_inode_permission()? Could you please share more details about
this idea?

> Btw, what you're doing is potentially very heavy-handed because you're
> retrieving xattrs for which no VFS cache exists so you might end up
> causing a lot of io.
> 
> Say you have a 10000 deep directory hierarchy and you open a
> file_at_level_10000. With that dget_parent() logic in the worst case you
> end up walking up the whole hierarchy reading xattr values from disk
> 10000 times. You can achieve the same result and cleaner if you do the
> checking in inode_permission() where it belongs and you only cause all
> of that pain once and you abort path lookup correctly.

Yes, we need the BPF program to limit the number of parents to walk. 

> Also, I'm not even sure this is always correct because you're
> retroactively checking what policy to apply based on the xattr value
> walking up the parent chain. But a rename could happen and then the
> ancestor chain you're checking is different from the current chain or
> there's a bunch of mounts along the way. 
> 
> Imho, that dget_parent() thing just encourages very badly written bpf
> LSM programs. That's certainly not an interface we want to expose.
> 
>> Also, we don't have a bpf_get_inode_xattr() yet. I guess we will need
>> it for the security_inode_permission approach. If we agree that's a
> 
> Yes, that's fine.
> 
> You also need to ensure that you're only reading user.* xattrs. I know
> you already do that for bpf_get_file_xattr() but this helper needs the
> same treatment.

Sounds good. bpf_get_inode_xattr() would be a very useful kfunc. Let's
add that. 

> 
> And you need to force a drop-out of RCU path lookup btw because you're
> almost definitely going to block when you check the xattr.

We only allow xattr look up from sleepable context. 

> 
>> better approach, I more than happy to implement it that way. In fact,
>> I think we will eventually need both bpf_get_inode_xattr() and 
>> bpf_get_dentry_xattr().
> 
> I'm not sure about that because it's royally annoying in the first place
> that we have to dentry and inode separately in the xattr handlers
> because LSMs sometimes call them from a location when the dentry and
> inode aren't yet fused together. The dentry is the wrong data structure
> to care about here.

Thanks,
Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-26 19:43         ` Song Liu
@ 2024-07-29 13:46           ` Christian Brauner
  2024-07-30  5:58             ` Song Liu
  2024-08-19  7:18             ` Song Liu
  0 siblings, 2 replies; 25+ messages in thread
From: Christian Brauner @ 2024-07-29 13:46 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

On Fri, Jul 26, 2024 at 07:43:28PM GMT, Song Liu wrote:
> Hi Christian, 
> 
> Thanks a lot for your comments.
> 
> > On Jul 26, 2024, at 4:51 AM, Christian Brauner <brauner@kernel.org> wrote:
> > 
> > On Fri, Jul 26, 2024 at 09:19:54AM GMT, Song Liu wrote:
> >> Hi Christian, 
> >> 
> >>> On Jul 26, 2024, at 12:06 AM, Christian Brauner <brauner@kernel.org> wrote:
> >> 
> >> [...]
> >> 
> >>>> +
> >>>> + for (i = 0; i < 10; i++) {
> >>>> + ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
> >>>> + if (ret == sizeof(expected_value) &&
> >>>> +    !bpf_strncmp(value, ret, expected_value))
> >>>> + matches++;
> >>>> +
> >>>> + prev_dentry = dentry;
> >>>> + dentry = bpf_dget_parent(prev_dentry);
> >>> 
> >>> Why do you need to walk upwards and instead of reading the xattr values
> >>> during security_inode_permission()?
> >> 
> >> In this use case, we would like to add xattr to the directory to cover
> >> all files under it. For example, assume we have the following xattrs:
> >> 
> >>  /bin  xattr: user.policy_A = value_A
> >>  /bin/gcc-6.9/ xattr: user.policy_A = value_B
> >>  /bin/gcc-6.9/gcc xattr: user.policy_A = value_C
> >> 
> >> /bin/gcc-6.9/gcc will use value_C;
> >> /bin/gcc-6.9/<other_files> will use value_B;
> >> /bin/<other_folder_or_file> will use value_A;
> >> 
> >> By walking upwards from security_file_open(), we can finish the logic 
> >> in a single LSM hook:
> >> 
> >>    repeat:
> >>        if (dentry have user.policy_A) {
> >>            /* make decision based on value */;
> >>        } else {
> >>            dentry = bpf_dget_parent();
> >>            goto repeat;
> >>        }
> >> 
> >> Does this make sense? Or maybe I misunderstood the suggestion?
> > 
> > Imho, what you're doing belongs into inode_permission() not into
> > security_file_open(). That's already too late and it's somewhat clear
> > from the example you're using that you're essentially doing permission
> > checking during path lookup.
> 
> I am not sure I follow the suggestion to implement this with 
> security_inode_permission()? Could you please share more details about
> this idea?

Given a path like /bin/gcc-6.9/gcc what that code currently does is:

* walk down to /bin/gcc-6.9/gcc
* walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
  gcc
  gcc-6.9/
  bin/
  /

That's broken because someone could've done
mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
/attack even though the path lookup was for /bin/gcc-6.9. IOW, the
security_file_open() checks have nothing to do with the permission checks that
were done during path lookup.

Why isn't that logic:

* walk down to /bin/gcc-6.9/gcc and check for each component:

  security_inode_permission(/)
  security_inode_permission(gcc-6.9/)
  security_inode_permission(bin/)
  security_inode_permission(gcc)
  security_file_open(gcc)

I think that dget_parent() logic also wouldn't make sense for relative path
lookups:

	dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);

This walks down to /bin/gcc-6.9 and then walks back up (subject to the
same problem mentioned earlier) and check xattrs for:

  gcc-6.9
  bin/
  /

then that dfd is passed to openat() to open "gcc":

fd = openat(dfd, "gcc", O_RDONLY);

which again walks up to /bin/gcc-6.9 and checks xattrs for:
  gcc
  gcc-6.9
  bin/
  /

Which means this code ends up charging relative lookups twice. Even if one
irons that out in the program this encourages really bad patterns.
Path lookup is iterative top down. One can't just randomly walk back up and
assume that's equivalent.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-29 13:46           ` Christian Brauner
@ 2024-07-30  5:58             ` Song Liu
  2024-07-30  8:59               ` Christian Brauner
  2024-08-19  7:18             ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-07-30  5:58 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

Hi Christian, 

Thanks a lot for your detailed explanation! We will revisit the design 
based on these comments and suggestions. 

One more question about a potential new kfunc bpf_get_inode_xattr(): 
Should it take dentry as input? IOW, should it look like:

__bpf_kfunc int bpf_get_inode_xattr(struct dentry *dentry, const char *name__str,
                                    struct bpf_dynptr *value_p)
{
        struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
        u32 value_len;
        void *value;
        int ret;

        if (strncmp(name__str, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN))
                return -EPERM;

        value_len = __bpf_dynptr_size(value_ptr);
        value = __bpf_dynptr_data_rw(value_ptr, value_len);
        if (!value)
                return -EINVAL;

        ret = inode_permission(&nop_mnt_idmap, dentry->d_inode, MAY_READ);
        if (ret)
                return ret;
        return __vfs_getxattr(dentry, dentry->d_inode, name__str, value, value_len);
}


I am asking because many security_inode_* hooks actually taking dentry as 
argument. So it makes sense to use dentry for kfuncs. Maybe we should
call it bpf_get_dentry_xattr, which is actually the same kfunc in this
set (1/2)?

Thanks,
Song



> On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:
[...]
>>> Imho, what you're doing belongs into inode_permission() not into
>>> security_file_open(). That's already too late and it's somewhat clear
>>> from the example you're using that you're essentially doing permission
>>> checking during path lookup.
>> 
>> I am not sure I follow the suggestion to implement this with 
>> security_inode_permission()? Could you please share more details about
>> this idea?

[...]


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-30  5:58             ` Song Liu
@ 2024-07-30  8:59               ` Christian Brauner
  0 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2024-07-30  8:59 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com

On Tue, Jul 30, 2024 at 05:58:31AM GMT, Song Liu wrote:
> Hi Christian, 
> 
> Thanks a lot for your detailed explanation! We will revisit the design 
> based on these comments and suggestions. 
> 
> One more question about a potential new kfunc bpf_get_inode_xattr(): 
> Should it take dentry as input? IOW, should it look like:
> 
> __bpf_kfunc int bpf_get_inode_xattr(struct dentry *dentry, const char *name__str,
>                                     struct bpf_dynptr *value_p)
> {
>         struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
>         u32 value_len;
>         void *value;
>         int ret;
> 
>         if (strncmp(name__str, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN))
>                 return -EPERM;
> 
>         value_len = __bpf_dynptr_size(value_ptr);
>         value = __bpf_dynptr_data_rw(value_ptr, value_len);
>         if (!value)
>                 return -EINVAL;
> 
>         ret = inode_permission(&nop_mnt_idmap, inode, MAY_READ);
>         if (ret)
>                 return ret;
>         return __vfs_getxattr(dentry, inode, name__str, value, value_len);
> }
> 
> 
> I am asking because many security_inode_* hooks actually taking dentry as 
> argument. So it makes sense to use dentry for kfuncs. Maybe we should

Some filesystems (i) require access to the @dentry in their xattr
handlers (e.g. 9p) and (ii) ->get() and ->set() xattr handlers can be
called when @inode hasn't been attached to @dentry yet.

So if you allowed to call bpf_get_*_xattr() from
security_d_instantiate() to somehow retrieve xattrs from there, then you
need to pass @dentry and @inode separately and you cannot use
@dentry->d_inode because it would still be NULL.

However, I doubt you'd call bpf_get_*_xattr() from
security_d_instantiate() so imo just pass the dentry and add a check
like:

struct inode *inode = d_inode(dentry);
if (WARN_ON(!inode))
	return -EINVAL;

in there.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-07-29 13:46           ` Christian Brauner
  2024-07-30  5:58             ` Song Liu
@ 2024-08-19  7:18             ` Song Liu
  2024-08-19 11:16               ` Christian Brauner
  1 sibling, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-08-19  7:18 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

Thanks again for your suggestions here. I have got more questions on
this work. 

> On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:

[...]

>> I am not sure I follow the suggestion to implement this with 
>> security_inode_permission()? Could you please share more details about
>> this idea?
> 
> Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> 
> * walk down to /bin/gcc-6.9/gcc
> * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
>  gcc
>  gcc-6.9/
>  bin/
>  /
> 
> That's broken because someone could've done
> mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> security_file_open() checks have nothing to do with the permission checks that
> were done during path lookup.
> 
> Why isn't that logic:
> 
> * walk down to /bin/gcc-6.9/gcc and check for each component:
> 
>  security_inode_permission(/)
>  security_inode_permission(gcc-6.9/)
>  security_inode_permission(bin/)
>  security_inode_permission(gcc)
>  security_file_open(gcc)

I am trying to implement this approach. The idea, IIUC, is:

1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
   check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
2. Save the value of the flag somewhere (for BPF, we can use inode local
   storage). This is needed, because openat(dfd, ..) will not start from
   root again. 
3. Propagate these flag to children. All the above are done at 
   security_inode_permission. 
4. Finally, at security_file_open, check the xattr with the file, which 
   is probably propagated from some parents.

Did I get this right? 

IIUC, there are a few issues with this approach. 

1. security_inode_permission takes inode as parameter. However, we need 
   dentry to get the xattr. Shall we change security_inode_permission
   to take dentry instead? 
   PS: Maybe we should change most/all inode hooks to take dentry instead?

2. There is no easy way to propagate data from parent. Assuming we already
   changed security_inode_permission to take dentry, we still need some
   mechanism to look up xattr from the parent, which is probably still 
   something like bpf_dget_parent(). Or maybe we should add another hook 
   with both parent and child dentry as input?

3. Given we save the flag from parents in children's inode local storage, 
   we may consume non-trivial extra memory. BPF inode local storage will 
   be freed as the inode gets freed, so we will not leak any memory or 
   overflow some hash map. However, this will probably increase the 
   memory consumption of inode by a few percents. I think a "walk-up" 
   based approach will not have this problem, as we don't need the extra
   storage. Of course, this means more xattr lookups in some cases. 

> 
> I think that dget_parent() logic also wouldn't make sense for relative path
> lookups:
> 
> dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> 
> This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> same problem mentioned earlier) and check xattrs for:
> 
>  gcc-6.9
>  bin/
>  /
> 
> then that dfd is passed to openat() to open "gcc":
> 
> fd = openat(dfd, "gcc", O_RDONLY);
> 
> which again walks up to /bin/gcc-6.9 and checks xattrs for:
>  gcc
>  gcc-6.9
>  bin/
>  /
> 
> Which means this code ends up charging relative lookups twice. Even if one
> irons that out in the program this encourages really bad patterns.
> Path lookup is iterative top down. One can't just randomly walk back up and
> assume that's equivalent.

I understand that walk-up is not equivalent to walk down. But it is probably
accurate enough for some security policies. For example, LSM LandLock uses
similar logic in the file_open hook (file security/landlock/fs.c, function 
is_access_to_paths_allowed). 

To summary my thoughts here. I think we need:

1. Change security_inode_permission to take dentry instead of inode. 
2. Still add bpf_dget_parent. We will use it with security_inode_permission
   so that we can propagate flags from parents to children. We will need
   a bpf_dput as well. 
3. There are pros and cons with different approaches to implement this
   policy (tags on directory work for all files in it). We probably need 
   the policy writer to decide with one to use. From BPF's POV, dget_parent
   is "safe", because it won't crash the system. It may encourage some bad
   patterns, but it appears to be required in some use cases. 

Does this make sense?

Thanks,
Song

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19  7:18             ` Song Liu
@ 2024-08-19 11:16               ` Christian Brauner
  2024-08-19 13:12                 ` Mickaël Salaün
  2024-08-19 20:25                 ` Song Liu
  0 siblings, 2 replies; 25+ messages in thread
From: Christian Brauner @ 2024-08-19 11:16 UTC (permalink / raw)
  To: Song Liu, Mickaël Salaün
  Cc: Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

On Mon, Aug 19, 2024 at 07:18:40AM GMT, Song Liu wrote:
> Hi Christian, 
> 
> Thanks again for your suggestions here. I have got more questions on
> this work. 
> 
> > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> [...]
> 
> >> I am not sure I follow the suggestion to implement this with 
> >> security_inode_permission()? Could you please share more details about
> >> this idea?
> > 
> > Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> > 
> > * walk down to /bin/gcc-6.9/gcc
> > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
> >  gcc
> >  gcc-6.9/
> >  bin/
> >  /
> > 
> > That's broken because someone could've done
> > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> > security_file_open() checks have nothing to do with the permission checks that
> > were done during path lookup.
> > 
> > Why isn't that logic:
> > 
> > * walk down to /bin/gcc-6.9/gcc and check for each component:
> > 
> >  security_inode_permission(/)
> >  security_inode_permission(gcc-6.9/)
> >  security_inode_permission(bin/)
> >  security_inode_permission(gcc)
> >  security_file_open(gcc)
> 
> I am trying to implement this approach. The idea, IIUC, is:
> 
> 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
>    check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
> 2. Save the value of the flag somewhere (for BPF, we can use inode local
>    storage). This is needed, because openat(dfd, ..) will not start from
>    root again. 
> 3. Propagate these flag to children. All the above are done at 
>    security_inode_permission. 
> 4. Finally, at security_file_open, check the xattr with the file, which 
>    is probably propagated from some parents.
> 
> Did I get this right? 
> 
> IIUC, there are a few issues with this approach. 
> 
> 1. security_inode_permission takes inode as parameter. However, we need 
>    dentry to get the xattr. Shall we change security_inode_permission
>    to take dentry instead? 
>    PS: Maybe we should change most/all inode hooks to take dentry instead?

security_inode_permission() is called in generic_permission() which in
turn is called from inode_permission() which in turn is called from
inode->i_op->permission() for various filesystems. So to make
security_inode_permission() take a dentry argument one would need to
change all inode->i_op->permission() to take a dentry argument for all
filesystems. NAK on that.

That's ignoring that it's just plain wrong to pass a dentry to
**inode**_permission() or security_**inode**_permission(). It's
permissions on the inode, not the dentry.

> 
> 2. There is no easy way to propagate data from parent. Assuming we already
>    changed security_inode_permission to take dentry, we still need some
>    mechanism to look up xattr from the parent, which is probably still 
>    something like bpf_dget_parent(). Or maybe we should add another hook 
>    with both parent and child dentry as input?
> 
> 3. Given we save the flag from parents in children's inode local storage, 
>    we may consume non-trivial extra memory. BPF inode local storage will 
>    be freed as the inode gets freed, so we will not leak any memory or 
>    overflow some hash map. However, this will probably increase the 
>    memory consumption of inode by a few percents. I think a "walk-up" 
>    based approach will not have this problem, as we don't need the extra
>    storage. Of course, this means more xattr lookups in some cases. 
> 
> > 
> > I think that dget_parent() logic also wouldn't make sense for relative path
> > lookups:
> > 
> > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> > 
> > This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> > same problem mentioned earlier) and check xattrs for:
> > 
> >  gcc-6.9
> >  bin/
> >  /
> > 
> > then that dfd is passed to openat() to open "gcc":
> > 
> > fd = openat(dfd, "gcc", O_RDONLY);
> > 
> > which again walks up to /bin/gcc-6.9 and checks xattrs for:
> >  gcc
> >  gcc-6.9
> >  bin/
> >  /
> > 
> > Which means this code ends up charging relative lookups twice. Even if one
> > irons that out in the program this encourages really bad patterns.
> > Path lookup is iterative top down. One can't just randomly walk back up and
> > assume that's equivalent.
> 
> I understand that walk-up is not equivalent to walk down. But it is probably
> accurate enough for some security policies. For example, LSM LandLock uses
> similar logic in the file_open hook (file security/landlock/fs.c, function 
> is_access_to_paths_allowed). 

I'm not well-versed in landlock so I'll let Mickaël comment on this with
more details but there's very important restrictions and differences
here.

Landlock expresses security policies with file hierarchies and
security_inode_permission() doesn't and cannot have access to that.

Landlock is subject to the same problem that the BPF is here. Namely
that the VFS permission checking could have been done on a path walk
completely different from the path walk that is checked when walking
back up from security_file_open().

But because landlock works with a deny-by-default security policy this
is ok and it takes overmounts into account etc.

> 
> To summary my thoughts here. I think we need:
> 
> 1. Change security_inode_permission to take dentry instead of inode. 

Sorry, no.

> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>    so that we can propagate flags from parents to children. We will need
>    a bpf_dput as well. 
> 3. There are pros and cons with different approaches to implement this
>    policy (tags on directory work for all files in it). We probably need 
>    the policy writer to decide with one to use. From BPF's POV, dget_parent
>    is "safe", because it won't crash the system. It may encourage some bad
>    patterns, but it appears to be required in some use cases. 

You cannot just walk a path upwards and check permissions and assume
that this is safe unless you have a clear idea what makes it safe in
this scenario. Landlock has afaict. But so far you only have a vague
sketch of checking permissions walking upwards and retrieving xattrs
without any notion of the problems involved.

If you provide a bpf_get_parent() api for userspace to consume you'll
end up providing them with an api that is extremly easy to misuse.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 11:16               ` Christian Brauner
@ 2024-08-19 13:12                 ` Mickaël Salaün
  2024-08-19 20:35                   ` Song Liu
  2024-08-19 20:25                 ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Mickaël Salaün @ 2024-08-19 13:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML, Kernel Team,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Mon, Aug 19, 2024 at 01:16:04PM +0200, Christian Brauner wrote:
> On Mon, Aug 19, 2024 at 07:18:40AM GMT, Song Liu wrote:
> > Hi Christian, 
> > 
> > Thanks again for your suggestions here. I have got more questions on
> > this work. 
> > 
> > > On Jul 29, 2024, at 6:46 AM, Christian Brauner <brauner@kernel.org> wrote:
> > 
> > [...]
> > 
> > >> I am not sure I follow the suggestion to implement this with 
> > >> security_inode_permission()? Could you please share more details about
> > >> this idea?
> > > 
> > > Given a path like /bin/gcc-6.9/gcc what that code currently does is:
> > > 
> > > * walk down to /bin/gcc-6.9/gcc
> > > * walk up from /bin/gcc-6.9/gcc and then checking xattr labels for:
> > >  gcc
> > >  gcc-6.9/
> > >  bin/
> > >  /
> > > 
> > > That's broken because someone could've done
> > > mv /bin/gcc-6.9/gcc /attack/ and when this walks back and it checks xattrs on
> > > /attack even though the path lookup was for /bin/gcc-6.9. IOW, the
> > > security_file_open() checks have nothing to do with the permission checks that
> > > were done during path lookup.
> > > 
> > > Why isn't that logic:
> > > 
> > > * walk down to /bin/gcc-6.9/gcc and check for each component:
> > > 
> > >  security_inode_permission(/)
> > >  security_inode_permission(gcc-6.9/)
> > >  security_inode_permission(bin/)
> > >  security_inode_permission(gcc)
> > >  security_file_open(gcc)
> > 
> > I am trying to implement this approach. The idea, IIUC, is:
> > 
> > 1. For each open/openat, as we walk the path in do_filp_open=>path_openat, 
> >    check xattr for "/", "gcc-6.9/", "bin/" for all given flags.
> > 2. Save the value of the flag somewhere (for BPF, we can use inode local
> >    storage). This is needed, because openat(dfd, ..) will not start from
> >    root again. 
> > 3. Propagate these flag to children. All the above are done at 
> >    security_inode_permission. 
> > 4. Finally, at security_file_open, check the xattr with the file, which 
> >    is probably propagated from some parents.
> > 
> > Did I get this right? 
> > 
> > IIUC, there are a few issues with this approach. 
> > 
> > 1. security_inode_permission takes inode as parameter. However, we need 
> >    dentry to get the xattr. Shall we change security_inode_permission
> >    to take dentry instead? 
> >    PS: Maybe we should change most/all inode hooks to take dentry instead?
> 
> security_inode_permission() is called in generic_permission() which in
> turn is called from inode_permission() which in turn is called from
> inode->i_op->permission() for various filesystems. So to make
> security_inode_permission() take a dentry argument one would need to
> change all inode->i_op->permission() to take a dentry argument for all
> filesystems. NAK on that.
> 
> That's ignoring that it's just plain wrong to pass a dentry to
> **inode**_permission() or security_**inode**_permission(). It's
> permissions on the inode, not the dentry.
> 
> > 
> > 2. There is no easy way to propagate data from parent. Assuming we already
> >    changed security_inode_permission to take dentry, we still need some
> >    mechanism to look up xattr from the parent, which is probably still 
> >    something like bpf_dget_parent(). Or maybe we should add another hook 
> >    with both parent and child dentry as input?
> > 
> > 3. Given we save the flag from parents in children's inode local storage, 
> >    we may consume non-trivial extra memory. BPF inode local storage will 
> >    be freed as the inode gets freed, so we will not leak any memory or 
> >    overflow some hash map. However, this will probably increase the 
> >    memory consumption of inode by a few percents. I think a "walk-up" 
> >    based approach will not have this problem, as we don't need the extra
> >    storage. Of course, this means more xattr lookups in some cases. 
> > 
> > > 
> > > I think that dget_parent() logic also wouldn't make sense for relative path
> > > lookups:
> > > 
> > > dfd = open("/bin/gcc-6.9", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
> > > 
> > > This walks down to /bin/gcc-6.9 and then walks back up (subject to the
> > > same problem mentioned earlier) and check xattrs for:
> > > 
> > >  gcc-6.9
> > >  bin/
> > >  /
> > > 
> > > then that dfd is passed to openat() to open "gcc":
> > > 
> > > fd = openat(dfd, "gcc", O_RDONLY);
> > > 
> > > which again walks up to /bin/gcc-6.9 and checks xattrs for:
> > >  gcc
> > >  gcc-6.9
> > >  bin/
> > >  /
> > > 
> > > Which means this code ends up charging relative lookups twice. Even if one
> > > irons that out in the program this encourages really bad patterns.
> > > Path lookup is iterative top down. One can't just randomly walk back up and
> > > assume that's equivalent.
> > 
> > I understand that walk-up is not equivalent to walk down. But it is probably
> > accurate enough for some security policies. For example, LSM LandLock uses
> > similar logic in the file_open hook (file security/landlock/fs.c, function 
> > is_access_to_paths_allowed). 
> 
> I'm not well-versed in landlock so I'll let Mickaël comment on this with
> more details but there's very important restrictions and differences
> here.
> 
> Landlock expresses security policies with file hierarchies and
> security_inode_permission() doesn't and cannot have access to that.
> 
> Landlock is subject to the same problem that the BPF is here. Namely
> that the VFS permission checking could have been done on a path walk
> completely different from the path walk that is checked when walking
> back up from security_file_open().
> 
> But because landlock works with a deny-by-default security policy this
> is ok and it takes overmounts into account etc.

Correct. Another point is that Landlock uses the file's path (i.e.
dentry + mnt) to walk down to the parent.  Only using the dentry would
be incorrect for most use cases (i.e. any system with more than one
mount point).

> 
> > 
> > To summary my thoughts here. I think we need:
> > 
> > 1. Change security_inode_permission to take dentry instead of inode. 
> 
> Sorry, no.
> 
> > 2. Still add bpf_dget_parent. We will use it with security_inode_permission
> >    so that we can propagate flags from parents to children. We will need
> >    a bpf_dput as well. 
> > 3. There are pros and cons with different approaches to implement this
> >    policy (tags on directory work for all files in it). We probably need 
> >    the policy writer to decide with one to use. From BPF's POV, dget_parent
> >    is "safe", because it won't crash the system. It may encourage some bad
> >    patterns, but it appears to be required in some use cases. 
> 
> You cannot just walk a path upwards and check permissions and assume
> that this is safe unless you have a clear idea what makes it safe in
> this scenario. Landlock has afaict. But so far you only have a vague
> sketch of checking permissions walking upwards and retrieving xattrs
> without any notion of the problems involved.

Something to keep in mind is that relying on xattr to label files
requires to deny sanboxed processes to change this xattr, otherwise it
would be trivial to bypass such a sandbox.  Sandboxing must be though as
a whole and Landlock's design for file system access control takes into
account all kind of file system operations that could bypass a sandbox
policy (e.g. mount operations), and also protects from impersonations.

What is the use case for this patch series?  Couldn't Landlock be used
for that?

> 
> If you provide a bpf_get_parent() api for userspace to consume you'll
> end up providing them with an api that is extremly easy to misuse.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 13:12                 ` Mickaël Salaün
@ 2024-08-19 20:35                   ` Song Liu
  2024-08-20 12:45                     ` Mickaël Salaün
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-08-19 20:35 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Song Liu, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

Hi Mickaël, 

> On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:

[...]

>> But because landlock works with a deny-by-default security policy this
>> is ok and it takes overmounts into account etc.
> 
> Correct. Another point is that Landlock uses the file's path (i.e.
> dentry + mnt) to walk down to the parent.  Only using the dentry would
> be incorrect for most use cases (i.e. any system with more than one
> mount point).

Thanks for highlighting the difference. Let me see whether we can bridge
the gap for this set. 

[...]

>>> 
>>> 1. Change security_inode_permission to take dentry instead of inode.
>> 
>> Sorry, no.
>> 
>>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>>   so that we can propagate flags from parents to children. We will need
>>>   a bpf_dput as well. 
>>> 3. There are pros and cons with different approaches to implement this
>>>   policy (tags on directory work for all files in it). We probably need 
>>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
>>>   is "safe", because it won't crash the system. It may encourage some bad
>>>   patterns, but it appears to be required in some use cases.
>> 
>> You cannot just walk a path upwards and check permissions and assume
>> that this is safe unless you have a clear idea what makes it safe in
>> this scenario. Landlock has afaict. But so far you only have a vague
>> sketch of checking permissions walking upwards and retrieving xattrs
>> without any notion of the problems involved.
> 
> Something to keep in mind is that relying on xattr to label files
> requires to deny sanboxed processes to change this xattr, otherwise it
> would be trivial to bypass such a sandbox.  Sandboxing must be though as
> a whole and Landlock's design for file system access control takes into
> account all kind of file system operations that could bypass a sandbox
> policy (e.g. mount operations), and also protects from impersonations.

Thanks for sharing these experiences! 

> What is the use case for this patch series?  Couldn't Landlock be used
> for that?

We have multiple use cases. We can use Landlock for some of them. The 
primary goal of this patchset is to add useful building blocks to BPF LSM
so that we can build effective and flexible security policies for various
use cases. These building blocks alone won't be very useful. For example,
as you pointed out, to make xattr labels useful, we need some policies 
for xattr read/write.

Does this make sense?

Thanks,
Song



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:35                   ` Song Liu
@ 2024-08-20 12:45                     ` Mickaël Salaün
  2024-08-20 17:42                       ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Mickaël Salaün @ 2024-08-20 12:45 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Mon, Aug 19, 2024 at 08:35:53PM +0000, Song Liu wrote:
> Hi Mickaël, 
> 
> > On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> [...]
> 
> >> But because landlock works with a deny-by-default security policy this
> >> is ok and it takes overmounts into account etc.
> > 
> > Correct. Another point is that Landlock uses the file's path (i.e.
> > dentry + mnt) to walk down to the parent.  Only using the dentry would
> > be incorrect for most use cases (i.e. any system with more than one
> > mount point).
> 
> Thanks for highlighting the difference. Let me see whether we can bridge
> the gap for this set. 
> 
> [...]
> 
> >>> 
> >>> 1. Change security_inode_permission to take dentry instead of inode.
> >> 
> >> Sorry, no.
> >> 
> >>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
> >>>   so that we can propagate flags from parents to children. We will need
> >>>   a bpf_dput as well. 
> >>> 3. There are pros and cons with different approaches to implement this
> >>>   policy (tags on directory work for all files in it). We probably need 
> >>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
> >>>   is "safe", because it won't crash the system. It may encourage some bad
> >>>   patterns, but it appears to be required in some use cases.
> >> 
> >> You cannot just walk a path upwards and check permissions and assume
> >> that this is safe unless you have a clear idea what makes it safe in
> >> this scenario. Landlock has afaict. But so far you only have a vague
> >> sketch of checking permissions walking upwards and retrieving xattrs
> >> without any notion of the problems involved.
> > 
> > Something to keep in mind is that relying on xattr to label files
> > requires to deny sanboxed processes to change this xattr, otherwise it
> > would be trivial to bypass such a sandbox.  Sandboxing must be though as
> > a whole and Landlock's design for file system access control takes into
> > account all kind of file system operations that could bypass a sandbox
> > policy (e.g. mount operations), and also protects from impersonations.
> 
> Thanks for sharing these experiences! 
> 
> > What is the use case for this patch series?  Couldn't Landlock be used
> > for that?
> 
> We have multiple use cases. We can use Landlock for some of them. The 
> primary goal of this patchset is to add useful building blocks to BPF LSM
> so that we can build effective and flexible security policies for various
> use cases. These building blocks alone won't be very useful. For example,
> as you pointed out, to make xattr labels useful, we need some policies 
> for xattr read/write.
> 
> Does this make sense?

Yes, but I think you'll end up with a code pretty close to the Landlock
implementation.

What about adding BPF hooks to Landlock?  User space could create
Landlock sandboxes that would delegate the denials to a BPF program,
which could then also allow such access, but without directly handling
nor reimplementing filesystem path walks.  The Landlock user space ABI
changes would mainly be a new landlock_ruleset_attr field to explicitly
ask for a (system-wide) BPF program to handle access requests if no
Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
Landlock domains for consistent sandbox management.  One of the
advantage of this approach is to only run related BPF programs if the
sandbox policy would deny the request.  Another advantage would be to
leverage the Landlock user space interface to let any program partially
define and extend their security policy.

I'm working on implementing audit support for Landlock [1] and I think
these changes could be useful to implement BPF hooks to run a dedicated
BPF program type per event (see landlock_log_denial() and struct
landlock_request).  I'll get back on this patch series in September.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=wip-audit

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 12:45                     ` Mickaël Salaün
@ 2024-08-20 17:42                       ` Song Liu
  2024-08-20 21:11                         ` Paul Moore
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-08-20 17:42 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Song Liu, Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack


> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> On Mon, Aug 19, 2024 at 08:35:53PM +0000, Song Liu wrote:
>> Hi Mickaël, 
>> 
>>> On Aug 19, 2024, at 6:12 AM, Mickaël Salaün <mic@digikod.net> wrote:
>> 
>> [...]
>> 
>>>> But because landlock works with a deny-by-default security policy this
>>>> is ok and it takes overmounts into account etc.
>>> 
>>> Correct. Another point is that Landlock uses the file's path (i.e.
>>> dentry + mnt) to walk down to the parent.  Only using the dentry would
>>> be incorrect for most use cases (i.e. any system with more than one
>>> mount point).
>> 
>> Thanks for highlighting the difference. Let me see whether we can bridge
>> the gap for this set. 
>> 
>> [...]
>> 
>>>>> 
>>>>> 1. Change security_inode_permission to take dentry instead of inode.
>>>> 
>>>> Sorry, no.
>>>> 
>>>>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>>>>  so that we can propagate flags from parents to children. We will need
>>>>>  a bpf_dput as well. 
>>>>> 3. There are pros and cons with different approaches to implement this
>>>>>  policy (tags on directory work for all files in it). We probably need 
>>>>>  the policy writer to decide with one to use. From BPF's POV, dget_parent
>>>>>  is "safe", because it won't crash the system. It may encourage some bad
>>>>>  patterns, but it appears to be required in some use cases.
>>>> 
>>>> You cannot just walk a path upwards and check permissions and assume
>>>> that this is safe unless you have a clear idea what makes it safe in
>>>> this scenario. Landlock has afaict. But so far you only have a vague
>>>> sketch of checking permissions walking upwards and retrieving xattrs
>>>> without any notion of the problems involved.
>>> 
>>> Something to keep in mind is that relying on xattr to label files
>>> requires to deny sanboxed processes to change this xattr, otherwise it
>>> would be trivial to bypass such a sandbox.  Sandboxing must be though as
>>> a whole and Landlock's design for file system access control takes into
>>> account all kind of file system operations that could bypass a sandbox
>>> policy (e.g. mount operations), and also protects from impersonations.
>> 
>> Thanks for sharing these experiences! 
>> 
>>> What is the use case for this patch series?  Couldn't Landlock be used
>>> for that?
>> 
>> We have multiple use cases. We can use Landlock for some of them. The 
>> primary goal of this patchset is to add useful building blocks to BPF LSM
>> so that we can build effective and flexible security policies for various
>> use cases. These building blocks alone won't be very useful. For example,
>> as you pointed out, to make xattr labels useful, we need some policies 
>> for xattr read/write.
>> 
>> Does this make sense?
> 
> Yes, but I think you'll end up with a code pretty close to the Landlock
> implementation.

At the moment, I think it is not possible to do full Landlock logic in
BPF. We are learning from other LSMs. 

> What about adding BPF hooks to Landlock?  User space could create
> Landlock sandboxes that would delegate the denials to a BPF program,
> which could then also allow such access, but without directly handling
> nor reimplementing filesystem path walks.  The Landlock user space ABI
> changes would mainly be a new landlock_ruleset_attr field to explicitly
> ask for a (system-wide) BPF program to handle access requests if no
> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> Landlock domains for consistent sandbox management.  One of the
> advantage of this approach is to only run related BPF programs if the
> sandbox policy would deny the request.  Another advantage would be to
> leverage the Landlock user space interface to let any program partially
> define and extend their security policy.

Given there is BPF LSM, I have never thought about adding BPF hooks to 
Landlock or other LSMs. I personally would prefer to have a common API
to walk the path, maybe something like vma_iterator. But I need to read
more code to understand whether this makes sense?

Thanks,
Song

> I'm working on implementing audit support for Landlock [1] and I think
> these changes could be useful to implement BPF hooks to run a dedicated
> BPF program type per event (see landlock_log_denial() and struct
> landlock_request).  I'll get back on this patch series in September.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=wip-audit



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 17:42                       ` Song Liu
@ 2024-08-20 21:11                         ` Paul Moore
  2024-08-21  3:43                           ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Moore @ 2024-08-20 21:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Mickaël Salaün, Christian Brauner, Song Liu, bpf,
	Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, viro@zeniv.linux.org.uk, jack@suse.cz,
	kpsingh@kernel.org, mattbobrowski@google.com, Liam Wisehart,
	Liang Tang, Shankaran Gnanashanmugam, LSM List,
	Günther Noack

On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
> > On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:

...

> > What about adding BPF hooks to Landlock?  User space could create
> > Landlock sandboxes that would delegate the denials to a BPF program,
> > which could then also allow such access, but without directly handling
> > nor reimplementing filesystem path walks.  The Landlock user space ABI
> > changes would mainly be a new landlock_ruleset_attr field to explicitly
> > ask for a (system-wide) BPF program to handle access requests if no
> > Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> > Landlock domains for consistent sandbox management.  One of the
> > advantage of this approach is to only run related BPF programs if the
> > sandbox policy would deny the request.  Another advantage would be to
> > leverage the Landlock user space interface to let any program partially
> > define and extend their security policy.
>
> Given there is BPF LSM, I have never thought about adding BPF hooks to
> Landlock or other LSMs. I personally would prefer to have a common API
> to walk the path, maybe something like vma_iterator. But I need to read
> more code to understand whether this makes sense?

Just so there isn't any confusion, I want to make sure that everyone
is clear that "adding BPF hooks to Landlock" should mean "add a new
Landlock specific BPF hook inside Landlock" and not "reuse existing
BPF LSM hooks inside Landlock".

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20 21:11                         ` Paul Moore
@ 2024-08-21  3:43                           ` Song Liu
  2024-08-23 10:38                             ` Mickaël Salaün
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2024-08-21  3:43 UTC (permalink / raw)
  To: Paul Moore
  Cc: Song Liu, Mickaël Salaün, Christian Brauner, Song Liu,
	bpf, Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, viro@zeniv.linux.org.uk, jack@suse.cz,
	kpsingh@kernel.org, mattbobrowski@google.com, Liam Wisehart,
	Liang Tang, Shankaran Gnanashanmugam, LSM List,
	Günther Noack



> On Aug 20, 2024, at 2:11 PM, Paul Moore <paul@paul-moore.com> wrote:
> 
> On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
>>> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> ...
> 
>>> What about adding BPF hooks to Landlock?  User space could create
>>> Landlock sandboxes that would delegate the denials to a BPF program,
>>> which could then also allow such access, but without directly handling
>>> nor reimplementing filesystem path walks.  The Landlock user space ABI
>>> changes would mainly be a new landlock_ruleset_attr field to explicitly
>>> ask for a (system-wide) BPF program to handle access requests if no
>>> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
>>> Landlock domains for consistent sandbox management.  One of the
>>> advantage of this approach is to only run related BPF programs if the
>>> sandbox policy would deny the request.  Another advantage would be to
>>> leverage the Landlock user space interface to let any program partially
>>> define and extend their security policy.
>> 
>> Given there is BPF LSM, I have never thought about adding BPF hooks to
>> Landlock or other LSMs. I personally would prefer to have a common API
>> to walk the path, maybe something like vma_iterator. But I need to read
>> more code to understand whether this makes sense?
> 
> Just so there isn't any confusion, I want to make sure that everyone
> is clear that "adding BPF hooks to Landlock" should mean "add a new
> Landlock specific BPF hook inside Landlock" and not "reuse existing
> BPF LSM hooks inside Landlock".

I think we are on the same page. My understanding of Mickaël's idea is
to add some brand new hooks to Landlock code, so that Landlock can
use BPF program to make some decisions. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-21  3:43                           ` Song Liu
@ 2024-08-23 10:38                             ` Mickaël Salaün
  0 siblings, 0 replies; 25+ messages in thread
From: Mickaël Salaün @ 2024-08-23 10:38 UTC (permalink / raw)
  To: Song Liu
  Cc: Paul Moore, Christian Brauner, Song Liu, bpf, Linux-Fsdevel, LKML,
	Kernel Team, andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List, Günther Noack

On Wed, Aug 21, 2024 at 03:43:48AM +0000, Song Liu wrote:
> 
> 
> > On Aug 20, 2024, at 2:11 PM, Paul Moore <paul@paul-moore.com> wrote:
> > 
> > On Tue, Aug 20, 2024 at 1:43 PM Song Liu <songliubraving@meta.com> wrote:
> >>> On Aug 20, 2024, at 5:45 AM, Mickaël Salaün <mic@digikod.net> wrote:
> > 
> > ...
> > 
> >>> What about adding BPF hooks to Landlock?  User space could create
> >>> Landlock sandboxes that would delegate the denials to a BPF program,
> >>> which could then also allow such access, but without directly handling
> >>> nor reimplementing filesystem path walks.  The Landlock user space ABI
> >>> changes would mainly be a new landlock_ruleset_attr field to explicitly
> >>> ask for a (system-wide) BPF program to handle access requests if no
> >>> Landlock rule allow them.  We could also tie a BPF data (i.e. blob) to
> >>> Landlock domains for consistent sandbox management.  One of the
> >>> advantage of this approach is to only run related BPF programs if the
> >>> sandbox policy would deny the request.  Another advantage would be to
> >>> leverage the Landlock user space interface to let any program partially
> >>> define and extend their security policy.
> >> 
> >> Given there is BPF LSM, I have never thought about adding BPF hooks to
> >> Landlock or other LSMs. I personally would prefer to have a common API
> >> to walk the path, maybe something like vma_iterator. But I need to read
> >> more code to understand whether this makes sense?

I think it would not be an issue to use BPF Landlock hooks along with
BPF LSM hooks for the same global policy.  This could also use the
Landlock domain concept for your use case, including domain inheritance,
domain identification, cross-domain protections... to avoid
reimplementing the same semantic (and going through the same issues).
Limiting the BPF program calls could also improve performance.

> > 
> > Just so there isn't any confusion, I want to make sure that everyone
> > is clear that "adding BPF hooks to Landlock" should mean "add a new
> > Landlock specific BPF hook inside Landlock" and not "reuse existing
> > BPF LSM hooks inside Landlock".
> 
> I think we are on the same page. My understanding of Mickaël's idea is
> to add some brand new hooks to Landlock code, so that Landlock can
> use BPF program to make some decisions. 

Correct

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 11:16               ` Christian Brauner
  2024-08-19 13:12                 ` Mickaël Salaün
@ 2024-08-19 20:25                 ` Song Liu
  2024-08-20  5:42                   ` Song Liu
  2024-08-20  6:29                   ` Al Viro
  1 sibling, 2 replies; 25+ messages in thread
From: Song Liu @ 2024-08-19 20:25 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Mickaël Salaün, Song Liu, bpf, Linux-Fsdevel,
	LKML, Kernel Team, andrii@kernel.org, eddyz87@gmail.com,
	ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

> On Aug 19, 2024, at 4:16 AM, Christian Brauner <brauner@kernel.org> wrote:

[...]

>> Did I get this right? 
>> 
>> IIUC, there are a few issues with this approach. 
>> 
>> 1. security_inode_permission takes inode as parameter. However, we need 
>>   dentry to get the xattr. Shall we change security_inode_permission
>>   to take dentry instead? 
>>   PS: Maybe we should change most/all inode hooks to take dentry instead?
> 
> security_inode_permission() is called in generic_permission() which in
> turn is called from inode_permission() which in turn is called from
> inode->i_op->permission() for various filesystems. So to make
> security_inode_permission() take a dentry argument one would need to
> change all inode->i_op->permission() to take a dentry argument for all
> filesystems. NAK on that.
> 
> That's ignoring that it's just plain wrong to pass a dentry to
> **inode**_permission() or security_**inode**_permission(). It's
> permissions on the inode, not the dentry.

Agreed. 

[...]

>>> 
>>> Which means this code ends up charging relative lookups twice. Even if one
>>> irons that out in the program this encourages really bad patterns.
>>> Path lookup is iterative top down. One can't just randomly walk back up and
>>> assume that's equivalent.
>> 
>> I understand that walk-up is not equivalent to walk down. But it is probably
>> accurate enough for some security policies. For example, LSM LandLock uses
>> similar logic in the file_open hook (file security/landlock/fs.c, function 
>> is_access_to_paths_allowed).
> 
> I'm not well-versed in landlock so I'll let Mickaël comment on this with
> more details but there's very important restrictions and differences
> here.
> 
> Landlock expresses security policies with file hierarchies and
> security_inode_permission() doesn't and cannot have access to that.
> 
> Landlock is subject to the same problem that the BPF is here. Namely
> that the VFS permission checking could have been done on a path walk
> completely different from the path walk that is checked when walking
> back up from security_file_open().
> 
> But because landlock works with a deny-by-default security policy this
> is ok and it takes overmounts into account etc.
> 
>> 
>> To summary my thoughts here. I think we need:
>> 
>> 1. Change security_inode_permission to take dentry instead of inode.
> 
> Sorry, no.
> 
>> 2. Still add bpf_dget_parent. We will use it with security_inode_permission
>>   so that we can propagate flags from parents to children. We will need
>>   a bpf_dput as well. 
>> 3. There are pros and cons with different approaches to implement this
>>   policy (tags on directory work for all files in it). We probably need 
>>   the policy writer to decide with one to use. From BPF's POV, dget_parent
>>   is "safe", because it won't crash the system. It may encourage some bad
>>   patterns, but it appears to be required in some use cases.
> 
> You cannot just walk a path upwards and check permissions and assume
> that this is safe unless you have a clear idea what makes it safe in
> this scenario. Landlock has afaict. But so far you only have a vague
> sketch of checking permissions walking upwards and retrieving xattrs
> without any notion of the problems involved.

I am sorry for being vague with the use case here. We are trying to 
cover a few different use cases, such as sandboxing, allowlisting 
certain operations to selected binaries, prevent operation errors, etc. 
For this work, we are looking for the right building blocks to enable
these use cases. 

> If you provide a bpf_get_parent() api for userspace to consume you'll
> end up providing them with an api that is extremly easy to misuse.

Does this make sense to have higher level API that walks up the path, 
so that it takes mounts into account. It can probably be something like:

int bpf_get_parent_path(struct path *p) {
again:
    if (p->dentry == p->mnt.mnt_root) {
        follow_up(p);
        goto again;
    }
    if (unlikely(IS_ROOT(p->dentry))) {
        return PARENT_WALK_DONE;  
    }
    parent_dentry = dget_parent(p->dentry);
    dput(p->dentry);
    p->dentry = parent_dentry;
    return PARENT_WALK_NEXT; 
}

This will handle the mount. However, we cannot guarantee deny-by-default
policies like LandLock does, because this is just a building block of 
some security policies. 

Is this something we can give a try with?

Thanks,
Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:25                 ` Song Liu
@ 2024-08-20  5:42                   ` Song Liu
  2024-08-20  6:29                   ` Al Viro
  1 sibling, 0 replies; 25+ messages in thread
From: Song Liu @ 2024-08-20  5:42 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Song Liu, Mickaël Salaün, Song Liu, bpf, Linux-Fsdevel,
	LKML, Kernel Team, andrii@kernel.org, eddyz87@gmail.com,
	ast@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	viro@zeniv.linux.org.uk, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

Hi Christian, 

> On Aug 19, 2024, at 1:25 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> Hi Christian, 

[...]

> If you provide a bpf_get_parent() api for userspace to consume you'll
>> end up providing them with an api that is extremly easy to misuse.
> 
> Does this make sense to have higher level API that walks up the path, 
> so that it takes mounts into account. It can probably be something like:
> 
> int bpf_get_parent_path(struct path *p) {
> again:
>    if (p->dentry == p->mnt.mnt_root) {
>        follow_up(p);
>        goto again;
>    }
>    if (unlikely(IS_ROOT(p->dentry))) {
>        return PARENT_WALK_DONE;  
>    }
>    parent_dentry = dget_parent(p->dentry);
>    dput(p->dentry);
>    p->dentry = parent_dentry;
>    return PARENT_WALK_NEXT; 
> }
> 
> This will handle the mount. However, we cannot guarantee deny-by-default
> policies like LandLock does, because this is just a building block of 
> some security policies. 

I guess the above is not really clear. Here is a prototype I got. 
With the kernel diff attached below, we are able to do something
like:

SEC("lsm.s/file_open")
int BPF_PROG(test_file_open, struct file *f)
{
	/* ... */

        bpf_for_each(dentry, dentry, &f->f_path, BPF_DENTRY_ITER_TO_ROOT) {
                ret = bpf_get_dentry_xattr(dentry, "user.kfunc", &value_ptr);
		/* do work with the xattr in value_ptr */
        }

	/* ... */
}

With this helper, the user cannot walk the tree randomly. Instead, 
the walk has to follow some pattern, namely, TO_ROOT, TO_MNT_ROOT, 
etc. And helper makes sure the walk is safe. 

Does this solution make sense to you? 

Thanks,
Song



The kernel diff below. 
============================== 8< ===============================


diff --git c/fs/bpf_fs_kfuncs.c w/fs/bpf_fs_kfuncs.c
index 3fe9f59ef867..4b1400dec984 100644
--- c/fs/bpf_fs_kfuncs.c
+++ w/fs/bpf_fs_kfuncs.c
@@ -8,6 +8,7 @@
 #include <linux/fs.h>
 #include <linux/file.h>
 #include <linux/mm.h>
+#include <linux/namei.h>
 #include <linux/xattr.h>

 __bpf_kfunc_start_defs();
@@ -154,13 +155,91 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str,

 __bpf_kfunc_end_defs();

+struct bpf_iter_dentry {
+       __u64 __opaque[3];
+} __attribute__((aligned(8)));
+
+struct bpf_iter_dentry_kern {
+       struct path path;
+       unsigned int flags;
+} __attribute__((aligned(8)));
+
+enum {
+       /* all the parent paths, until root (/) */
+       BPF_DENTRY_ITER_TO_ROOT,
+       /* all the parent paths, until mnt root */
+       BPF_DENTRY_ITER_TO_MNT_ROOT,
+};
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc int bpf_iter_dentry_new(struct bpf_iter_dentry *it,
+                                        struct path *path, unsigned int flags)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+
+       BUILD_BUG_ON(sizeof(struct bpf_iter_dentry_kern) >
+                    sizeof(struct bpf_iter_dentry));
+       BUILD_BUG_ON(__alignof__(struct bpf_iter_dentry_kern) !=
+                    __alignof__(struct bpf_iter_dentry));
+
+       if (flags)
+               return -EINVAL;
+
+       switch (flags) {
+       case BPF_DENTRY_ITER_TO_ROOT:
+       case BPF_DENTRY_ITER_TO_MNT_ROOT:
+               break;
+       default:
+               return -EINVAL;
+       }
+       kit->path = *path;
+       path_get(&kit->path);
+       kit->flags = flags;
+       return 0;
+}
+
+__bpf_kfunc struct dentry *bpf_iter_dentry_next(struct bpf_iter_dentry *it)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+       struct dentry *parent_dentry;
+
+       if (unlikely(IS_ROOT(kit->path.dentry)))
+               return NULL;
+
+jump_up:
+       if (kit->path.dentry == kit->path.mnt->mnt_root) {
+               if (kit->flags == BPF_DENTRY_ITER_TO_MNT_ROOT)
+                       return NULL;
+               if (follow_up(&kit->path)) {
+                       goto jump_up;
+               }
+       }
+       parent_dentry = dget_parent(kit->path.dentry);
+       dput(kit->path.dentry);
+       kit->path.dentry = parent_dentry;
+       return parent_dentry;
+}
+
+__bpf_kfunc void bpf_iter_dentry_destroy(struct bpf_iter_dentry *it)
+{
+       struct bpf_iter_dentry_kern *kit = (void*)it;
+
+       path_put(&kit->path);
+}
+
+__bpf_kfunc_end_defs();
+
 BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_get_task_exe_file,
             KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE)
 BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS)
-BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE)    /* Will fix this later */
 BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_dentry_new, KF_ITER_NEW | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_iter_dentry_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_dentry_destroy, KF_ITER_DESTROY)
 BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)

 static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-19 20:25                 ` Song Liu
  2024-08-20  5:42                   ` Song Liu
@ 2024-08-20  6:29                   ` Al Viro
  2024-08-20  7:23                     ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Al Viro @ 2024-08-20  6:29 UTC (permalink / raw)
  To: Song Liu
  Cc: Christian Brauner, Mickaël Salaün, Song Liu, bpf,
	Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

On Mon, Aug 19, 2024 at 08:25:38PM +0000, Song Liu wrote:

> int bpf_get_parent_path(struct path *p) {
> again:
>     if (p->dentry == p->mnt.mnt_root) {
>         follow_up(p);
>         goto again;
>     }
>     if (unlikely(IS_ROOT(p->dentry))) {
>         return PARENT_WALK_DONE;  
>     }
>     parent_dentry = dget_parent(p->dentry);
>     dput(p->dentry);
>     p->dentry = parent_dentry;
>     return PARENT_WALK_NEXT; 
> }
> 
> This will handle the mount. However, we cannot guarantee deny-by-default
> policies like LandLock does, because this is just a building block of 
> some security policies. 

You do realize that above is racy as hell, right?

Filesystem objects do get moved around.  You can, theoretically, play with
rename_lock, but that is highly antisocial.

What's more, _mounts_ can get moved around.  That is to say, there is no
such thing as stable canonical pathname of a file.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr
  2024-08-20  6:29                   ` Al Viro
@ 2024-08-20  7:23                     ` Song Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2024-08-20  7:23 UTC (permalink / raw)
  To: Al Viro
  Cc: Song Liu, Christian Brauner, Mickaël Salaün, Song Liu,
	bpf, Linux-Fsdevel, LKML, Kernel Team, andrii@kernel.org,
	eddyz87@gmail.com, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@linux.dev, jack@suse.cz, kpsingh@kernel.org,
	mattbobrowski@google.com, Liam Wisehart, Liang Tang,
	Shankaran Gnanashanmugam, LSM List

> On Aug 19, 2024, at 11:29 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> On Mon, Aug 19, 2024 at 08:25:38PM +0000, Song Liu wrote:
> 
>> int bpf_get_parent_path(struct path *p) {
>> again:
>>    if (p->dentry == p->mnt.mnt_root) {
>>        follow_up(p);
>>        goto again;
>>    }
>>    if (unlikely(IS_ROOT(p->dentry))) {
>>        return PARENT_WALK_DONE;  
>>    }
>>    parent_dentry = dget_parent(p->dentry);
>>    dput(p->dentry);
>>    p->dentry = parent_dentry;
>>    return PARENT_WALK_NEXT; 
>> }
>> 
>> This will handle the mount. However, we cannot guarantee deny-by-default
>> policies like LandLock does, because this is just a building block of 
>> some security policies.
> 
> You do realize that above is racy as hell, right?
> 
> Filesystem objects do get moved around.  You can, theoretically, play with
> rename_lock, but that is highly antisocial.

I do understand filesystem objects may get moved around. However, I am not
sure whether we have to avoid all the race conditions (and whether it is
really possible to avoid all race conditions). 

> What's more, _mounts_ can get moved around.  That is to say, there is no
> such thing as stable canonical pathname of a file.

Maybe I should really step back and ask for high level suggestions. 

We are hoping to tag all files in a directory with xattr (or something
else) on the directory. For example, a xattr "Do_not_rename" on /usr 
should block rename of all files inside /usr. 

Our original idea is to start from security_file_open() hook, and walk 
up the tree (/usr/bin/gcc => /usr/bin => /usr). However, this appears
to be wasteful and unreliable, and Christian suggested we should use a 
combination of security_inode_permission and security_file_open. I 
tried to build something on this direction, and hits a few issues:

1. Getting xattr from security_inode_permission() is not easy. Some
   FS requires dentry to get xattr. 
2. Finding parent from security_inode_permission() is also tricky.
   (maybe as trick as doing dget_parent() from security_file_open?)
   We need tag on /usr to work on /usr/bin. But how do we know /usr
   is /usr/bin's parent?

For the original goal… tag all files in a directory with xattr on 
the directory, is it possible at all? If not, we will go back and
implement something to tag all the files one at a time. If it is
indeed possible, what's the right way to do it, and what are the
race conditions we need to avoid and to accept?

Comments and suggestions are highly appreciated. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-08-23 10:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-25 23:47 [PATCH bpf-next 0/2] Add kfuncs to support reading xattr from dentry Song Liu
2024-07-25 23:47 ` [PATCH bpf-next 1/2] bpf: Add kfunc bpf_get_dentry_xattr() to read " Song Liu
2024-07-26  5:34   ` Al Viro
2024-07-26  7:01     ` Song Liu
2024-07-25 23:47 ` [PATCH bpf-next 2/2] selftests/bpf: Add tests for bpf_get_dentry_xattr Song Liu
2024-07-26  7:06   ` Christian Brauner
2024-07-26  9:19     ` Song Liu
2024-07-26 11:51       ` Christian Brauner
2024-07-26 19:43         ` Song Liu
2024-07-29 13:46           ` Christian Brauner
2024-07-30  5:58             ` Song Liu
2024-07-30  8:59               ` Christian Brauner
2024-08-19  7:18             ` Song Liu
2024-08-19 11:16               ` Christian Brauner
2024-08-19 13:12                 ` Mickaël Salaün
2024-08-19 20:35                   ` Song Liu
2024-08-20 12:45                     ` Mickaël Salaün
2024-08-20 17:42                       ` Song Liu
2024-08-20 21:11                         ` Paul Moore
2024-08-21  3:43                           ` Song Liu
2024-08-23 10:38                             ` Mickaël Salaün
2024-08-19 20:25                 ` Song Liu
2024-08-20  5:42                   ` Song Liu
2024-08-20  6:29                   ` Al Viro
2024-08-20  7:23                     ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).