* [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories
@ 2025-07-01 18:38 Mickaël Salaün
2025-07-01 18:38 ` [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite Mickaël Salaün
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Mickaël Salaün @ 2025-07-01 18:38 UTC (permalink / raw)
To: Günther Noack
Cc: Mickaël Salaün, linux-security-module, linux-fsdevel,
NeilBrown, Christian Brauner, Al Viro, Jeff Xu, Ben Scarlato,
Paul Moore, Daniel Burgener, Song Liu, Tingmao Wang
We can get disconnected files or directories when they are visible and
opened from a bind mount, before being renamed/moved from the source of
the bind mount in a way that makes them inaccessible from the mount
point (i.e. out of scope).
Until now, access rights tied to files or directories opened through a
disconnected directory were collected by walking the related hierarchy
down to the root of this filesystem because the mount point couldn't be
found. This could lead to inconsistent access results, and
hard-to-debug renames, especially because such paths cannot be printed.
For a sandboxed task to create a disconnected directory, it needs to
have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
the underlying source of the bind mount, and read access to the related
mount point. Because a sandboxed task cannot get more access than those
defined by its Landlock domain, this could only lead to inconsistent
access rights because of missing those that should be inherited from the
mount point hierarchy and inheriting from the hierarchy of the mounted
filesystem instead.
Landlock now handles files/directories opened from disconnected
directories like the mount point these disconnected directories were
opened from. This gives the guarantee that access rights on a
file/directory cannot be more than those at open time. The rationale is
that disconnected hierarchies might not be visible nor accessible to a
sandboxed task, and relying on the collected access rights from them
could introduce unexpected results, especially for rename actions
because of the access right comparison between the source and the
destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much
less surprising to users and safer from an access point of view.
Unlike follow_dotdot(), we don't need to check for each directory if it
is part of the mount's root, but instead this is only checked when we
reached a root dentry (not a mount point), or when the access
request is about to be allowed. This limits the number of calls to
is_subdir() which walks down the hierarchy (again). This also avoids
checking path connection at the beginning of the walk for each mount
point, which would be racy.
Make path_connected() public to stay consistent with the VFS. This
helper is used when we are about to allowed an access.
This change increases the stack size with two Landlock layer masks
backups that are needed to reset the collected access rights to the
latest mount point.
Because opened files have their access rights stored in the related file
security properties, their is no impact for disconnected or unlinked
files.
A following commit will document handling of disconnected files and
directories.
Cc: Günther Noack <gnoack@google.com>
Cc: Song Liu <song@kernel.org>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org
Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
This replaces this patch:
landlock: Remove warning in collect_domain_accesses()
https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net
I'll probably split this commit into two to ease backport (same for
tests).
This patch series applies on top of my next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
TODO: Add documentation
TODO: Add Landlock erratum
---
fs/namei.c | 2 +-
include/linux/fs.h | 1 +
security/landlock/fs.c | 121 +++++++++++++++++++++++++++++++++++------
3 files changed, 105 insertions(+), 19 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 4bb889fc980b..7853a876fc1c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd)
* Rename can sometimes move a file or directory outside of a bind
* mount, path_connected allows those cases to be detected.
*/
-static bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
+bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
{
struct super_block *sb = mnt->mnt_sb;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4ec77da65f14..3c0e324a9272 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *);
/* fs/dcache.c -- generic fs support functions */
extern bool is_subdir(struct dentry *, struct dentry *);
extern bool path_is_under(const struct path *, const struct path *);
+extern bool path_connected(struct vfsmount *mnt, struct dentry *dentry);
extern char *file_path(struct file *, char *, int);
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 1d6c4e728f92..51f03eb82069 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -768,7 +768,9 @@ static bool is_access_to_paths_allowed(
struct path walker_path;
access_mask_t access_masked_parent1, access_masked_parent2;
layer_mask_t _layer_masks_child1[LANDLOCK_NUM_ACCESS_FS],
- _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS];
+ _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS],
+ _layer_masks_parent1_bkp[LANDLOCK_NUM_ACCESS_FS],
+ _layer_masks_parent2_bkp[LANDLOCK_NUM_ACCESS_FS];
layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL,
(*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL;
@@ -800,6 +802,8 @@ static bool is_access_to_paths_allowed(
access_masked_parent1 = access_masked_parent2 =
landlock_union_access_masks(domain).fs;
is_dom_check = true;
+ memcpy(&_layer_masks_parent2_bkp, layer_masks_parent2,
+ sizeof(_layer_masks_parent2_bkp));
} else {
if (WARN_ON_ONCE(dentry_child1 || dentry_child2))
return false;
@@ -807,6 +811,8 @@ static bool is_access_to_paths_allowed(
access_masked_parent1 = access_request_parent1;
access_masked_parent2 = access_request_parent2;
is_dom_check = false;
+ memcpy(&_layer_masks_parent1_bkp, layer_masks_parent1,
+ sizeof(_layer_masks_parent1_bkp));
}
if (unlikely(dentry_child1)) {
@@ -858,6 +864,14 @@ static bool is_access_to_paths_allowed(
child1_is_directory, layer_masks_parent2,
layer_masks_child2,
child2_is_directory))) {
+ /*
+ * Rewinds walk for disconnected directories before any other state
+ * change.
+ */
+ if (unlikely(!path_connected(walker_path.mnt,
+ walker_path.dentry)))
+ goto reset_to_mount_root;
+
/*
* Now, downgrades the remaining checks from domain
* handled accesses to requested accesses.
@@ -893,14 +907,42 @@ static bool is_access_to_paths_allowed(
ARRAY_SIZE(*layer_masks_parent2));
/* Stops when a rule from each layer grants access. */
- if (allowed_parent1 && allowed_parent2)
+ if (allowed_parent1 && allowed_parent2) {
+ /*
+ * Rewinds walk for disconnected directories before any other state
+ * change.
+ */
+ if (unlikely(!path_connected(walker_path.mnt,
+ walker_path.dentry)))
+ goto reset_to_mount_root;
+
break;
+ }
+
jump_up:
if (walker_path.dentry == walker_path.mnt->mnt_root) {
if (follow_up(&walker_path)) {
+ /* Saves known good values. */
+ memcpy(&_layer_masks_parent1_bkp,
+ layer_masks_parent1,
+ sizeof(_layer_masks_parent1_bkp));
+ if (layer_masks_parent2)
+ memcpy(&_layer_masks_parent2_bkp,
+ layer_masks_parent2,
+ sizeof(_layer_masks_parent2_bkp));
+
/* Ignores hidden mount points. */
goto jump_up;
} else {
+ /*
+ * Rewinds walk for disconnected directories before any other
+ * state change.
+ */
+ if (unlikely(!path_connected(
+ walker_path.mnt,
+ walker_path.dentry)))
+ goto reset_to_mount_root;
+
/*
* Stops at the real root. Denies access
* because not all layers have granted access.
@@ -909,20 +951,51 @@ static bool is_access_to_paths_allowed(
}
}
if (unlikely(IS_ROOT(walker_path.dentry))) {
- /*
- * Stops at disconnected root directories. Only allows
- * access to internal filesystems (e.g. nsfs, which is
- * reachable through /proc/<pid>/ns/<namespace>).
- */
if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
+ /*
+ * Stops and allows access when reaching disconnected root
+ * directories that are part of internal filesystems (e.g. nsfs,
+ * which is reachable through /proc/<pid>/ns/<namespace>).
+ */
allowed_parent1 = true;
allowed_parent2 = true;
+ break;
+ } else {
+ /*
+ * Ignores current walk in walker_path.mnt when reaching
+ * disconnected root directories from bind mounts. Reset the
+ * collected access rights to the latest mount point (or @path)
+ * we walked through, and start again from the current root of
+ * the mount point. The newly collected access rights will be
+ * less than or equal to those at open time.
+ */
+ goto reset_to_mount_root;
}
- break;
}
parent_dentry = dget_parent(walker_path.dentry);
dput(walker_path.dentry);
walker_path.dentry = parent_dentry;
+ continue;
+
+reset_to_mount_root:
+ /* Restores latest known good values. */
+ memcpy(layer_masks_parent1, &_layer_masks_parent1_bkp,
+ sizeof(_layer_masks_parent1_bkp));
+ if (layer_masks_parent2)
+ memcpy(layer_masks_parent2, &_layer_masks_parent2_bkp,
+ sizeof(_layer_masks_parent2_bkp));
+
+ /*
+ * Ignores previous results. They will be computed again with the next
+ * iteration.
+ */
+ allowed_parent1 = false;
+ allowed_parent2 = false;
+
+ /* Restarts with the current mount point. */
+ dput(walker_path.dentry);
+ walker_path.dentry = walker_path.mnt->mnt_root;
+ dget(walker_path.dentry);
}
path_put(&walker_path);
@@ -1030,13 +1103,13 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
*/
static bool collect_domain_accesses(
const struct landlock_ruleset *const domain,
- const struct dentry *const mnt_root, struct dentry *dir,
+ const struct path *const mnt_dir, struct dentry *dir,
layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
{
- unsigned long access_dom;
+ access_mask_t access_dom;
bool ret = false;
- if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom))
+ if (WARN_ON_ONCE(!domain || !mnt_dir || !dir || !layer_masks_dom))
return true;
if (is_nouser_or_private(dir))
return true;
@@ -1053,6 +1126,10 @@ static bool collect_domain_accesses(
if (landlock_unmask_layers(find_rule(domain, dir), access_dom,
layer_masks_dom,
ARRAY_SIZE(*layer_masks_dom))) {
+ /* Ignores this walk if we end up in a disconnected directory. */
+ if (unlikely(!path_connected(mnt_dir->mnt, dir)))
+ goto cancel_walk;
+
/*
* Stops when all handled accesses are allowed by at
* least one rule in each layer.
@@ -1061,13 +1138,23 @@ static bool collect_domain_accesses(
break;
}
- /* Stops at the mount point or disconnected root directories. */
- if (dir == mnt_root || IS_ROOT(dir))
+ /* Stops at the mount point. */
+ if (dir == mnt_dir->dentry)
break;
+ /* Ignores this walk if we end up in a disconnected root directory. */
+ if (unlikely(IS_ROOT(dir)))
+ goto cancel_walk;
+
parent_dentry = dget_parent(dir);
dput(dir);
dir = parent_dentry;
+ continue;
+
+cancel_walk:
+ landlock_init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
+ layer_masks_dom, LANDLOCK_KEY_INODE);
+ break;
}
dput(dir);
return ret;
@@ -1198,13 +1285,11 @@ static int current_check_refer_path(struct dentry *const old_dentry,
old_dentry->d_parent;
/* new_dir->dentry is equal to new_dentry->d_parent */
- allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
- old_parent,
- &layer_masks_parent1);
- allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
+ allow_parent1 = collect_domain_accesses(
+ subject->domain, &mnt_dir, old_parent, &layer_masks_parent1);
+ allow_parent2 = collect_domain_accesses(subject->domain, &mnt_dir,
new_dir->dentry,
&layer_masks_parent2);
-
if (allow_parent1 && allow_parent2)
return 0;
--
2.50.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite
2025-07-01 18:38 [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Mickaël Salaün
@ 2025-07-01 18:38 ` Mickaël Salaün
2025-07-07 0:22 ` Tingmao Wang
2025-07-07 10:30 ` [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Christian Brauner
2025-07-08 17:36 ` Mickaël Salaün
2 siblings, 1 reply; 5+ messages in thread
From: Mickaël Salaün @ 2025-07-01 18:38 UTC (permalink / raw)
To: Günther Noack
Cc: Mickaël Salaün, linux-security-module, linux-fsdevel,
NeilBrown, Christian Brauner, Al Viro, Jeff Xu, Ben Scarlato,
Paul Moore, Daniel Burgener, Song Liu, Tingmao Wang
Test disconnected directories with 12 variants.
Test coverage for security/landlock is 92.0% of 1956 lines.
Cc: Günther Noack <gnoack@google.com>
Cc: Song Liu <song@kernel.org>
Cc: Tingmao Wang <m@maowtm.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---
Tingmao, I initially started from your patch [1] (it changed a lot), and
I'll probably squash with [2], at which point you'll be co-developer if
that's OK with you.
Link: https://lore.kernel.org/r/f7e3d874-6088-4f70-8222-c4a8547d213e@maowtm.org [1]
Link: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org [2]
TODO: Add more tests
---
tools/testing/selftests/landlock/fs_test.c | 458 ++++++++++++++++++++-
1 file changed, 452 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index d8f9259fffe4..a99e9ea22f98 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -2267,6 +2267,22 @@ static int test_exchange(const char *const oldpath, const char *const newpath)
return 0;
}
+static int test_renameat(int olddirfd, const char *oldpath, int newdirfd,
+ const char *newpath)
+{
+ if (renameat2(olddirfd, oldpath, newdirfd, newpath, 0))
+ return errno;
+ return 0;
+}
+
+static int test_exchangeat(int olddirfd, const char *oldpath, int newdirfd,
+ const char *newpath)
+{
+ if (renameat2(olddirfd, oldpath, newdirfd, newpath, RENAME_EXCHANGE))
+ return errno;
+ return 0;
+}
+
TEST_F_FORK(layout1, rename_file)
{
const struct rule rules[] = {
@@ -4895,7 +4911,14 @@ TEST_F_FORK(layout1_bind, path_disconnected)
EXPECT_EQ(0, close(ruleset_fd_l2));
EXPECT_EQ(0, test_open(file1_s4d1, O_RDONLY));
- EXPECT_EQ(0, test_open_rel(bind_s1d3_fd, file1_name, O_RDONLY));
+ /*
+ * Accessing a file through a disconnected file descriptor is not allowed
+ * when it is no longer visible in its mount point.
+ *
+ * TODO: Is this the right behavior? We could add an exception for access
+ * rights tied to files, but it would be a bit inconsistent.
+ */
+ EXPECT_EQ(EACCES, test_open_rel(bind_s1d3_fd, file1_name, O_RDONLY));
EXPECT_EQ(EACCES, test_open_rel(bind_s1d3_fd, file2_name, O_RDONLY));
/*
@@ -4977,17 +5000,17 @@ TEST_F_FORK(layout1_bind, path_disconnected_rename)
EXPECT_EQ(ENOENT, test_open_rel(bind_s1d3_fd, "..", O_DIRECTORY));
/*
- * Since file is no longer under s1d2, we should not be able to access it if
- * we enforced layer 2. Do a fork to test this so we don't prevent
- * ourselves from renaming it back later.
+ * The file is no longer under s1d2 but we should still be able to access it
+ * with layer 2 because its mount point is evaluated as the first valid
+ * directory because it was initially a parent. Do a fork to test this so
+ * we don't prevent ourselves from renaming it back later.
*/
child_pid = fork();
ASSERT_LE(0, child_pid);
if (child_pid == 0) {
enforce_ruleset(_metadata, ruleset_fd_l2);
EXPECT_EQ(0, close(ruleset_fd_l2));
- EXPECT_EQ(EACCES,
- test_open_rel(bind_s1d3_fd, file1_name, O_RDONLY));
+ EXPECT_EQ(0, test_open_rel(bind_s1d3_fd, file1_name, O_RDONLY));
EXPECT_EQ(EACCES, test_open(file1_s4d2, O_RDONLY));
/*
@@ -5201,6 +5224,429 @@ TEST_F_FORK(layout1_bind, path_disconnected_link)
}
}
+/* clang-format off */
+FIXTURE(layout4_disconnected) {
+ int s2d2_fd;
+};
+/* clang-format on */
+
+FIXTURE_SETUP(layout4_disconnected)
+{
+ prepare_layout(_metadata);
+
+ create_file(_metadata, TMP_DIR "/s1d1/s1d2/s1d31/s1d41/f1");
+ create_file(_metadata, TMP_DIR "/s1d1/s1d2/s1d31/s1d41/f2");
+ create_file(_metadata, TMP_DIR "/s1d1/s1d2/s1d32/s1d42/f3");
+ create_file(_metadata, TMP_DIR "/s1d1/s1d2/s1d32/s1d42/f4");
+ create_directory(_metadata, TMP_DIR "/s2d1/s2d2");
+ create_directory(_metadata, TMP_DIR "/s3d1/s3d2");
+ create_directory(_metadata, TMP_DIR "/s4d1/s4d2");
+
+ self->s2d2_fd =
+ open(TMP_DIR "/s2d1/s2d2", O_DIRECTORY | O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, self->s2d2_fd);
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount(TMP_DIR "/s1d1/s1d2", TMP_DIR "/s2d1/s2d2", NULL,
+ MS_BIND, NULL));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+}
+
+FIXTURE_TEARDOWN_PARENT(layout4_disconnected)
+{
+ /* umount(TMP_DIR "/s2d1") is handled by namespace lifetime. */
+
+ /* Removes files after renames. */
+ remove_path(TMP_DIR "/s3d1/s1d41/f1");
+ remove_path(TMP_DIR "/s3d1/s1d41/f2");
+ remove_path(TMP_DIR "/s4d1/s1d42/f1");
+ remove_path(TMP_DIR "/s4d1/s1d42/f3");
+ remove_path(TMP_DIR "/s4d1/s1d42/f4");
+ remove_path(TMP_DIR "/s4d1/s1d42/f5");
+
+ cleanup_layout(_metadata);
+}
+
+/*
+ * layout4_disconnected with bind mount and renames:
+ *
+ * tmp
+ * ├── s1d1
+ * │ └── s1d2 [source of the bind mount]
+ * │ ├── s1d31
+ * │ │ └── s1d41 [now renamed]
+ * │ │ ├── f1
+ * │ │ └── f2
+ * │ └── s1d32
+ * │ └── s1d42 [now renamed]
+ * │ ├── f3
+ * │ └── f4
+ * ├── s2d1
+ * │ └── s2d2 [bind mount of s1d2]
+ * │ ├── s1d31
+ * │ │ └── s1d41 [opened and now renamed]
+ * │ │ ├── f1
+ * │ │ └── f2
+ * │ └── s1d32
+ * │ └── s1d42 [opened and now renamed]
+ * │ ├── f3
+ * │ └── f4
+ * ├── s3d1
+ * │ └── s1d41 [renamed here]
+ * │ ├── f1
+ * │ └── f2
+ * └── s4d1
+ * └── s1d42 [renamed here]
+ * ├── f3
+ * └── f4
+ */
+FIXTURE_VARIANT(layout4_disconnected)
+{
+ /*
+ * Parent of the bind mount source. It should always be ignored when
+ * testing against files under the s1d41 or s1d42 disconnected directories.
+ */
+ const __u64 allowed_s1d1;
+ /*
+ * Source of bind mount (to s2d2). It should always be enforced when
+ * testing against files under the s1d41 or s1d42 disconnected directories.
+ */
+ const __u64 allowed_s1d2;
+ /*
+ * Original parent of s1d41. It should always be ignored when testing
+ * against files under the s1d41 disconnected directory.
+ */
+ const __u64 allowed_s1d31;
+ /*
+ * Original parent of s1d42. It should always be ignored when testing
+ * against files under the s1d42 disconnected directory.
+ */
+ const __u64 allowed_s1d32;
+ /*
+ * Opened and disconnected source directory. It should always be enforced
+ * when testing against files under the s1d41 disconnected directory.
+ */
+ const __u64 allowed_s1d41;
+ /*
+ * Opened and disconnected source directory. It should always be enforced
+ * when testing against files under the s1d42 disconnected directory.
+ */
+ const __u64 allowed_s1d42;
+ /*
+ * File in the s1d41 disconnected directory. It should always be enforced
+ * when testing against itself under the s1d41 disconnected directory.
+ */
+ const __u64 allowed_f1;
+ /*
+ * File in the s1d41 disconnected directory. It should always be enforced
+ * when testing against itself under the s1d41 disconnected directory.
+ */
+ const __u64 allowed_f2;
+ /*
+ * File in the s1d42 disconnected directory. It should always be enforced
+ * when testing against itself under the s1d42 disconnected directory.
+ */
+ const __u64 allowed_f3;
+ /*
+ * Parent of the bind mount destination. It should always be enforced when
+ * testing against files under the s1d41 or s1d42 disconnected directories.
+ */
+ const __u64 allowed_s2d1;
+ /*
+ * Directory covered by the bind mount. It should always be ignored when
+ * testing against files under the s1d41 or s1d42 disconnected directories.
+ */
+ const __u64 allowed_s2d2;
+ /*
+ * New parent of the renamed s1d41. It should always be ignored when
+ * testing against files under the s1d41 disconnected directory.
+ */
+ const __u64 allowed_s3d1;
+ /*
+ * New parent of the renamed s1d42. It should always be ignored when
+ * testing against files under the s1d42 disconnected directory.
+ */
+ const __u64 allowed_s4d1;
+
+ /* Expected result of the call to open([fd:s1d41]/f1, O_RDONLY). */
+ const int expected_read_result;
+ /* Expected result of the call to renameat([fd:s1d42]/f4, [fd:s1d42]/f5). */
+ const int expected_same_dir_rename_result;
+ /* Expected result of the call to renameat([fd:s1d41]/f1, [fd:s1d42]/f1). */
+ const int expected_rename_result;
+ /*
+ * Expected result of the call to renameat([fd:s1d41]/f2, [fd:s1d42]/f2,
+ * RENAME_EXCHANGE).
+ */
+ const int expected_exchange_result;
+};
+
+/* XXX: Fails without the fix. */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d1_mount_src_parent) {
+ /* clang-format on */
+ .allowed_s1d1 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d2_mount_src_refer) {
+ /* clang-format on */
+ .allowed_s1d2 = LANDLOCK_ACCESS_FS_REFER | LANDLOCK_ACCESS_FS_READ_FILE,
+ .expected_read_result = 0,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d2_mount_src_create) {
+ /* clang-format on */
+ .allowed_s1d2 = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = 0,
+ .expected_same_dir_rename_result = 0,
+ .expected_rename_result = EXDEV,
+ .expected_exchange_result = EXDEV,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d2_mount_src_rename) {
+ /* clang-format on */
+ .allowed_s1d2 = LANDLOCK_ACCESS_FS_REFER | LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = 0,
+ .expected_rename_result = 0,
+ .expected_exchange_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d31_s1d32_old_parent) {
+ /* clang-format on */
+ .allowed_s1d31 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .allowed_s1d32 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s1d41_s1d42_disconnected) {
+ /* clang-format on */
+ .allowed_s1d41 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .allowed_s1d42 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s2d1_mount_dst_parent_create) {
+ /* clang-format on */
+ .allowed_s2d1 = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = 0,
+ .expected_same_dir_rename_result = 0,
+ .expected_rename_result = EXDEV,
+ .expected_exchange_result = EXDEV,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s2d1_mount_dst_parent_refer) {
+ /* clang-format on */
+ .allowed_s2d1 = LANDLOCK_ACCESS_FS_REFER | LANDLOCK_ACCESS_FS_READ_FILE,
+ .expected_read_result = 0,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s2d1_mount_dst_parent_mini) {
+ /* clang-format on */
+ .allowed_s2d1 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = 0,
+ .expected_same_dir_rename_result = 0,
+ .expected_rename_result = 0,
+ .expected_exchange_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s2d2_covered_by_mount) {
+ /* clang-format on */
+ .allowed_s2d2 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* Fails without the fix. */
+/* Tests collect_domain_accesses(). */
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, s3d1_s4d1_new_parent) {
+ /* clang-format on */
+ .allowed_s3d1 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .allowed_s4d1 = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_REG |
+ LANDLOCK_ACCESS_FS_MAKE_REG,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(layout4_disconnected, f1_f2_f3) {
+ /* clang-format on */
+ .allowed_f1 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .allowed_f2 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .allowed_f3 = LANDLOCK_ACCESS_FS_READ_FILE,
+ .expected_read_result = EACCES,
+ .expected_same_dir_rename_result = EACCES,
+ .expected_rename_result = EACCES,
+ .expected_exchange_result = EACCES,
+};
+
+TEST_F_FORK(layout4_disconnected, read_rename_exchange)
+{
+ const __u64 handled_access =
+ LANDLOCK_ACCESS_FS_REFER | LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_EXECUTE | LANDLOCK_ACCESS_FS_MAKE_REG;
+ const struct rule rules[] = {
+ {
+ .path = TMP_DIR "/s1d1",
+ .access = variant->allowed_s1d1,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2",
+ .access = variant->allowed_s1d2,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d31",
+ .access = variant->allowed_s1d31,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d32",
+ .access = variant->allowed_s1d32,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d31/s1d41",
+ .access = variant->allowed_s1d41,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d32/s1d42",
+ .access = variant->allowed_s1d42,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d31/s1d41/f1",
+ .access = variant->allowed_f1,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d31/s1d41/f2",
+ .access = variant->allowed_f2,
+ },
+ {
+ .path = TMP_DIR "/s1d1/s1d2/s1d32/s1d42/f3",
+ .access = variant->allowed_f3,
+ },
+ {
+ .path = TMP_DIR "/s2d1",
+ .access = variant->allowed_s2d1,
+ },
+ {
+ .path = TMP_DIR "/s3d1",
+ .access = variant->allowed_s3d1,
+ },
+ {
+ .path = TMP_DIR "/s4d1",
+ .access = variant->allowed_s4d1,
+ },
+ {},
+ };
+ int ruleset_fd, s1d31_bind_fd, s1d32_bind_fd;
+
+ ruleset_fd = create_ruleset(_metadata, handled_access, rules);
+ ASSERT_LE(0, ruleset_fd);
+
+ /* Adds rule for the covered directory. */
+ if (variant->allowed_s2d2) {
+ ASSERT_EQ(0, landlock_add_rule(
+ ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &(struct landlock_path_beneath_attr){
+ .parent_fd = self->s2d2_fd,
+ .allowed_access =
+ variant->allowed_s2d2,
+ },
+ 0));
+ }
+ EXPECT_EQ(0, close(self->s2d2_fd));
+
+ s1d31_bind_fd = open(TMP_DIR "/s2d1/s2d2/s1d31/s1d41",
+ O_DIRECTORY | O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, s1d31_bind_fd);
+ s1d32_bind_fd = open(TMP_DIR "/s2d1/s2d2/s1d32/s1d42",
+ O_DIRECTORY | O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, s1d32_bind_fd);
+
+ /* Disconnects and checks source and destination directories. */
+ EXPECT_EQ(0, test_open_rel(s1d31_bind_fd, "..", O_DIRECTORY));
+ EXPECT_EQ(0, test_open_rel(s1d32_bind_fd, "..", O_DIRECTORY));
+ /* Renames to make it accessible through s3d1/s1d41 */
+ ASSERT_EQ(0, test_renameat(AT_FDCWD, TMP_DIR "/s1d1/s1d2/s1d31/s1d41",
+ AT_FDCWD, TMP_DIR "/s3d1/s1d41"));
+ /* Renames to make it accessible through s4d1/s1d42 */
+ ASSERT_EQ(0, test_renameat(AT_FDCWD, TMP_DIR "/s1d1/s1d2/s1d32/s1d42",
+ AT_FDCWD, TMP_DIR "/s4d1/s1d42"));
+ EXPECT_EQ(ENOENT, test_open_rel(s1d31_bind_fd, "..", O_DIRECTORY));
+ EXPECT_EQ(ENOENT, test_open_rel(s1d32_bind_fd, "..", O_DIRECTORY));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ EXPECT_EQ(0, close(ruleset_fd));
+
+ EXPECT_EQ(variant->expected_read_result,
+ test_open_rel(s1d31_bind_fd, "f1", O_RDONLY));
+
+ EXPECT_EQ(variant->expected_rename_result,
+ test_renameat(s1d31_bind_fd, "f1", s1d32_bind_fd, "f1"));
+ EXPECT_EQ(variant->expected_exchange_result,
+ test_exchangeat(s1d31_bind_fd, "f2", s1d32_bind_fd, "f3"));
+
+ EXPECT_EQ(variant->expected_same_dir_rename_result,
+ test_renameat(s1d32_bind_fd, "f4", s1d32_bind_fd, "f5"));
+}
+
#define LOWER_BASE TMP_DIR "/lower"
#define LOWER_DATA LOWER_BASE "/data"
static const char lower_fl1[] = LOWER_DATA "/fl1";
--
2.50.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite
2025-07-01 18:38 ` [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite Mickaël Salaün
@ 2025-07-07 0:22 ` Tingmao Wang
0 siblings, 0 replies; 5+ messages in thread
From: Tingmao Wang @ 2025-07-07 0:22 UTC (permalink / raw)
To: Mickaël Salaün, Günther Noack
Cc: linux-security-module, linux-fsdevel, NeilBrown,
Christian Brauner, Al Viro, Jeff Xu, Ben Scarlato, Paul Moore,
Daniel Burgener, Song Liu, Jann Horn
On 7/1/25 19:38, Mickaël Salaün wrote:
> [...]
> Tingmao, I initially started from your patch [1] (it changed a lot), and
> I'll probably squash with [2], at which point you'll be co-developer if
> that's OK with you.
Sounds good to me. I've not had a chance yet but I will try to give this
a read and reply if I have any questions.
(Added Jann to Cc per request.)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories
2025-07-01 18:38 [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Mickaël Salaün
2025-07-01 18:38 ` [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite Mickaël Salaün
@ 2025-07-07 10:30 ` Christian Brauner
2025-07-08 17:36 ` Mickaël Salaün
2 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2025-07-07 10:30 UTC (permalink / raw)
To: Mickaël Salaün
Cc: Günther Noack, linux-security-module, linux-fsdevel,
NeilBrown, Al Viro, Jeff Xu, Ben Scarlato, Paul Moore,
Daniel Burgener, Song Liu, Tingmao Wang
On Tue, Jul 01, 2025 at 08:38:07PM +0200, Mickaël Salaün wrote:
> We can get disconnected files or directories when they are visible and
> opened from a bind mount, before being renamed/moved from the source of
> the bind mount in a way that makes them inaccessible from the mount
> point (i.e. out of scope).
>
> Until now, access rights tied to files or directories opened through a
> disconnected directory were collected by walking the related hierarchy
> down to the root of this filesystem because the mount point couldn't be
> found. This could lead to inconsistent access results, and
> hard-to-debug renames, especially because such paths cannot be printed.
>
> For a sandboxed task to create a disconnected directory, it needs to
> have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
> the underlying source of the bind mount, and read access to the related
> mount point. Because a sandboxed task cannot get more access than those
> defined by its Landlock domain, this could only lead to inconsistent
> access rights because of missing those that should be inherited from the
> mount point hierarchy and inheriting from the hierarchy of the mounted
> filesystem instead.
>
> Landlock now handles files/directories opened from disconnected
> directories like the mount point these disconnected directories were
> opened from. This gives the guarantee that access rights on a
> file/directory cannot be more than those at open time. The rationale is
> that disconnected hierarchies might not be visible nor accessible to a
> sandboxed task, and relying on the collected access rights from them
> could introduce unexpected results, especially for rename actions
> because of the access right comparison between the source and the
> destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much
> less surprising to users and safer from an access point of view.
>
> Unlike follow_dotdot(), we don't need to check for each directory if it
> is part of the mount's root, but instead this is only checked when we
> reached a root dentry (not a mount point), or when the access
> request is about to be allowed. This limits the number of calls to
> is_subdir() which walks down the hierarchy (again). This also avoids
> checking path connection at the beginning of the walk for each mount
> point, which would be racy.
>
> Make path_connected() public to stay consistent with the VFS. This
> helper is used when we are about to allowed an access.
>
> This change increases the stack size with two Landlock layer masks
> backups that are needed to reset the collected access rights to the
> latest mount point.
>
> Because opened files have their access rights stored in the related file
> security properties, their is no impact for disconnected or unlinked
> files.
>
> A following commit will document handling of disconnected files and
> directories.
>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Song Liu <song@kernel.org>
> Reported-by: Tingmao Wang <m@maowtm.org>
> Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org
> Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
> Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control")
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>
> This replaces this patch:
> landlock: Remove warning in collect_domain_accesses()
> https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net
>
> I'll probably split this commit into two to ease backport (same for
> tests).
>
> This patch series applies on top of my next branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
>
> TODO: Add documentation
>
> TODO: Add Landlock erratum
> ---
> fs/namei.c | 2 +-
> include/linux/fs.h | 1 +
> security/landlock/fs.c | 121 +++++++++++++++++++++++++++++++++++------
> 3 files changed, 105 insertions(+), 19 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 4bb889fc980b..7853a876fc1c 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd)
> * Rename can sometimes move a file or directory outside of a bind
> * mount, path_connected allows those cases to be detected.
> */
> -static bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
> +bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
> {
> struct super_block *sb = mnt->mnt_sb;
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 4ec77da65f14..3c0e324a9272 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *);
> /* fs/dcache.c -- generic fs support functions */
> extern bool is_subdir(struct dentry *, struct dentry *);
> extern bool path_is_under(const struct path *, const struct path *);
> +extern bool path_connected(struct vfsmount *mnt, struct dentry *dentry);
Drop the "extern" please. We generally don't do that anymore for new
additions. Otherwise,
Acked-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories
2025-07-01 18:38 [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Mickaël Salaün
2025-07-01 18:38 ` [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite Mickaël Salaün
2025-07-07 10:30 ` [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Christian Brauner
@ 2025-07-08 17:36 ` Mickaël Salaün
2 siblings, 0 replies; 5+ messages in thread
From: Mickaël Salaün @ 2025-07-08 17:36 UTC (permalink / raw)
To: Günther Noack
Cc: linux-security-module, linux-fsdevel, NeilBrown,
Christian Brauner, Al Viro, Jeff Xu, Ben Scarlato, Paul Moore,
Daniel Burgener, Song Liu, Tingmao Wang, Jann Horn
On Tue, Jul 01, 2025 at 08:38:07PM +0200, Mickaël Salaün wrote:
> We can get disconnected files or directories when they are visible and
> opened from a bind mount, before being renamed/moved from the source of
> the bind mount in a way that makes them inaccessible from the mount
> point (i.e. out of scope).
>
> Until now, access rights tied to files or directories opened through a
> disconnected directory were collected by walking the related hierarchy
> down to the root of this filesystem because the mount point couldn't be
> found. This could lead to inconsistent access results, and
> hard-to-debug renames, especially because such paths cannot be printed.
>
> For a sandboxed task to create a disconnected directory, it needs to
> have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
> the underlying source of the bind mount, and read access to the related
> mount point. Because a sandboxed task cannot get more access than those
> defined by its Landlock domain, this could only lead to inconsistent
> access rights because of missing those that should be inherited from the
> mount point hierarchy and inheriting from the hierarchy of the mounted
> filesystem instead.
>
> Landlock now handles files/directories opened from disconnected
> directories like the mount point these disconnected directories were
> opened from. This gives the guarantee that access rights on a
> file/directory cannot be more than those at open time. The rationale is
> that disconnected hierarchies might not be visible nor accessible to a
> sandboxed task, and relying on the collected access rights from them
> could introduce unexpected results, especially for rename actions
> because of the access right comparison between the source and the
> destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much
> less surprising to users and safer from an access point of view.
>
> Unlike follow_dotdot(), we don't need to check for each directory if it
> is part of the mount's root, but instead this is only checked when we
> reached a root dentry (not a mount point), or when the access
> request is about to be allowed. This limits the number of calls to
> is_subdir() which walks down the hierarchy (again). This also avoids
> checking path connection at the beginning of the walk for each mount
> point, which would be racy.
>
> Make path_connected() public to stay consistent with the VFS. This
> helper is used when we are about to allowed an access.
>
> This change increases the stack size with two Landlock layer masks
> backups that are needed to reset the collected access rights to the
> latest mount point.
>
> Because opened files have their access rights stored in the related file
> security properties, their is no impact for disconnected or unlinked
> files.
>
> A following commit will document handling of disconnected files and
> directories.
>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Song Liu <song@kernel.org>
> Reported-by: Tingmao Wang <m@maowtm.org>
> Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org
> Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
> Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control")
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>
> This replaces this patch:
> landlock: Remove warning in collect_domain_accesses()
> https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net
>
> I'll probably split this commit into two to ease backport (same for
> tests).
>
> This patch series applies on top of my next branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next
>
> TODO: Add documentation
>
> TODO: Add Landlock erratum
> ---
> fs/namei.c | 2 +-
> include/linux/fs.h | 1 +
> security/landlock/fs.c | 121 +++++++++++++++++++++++++++++++++++------
> 3 files changed, 105 insertions(+), 19 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 4bb889fc980b..7853a876fc1c 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd)
> * Rename can sometimes move a file or directory outside of a bind
> * mount, path_connected allows those cases to be detected.
> */
> -static bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
> +bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
> {
> struct super_block *sb = mnt->mnt_sb;
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 4ec77da65f14..3c0e324a9272 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *);
> /* fs/dcache.c -- generic fs support functions */
> extern bool is_subdir(struct dentry *, struct dentry *);
> extern bool path_is_under(const struct path *, const struct path *);
> +extern bool path_connected(struct vfsmount *mnt, struct dentry *dentry);
>
> extern char *file_path(struct file *, char *, int);
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 1d6c4e728f92..51f03eb82069 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -768,7 +768,9 @@ static bool is_access_to_paths_allowed(
> struct path walker_path;
> access_mask_t access_masked_parent1, access_masked_parent2;
> layer_mask_t _layer_masks_child1[LANDLOCK_NUM_ACCESS_FS],
> - _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS];
> + _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS],
> + _layer_masks_parent1_bkp[LANDLOCK_NUM_ACCESS_FS],
> + _layer_masks_parent2_bkp[LANDLOCK_NUM_ACCESS_FS];
> layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL,
> (*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL;
>
> @@ -800,6 +802,8 @@ static bool is_access_to_paths_allowed(
> access_masked_parent1 = access_masked_parent2 =
> landlock_union_access_masks(domain).fs;
> is_dom_check = true;
> + memcpy(&_layer_masks_parent2_bkp, layer_masks_parent2,
> + sizeof(_layer_masks_parent2_bkp));
> } else {
> if (WARN_ON_ONCE(dentry_child1 || dentry_child2))
> return false;
> @@ -807,6 +811,8 @@ static bool is_access_to_paths_allowed(
> access_masked_parent1 = access_request_parent1;
> access_masked_parent2 = access_request_parent2;
> is_dom_check = false;
> + memcpy(&_layer_masks_parent1_bkp, layer_masks_parent1,
> + sizeof(_layer_masks_parent1_bkp));
> }
>
> if (unlikely(dentry_child1)) {
> @@ -858,6 +864,14 @@ static bool is_access_to_paths_allowed(
> child1_is_directory, layer_masks_parent2,
> layer_masks_child2,
> child2_is_directory))) {
> + /*
> + * Rewinds walk for disconnected directories before any other state
> + * change.
> + */
> + if (unlikely(!path_connected(walker_path.mnt,
> + walker_path.dentry)))
> + goto reset_to_mount_root;
> +
> /*
> * Now, downgrades the remaining checks from domain
> * handled accesses to requested accesses.
> @@ -893,14 +907,42 @@ static bool is_access_to_paths_allowed(
> ARRAY_SIZE(*layer_masks_parent2));
>
> /* Stops when a rule from each layer grants access. */
> - if (allowed_parent1 && allowed_parent2)
> + if (allowed_parent1 && allowed_parent2) {
> + /*
> + * Rewinds walk for disconnected directories before any other state
> + * change.
> + */
> + if (unlikely(!path_connected(walker_path.mnt,
> + walker_path.dentry)))
> + goto reset_to_mount_root;
> +
> break;
> + }
> +
> jump_up:
> if (walker_path.dentry == walker_path.mnt->mnt_root) {
> if (follow_up(&walker_path)) {
> + /* Saves known good values. */
> + memcpy(&_layer_masks_parent1_bkp,
> + layer_masks_parent1,
> + sizeof(_layer_masks_parent1_bkp));
> + if (layer_masks_parent2)
> + memcpy(&_layer_masks_parent2_bkp,
> + layer_masks_parent2,
> + sizeof(_layer_masks_parent2_bkp));
> +
> /* Ignores hidden mount points. */
> goto jump_up;
> } else {
> + /*
> + * Rewinds walk for disconnected directories before any other
> + * state change.
> + */
> + if (unlikely(!path_connected(
> + walker_path.mnt,
> + walker_path.dentry)))
> + goto reset_to_mount_root;
> +
This hunk is useless, I'll remove it.
> /*
> * Stops at the real root. Denies access
> * because not all layers have granted access.
> @@ -909,20 +951,51 @@ static bool is_access_to_paths_allowed(
> }
> }
> if (unlikely(IS_ROOT(walker_path.dentry))) {
> - /*
> - * Stops at disconnected root directories. Only allows
> - * access to internal filesystems (e.g. nsfs, which is
> - * reachable through /proc/<pid>/ns/<namespace>).
> - */
> if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
> + /*
> + * Stops and allows access when reaching disconnected root
> + * directories that are part of internal filesystems (e.g. nsfs,
> + * which is reachable through /proc/<pid>/ns/<namespace>).
> + */
> allowed_parent1 = true;
> allowed_parent2 = true;
> + break;
> + } else {
> + /*
> + * Ignores current walk in walker_path.mnt when reaching
> + * disconnected root directories from bind mounts. Reset the
> + * collected access rights to the latest mount point (or @path)
> + * we walked through, and start again from the current root of
> + * the mount point. The newly collected access rights will be
> + * less than or equal to those at open time.
> + */
> + goto reset_to_mount_root;
> }
> - break;
> }
> parent_dentry = dget_parent(walker_path.dentry);
> dput(walker_path.dentry);
> walker_path.dentry = parent_dentry;
> + continue;
> +
> +reset_to_mount_root:
> + /* Restores latest known good values. */
> + memcpy(layer_masks_parent1, &_layer_masks_parent1_bkp,
> + sizeof(_layer_masks_parent1_bkp));
> + if (layer_masks_parent2)
> + memcpy(layer_masks_parent2, &_layer_masks_parent2_bkp,
> + sizeof(_layer_masks_parent2_bkp));
> +
> + /*
> + * Ignores previous results. They will be computed again with the next
> + * iteration.
> + */
> + allowed_parent1 = false;
> + allowed_parent2 = false;
> +
> + /* Restarts with the current mount point. */
> + dput(walker_path.dentry);
> + walker_path.dentry = walker_path.mnt->mnt_root;
> + dget(walker_path.dentry);
> }
> path_put(&walker_path);
>
> @@ -1030,13 +1103,13 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
> */
> static bool collect_domain_accesses(
> const struct landlock_ruleset *const domain,
> - const struct dentry *const mnt_root, struct dentry *dir,
> + const struct path *const mnt_dir, struct dentry *dir,
> layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
> {
> - unsigned long access_dom;
> + access_mask_t access_dom;
> bool ret = false;
>
> - if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom))
> + if (WARN_ON_ONCE(!domain || !mnt_dir || !dir || !layer_masks_dom))
> return true;
> if (is_nouser_or_private(dir))
> return true;
> @@ -1053,6 +1126,10 @@ static bool collect_domain_accesses(
> if (landlock_unmask_layers(find_rule(domain, dir), access_dom,
> layer_masks_dom,
> ARRAY_SIZE(*layer_masks_dom))) {
> + /* Ignores this walk if we end up in a disconnected directory. */
> + if (unlikely(!path_connected(mnt_dir->mnt, dir)))
> + goto cancel_walk;
> +
> /*
> * Stops when all handled accesses are allowed by at
> * least one rule in each layer.
> @@ -1061,13 +1138,23 @@ static bool collect_domain_accesses(
> break;
> }
>
> - /* Stops at the mount point or disconnected root directories. */
> - if (dir == mnt_root || IS_ROOT(dir))
> + /* Stops at the mount point. */
> + if (dir == mnt_dir->dentry)
> break;
>
> + /* Ignores this walk if we end up in a disconnected root directory. */
> + if (unlikely(IS_ROOT(dir)))
> + goto cancel_walk;
> +
> parent_dentry = dget_parent(dir);
> dput(dir);
> dir = parent_dentry;
> + continue;
> +
> +cancel_walk:
> + landlock_init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
> + layer_masks_dom, LANDLOCK_KEY_INODE);
> + break;
> }
> dput(dir);
> return ret;
> @@ -1198,13 +1285,11 @@ static int current_check_refer_path(struct dentry *const old_dentry,
> old_dentry->d_parent;
>
> /* new_dir->dentry is equal to new_dentry->d_parent */
> - allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
> - old_parent,
> - &layer_masks_parent1);
> - allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
> + allow_parent1 = collect_domain_accesses(
> + subject->domain, &mnt_dir, old_parent, &layer_masks_parent1);
> + allow_parent2 = collect_domain_accesses(subject->domain, &mnt_dir,
> new_dir->dentry,
> &layer_masks_parent2);
> -
> if (allow_parent1 && allow_parent2)
> return 0;
>
> --
> 2.50.0
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-08 17:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-01 18:38 [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Mickaël Salaün
2025-07-01 18:38 ` [RFC PATCH v1 2/2] selftests/landlock: Add layout4_disconnected test suite Mickaël Salaün
2025-07-07 0:22 ` Tingmao Wang
2025-07-07 10:30 ` [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Christian Brauner
2025-07-08 17:36 ` Mickaël Salaün
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).