From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-bc0b.mail.infomaniak.ch (smtp-bc0b.mail.infomaniak.ch [45.157.188.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D62F42E6D2A for ; Tue, 8 Jul 2025 17:37:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.157.188.11 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751996234; cv=none; b=ZWyhxsLqk+HhaWBIqSk05RMbDNWWpGao3kmwqYQ21IfsckNnRgTuJvGyzX7rXmEMXBKl9XBNq63BlQIMvQY386bVBpzs44LmzoAmxnDFcmS0GA4/R2Se18HwrV5yeuU4Ixrra7YyOarY4B6Ljtuq6TSmYLQ+RKRM5MrtBKlxxds= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751996234; c=relaxed/simple; bh=lceaWOs9hk3V4yzJb6xjhKOfDkznCmQ7z9nUp9joaAE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ozsGADnSKHq78lmZXeE7XcTqZTTCzJOcQY6SUzF4J66Abq0iyFqhFtJaBS07Xe9soIEQyHW6+9cdj3ZEMIWuIrFAV1tRwhQF0Q495AUNIA4e2ByOqb6Sj7j2OkCmKzxIeFWFv5j3BFB8WoFsfXWLDeL73ukaVzxBWj9OLK/CO3E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=digikod.net; spf=pass smtp.mailfrom=digikod.net; dkim=pass (1024-bit key) header.d=digikod.net header.i=@digikod.net header.b=FZj9966U; arc=none smtp.client-ip=45.157.188.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=digikod.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=digikod.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=digikod.net header.i=@digikod.net header.b="FZj9966U" Received: from smtp-4-0000.mail.infomaniak.ch (unknown [IPv6:2001:1600:7:10::a6b]) by smtp-4-3000.mail.infomaniak.ch (Postfix) with ESMTPS id 4bc7Zz5DhgzlQR; Tue, 8 Jul 2025 19:36:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digikod.net; s=20191114; t=1751996219; bh=UYan0adPDKbkAsXZN2hQ3hR3txzw6/bynQO0IGUmMB4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FZj9966UvtDyAfty6BcEy5huvNCnsP7kqbEePsUn+ZFx1Nn3k43+nT2Z5WloVB47L 45U43D820NDUvWVGHQdc2WBllO+TdmOP5GZ3/SA+XajxCXFsCxs+f7lHvrfQgfsH5I QIUnrV5mMQF0vr2LxdeSWlVJfCVjfOggsneR3OsI= Received: from unknown by smtp-4-0000.mail.infomaniak.ch (Postfix) with ESMTPA id 4bc7Zy5NMcz9lh; Tue, 8 Jul 2025 19:36:58 +0200 (CEST) Date: Tue, 8 Jul 2025 19:36:57 +0200 From: =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= To: =?utf-8?Q?G=C3=BCnther?= Noack Cc: linux-security-module@vger.kernel.org, linux-fsdevel@vger.kernel.org, NeilBrown , Christian Brauner , Al Viro , Jeff Xu , Ben Scarlato , Paul Moore , Daniel Burgener , Song Liu , Tingmao Wang , Jann Horn Subject: Re: [RFC PATCH v1 1/2] landlock: Fix handling of disconnected directories Message-ID: <20250708.puW1Kegh9voo@digikod.net> References: <20250701183812.3201231-1-mic@digikod.net> Precedence: bulk X-Mailing-List: linux-security-module@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20250701183812.3201231-1-mic@digikod.net> X-Infomaniak-Routing: alpha On Tue, Jul 01, 2025 at 08:38:07PM +0200, Mickaël Salaün wrote: > We can get disconnected files or directories when they are visible and > opened from a bind mount, before being renamed/moved from the source of > the bind mount in a way that makes them inaccessible from the mount > point (i.e. out of scope). > > Until now, access rights tied to files or directories opened through a > disconnected directory were collected by walking the related hierarchy > down to the root of this filesystem because the mount point couldn't be > found. This could lead to inconsistent access results, and > hard-to-debug renames, especially because such paths cannot be printed. > > For a sandboxed task to create a disconnected directory, it needs to > have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to > the underlying source of the bind mount, and read access to the related > mount point. Because a sandboxed task cannot get more access than those > defined by its Landlock domain, this could only lead to inconsistent > access rights because of missing those that should be inherited from the > mount point hierarchy and inheriting from the hierarchy of the mounted > filesystem instead. > > Landlock now handles files/directories opened from disconnected > directories like the mount point these disconnected directories were > opened from. This gives the guarantee that access rights on a > file/directory cannot be more than those at open time. The rationale is > that disconnected hierarchies might not be visible nor accessible to a > sandboxed task, and relying on the collected access rights from them > could introduce unexpected results, especially for rename actions > because of the access right comparison between the source and the > destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much > less surprising to users and safer from an access point of view. > > Unlike follow_dotdot(), we don't need to check for each directory if it > is part of the mount's root, but instead this is only checked when we > reached a root dentry (not a mount point), or when the access > request is about to be allowed. This limits the number of calls to > is_subdir() which walks down the hierarchy (again). This also avoids > checking path connection at the beginning of the walk for each mount > point, which would be racy. > > Make path_connected() public to stay consistent with the VFS. This > helper is used when we are about to allowed an access. > > This change increases the stack size with two Landlock layer masks > backups that are needed to reset the collected access rights to the > latest mount point. > > Because opened files have their access rights stored in the related file > security properties, their is no impact for disconnected or unlinked > files. > > A following commit will document handling of disconnected files and > directories. > > Cc: Günther Noack > Cc: Song Liu > Reported-by: Tingmao Wang > Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org > Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") > Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control") > Signed-off-by: Mickaël Salaün > --- > > This replaces this patch: > landlock: Remove warning in collect_domain_accesses() > https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net > > I'll probably split this commit into two to ease backport (same for > tests). > > This patch series applies on top of my next branch: > https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next > > TODO: Add documentation > > TODO: Add Landlock erratum > --- > fs/namei.c | 2 +- > include/linux/fs.h | 1 + > security/landlock/fs.c | 121 +++++++++++++++++++++++++++++++++++------ > 3 files changed, 105 insertions(+), 19 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 4bb889fc980b..7853a876fc1c 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd) > * Rename can sometimes move a file or directory outside of a bind > * mount, path_connected allows those cases to be detected. > */ > -static bool path_connected(struct vfsmount *mnt, struct dentry *dentry) > +bool path_connected(struct vfsmount *mnt, struct dentry *dentry) > { > struct super_block *sb = mnt->mnt_sb; > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 4ec77da65f14..3c0e324a9272 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *); > /* fs/dcache.c -- generic fs support functions */ > extern bool is_subdir(struct dentry *, struct dentry *); > extern bool path_is_under(const struct path *, const struct path *); > +extern bool path_connected(struct vfsmount *mnt, struct dentry *dentry); > > extern char *file_path(struct file *, char *, int); > > diff --git a/security/landlock/fs.c b/security/landlock/fs.c > index 1d6c4e728f92..51f03eb82069 100644 > --- a/security/landlock/fs.c > +++ b/security/landlock/fs.c > @@ -768,7 +768,9 @@ static bool is_access_to_paths_allowed( > struct path walker_path; > access_mask_t access_masked_parent1, access_masked_parent2; > layer_mask_t _layer_masks_child1[LANDLOCK_NUM_ACCESS_FS], > - _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS]; > + _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS], > + _layer_masks_parent1_bkp[LANDLOCK_NUM_ACCESS_FS], > + _layer_masks_parent2_bkp[LANDLOCK_NUM_ACCESS_FS]; > layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL, > (*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL; > > @@ -800,6 +802,8 @@ static bool is_access_to_paths_allowed( > access_masked_parent1 = access_masked_parent2 = > landlock_union_access_masks(domain).fs; > is_dom_check = true; > + memcpy(&_layer_masks_parent2_bkp, layer_masks_parent2, > + sizeof(_layer_masks_parent2_bkp)); > } else { > if (WARN_ON_ONCE(dentry_child1 || dentry_child2)) > return false; > @@ -807,6 +811,8 @@ static bool is_access_to_paths_allowed( > access_masked_parent1 = access_request_parent1; > access_masked_parent2 = access_request_parent2; > is_dom_check = false; > + memcpy(&_layer_masks_parent1_bkp, layer_masks_parent1, > + sizeof(_layer_masks_parent1_bkp)); > } > > if (unlikely(dentry_child1)) { > @@ -858,6 +864,14 @@ static bool is_access_to_paths_allowed( > child1_is_directory, layer_masks_parent2, > layer_masks_child2, > child2_is_directory))) { > + /* > + * Rewinds walk for disconnected directories before any other state > + * change. > + */ > + if (unlikely(!path_connected(walker_path.mnt, > + walker_path.dentry))) > + goto reset_to_mount_root; > + > /* > * Now, downgrades the remaining checks from domain > * handled accesses to requested accesses. > @@ -893,14 +907,42 @@ static bool is_access_to_paths_allowed( > ARRAY_SIZE(*layer_masks_parent2)); > > /* Stops when a rule from each layer grants access. */ > - if (allowed_parent1 && allowed_parent2) > + if (allowed_parent1 && allowed_parent2) { > + /* > + * Rewinds walk for disconnected directories before any other state > + * change. > + */ > + if (unlikely(!path_connected(walker_path.mnt, > + walker_path.dentry))) > + goto reset_to_mount_root; > + > break; > + } > + > jump_up: > if (walker_path.dentry == walker_path.mnt->mnt_root) { > if (follow_up(&walker_path)) { > + /* Saves known good values. */ > + memcpy(&_layer_masks_parent1_bkp, > + layer_masks_parent1, > + sizeof(_layer_masks_parent1_bkp)); > + if (layer_masks_parent2) > + memcpy(&_layer_masks_parent2_bkp, > + layer_masks_parent2, > + sizeof(_layer_masks_parent2_bkp)); > + > /* Ignores hidden mount points. */ > goto jump_up; > } else { > + /* > + * Rewinds walk for disconnected directories before any other > + * state change. > + */ > + if (unlikely(!path_connected( > + walker_path.mnt, > + walker_path.dentry))) > + goto reset_to_mount_root; > + This hunk is useless, I'll remove it. > /* > * Stops at the real root. Denies access > * because not all layers have granted access. > @@ -909,20 +951,51 @@ static bool is_access_to_paths_allowed( > } > } > if (unlikely(IS_ROOT(walker_path.dentry))) { > - /* > - * Stops at disconnected root directories. Only allows > - * access to internal filesystems (e.g. nsfs, which is > - * reachable through /proc//ns/). > - */ > if (walker_path.mnt->mnt_flags & MNT_INTERNAL) { > + /* > + * Stops and allows access when reaching disconnected root > + * directories that are part of internal filesystems (e.g. nsfs, > + * which is reachable through /proc//ns/). > + */ > allowed_parent1 = true; > allowed_parent2 = true; > + break; > + } else { > + /* > + * Ignores current walk in walker_path.mnt when reaching > + * disconnected root directories from bind mounts. Reset the > + * collected access rights to the latest mount point (or @path) > + * we walked through, and start again from the current root of > + * the mount point. The newly collected access rights will be > + * less than or equal to those at open time. > + */ > + goto reset_to_mount_root; > } > - break; > } > parent_dentry = dget_parent(walker_path.dentry); > dput(walker_path.dentry); > walker_path.dentry = parent_dentry; > + continue; > + > +reset_to_mount_root: > + /* Restores latest known good values. */ > + memcpy(layer_masks_parent1, &_layer_masks_parent1_bkp, > + sizeof(_layer_masks_parent1_bkp)); > + if (layer_masks_parent2) > + memcpy(layer_masks_parent2, &_layer_masks_parent2_bkp, > + sizeof(_layer_masks_parent2_bkp)); > + > + /* > + * Ignores previous results. They will be computed again with the next > + * iteration. > + */ > + allowed_parent1 = false; > + allowed_parent2 = false; > + > + /* Restarts with the current mount point. */ > + dput(walker_path.dentry); > + walker_path.dentry = walker_path.mnt->mnt_root; > + dget(walker_path.dentry); > } > path_put(&walker_path); > > @@ -1030,13 +1103,13 @@ static access_mask_t maybe_remove(const struct dentry *const dentry) > */ > static bool collect_domain_accesses( > const struct landlock_ruleset *const domain, > - const struct dentry *const mnt_root, struct dentry *dir, > + const struct path *const mnt_dir, struct dentry *dir, > layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS]) > { > - unsigned long access_dom; > + access_mask_t access_dom; > bool ret = false; > > - if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom)) > + if (WARN_ON_ONCE(!domain || !mnt_dir || !dir || !layer_masks_dom)) > return true; > if (is_nouser_or_private(dir)) > return true; > @@ -1053,6 +1126,10 @@ static bool collect_domain_accesses( > if (landlock_unmask_layers(find_rule(domain, dir), access_dom, > layer_masks_dom, > ARRAY_SIZE(*layer_masks_dom))) { > + /* Ignores this walk if we end up in a disconnected directory. */ > + if (unlikely(!path_connected(mnt_dir->mnt, dir))) > + goto cancel_walk; > + > /* > * Stops when all handled accesses are allowed by at > * least one rule in each layer. > @@ -1061,13 +1138,23 @@ static bool collect_domain_accesses( > break; > } > > - /* Stops at the mount point or disconnected root directories. */ > - if (dir == mnt_root || IS_ROOT(dir)) > + /* Stops at the mount point. */ > + if (dir == mnt_dir->dentry) > break; > > + /* Ignores this walk if we end up in a disconnected root directory. */ > + if (unlikely(IS_ROOT(dir))) > + goto cancel_walk; > + > parent_dentry = dget_parent(dir); > dput(dir); > dir = parent_dentry; > + continue; > + > +cancel_walk: > + landlock_init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS, > + layer_masks_dom, LANDLOCK_KEY_INODE); > + break; > } > dput(dir); > return ret; > @@ -1198,13 +1285,11 @@ static int current_check_refer_path(struct dentry *const old_dentry, > old_dentry->d_parent; > > /* new_dir->dentry is equal to new_dentry->d_parent */ > - allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry, > - old_parent, > - &layer_masks_parent1); > - allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry, > + allow_parent1 = collect_domain_accesses( > + subject->domain, &mnt_dir, old_parent, &layer_masks_parent1); > + allow_parent2 = collect_domain_accesses(subject->domain, &mnt_dir, > new_dir->dentry, > &layer_masks_parent2); > - > if (allow_parent1 && allow_parent2) > return 0; > > -- > 2.50.0 > >