From: Christian Brauner <brauner@kernel.org>
To: linux-fsdevel@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>,
Miklos Szeredi <miklos@szeredi.hu>,
Jeff Layton <jlayton@kernel.org>,
Josef Bacik <josef@toxicpanda.com>,
Seth Forshee <sforshee@kernel.org>,
Christian Brauner <brauner@kernel.org>
Subject: [PATCH RFC 06/16] fs: create detached mounts from detached mounts
Date: Fri, 21 Feb 2025 14:13:05 +0100 [thread overview]
Message-ID: <20250221-brauner-open_tree-v1-6-dbcfcb98c676@kernel.org> (raw)
In-Reply-To: <20250221-brauner-open_tree-v1-0-dbcfcb98c676@kernel.org>
Add the ability to create detached mounts from detached mounts.
Currently, detached mounts can only be created from attached mounts.
This limitaton prevents various use-cases. For example, the ability to
mount a subdirectory without ever having to make the whole filesystem
visible first.
The current permission model for the OPEN_TREE_CLONE flag of the
open_tree() system call is:
(1) Check that the caller is privileged over the owning user namespace
of it's current mount namespace.
(2) Check that the caller is located in the mount namespace of the mount
it wants to create a detached copy of.
While it is not strictly necessary to do it this way it is consistently
applied in the new mount api. This model will also be used when allowing
the creation of detached mount from another detached mount.
The (1) requirement can simply be met by performing the same check as
for the non-detached case, i.e., verify that the caller is privileged
over its current mount namespace.
To meet the (2) requirement it must be possible to infer the origin
mount namespace that the anonymous mount namespace of the detached mount
was created from.
The origin mount namespace of an anonymous mount is the mount namespace
that the mounts that were copied into the anonymous mount namespace
originate from.
The origin mount namespace of the anonymous mount namespace must be the
same as the caller's mount namespace. To establish this the sequence
number of the caller's mount namespace and the origin sequence number of
the anonymous mount namespace are compared.
The caller is always located in a non-anonymous mount namespace since
anonymous mount namespaces cannot be setns()ed into. The caller's mount
namespace will thus always have a valid sequence number.
The owning namespace of any mount namespace, anonymous or non-anonymous,
can never change. A mount attached to a non-anonymous mount namespace
can never change mount namespace.
If the sequence number of the non-anonymous mount namespace and the
origin sequence number of the anonymous mount namespace match, the
owning namespaces must match as well.
Hence, the capability check on the owning namespace of the caller's
mount namespace ensures that the caller has the ability to copy the
mount tree.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/namespace.c | 38 +++++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index c61b9704499a..66b9cea1cf66 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -998,6 +998,12 @@ static inline int check_mnt(struct mount *mnt)
return mnt->mnt_ns == current->nsproxy->mnt_ns;
}
+static inline bool check_anonymous_mnt(struct mount *mnt)
+{
+ return is_anon_ns(mnt->mnt_ns) &&
+ mnt->mnt_ns->seq_origin == current->nsproxy->mnt_ns->seq;
+}
+
/*
* vfsmount lock must be held for write
*/
@@ -2822,6 +2828,32 @@ static int do_change_type(struct path *path, int ms_flags)
* namespace, i.e., the caller is trying to copy a mount namespace
* entry from nsfs.
* (3) The caller tries to copy a pidfs mount referring to a pidfd.
+ * (4) The caller is trying to copy a mount tree that belongs to an
+ * anonymous mount namespace.
+ *
+ * For that to be safe, this helper enforces that the origin mount
+ * namespace the anonymous mount namespace was created from is the
+ * same as the caller's mount namespace by comparing the sequence
+ * numbers.
+ *
+ * This is not strictly necessary. The current semantics of the new
+ * mount api enforce that the caller must be located in the same
+ * mount namespace as the mount tree it interacts with. Using the
+ * origin sequence number preserves these semantics even for
+ * anonymous mount namespaces. However, one could envision extending
+ * the api to directly operate across mount namespace if needed.
+ *
+ * The ownership of a non-anonymous mount namespace such as the
+ * caller's cannot change.
+ * => We know that the caller's mount namespace is stable.
+ *
+ * If the origin sequence number of the anonymous mount namespace is
+ * the same as the sequence number of the caller's mount namespace.
+ * => The owning namespaces are the same.
+ *
+ * ==> The earlier capability check on the owning namespace of the
+ * caller's mount namespace ensures that the caller has the
+ * ability to copy the mount tree.
*
* Returns true if the mount tree can be copied, false otherwise.
*/
@@ -2840,9 +2872,13 @@ static inline bool may_copy_tree(struct path *path)
if (d_op == &pidfs_dentry_operations)
return true;
- return false;
+ if (!is_mounted(path->mnt))
+ return false;
+
+ return check_anonymous_mnt(mnt);
}
+
static struct mount *__do_loopback(struct path *old_path, int recurse)
{
struct mount *mnt = ERR_PTR(-EINVAL), *old = real_mount(old_path->mnt);
--
2.47.2
next prev parent reply other threads:[~2025-02-21 13:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-21 13:12 [PATCH RFC 00/16] fs: expand abilities of anonymous mount namespaces Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 01/16] fs: record sequence number of origin mount namespace Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 02/16] fs: add mnt_ns_empty() helper Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 03/16] fs: add assert for move_mount() Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 04/16] fs: add fastpath for dissolve_on_fput() Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 05/16] fs: add may_copy_tree() Christian Brauner
2025-02-21 13:13 ` Christian Brauner [this message]
2025-02-21 13:13 ` [PATCH RFC 07/16] selftests: create detached mounts from detached mounts Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 08/16] fs: support getname_maybe_null() in move_mount() Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 09/16] fs: mount detached mounts onto detached mounts Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 10/16] selftests: first test for mounting " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 11/16] selftests: second " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 12/16] selftests: third " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 13/16] selftests: fourth " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 14/16] selftests: fifth " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 15/16] selftests: sixth " Christian Brauner
2025-02-21 13:13 ` [PATCH RFC 16/16] selftests: seventh " Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250221-brauner-open_tree-v1-6-dbcfcb98c676@kernel.org \
--to=brauner@kernel.org \
--cc=jlayton@kernel.org \
--cc=josef@toxicpanda.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=sforshee@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox