* [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs
@ 2026-01-02 14:36 Christian Brauner
2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw)
To: linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik,
Christian Brauner, stable
Currently pivot_root() doesnt't work on the real rootfs because it
cannot be unmounted. Userspace has to do a recursive removal of the
initramfs contents manually before continuing the boot.
Really all we want from the real rootfs is to serve as the parent mount
for anything that is actually useful such as the tmpfs or ramfs for
initramfs unpacking or the rootfs itself. There's no need for the real
rootfs to actually be anything meaningful or useful. Add a immutable
rootfs that can be selected via the "immutable_rootfs" kernel command
line option.
The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
and fire up userspace which mounts the rootfs and can then just do:
chdir(rootfs);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
and be done with it. (Ofc, userspace can also choose to retain the
initramfs contents by using something like pivot_root(".", "/initramfs")
without unmounting it.)
Technically this also means that the rootfs mount in unprivileged
namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
that the immutable rootfs remains permanently empty so there cannot be
anything revealed by unmounting the covering mount.
In the future this will also allow us to create completely empty mount
namespaces without risking to leak anything.
systemd already handles this all correctly as it tries to pivot_root()
first and falls back to MS_MOVE only when that fails.
This goes back to various discussion in previous years and a LPC 2024
presentation about this very topic.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Christian Brauner (3):
fs: ensure that internal tmpfs mount gets mount id zero
fs: add init_pivot_root()
fs: add immutable rootfs
fs/Makefile | 2 +-
fs/init.c | 17 ++++
fs/internal.h | 1 +
fs/mount.h | 1 +
fs/namespace.c | 181 +++++++++++++++++++++++++++++-------------
fs/rootfs.c | 65 +++++++++++++++
include/linux/init_syscalls.h | 1 +
include/uapi/linux/magic.h | 1 +
init/do_mounts.c | 13 ++-
init/do_mounts.h | 1 +
10 files changed, 223 insertions(+), 60 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20260102-work-immutable-rootfs-b5f23e0f5a27
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero
2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner
@ 2026-01-02 14:36 ` Christian Brauner
2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner
2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner
2 siblings, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw)
To: linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik,
Christian Brauner, stable
and the rootfs get mount id one as it always has. Before we actually
mount the rootfs we create an internal tmpfs mount which has mount id
zero but is never exposed anywhere. Continue that "tradition".
Fixes: 7f9bfafc5f49 ("fs: use xarray for old mount id")
Cc: <stable@vger.kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index c58674a20cad..8b082b1de7f3 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -221,7 +221,7 @@ static int mnt_alloc_id(struct mount *mnt)
int res;
xa_lock(&mnt_id_xa);
- res = __xa_alloc(&mnt_id_xa, &mnt->mnt_id, mnt, XA_LIMIT(1, INT_MAX), GFP_KERNEL);
+ res = __xa_alloc(&mnt_id_xa, &mnt->mnt_id, mnt, xa_limit_31b, GFP_KERNEL);
if (!res)
mnt->mnt_id_unique = ++mnt_id_ctr;
xa_unlock(&mnt_id_xa);
--
2.47.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/3] fs: add init_pivot_root()
2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner
2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner
@ 2026-01-02 14:36 ` Christian Brauner
2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner
2 siblings, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw)
To: linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik,
Christian Brauner
We will soon be able to pivot_root() with the introduction of the
immutable rootfs. Add a wrapper for kernel internal usage.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/init.c | 17 +++++++
fs/internal.h | 1 +
fs/namespace.c | 101 ++++++++++++++++++++++--------------------
include/linux/init_syscalls.h | 1 +
4 files changed, 73 insertions(+), 47 deletions(-)
diff --git a/fs/init.c b/fs/init.c
index e0f5429c0a49..e33b2690d851 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -13,6 +13,23 @@
#include <linux/security.h>
#include "internal.h"
+int __init init_pivot_root(const char *new_root, const char *put_old)
+{
+ struct path new_path __free(path_put) = {};
+ struct path old_path __free(path_put) = {};
+ int ret;
+
+ ret = kern_path(new_root, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new_path);
+ if (ret)
+ return ret;
+
+ ret = kern_path(put_old, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old_path);
+ if (ret)
+ return ret;
+
+ return path_pivot_root(&new_path, &old_path);
+}
+
int __init init_mount(const char *dev_name, const char *dir_name,
const char *type_page, unsigned long flags, void *data_page)
{
diff --git a/fs/internal.h b/fs/internal.h
index ab638d41ab81..4b27a4b0fdef 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -90,6 +90,7 @@ extern bool may_mount(void);
int path_mount(const char *dev_name, const struct path *path,
const char *type_page, unsigned long flags, void *data_page);
int path_umount(const struct path *path, int flags);
+int path_pivot_root(struct path *new, struct path *old);
int show_path(struct seq_file *m, struct dentry *root);
diff --git a/fs/namespace.c b/fs/namespace.c
index 8b082b1de7f3..9261f56ccc81 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4498,36 +4498,8 @@ bool path_is_under(const struct path *path1, const struct path *path2)
}
EXPORT_SYMBOL(path_is_under);
-/*
- * pivot_root Semantics:
- * Moves the root file system of the current process to the directory put_old,
- * makes new_root as the new root file system of the current process, and sets
- * root/cwd of all processes which had them on the current root to new_root.
- *
- * Restrictions:
- * The new_root and put_old must be directories, and must not be on the
- * same file system as the current process root. The put_old must be
- * underneath new_root, i.e. adding a non-zero number of /.. to the string
- * pointed to by put_old must yield the same directory as new_root. No other
- * file system may be mounted on put_old. After all, new_root is a mountpoint.
- *
- * Also, the current root cannot be on the 'rootfs' (initial ramfs) filesystem.
- * See Documentation/filesystems/ramfs-rootfs-initramfs.rst for alternatives
- * in this situation.
- *
- * Notes:
- * - we don't move root/cwd if they are not at the root (reason: if something
- * cared enough to change them, it's probably wrong to force them elsewhere)
- * - it's okay to pick a root that isn't the root of a file system, e.g.
- * /nfs/my_root where /nfs is the mount point. It must be a mountpoint,
- * though, so you may need to say mount --bind /nfs/my_root /nfs/my_root
- * first.
- */
-SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
- const char __user *, put_old)
+int path_pivot_root(struct path *new, struct path *old)
{
- struct path new __free(path_put) = {};
- struct path old __free(path_put) = {};
struct path root __free(path_put) = {};
struct mount *new_mnt, *root_mnt, *old_mnt, *root_parent, *ex_parent;
int error;
@@ -4535,28 +4507,18 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
if (!may_mount())
return -EPERM;
- error = user_path_at(AT_FDCWD, new_root,
- LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new);
- if (error)
- return error;
-
- error = user_path_at(AT_FDCWD, put_old,
- LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old);
- if (error)
- return error;
-
- error = security_sb_pivotroot(&old, &new);
+ error = security_sb_pivotroot(old, new);
if (error)
return error;
get_fs_root(current->fs, &root);
- LOCK_MOUNT(old_mp, &old);
+ LOCK_MOUNT(old_mp, old);
old_mnt = old_mp.parent;
if (IS_ERR(old_mnt))
return PTR_ERR(old_mnt);
- new_mnt = real_mount(new.mnt);
+ new_mnt = real_mount(new->mnt);
root_mnt = real_mount(root.mnt);
ex_parent = new_mnt->mnt_parent;
root_parent = root_mnt->mnt_parent;
@@ -4568,7 +4530,7 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
return -EINVAL;
if (new_mnt->mnt.mnt_flags & MNT_LOCKED)
return -EINVAL;
- if (d_unlinked(new.dentry))
+ if (d_unlinked(new->dentry))
return -ENOENT;
if (new_mnt == root_mnt || old_mnt == root_mnt)
return -EBUSY; /* loop, on the same file system */
@@ -4576,15 +4538,15 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
return -EINVAL; /* not a mountpoint */
if (!mnt_has_parent(root_mnt))
return -EINVAL; /* absolute root */
- if (!path_mounted(&new))
+ if (!path_mounted(new))
return -EINVAL; /* not a mountpoint */
if (!mnt_has_parent(new_mnt))
return -EINVAL; /* absolute root */
/* make sure we can reach put_old from new_root */
- if (!is_path_reachable(old_mnt, old_mp.mp->m_dentry, &new))
+ if (!is_path_reachable(old_mnt, old_mp.mp->m_dentry, new))
return -EINVAL;
/* make certain new is below the root */
- if (!is_path_reachable(new_mnt, new.dentry, &root))
+ if (!is_path_reachable(new_mnt, new->dentry, &root))
return -EINVAL;
lock_mount_hash();
umount_mnt(new_mnt);
@@ -4603,10 +4565,55 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
unlock_mount_hash();
mnt_notify_add(root_mnt);
mnt_notify_add(new_mnt);
- chroot_fs_refs(&root, &new);
+ chroot_fs_refs(&root, new);
return 0;
}
+/*
+ * pivot_root Semantics:
+ * Moves the root file system of the current process to the directory put_old,
+ * makes new_root as the new root file system of the current process, and sets
+ * root/cwd of all processes which had them on the current root to new_root.
+ *
+ * Restrictions:
+ * The new_root and put_old must be directories, and must not be on the
+ * same file system as the current process root. The put_old must be
+ * underneath new_root, i.e. adding a non-zero number of /.. to the string
+ * pointed to by put_old must yield the same directory as new_root. No other
+ * file system may be mounted on put_old. After all, new_root is a mountpoint.
+ *
+ * Also, the current root cannot be on the 'rootfs' (initial ramfs) filesystem.
+ * See Documentation/filesystems/ramfs-rootfs-initramfs.rst for alternatives
+ * in this situation.
+ *
+ * Notes:
+ * - we don't move root/cwd if they are not at the root (reason: if something
+ * cared enough to change them, it's probably wrong to force them elsewhere)
+ * - it's okay to pick a root that isn't the root of a file system, e.g.
+ * /nfs/my_root where /nfs is the mount point. It must be a mountpoint,
+ * though, so you may need to say mount --bind /nfs/my_root /nfs/my_root
+ * first.
+ */
+SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
+ const char __user *, put_old)
+{
+ struct path new __free(path_put) = {};
+ struct path old __free(path_put) = {};
+ int error;
+
+ error = user_path_at(AT_FDCWD, new_root,
+ LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new);
+ if (error)
+ return error;
+
+ error = user_path_at(AT_FDCWD, put_old,
+ LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old);
+ if (error)
+ return error;
+
+ return path_pivot_root(&new, &old);
+}
+
static unsigned int recalc_flags(struct mount_kattr *kattr, struct mount *mnt)
{
unsigned int flags = mnt->mnt.mnt_flags;
diff --git a/include/linux/init_syscalls.h b/include/linux/init_syscalls.h
index 92045d18cbfc..28776ee28d8e 100644
--- a/include/linux/init_syscalls.h
+++ b/include/linux/init_syscalls.h
@@ -17,3 +17,4 @@ int __init init_mkdir(const char *pathname, umode_t mode);
int __init init_rmdir(const char *pathname);
int __init init_utimes(char *filename, struct timespec64 *ts);
int __init init_dup(struct file *file);
+int __init init_pivot_root(const char *new_root, const char *put_old);
--
2.47.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/3] fs: add immutable rootfs
2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner
2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner
2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner
@ 2026-01-02 14:36 ` Christian Brauner
2026-01-04 7:27 ` Al Viro
2026-01-07 2:28 ` Gao Xiang
2 siblings, 2 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw)
To: linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik,
Christian Brauner
Currently pivot_root() doesnt't work on the real rootfs because it
cannot be unmounted. Userspace has to do a recursive removal of the
initramfs contents manually before continuing the boot.
Really all we want from the real rootfs is to serve as the parent mount
for anything that is actually useful such as the tmpfs or ramfs for
initramfs unpacking or the rootfs itself. There's no need for the real
rootfs to actually be anything meaningful or useful. Add a immutable
rootfs that can be selected via the "immutable_rootfs" kernel command
line option.
The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
and fire up userspace which mounts the rootfs and can then just do:
chdir(rootfs);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
and be done with it. (Ofc, userspace can also choose to retain the
initramfs contents by using something like pivot_root(".", "/initramfs")
without unmounting it.)
Technically this also means that the rootfs mount in unprivileged
namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
that the immutable rootfs remains permanently empty so there cannot be
anything revealed by unmounting the covering mount.
In the future this will also allow us to create completely empty mount
namespaces without risking to leak anything.
systemd already handles this all correctly as it tries to pivot_root()
first and falls back to MS_MOVE only when that fails.
This goes back to various discussion in previous years and a LPC 2024
presentation about this very topic.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/Makefile | 2 +-
fs/mount.h | 1 +
fs/namespace.c | 78 ++++++++++++++++++++++++++++++++++++++++------
fs/rootfs.c | 65 ++++++++++++++++++++++++++++++++++++++
include/uapi/linux/magic.h | 1 +
init/do_mounts.c | 13 ++++++--
init/do_mounts.h | 1 +
7 files changed, 149 insertions(+), 12 deletions(-)
diff --git a/fs/Makefile b/fs/Makefile
index a04274a3c854..d31b56b7c4d5 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -16,7 +16,7 @@ obj-y := open.o read_write.o file_table.o super.o \
stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \
fs_dirent.o fs_context.o fs_parser.o fsopen.o init.o \
kernel_read_file.o mnt_idmapping.o remap_range.o pidfs.o \
- file_attr.o
+ file_attr.o rootfs.o
obj-$(CONFIG_BUFFER_HEAD) += buffer.o mpage.o
obj-$(CONFIG_PROC_FS) += proc_namespace.o
diff --git a/fs/mount.h b/fs/mount.h
index 2d28ef2a3aed..c3e0d9dbfaa4 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -5,6 +5,7 @@
#include <linux/ns_common.h>
#include <linux/fs_pin.h>
+extern struct file_system_type immutable_rootfs_fs_type;
extern struct list_head notify_list;
struct mnt_namespace {
diff --git a/fs/namespace.c b/fs/namespace.c
index 9261f56ccc81..30597f4610fd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -75,6 +75,17 @@ static int __init initramfs_options_setup(char *str)
__setup("initramfs_options=", initramfs_options_setup);
+bool immutable_rootfs = false;
+
+static int __init immutable_rootfs_setup(char *str)
+{
+ if (*str)
+ return 0;
+ immutable_rootfs = true;
+ return 1;
+}
+__setup("immutable_rootfs", immutable_rootfs_setup);
+
static u64 event;
static DEFINE_XARRAY_FLAGS(mnt_id_xa, XA_FLAGS_ALLOC);
static DEFINE_IDA(mnt_group_ida);
@@ -5976,24 +5987,73 @@ struct mnt_namespace init_mnt_ns = {
static void __init init_mount_tree(void)
{
- struct vfsmount *mnt;
- struct mount *m;
+ struct vfsmount *mnt, *immutable_mnt;
+ struct mount *mnt_root;
struct path root;
+ /*
+ * When the immutable rootfs is used, we create two mounts:
+ *
+ * (1) immutable rootfs with mount id 1
+ * (2) mutable rootfs with mount id 2
+ *
+ * with (2) mounted on top of (1).
+ */
+ if (immutable_rootfs) {
+ immutable_mnt = vfs_kern_mount(&immutable_rootfs_fs_type, 0,
+ "rootfs", NULL);
+ if (IS_ERR(immutable_mnt))
+ panic("VFS: Failed to create immutable rootfs");
+ }
+
mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options);
if (IS_ERR(mnt))
panic("Can't create rootfs");
- m = real_mount(mnt);
- init_mnt_ns.root = m;
- init_mnt_ns.nr_mounts = 1;
- mnt_add_to_ns(&init_mnt_ns, m);
+ if (immutable_rootfs) {
+ VFS_WARN_ON_ONCE(real_mount(immutable_mnt)->mnt_id != 1);
+ VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 2);
+
+ /* The namespace root is the immutable rootfs. */
+ mnt_root = real_mount(immutable_mnt);
+ init_mnt_ns.root = mnt_root;
+
+ /* Mount mutable rootfs on top of the immutable rootfs. */
+ root.mnt = immutable_mnt;
+ root.dentry = immutable_mnt->mnt_root;
+
+ LOCK_MOUNT_EXACT(mp, &root);
+ if (unlikely(IS_ERR(mp.parent)))
+ panic("VFS: Failed to setup immutable rootfs");
+ scoped_guard(mount_writer)
+ attach_mnt(real_mount(mnt), mp.parent, mp.mp);
+
+ pr_info("VFS: Finished setting up immutable rootfs\n");
+ } else {
+ VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 1);
+
+ /* The namespace root is the mutable rootfs. */
+ mnt_root = real_mount(mnt);
+ init_mnt_ns.root = mnt_root;
+ }
+
+ /*
+ * We've dropped all locks here but that's fine. Not just are we
+ * the only task that's running, there's no other mount
+ * namespace in existence and the initial mount namespace is
+ * completely empty until we add the mounts we just created.
+ */
+ for (struct mount *p = mnt_root; p; p = next_mnt(p, mnt_root)) {
+ mnt_add_to_ns(&init_mnt_ns, p);
+ init_mnt_ns.nr_mounts++;
+ }
+
init_task.nsproxy->mnt_ns = &init_mnt_ns;
get_mnt_ns(&init_mnt_ns);
- root.mnt = mnt;
- root.dentry = mnt->mnt_root;
-
+ /* The root and pwd always point to the mutable rootfs. */
+ root.mnt = mnt;
+ root.dentry = mnt->mnt_root;
set_fs_pwd(current->fs, &root);
set_fs_root(current->fs, &root);
diff --git a/fs/rootfs.c b/fs/rootfs.c
new file mode 100644
index 000000000000..b82b73bb8bb2
--- /dev/null
+++ b/fs/rootfs.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
+#include <linux/fs/super_types.h>
+#include <linux/fs_context.h>
+#include <linux/magic.h>
+
+static const struct super_operations rootfs_super_operations = {
+ .statfs = simple_statfs,
+};
+
+static int rootfs_fs_fill_super(struct super_block *s, struct fs_context *fc)
+{
+ struct inode *inode;
+
+ s->s_maxbytes = MAX_LFS_FILESIZE;
+ s->s_blocksize = PAGE_SIZE;
+ s->s_blocksize_bits = PAGE_SHIFT;
+ s->s_magic = ROOT_FS_MAGIC;
+ s->s_op = &rootfs_super_operations;
+ s->s_export_op = NULL;
+ s->s_xattr = NULL;
+ s->s_time_gran = 1;
+ s->s_d_flags = 0;
+
+ inode = new_inode(s);
+ if (!inode)
+ return -ENOMEM;
+
+ /* The real rootfs is permanently empty... */
+ make_empty_dir_inode(inode);
+ simple_inode_init_ts(inode);
+ inode->i_ino = 1;
+ /* ... and immutable. */
+ inode->i_flags |= S_IMMUTABLE;
+
+ s->s_root = d_make_root(inode);
+ if (!s->s_root)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int rootfs_fs_get_tree(struct fs_context *fc)
+{
+ return get_tree_single(fc, rootfs_fs_fill_super);
+}
+
+static const struct fs_context_operations rootfs_fs_context_ops = {
+ .get_tree = rootfs_fs_get_tree,
+};
+
+static int rootfs_init_fs_context(struct fs_context *fc)
+{
+ fc->ops = &rootfs_fs_context_ops;
+ fc->global = true;
+ fc->sb_flags = SB_NOUSER;
+ fc->s_iflags = SB_I_NOEXEC | SB_I_NODEV;
+ return 0;
+}
+
+struct file_system_type immutable_rootfs_fs_type = {
+ .name = "rootfs",
+ .init_fs_context = rootfs_init_fs_context,
+ .kill_sb = kill_anon_super,
+};
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 638ca21b7a90..1a3a5a5b785a 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -104,5 +104,6 @@
#define SECRETMEM_MAGIC 0x5345434d /* "SECM" */
#define PID_FS_MAGIC 0x50494446 /* "PIDF" */
#define GUEST_MEMFD_MAGIC 0x474d454d /* "GMEM" */
+#define ROOT_FS_MAGIC 0x524F4F54 /* "ROOT" */
#endif /* __LINUX_MAGIC_H__ */
diff --git a/init/do_mounts.c b/init/do_mounts.c
index defbbf1d55f7..e245e5e4e954 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -492,8 +492,17 @@ void __init prepare_namespace(void)
mount_root(saved_root_name);
out:
devtmpfs_mount();
- init_mount(".", "/", NULL, MS_MOVE, NULL);
- init_chroot(".");
+
+ if (immutable_rootfs) {
+ if (init_pivot_root(".", "."))
+ pr_err("VFS: Failed to pivot into new rootfs\n");
+ if (init_umount(".", MNT_DETACH))
+ pr_err("VFS: Failed to unmount old rootfs\n");
+ pr_info("VFS: Pivoted into new rootfs\n");
+ } else {
+ init_mount(".", "/", NULL, MS_MOVE, NULL);
+ init_chroot(".");
+ }
}
static bool is_tmpfs;
diff --git a/init/do_mounts.h b/init/do_mounts.h
index 6069ea3eb80d..d05870fcb662 100644
--- a/init/do_mounts.h
+++ b/init/do_mounts.h
@@ -15,6 +15,7 @@
void mount_root_generic(char *name, char *pretty_name, int flags);
void mount_root(char *root_device_name);
extern int root_mountflags;
+extern bool immutable_rootfs;
static inline __init int create_dev(char *name, dev_t dev)
{
--
2.47.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner
@ 2026-01-04 7:27 ` Al Viro
2026-01-04 7:41 ` Al Viro
2026-01-07 2:28 ` Gao Xiang
1 sibling, 1 reply; 16+ messages in thread
From: Al Viro @ 2026-01-04 7:27 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote:
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
> +#include <linux/fs/super_types.h>
> +#include <linux/fs_context.h>
> +#include <linux/magic.h>
[snip]
What does it give you compared to an empty ramfs? Or tmpfs, for that
matter...
Why bother with a separate fs type?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-04 7:27 ` Al Viro
@ 2026-01-04 7:41 ` Al Viro
2026-01-06 22:07 ` Christian Brauner
0 siblings, 1 reply; 16+ messages in thread
From: Al Viro @ 2026-01-04 7:41 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Sun, Jan 04, 2026 at 07:27:43AM +0000, Al Viro wrote:
> On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote:
>
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
> > +#include <linux/fs/super_types.h>
> > +#include <linux/fs_context.h>
> > +#include <linux/magic.h>
>
> [snip]
>
> What does it give you compared to an empty ramfs? Or tmpfs, for that
> matter...
>
> Why bother with a separate fs type?
Make that "empty ramfs" and as soon as you've got the mount have
mnt->mnt_root->d_inode->i_flags |= S_IMMUTABLE;
done. No concurrent accesses at that point, no way to clear that
flag for ramfs inodes afterwards and ramfs is always built in...
What am I missing here?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-04 7:41 ` Al Viro
@ 2026-01-06 22:07 ` Christian Brauner
2026-01-06 22:59 ` Al Viro
0 siblings, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2026-01-06 22:07 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Sun, Jan 04, 2026 at 07:41:45AM +0000, Al Viro wrote:
> On Sun, Jan 04, 2026 at 07:27:43AM +0000, Al Viro wrote:
> > On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote:
> >
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
> > > +#include <linux/fs/super_types.h>
> > > +#include <linux/fs_context.h>
> > > +#include <linux/magic.h>
> >
> > [snip]
> >
> > What does it give you compared to an empty ramfs? Or tmpfs, for that
> > matter...
> >
> > Why bother with a separate fs type?
>
> Make that "empty ramfs" and as soon as you've got the mount have
> mnt->mnt_root->d_inode->i_flags |= S_IMMUTABLE;
> done. No concurrent accesses at that point, no way to clear that
> flag for ramfs inodes afterwards and ramfs is always built in...
>
> What am I missing here?
Good point.
Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged
process breaking the promise that this is a permanently immutable
rootfs. The guarantee is that nothing will ever exist in it and all it
ever does is to serve as a parent mount to the point where we can just
hand out an empty namespace to unprivileged namespaces without ever
having to worry that anything sensitive is exposed.
I also dislike that the real rootfs should be a tmpfs or ramfs in the
first place. It just shouldn't serve any purpose other than as a marker
that we reached a dead-end. Userspace has terrible code to parse for
"rootfs" and then stat whether it's a ramfs or tmpfs to figure out
whether it is the real rootfs. By making it a really dead-simple
filesystem and giving it its own magic number userspace can just stat
for it.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-06 22:07 ` Christian Brauner
@ 2026-01-06 22:59 ` Al Viro
2026-01-07 10:53 ` Christian Brauner
0 siblings, 1 reply; 16+ messages in thread
From: Al Viro @ 2026-01-06 22:59 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Tue, Jan 06, 2026 at 11:07:32PM +0100, Christian Brauner wrote:
>
> Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged
> process breaking the promise that this is a permanently immutable
> rootfs.
Not on ramfs:
int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
struct file_kattr *fa)
{
struct inode *inode = d_inode(dentry);
struct file_kattr old_ma = {};
int err;
if (!inode->i_op->fileattr_set)
return -ENOIOCTLCMD;
and that's it, priveleges do not matter.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner
2026-01-04 7:27 ` Al Viro
@ 2026-01-07 2:28 ` Gao Xiang
2026-01-07 2:47 ` Al Viro
1 sibling, 1 reply; 16+ messages in thread
From: Gao Xiang @ 2026-01-07 2:28 UTC (permalink / raw)
To: Christian Brauner, linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On 2026/1/2 22:36, Christian Brauner wrote:
> Currently pivot_root() doesnt't work on the real rootfs because it
> cannot be unmounted. Userspace has to do a recursive removal of the
> initramfs contents manually before continuing the boot.
>
> Really all we want from the real rootfs is to serve as the parent mount
> for anything that is actually useful such as the tmpfs or ramfs for
> initramfs unpacking or the rootfs itself. There's no need for the real
> rootfs to actually be anything meaningful or useful. Add a immutable
> rootfs that can be selected via the "immutable_rootfs" kernel command
> line option.
>
> The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
> and fire up userspace which mounts the rootfs and can then just do:
>
> chdir(rootfs);
> pivot_root(".", ".");
> umount2(".", MNT_DETACH);
>
> and be done with it. (Ofc, userspace can also choose to retain the
> initramfs contents by using something like pivot_root(".", "/initramfs")
> without unmounting it.)
>
> Technically this also means that the rootfs mount in unprivileged
> namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
> that the immutable rootfs remains permanently empty so there cannot be
> anything revealed by unmounting the covering mount.
>
> In the future this will also allow us to create completely empty mount
> namespaces without risking to leak anything.
>
> systemd already handles this all correctly as it tries to pivot_root()
> first and falls back to MS_MOVE only when that fails.
>
> This goes back to various discussion in previous years and a LPC 2024
> presentation about this very topic.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> fs/Makefile | 2 +-
> fs/mount.h | 1 +
> fs/namespace.c | 78 ++++++++++++++++++++++++++++++++++++++++------
> fs/rootfs.c | 65 ++++++++++++++++++++++++++++++++++++++
> include/uapi/linux/magic.h | 1 +
> init/do_mounts.c | 13 ++++++--
> init/do_mounts.h | 1 +
> 7 files changed, 149 insertions(+), 12 deletions(-)
>
> diff --git a/fs/Makefile b/fs/Makefile
> index a04274a3c854..d31b56b7c4d5 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -16,7 +16,7 @@ obj-y := open.o read_write.o file_table.o super.o \
> stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \
> fs_dirent.o fs_context.o fs_parser.o fsopen.o init.o \
> kernel_read_file.o mnt_idmapping.o remap_range.o pidfs.o \
> - file_attr.o
> + file_attr.o rootfs.o
>
> obj-$(CONFIG_BUFFER_HEAD) += buffer.o mpage.o
> obj-$(CONFIG_PROC_FS) += proc_namespace.o
> diff --git a/fs/mount.h b/fs/mount.h
> index 2d28ef2a3aed..c3e0d9dbfaa4 100644
> --- a/fs/mount.h
> +++ b/fs/mount.h
> @@ -5,6 +5,7 @@
> #include <linux/ns_common.h>
> #include <linux/fs_pin.h>
>
> +extern struct file_system_type immutable_rootfs_fs_type;
> extern struct list_head notify_list;
>
> struct mnt_namespace {
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 9261f56ccc81..30597f4610fd 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -75,6 +75,17 @@ static int __init initramfs_options_setup(char *str)
>
> __setup("initramfs_options=", initramfs_options_setup);
>
> +bool immutable_rootfs = false;
> +
> +static int __init immutable_rootfs_setup(char *str)
> +{
> + if (*str)
> + return 0;
> + immutable_rootfs = true;
> + return 1;
> +}
> +__setup("immutable_rootfs", immutable_rootfs_setup);
> +
> static u64 event;
> static DEFINE_XARRAY_FLAGS(mnt_id_xa, XA_FLAGS_ALLOC);
> static DEFINE_IDA(mnt_group_ida);
> @@ -5976,24 +5987,73 @@ struct mnt_namespace init_mnt_ns = {
>
> static void __init init_mount_tree(void)
> {
> - struct vfsmount *mnt;
> - struct mount *m;
> + struct vfsmount *mnt, *immutable_mnt;
> + struct mount *mnt_root;
> struct path root;
>
> + /*
> + * When the immutable rootfs is used, we create two mounts:
> + *
> + * (1) immutable rootfs with mount id 1
> + * (2) mutable rootfs with mount id 2
> + *
> + * with (2) mounted on top of (1).
> + */
> + if (immutable_rootfs) {
> + immutable_mnt = vfs_kern_mount(&immutable_rootfs_fs_type, 0,
> + "rootfs", NULL);
> + if (IS_ERR(immutable_mnt))
> + panic("VFS: Failed to create immutable rootfs");
> + }
> +
> mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options);
> if (IS_ERR(mnt))
> panic("Can't create rootfs");
>
> - m = real_mount(mnt);
> - init_mnt_ns.root = m;
> - init_mnt_ns.nr_mounts = 1;
> - mnt_add_to_ns(&init_mnt_ns, m);
> + if (immutable_rootfs) {
> + VFS_WARN_ON_ONCE(real_mount(immutable_mnt)->mnt_id != 1);
> + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 2);
> +
> + /* The namespace root is the immutable rootfs. */
> + mnt_root = real_mount(immutable_mnt);
> + init_mnt_ns.root = mnt_root;
> +
> + /* Mount mutable rootfs on top of the immutable rootfs. */
> + root.mnt = immutable_mnt;
> + root.dentry = immutable_mnt->mnt_root;
> +
> + LOCK_MOUNT_EXACT(mp, &root);
> + if (unlikely(IS_ERR(mp.parent)))
> + panic("VFS: Failed to setup immutable rootfs");
> + scoped_guard(mount_writer)
> + attach_mnt(real_mount(mnt), mp.parent, mp.mp);
> +
> + pr_info("VFS: Finished setting up immutable rootfs\n");
> + } else {
> + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 1);
> +
> + /* The namespace root is the mutable rootfs. */
> + mnt_root = real_mount(mnt);
> + init_mnt_ns.root = mnt_root;
> + }
> +
> + /*
> + * We've dropped all locks here but that's fine. Not just are we
> + * the only task that's running, there's no other mount
> + * namespace in existence and the initial mount namespace is
> + * completely empty until we add the mounts we just created.
> + */
> + for (struct mount *p = mnt_root; p; p = next_mnt(p, mnt_root)) {
> + mnt_add_to_ns(&init_mnt_ns, p);
> + init_mnt_ns.nr_mounts++;
> + }
> +
> init_task.nsproxy->mnt_ns = &init_mnt_ns;
> get_mnt_ns(&init_mnt_ns);
>
> - root.mnt = mnt;
> - root.dentry = mnt->mnt_root;
> -
> + /* The root and pwd always point to the mutable rootfs. */
> + root.mnt = mnt;
> + root.dentry = mnt->mnt_root;
> set_fs_pwd(current->fs, &root);
> set_fs_root(current->fs, &root);
>
> diff --git a/fs/rootfs.c b/fs/rootfs.c
> new file mode 100644
> index 000000000000..b82b73bb8bb2
> --- /dev/null
> +++ b/fs/rootfs.c
> @@ -0,0 +1,65 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
> +#include <linux/fs/super_types.h>
> +#include <linux/fs_context.h>
> +#include <linux/magic.h>
> +
> +static const struct super_operations rootfs_super_operations = {
> + .statfs = simple_statfs,
> +};
> +
> +static int rootfs_fs_fill_super(struct super_block *s, struct fs_context *fc)
> +{
> + struct inode *inode;
> +
> + s->s_maxbytes = MAX_LFS_FILESIZE;
> + s->s_blocksize = PAGE_SIZE;
> + s->s_blocksize_bits = PAGE_SHIFT;
> + s->s_magic = ROOT_FS_MAGIC;
Just one random suggestion. Regardless of Al's comments,
if we really would like to expose a new visible type to
userspace, how about giving it a meaningful name like
emptyfs or nullfs (I know it could have other meanings
in other OSes) from its tree hierarchy to avoid the
ambiguous "rootfs" naming, especially if it may be
considered for mounting by users in future potential use
cases?
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-07 2:28 ` Gao Xiang
@ 2026-01-07 2:47 ` Al Viro
2026-01-07 2:55 ` Gao Xiang
2026-01-07 10:52 ` Christian Brauner
0 siblings, 2 replies; 16+ messages in thread
From: Al Viro @ 2026-01-07 2:47 UTC (permalink / raw)
To: Gao Xiang
Cc: Christian Brauner, linux-fsdevel, Jan Kara, Jeff Layton,
Amir Goldstein, Lennart Poettering,
Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote:
> Just one random suggestion. Regardless of Al's comments,
> if we really would like to expose a new visible type to
> userspace, how about giving it a meaningful name like
> emptyfs or nullfs (I know it could have other meanings
> in other OSes) from its tree hierarchy to avoid the
> ambiguous "rootfs" naming, especially if it may be
> considered for mounting by users in future potential use
> cases?
*boggle*
_what_ potential use cases? "This here directory is empty and
it'll stay empty and anyone trying to create stuff in it will
get an error; oh, and we want it to be a mount boundary, for
some reason"?
IDGI...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-07 2:47 ` Al Viro
@ 2026-01-07 2:55 ` Gao Xiang
2026-01-07 10:52 ` Christian Brauner
1 sibling, 0 replies; 16+ messages in thread
From: Gao Xiang @ 2026-01-07 2:55 UTC (permalink / raw)
To: Al Viro
Cc: Christian Brauner, linux-fsdevel, Jan Kara, Jeff Layton,
Amir Goldstein, Lennart Poettering,
Zbigniew Jędrzejewski-Szmek, Josef Bacik
On 2026/1/7 10:47, Al Viro wrote:
> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote:
>
>> Just one random suggestion. Regardless of Al's comments,
>> if we really would like to expose a new visible type to
>> userspace, how about giving it a meaningful name like
>> emptyfs or nullfs (I know it could have other meanings
>> in other OSes) from its tree hierarchy to avoid the
>> ambiguous "rootfs" naming, especially if it may be
>> considered for mounting by users in future potential use
>> cases?
>
> *boggle*
>
> _what_ potential use cases? "This here directory is empty and
> it'll stay empty and anyone trying to create stuff in it will
> get an error; oh, and we want it to be a mount boundary, for
> some reason"?
>
> IDGI...
My concern is that "rootfs" naming is already (ab)used in various
ways, although kernel folks know what happens here by checking
the kernel code for example, but making it visible to users I'm
afraid that userspace folks already get various concepts out of
the word "root" (it's absolutely not "chroot" but for a mount
namespace?).
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-07 2:47 ` Al Viro
2026-01-07 2:55 ` Gao Xiang
@ 2026-01-07 10:52 ` Christian Brauner
2026-01-07 16:33 ` Colin Walters
1 sibling, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2026-01-07 10:52 UTC (permalink / raw)
To: Al Viro
Cc: Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote:
> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote:
>
> > Just one random suggestion. Regardless of Al's comments,
> > if we really would like to expose a new visible type to
> > userspace, how about giving it a meaningful name like
> > emptyfs or nullfs (I know it could have other meanings
> > in other OSes) from its tree hierarchy to avoid the
> > ambiguous "rootfs" naming, especially if it may be
> > considered for mounting by users in future potential use
> > cases?
>
> *boggle*
>
> _what_ potential use cases? "This here directory is empty and
> it'll stay empty and anyone trying to create stuff in it will
> get an error; oh, and we want it to be a mount boundary, for
> some reason"?
>
> IDGI...
It's not a completely crazy idea. I thought about this as well. You
could e.g. use it to overmount and hide other directories - like procfs
overmounting or sysfs overmounting or hiding stuff in /etc where
currently tmpfs is used. But tmpfs is not ideal because you don't get
the reliable immutability guarantees.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-06 22:59 ` Al Viro
@ 2026-01-07 10:53 ` Christian Brauner
0 siblings, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-07 10:53 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Tue, Jan 06, 2026 at 10:59:17PM +0000, Al Viro wrote:
> On Tue, Jan 06, 2026 at 11:07:32PM +0100, Christian Brauner wrote:
>
> >
> > Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged
> > process breaking the promise that this is a permanently immutable
> > rootfs.
>
> Not on ramfs:
>
> int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
> struct file_kattr *fa)
> {
> struct inode *inode = d_inode(dentry);
> struct file_kattr old_ma = {};
> int err;
>
> if (!inode->i_op->fileattr_set)
> return -ENOIOCTLCMD;
>
> and that's it, priveleges do not matter.
Ugh, that relies on a current implementation detail of ramfs. I'm not
super convinced that this is great. Like, if you can swallow it I think
having a "nullfs" or "immutablefs" type with a separate magic number
does provide some value and frankly is just a lot cleaner than abusing
ramfs.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-07 10:52 ` Christian Brauner
@ 2026-01-07 16:33 ` Colin Walters
2026-01-08 11:02 ` Christian Brauner
0 siblings, 1 reply; 16+ messages in thread
From: Colin Walters @ 2026-01-07 16:33 UTC (permalink / raw)
To: Christian Brauner, Al Viro
Cc: Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Wed, Jan 7, 2026, at 5:52 AM, Christian Brauner wrote:
> On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote:
>> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote:
>>
>> > Just one random suggestion. Regardless of Al's comments,
>> > if we really would like to expose a new visible type to
>> > userspace, how about giving it a meaningful name like
>> > emptyfs or nullfs (I know it could have other meanings
>> > in other OSes) from its tree hierarchy to avoid the
>> > ambiguous "rootfs" naming, especially if it may be
>> > considered for mounting by users in future potential use
>> > cases?
>>
>> *boggle*
>>
>> _what_ potential use cases? "This here directory is empty and
>> it'll stay empty and anyone trying to create stuff in it will
>> get an error; oh, and we want it to be a mount boundary, for
>> some reason"?
>>
>> IDGI...
>
> It's not a completely crazy idea. I thought about this as well. You
> could e.g. use it to overmount and hide other directories - like procfs
> overmounting or sysfs overmounting or hiding stuff in /etc where
> currently tmpfs is used. But tmpfs is not ideal because you don't get
> the reliable immutability guarantees.
Yeah, there's e.g. `/usr/share/empty` that is intended for things like that as a canonical bind mount source.
I also like this idea (though bikeshed I'd call it "emptyfs") but if we generalize it beyond just the current case, it probably needs to support configuring things like permissions (some cases may want 0700, others 0755 etc.)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-07 16:33 ` Colin Walters
@ 2026-01-08 11:02 ` Christian Brauner
2026-01-25 20:47 ` Askar Safin
0 siblings, 1 reply; 16+ messages in thread
From: Christian Brauner @ 2026-01-08 11:02 UTC (permalink / raw)
To: Colin Walters
Cc: Al Viro, Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton,
Amir Goldstein, Lennart Poettering,
Zbigniew Jędrzejewski-Szmek, Josef Bacik
On Wed, Jan 07, 2026 at 11:33:29AM -0500, Colin Walters wrote:
>
>
> On Wed, Jan 7, 2026, at 5:52 AM, Christian Brauner wrote:
> > On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote:
> >> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote:
> >>
> >> > Just one random suggestion. Regardless of Al's comments,
> >> > if we really would like to expose a new visible type to
> >> > userspace, how about giving it a meaningful name like
> >> > emptyfs or nullfs (I know it could have other meanings
> >> > in other OSes) from its tree hierarchy to avoid the
> >> > ambiguous "rootfs" naming, especially if it may be
> >> > considered for mounting by users in future potential use
> >> > cases?
> >>
> >> *boggle*
> >>
> >> _what_ potential use cases? "This here directory is empty and
> >> it'll stay empty and anyone trying to create stuff in it will
> >> get an error; oh, and we want it to be a mount boundary, for
> >> some reason"?
> >>
> >> IDGI...
> >
> > It's not a completely crazy idea. I thought about this as well. You
> > could e.g. use it to overmount and hide other directories - like procfs
> > overmounting or sysfs overmounting or hiding stuff in /etc where
> > currently tmpfs is used. But tmpfs is not ideal because you don't get
> > the reliable immutability guarantees.
>
> Yeah, there's e.g. `/usr/share/empty` that is intended for things like that as a canonical bind mount source.
>
> I also like this idea (though bikeshed I'd call it "emptyfs") but if we generalize it beyond just the current case, it probably needs to support configuring things like permissions (some cases may want 0700, others 0755 etc.)
We can start with the basic right now where it's not mountable from
userspace and then make it mountable from userspace later.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs
2026-01-08 11:02 ` Christian Brauner
@ 2026-01-25 20:47 ` Askar Safin
0 siblings, 0 replies; 16+ messages in thread
From: Askar Safin @ 2026-01-25 20:47 UTC (permalink / raw)
To: brauner
Cc: amir73il, hsiangkao, jack, jlayton, josef, lennart, linux-fsdevel,
viro, walters, zbyszek
Christian Brauner <brauner@kernel.org>:
> We can start with the basic right now where it's not mountable from
> userspace and then make it mountable from userspace later.
For your information: if we make it mountable by userspace, then
magic number will not be reliable indicator of whether this is actual
root of hierarchy.
But this is okay, because we can do listmount/statmount and check
whether mount id is equal to parent mount id.
--
Askar Safin
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-01-25 20:47 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner
2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner
2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner
2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner
2026-01-04 7:27 ` Al Viro
2026-01-04 7:41 ` Al Viro
2026-01-06 22:07 ` Christian Brauner
2026-01-06 22:59 ` Al Viro
2026-01-07 10:53 ` Christian Brauner
2026-01-07 2:28 ` Gao Xiang
2026-01-07 2:47 ` Al Viro
2026-01-07 2:55 ` Gao Xiang
2026-01-07 10:52 ` Christian Brauner
2026-01-07 16:33 ` Colin Walters
2026-01-08 11:02 ` Christian Brauner
2026-01-25 20:47 ` Askar Safin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox