* [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs
@ 2026-01-02 14:36 Christian Brauner
2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw)
To: linux-fsdevel
Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein,
Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik,
Christian Brauner, stable
Currently pivot_root() doesnt't work on the real rootfs because it
cannot be unmounted. Userspace has to do a recursive removal of the
initramfs contents manually before continuing the boot.
Really all we want from the real rootfs is to serve as the parent mount
for anything that is actually useful such as the tmpfs or ramfs for
initramfs unpacking or the rootfs itself. There's no need for the real
rootfs to actually be anything meaningful or useful. Add a immutable
rootfs that can be selected via the "immutable_rootfs" kernel command
line option.
The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
and fire up userspace which mounts the rootfs and can then just do:
chdir(rootfs);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
and be done with it. (Ofc, userspace can also choose to retain the
initramfs contents by using something like pivot_root(".", "/initramfs")
without unmounting it.)
Technically this also means that the rootfs mount in unprivileged
namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
that the immutable rootfs remains permanently empty so there cannot be
anything revealed by unmounting the covering mount.
In the future this will also allow us to create completely empty mount
namespaces without risking to leak anything.
systemd already handles this all correctly as it tries to pivot_root()
first and falls back to MS_MOVE only when that fails.
This goes back to various discussion in previous years and a LPC 2024
presentation about this very topic.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Christian Brauner (3):
fs: ensure that internal tmpfs mount gets mount id zero
fs: add init_pivot_root()
fs: add immutable rootfs
fs/Makefile | 2 +-
fs/init.c | 17 ++++
fs/internal.h | 1 +
fs/mount.h | 1 +
fs/namespace.c | 181 +++++++++++++++++++++++++++++-------------
fs/rootfs.c | 65 +++++++++++++++
include/linux/init_syscalls.h | 1 +
include/uapi/linux/magic.h | 1 +
init/do_mounts.c | 13 ++-
init/do_mounts.h | 1 +
10 files changed, 223 insertions(+), 60 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20260102-work-immutable-rootfs-b5f23e0f5a27
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero 2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner @ 2026-01-02 14:36 ` Christian Brauner 2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner 2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner 2 siblings, 0 replies; 16+ messages in thread From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw) To: linux-fsdevel Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik, Christian Brauner, stable and the rootfs get mount id one as it always has. Before we actually mount the rootfs we create an internal tmpfs mount which has mount id zero but is never exposed anywhere. Continue that "tradition". Fixes: 7f9bfafc5f49 ("fs: use xarray for old mount id") Cc: <stable@vger.kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index c58674a20cad..8b082b1de7f3 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -221,7 +221,7 @@ static int mnt_alloc_id(struct mount *mnt) int res; xa_lock(&mnt_id_xa); - res = __xa_alloc(&mnt_id_xa, &mnt->mnt_id, mnt, XA_LIMIT(1, INT_MAX), GFP_KERNEL); + res = __xa_alloc(&mnt_id_xa, &mnt->mnt_id, mnt, xa_limit_31b, GFP_KERNEL); if (!res) mnt->mnt_id_unique = ++mnt_id_ctr; xa_unlock(&mnt_id_xa); -- 2.47.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/3] fs: add init_pivot_root() 2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner 2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner @ 2026-01-02 14:36 ` Christian Brauner 2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner 2 siblings, 0 replies; 16+ messages in thread From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw) To: linux-fsdevel Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik, Christian Brauner We will soon be able to pivot_root() with the introduction of the immutable rootfs. Add a wrapper for kernel internal usage. Signed-off-by: Christian Brauner <brauner@kernel.org> --- fs/init.c | 17 +++++++ fs/internal.h | 1 + fs/namespace.c | 101 ++++++++++++++++++++++-------------------- include/linux/init_syscalls.h | 1 + 4 files changed, 73 insertions(+), 47 deletions(-) diff --git a/fs/init.c b/fs/init.c index e0f5429c0a49..e33b2690d851 100644 --- a/fs/init.c +++ b/fs/init.c @@ -13,6 +13,23 @@ #include <linux/security.h> #include "internal.h" +int __init init_pivot_root(const char *new_root, const char *put_old) +{ + struct path new_path __free(path_put) = {}; + struct path old_path __free(path_put) = {}; + int ret; + + ret = kern_path(new_root, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new_path); + if (ret) + return ret; + + ret = kern_path(put_old, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old_path); + if (ret) + return ret; + + return path_pivot_root(&new_path, &old_path); +} + int __init init_mount(const char *dev_name, const char *dir_name, const char *type_page, unsigned long flags, void *data_page) { diff --git a/fs/internal.h b/fs/internal.h index ab638d41ab81..4b27a4b0fdef 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -90,6 +90,7 @@ extern bool may_mount(void); int path_mount(const char *dev_name, const struct path *path, const char *type_page, unsigned long flags, void *data_page); int path_umount(const struct path *path, int flags); +int path_pivot_root(struct path *new, struct path *old); int show_path(struct seq_file *m, struct dentry *root); diff --git a/fs/namespace.c b/fs/namespace.c index 8b082b1de7f3..9261f56ccc81 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4498,36 +4498,8 @@ bool path_is_under(const struct path *path1, const struct path *path2) } EXPORT_SYMBOL(path_is_under); -/* - * pivot_root Semantics: - * Moves the root file system of the current process to the directory put_old, - * makes new_root as the new root file system of the current process, and sets - * root/cwd of all processes which had them on the current root to new_root. - * - * Restrictions: - * The new_root and put_old must be directories, and must not be on the - * same file system as the current process root. The put_old must be - * underneath new_root, i.e. adding a non-zero number of /.. to the string - * pointed to by put_old must yield the same directory as new_root. No other - * file system may be mounted on put_old. After all, new_root is a mountpoint. - * - * Also, the current root cannot be on the 'rootfs' (initial ramfs) filesystem. - * See Documentation/filesystems/ramfs-rootfs-initramfs.rst for alternatives - * in this situation. - * - * Notes: - * - we don't move root/cwd if they are not at the root (reason: if something - * cared enough to change them, it's probably wrong to force them elsewhere) - * - it's okay to pick a root that isn't the root of a file system, e.g. - * /nfs/my_root where /nfs is the mount point. It must be a mountpoint, - * though, so you may need to say mount --bind /nfs/my_root /nfs/my_root - * first. - */ -SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, - const char __user *, put_old) +int path_pivot_root(struct path *new, struct path *old) { - struct path new __free(path_put) = {}; - struct path old __free(path_put) = {}; struct path root __free(path_put) = {}; struct mount *new_mnt, *root_mnt, *old_mnt, *root_parent, *ex_parent; int error; @@ -4535,28 +4507,18 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, if (!may_mount()) return -EPERM; - error = user_path_at(AT_FDCWD, new_root, - LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new); - if (error) - return error; - - error = user_path_at(AT_FDCWD, put_old, - LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old); - if (error) - return error; - - error = security_sb_pivotroot(&old, &new); + error = security_sb_pivotroot(old, new); if (error) return error; get_fs_root(current->fs, &root); - LOCK_MOUNT(old_mp, &old); + LOCK_MOUNT(old_mp, old); old_mnt = old_mp.parent; if (IS_ERR(old_mnt)) return PTR_ERR(old_mnt); - new_mnt = real_mount(new.mnt); + new_mnt = real_mount(new->mnt); root_mnt = real_mount(root.mnt); ex_parent = new_mnt->mnt_parent; root_parent = root_mnt->mnt_parent; @@ -4568,7 +4530,7 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, return -EINVAL; if (new_mnt->mnt.mnt_flags & MNT_LOCKED) return -EINVAL; - if (d_unlinked(new.dentry)) + if (d_unlinked(new->dentry)) return -ENOENT; if (new_mnt == root_mnt || old_mnt == root_mnt) return -EBUSY; /* loop, on the same file system */ @@ -4576,15 +4538,15 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, return -EINVAL; /* not a mountpoint */ if (!mnt_has_parent(root_mnt)) return -EINVAL; /* absolute root */ - if (!path_mounted(&new)) + if (!path_mounted(new)) return -EINVAL; /* not a mountpoint */ if (!mnt_has_parent(new_mnt)) return -EINVAL; /* absolute root */ /* make sure we can reach put_old from new_root */ - if (!is_path_reachable(old_mnt, old_mp.mp->m_dentry, &new)) + if (!is_path_reachable(old_mnt, old_mp.mp->m_dentry, new)) return -EINVAL; /* make certain new is below the root */ - if (!is_path_reachable(new_mnt, new.dentry, &root)) + if (!is_path_reachable(new_mnt, new->dentry, &root)) return -EINVAL; lock_mount_hash(); umount_mnt(new_mnt); @@ -4603,10 +4565,55 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, unlock_mount_hash(); mnt_notify_add(root_mnt); mnt_notify_add(new_mnt); - chroot_fs_refs(&root, &new); + chroot_fs_refs(&root, new); return 0; } +/* + * pivot_root Semantics: + * Moves the root file system of the current process to the directory put_old, + * makes new_root as the new root file system of the current process, and sets + * root/cwd of all processes which had them on the current root to new_root. + * + * Restrictions: + * The new_root and put_old must be directories, and must not be on the + * same file system as the current process root. The put_old must be + * underneath new_root, i.e. adding a non-zero number of /.. to the string + * pointed to by put_old must yield the same directory as new_root. No other + * file system may be mounted on put_old. After all, new_root is a mountpoint. + * + * Also, the current root cannot be on the 'rootfs' (initial ramfs) filesystem. + * See Documentation/filesystems/ramfs-rootfs-initramfs.rst for alternatives + * in this situation. + * + * Notes: + * - we don't move root/cwd if they are not at the root (reason: if something + * cared enough to change them, it's probably wrong to force them elsewhere) + * - it's okay to pick a root that isn't the root of a file system, e.g. + * /nfs/my_root where /nfs is the mount point. It must be a mountpoint, + * though, so you may need to say mount --bind /nfs/my_root /nfs/my_root + * first. + */ +SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, + const char __user *, put_old) +{ + struct path new __free(path_put) = {}; + struct path old __free(path_put) = {}; + int error; + + error = user_path_at(AT_FDCWD, new_root, + LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new); + if (error) + return error; + + error = user_path_at(AT_FDCWD, put_old, + LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &old); + if (error) + return error; + + return path_pivot_root(&new, &old); +} + static unsigned int recalc_flags(struct mount_kattr *kattr, struct mount *mnt) { unsigned int flags = mnt->mnt.mnt_flags; diff --git a/include/linux/init_syscalls.h b/include/linux/init_syscalls.h index 92045d18cbfc..28776ee28d8e 100644 --- a/include/linux/init_syscalls.h +++ b/include/linux/init_syscalls.h @@ -17,3 +17,4 @@ int __init init_mkdir(const char *pathname, umode_t mode); int __init init_rmdir(const char *pathname); int __init init_utimes(char *filename, struct timespec64 *ts); int __init init_dup(struct file *file); +int __init init_pivot_root(const char *new_root, const char *put_old); -- 2.47.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/3] fs: add immutable rootfs 2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner 2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner 2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner @ 2026-01-02 14:36 ` Christian Brauner 2026-01-04 7:27 ` Al Viro 2026-01-07 2:28 ` Gao Xiang 2 siblings, 2 replies; 16+ messages in thread From: Christian Brauner @ 2026-01-02 14:36 UTC (permalink / raw) To: linux-fsdevel Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik, Christian Brauner Currently pivot_root() doesnt't work on the real rootfs because it cannot be unmounted. Userspace has to do a recursive removal of the initramfs contents manually before continuing the boot. Really all we want from the real rootfs is to serve as the parent mount for anything that is actually useful such as the tmpfs or ramfs for initramfs unpacking or the rootfs itself. There's no need for the real rootfs to actually be anything meaningful or useful. Add a immutable rootfs that can be selected via the "immutable_rootfs" kernel command line option. The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs and fire up userspace which mounts the rootfs and can then just do: chdir(rootfs); pivot_root(".", "."); umount2(".", MNT_DETACH); and be done with it. (Ofc, userspace can also choose to retain the initramfs contents by using something like pivot_root(".", "/initramfs") without unmounting it.) Technically this also means that the rootfs mount in unprivileged namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed that the immutable rootfs remains permanently empty so there cannot be anything revealed by unmounting the covering mount. In the future this will also allow us to create completely empty mount namespaces without risking to leak anything. systemd already handles this all correctly as it tries to pivot_root() first and falls back to MS_MOVE only when that fails. This goes back to various discussion in previous years and a LPC 2024 presentation about this very topic. Signed-off-by: Christian Brauner <brauner@kernel.org> --- fs/Makefile | 2 +- fs/mount.h | 1 + fs/namespace.c | 78 ++++++++++++++++++++++++++++++++++++++++------ fs/rootfs.c | 65 ++++++++++++++++++++++++++++++++++++++ include/uapi/linux/magic.h | 1 + init/do_mounts.c | 13 ++++++-- init/do_mounts.h | 1 + 7 files changed, 149 insertions(+), 12 deletions(-) diff --git a/fs/Makefile b/fs/Makefile index a04274a3c854..d31b56b7c4d5 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -16,7 +16,7 @@ obj-y := open.o read_write.o file_table.o super.o \ stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \ fs_dirent.o fs_context.o fs_parser.o fsopen.o init.o \ kernel_read_file.o mnt_idmapping.o remap_range.o pidfs.o \ - file_attr.o + file_attr.o rootfs.o obj-$(CONFIG_BUFFER_HEAD) += buffer.o mpage.o obj-$(CONFIG_PROC_FS) += proc_namespace.o diff --git a/fs/mount.h b/fs/mount.h index 2d28ef2a3aed..c3e0d9dbfaa4 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -5,6 +5,7 @@ #include <linux/ns_common.h> #include <linux/fs_pin.h> +extern struct file_system_type immutable_rootfs_fs_type; extern struct list_head notify_list; struct mnt_namespace { diff --git a/fs/namespace.c b/fs/namespace.c index 9261f56ccc81..30597f4610fd 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -75,6 +75,17 @@ static int __init initramfs_options_setup(char *str) __setup("initramfs_options=", initramfs_options_setup); +bool immutable_rootfs = false; + +static int __init immutable_rootfs_setup(char *str) +{ + if (*str) + return 0; + immutable_rootfs = true; + return 1; +} +__setup("immutable_rootfs", immutable_rootfs_setup); + static u64 event; static DEFINE_XARRAY_FLAGS(mnt_id_xa, XA_FLAGS_ALLOC); static DEFINE_IDA(mnt_group_ida); @@ -5976,24 +5987,73 @@ struct mnt_namespace init_mnt_ns = { static void __init init_mount_tree(void) { - struct vfsmount *mnt; - struct mount *m; + struct vfsmount *mnt, *immutable_mnt; + struct mount *mnt_root; struct path root; + /* + * When the immutable rootfs is used, we create two mounts: + * + * (1) immutable rootfs with mount id 1 + * (2) mutable rootfs with mount id 2 + * + * with (2) mounted on top of (1). + */ + if (immutable_rootfs) { + immutable_mnt = vfs_kern_mount(&immutable_rootfs_fs_type, 0, + "rootfs", NULL); + if (IS_ERR(immutable_mnt)) + panic("VFS: Failed to create immutable rootfs"); + } + mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options); if (IS_ERR(mnt)) panic("Can't create rootfs"); - m = real_mount(mnt); - init_mnt_ns.root = m; - init_mnt_ns.nr_mounts = 1; - mnt_add_to_ns(&init_mnt_ns, m); + if (immutable_rootfs) { + VFS_WARN_ON_ONCE(real_mount(immutable_mnt)->mnt_id != 1); + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 2); + + /* The namespace root is the immutable rootfs. */ + mnt_root = real_mount(immutable_mnt); + init_mnt_ns.root = mnt_root; + + /* Mount mutable rootfs on top of the immutable rootfs. */ + root.mnt = immutable_mnt; + root.dentry = immutable_mnt->mnt_root; + + LOCK_MOUNT_EXACT(mp, &root); + if (unlikely(IS_ERR(mp.parent))) + panic("VFS: Failed to setup immutable rootfs"); + scoped_guard(mount_writer) + attach_mnt(real_mount(mnt), mp.parent, mp.mp); + + pr_info("VFS: Finished setting up immutable rootfs\n"); + } else { + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 1); + + /* The namespace root is the mutable rootfs. */ + mnt_root = real_mount(mnt); + init_mnt_ns.root = mnt_root; + } + + /* + * We've dropped all locks here but that's fine. Not just are we + * the only task that's running, there's no other mount + * namespace in existence and the initial mount namespace is + * completely empty until we add the mounts we just created. + */ + for (struct mount *p = mnt_root; p; p = next_mnt(p, mnt_root)) { + mnt_add_to_ns(&init_mnt_ns, p); + init_mnt_ns.nr_mounts++; + } + init_task.nsproxy->mnt_ns = &init_mnt_ns; get_mnt_ns(&init_mnt_ns); - root.mnt = mnt; - root.dentry = mnt->mnt_root; - + /* The root and pwd always point to the mutable rootfs. */ + root.mnt = mnt; + root.dentry = mnt->mnt_root; set_fs_pwd(current->fs, &root); set_fs_root(current->fs, &root); diff --git a/fs/rootfs.c b/fs/rootfs.c new file mode 100644 index 000000000000..b82b73bb8bb2 --- /dev/null +++ b/fs/rootfs.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */ +#include <linux/fs/super_types.h> +#include <linux/fs_context.h> +#include <linux/magic.h> + +static const struct super_operations rootfs_super_operations = { + .statfs = simple_statfs, +}; + +static int rootfs_fs_fill_super(struct super_block *s, struct fs_context *fc) +{ + struct inode *inode; + + s->s_maxbytes = MAX_LFS_FILESIZE; + s->s_blocksize = PAGE_SIZE; + s->s_blocksize_bits = PAGE_SHIFT; + s->s_magic = ROOT_FS_MAGIC; + s->s_op = &rootfs_super_operations; + s->s_export_op = NULL; + s->s_xattr = NULL; + s->s_time_gran = 1; + s->s_d_flags = 0; + + inode = new_inode(s); + if (!inode) + return -ENOMEM; + + /* The real rootfs is permanently empty... */ + make_empty_dir_inode(inode); + simple_inode_init_ts(inode); + inode->i_ino = 1; + /* ... and immutable. */ + inode->i_flags |= S_IMMUTABLE; + + s->s_root = d_make_root(inode); + if (!s->s_root) + return -ENOMEM; + + return 0; +} + +static int rootfs_fs_get_tree(struct fs_context *fc) +{ + return get_tree_single(fc, rootfs_fs_fill_super); +} + +static const struct fs_context_operations rootfs_fs_context_ops = { + .get_tree = rootfs_fs_get_tree, +}; + +static int rootfs_init_fs_context(struct fs_context *fc) +{ + fc->ops = &rootfs_fs_context_ops; + fc->global = true; + fc->sb_flags = SB_NOUSER; + fc->s_iflags = SB_I_NOEXEC | SB_I_NODEV; + return 0; +} + +struct file_system_type immutable_rootfs_fs_type = { + .name = "rootfs", + .init_fs_context = rootfs_init_fs_context, + .kill_sb = kill_anon_super, +}; diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index 638ca21b7a90..1a3a5a5b785a 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -104,5 +104,6 @@ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ #define GUEST_MEMFD_MAGIC 0x474d454d /* "GMEM" */ +#define ROOT_FS_MAGIC 0x524F4F54 /* "ROOT" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/init/do_mounts.c b/init/do_mounts.c index defbbf1d55f7..e245e5e4e954 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -492,8 +492,17 @@ void __init prepare_namespace(void) mount_root(saved_root_name); out: devtmpfs_mount(); - init_mount(".", "/", NULL, MS_MOVE, NULL); - init_chroot("."); + + if (immutable_rootfs) { + if (init_pivot_root(".", ".")) + pr_err("VFS: Failed to pivot into new rootfs\n"); + if (init_umount(".", MNT_DETACH)) + pr_err("VFS: Failed to unmount old rootfs\n"); + pr_info("VFS: Pivoted into new rootfs\n"); + } else { + init_mount(".", "/", NULL, MS_MOVE, NULL); + init_chroot("."); + } } static bool is_tmpfs; diff --git a/init/do_mounts.h b/init/do_mounts.h index 6069ea3eb80d..d05870fcb662 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -15,6 +15,7 @@ void mount_root_generic(char *name, char *pretty_name, int flags); void mount_root(char *root_device_name); extern int root_mountflags; +extern bool immutable_rootfs; static inline __init int create_dev(char *name, dev_t dev) { -- 2.47.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner @ 2026-01-04 7:27 ` Al Viro 2026-01-04 7:41 ` Al Viro 2026-01-07 2:28 ` Gao Xiang 1 sibling, 1 reply; 16+ messages in thread From: Al Viro @ 2026-01-04 7:27 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote: > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */ > +#include <linux/fs/super_types.h> > +#include <linux/fs_context.h> > +#include <linux/magic.h> [snip] What does it give you compared to an empty ramfs? Or tmpfs, for that matter... Why bother with a separate fs type? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-04 7:27 ` Al Viro @ 2026-01-04 7:41 ` Al Viro 2026-01-06 22:07 ` Christian Brauner 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2026-01-04 7:41 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Sun, Jan 04, 2026 at 07:27:43AM +0000, Al Viro wrote: > On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote: > > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */ > > +#include <linux/fs/super_types.h> > > +#include <linux/fs_context.h> > > +#include <linux/magic.h> > > [snip] > > What does it give you compared to an empty ramfs? Or tmpfs, for that > matter... > > Why bother with a separate fs type? Make that "empty ramfs" and as soon as you've got the mount have mnt->mnt_root->d_inode->i_flags |= S_IMMUTABLE; done. No concurrent accesses at that point, no way to clear that flag for ramfs inodes afterwards and ramfs is always built in... What am I missing here? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-04 7:41 ` Al Viro @ 2026-01-06 22:07 ` Christian Brauner 2026-01-06 22:59 ` Al Viro 0 siblings, 1 reply; 16+ messages in thread From: Christian Brauner @ 2026-01-06 22:07 UTC (permalink / raw) To: Al Viro Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Sun, Jan 04, 2026 at 07:41:45AM +0000, Al Viro wrote: > On Sun, Jan 04, 2026 at 07:27:43AM +0000, Al Viro wrote: > > On Fri, Jan 02, 2026 at 03:36:24PM +0100, Christian Brauner wrote: > > > > > +// SPDX-License-Identifier: GPL-2.0-only > > > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */ > > > +#include <linux/fs/super_types.h> > > > +#include <linux/fs_context.h> > > > +#include <linux/magic.h> > > > > [snip] > > > > What does it give you compared to an empty ramfs? Or tmpfs, for that > > matter... > > > > Why bother with a separate fs type? > > Make that "empty ramfs" and as soon as you've got the mount have > mnt->mnt_root->d_inode->i_flags |= S_IMMUTABLE; > done. No concurrent accesses at that point, no way to clear that > flag for ramfs inodes afterwards and ramfs is always built in... > > What am I missing here? Good point. Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged process breaking the promise that this is a permanently immutable rootfs. The guarantee is that nothing will ever exist in it and all it ever does is to serve as a parent mount to the point where we can just hand out an empty namespace to unprivileged namespaces without ever having to worry that anything sensitive is exposed. I also dislike that the real rootfs should be a tmpfs or ramfs in the first place. It just shouldn't serve any purpose other than as a marker that we reached a dead-end. Userspace has terrible code to parse for "rootfs" and then stat whether it's a ramfs or tmpfs to figure out whether it is the real rootfs. By making it a really dead-simple filesystem and giving it its own magic number userspace can just stat for it. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-06 22:07 ` Christian Brauner @ 2026-01-06 22:59 ` Al Viro 2026-01-07 10:53 ` Christian Brauner 0 siblings, 1 reply; 16+ messages in thread From: Al Viro @ 2026-01-06 22:59 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Tue, Jan 06, 2026 at 11:07:32PM +0100, Christian Brauner wrote: > > Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged > process breaking the promise that this is a permanently immutable > rootfs. Not on ramfs: int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry, struct file_kattr *fa) { struct inode *inode = d_inode(dentry); struct file_kattr old_ma = {}; int err; if (!inode->i_op->fileattr_set) return -ENOIOCTLCMD; and that's it, priveleges do not matter. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-06 22:59 ` Al Viro @ 2026-01-07 10:53 ` Christian Brauner 0 siblings, 0 replies; 16+ messages in thread From: Christian Brauner @ 2026-01-07 10:53 UTC (permalink / raw) To: Al Viro Cc: linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Tue, Jan 06, 2026 at 10:59:17PM +0000, Al Viro wrote: > On Tue, Jan 06, 2026 at 11:07:32PM +0100, Christian Brauner wrote: > > > > > Afaict, FS_IMMUTABLE_FL can be cleared by a sufficiently privileged > > process breaking the promise that this is a permanently immutable > > rootfs. > > Not on ramfs: > > int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry, > struct file_kattr *fa) > { > struct inode *inode = d_inode(dentry); > struct file_kattr old_ma = {}; > int err; > > if (!inode->i_op->fileattr_set) > return -ENOIOCTLCMD; > > and that's it, priveleges do not matter. Ugh, that relies on a current implementation detail of ramfs. I'm not super convinced that this is great. Like, if you can swallow it I think having a "nullfs" or "immutablefs" type with a separate magic number does provide some value and frankly is just a lot cleaner than abusing ramfs. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner 2026-01-04 7:27 ` Al Viro @ 2026-01-07 2:28 ` Gao Xiang 2026-01-07 2:47 ` Al Viro 1 sibling, 1 reply; 16+ messages in thread From: Gao Xiang @ 2026-01-07 2:28 UTC (permalink / raw) To: Christian Brauner, linux-fsdevel Cc: Alexander Viro, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On 2026/1/2 22:36, Christian Brauner wrote: > Currently pivot_root() doesnt't work on the real rootfs because it > cannot be unmounted. Userspace has to do a recursive removal of the > initramfs contents manually before continuing the boot. > > Really all we want from the real rootfs is to serve as the parent mount > for anything that is actually useful such as the tmpfs or ramfs for > initramfs unpacking or the rootfs itself. There's no need for the real > rootfs to actually be anything meaningful or useful. Add a immutable > rootfs that can be selected via the "immutable_rootfs" kernel command > line option. > > The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs > and fire up userspace which mounts the rootfs and can then just do: > > chdir(rootfs); > pivot_root(".", "."); > umount2(".", MNT_DETACH); > > and be done with it. (Ofc, userspace can also choose to retain the > initramfs contents by using something like pivot_root(".", "/initramfs") > without unmounting it.) > > Technically this also means that the rootfs mount in unprivileged > namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed > that the immutable rootfs remains permanently empty so there cannot be > anything revealed by unmounting the covering mount. > > In the future this will also allow us to create completely empty mount > namespaces without risking to leak anything. > > systemd already handles this all correctly as it tries to pivot_root() > first and falls back to MS_MOVE only when that fails. > > This goes back to various discussion in previous years and a LPC 2024 > presentation about this very topic. > > Signed-off-by: Christian Brauner <brauner@kernel.org> > --- > fs/Makefile | 2 +- > fs/mount.h | 1 + > fs/namespace.c | 78 ++++++++++++++++++++++++++++++++++++++++------ > fs/rootfs.c | 65 ++++++++++++++++++++++++++++++++++++++ > include/uapi/linux/magic.h | 1 + > init/do_mounts.c | 13 ++++++-- > init/do_mounts.h | 1 + > 7 files changed, 149 insertions(+), 12 deletions(-) > > diff --git a/fs/Makefile b/fs/Makefile > index a04274a3c854..d31b56b7c4d5 100644 > --- a/fs/Makefile > +++ b/fs/Makefile > @@ -16,7 +16,7 @@ obj-y := open.o read_write.o file_table.o super.o \ > stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \ > fs_dirent.o fs_context.o fs_parser.o fsopen.o init.o \ > kernel_read_file.o mnt_idmapping.o remap_range.o pidfs.o \ > - file_attr.o > + file_attr.o rootfs.o > > obj-$(CONFIG_BUFFER_HEAD) += buffer.o mpage.o > obj-$(CONFIG_PROC_FS) += proc_namespace.o > diff --git a/fs/mount.h b/fs/mount.h > index 2d28ef2a3aed..c3e0d9dbfaa4 100644 > --- a/fs/mount.h > +++ b/fs/mount.h > @@ -5,6 +5,7 @@ > #include <linux/ns_common.h> > #include <linux/fs_pin.h> > > +extern struct file_system_type immutable_rootfs_fs_type; > extern struct list_head notify_list; > > struct mnt_namespace { > diff --git a/fs/namespace.c b/fs/namespace.c > index 9261f56ccc81..30597f4610fd 100644 > --- a/fs/namespace.c > +++ b/fs/namespace.c > @@ -75,6 +75,17 @@ static int __init initramfs_options_setup(char *str) > > __setup("initramfs_options=", initramfs_options_setup); > > +bool immutable_rootfs = false; > + > +static int __init immutable_rootfs_setup(char *str) > +{ > + if (*str) > + return 0; > + immutable_rootfs = true; > + return 1; > +} > +__setup("immutable_rootfs", immutable_rootfs_setup); > + > static u64 event; > static DEFINE_XARRAY_FLAGS(mnt_id_xa, XA_FLAGS_ALLOC); > static DEFINE_IDA(mnt_group_ida); > @@ -5976,24 +5987,73 @@ struct mnt_namespace init_mnt_ns = { > > static void __init init_mount_tree(void) > { > - struct vfsmount *mnt; > - struct mount *m; > + struct vfsmount *mnt, *immutable_mnt; > + struct mount *mnt_root; > struct path root; > > + /* > + * When the immutable rootfs is used, we create two mounts: > + * > + * (1) immutable rootfs with mount id 1 > + * (2) mutable rootfs with mount id 2 > + * > + * with (2) mounted on top of (1). > + */ > + if (immutable_rootfs) { > + immutable_mnt = vfs_kern_mount(&immutable_rootfs_fs_type, 0, > + "rootfs", NULL); > + if (IS_ERR(immutable_mnt)) > + panic("VFS: Failed to create immutable rootfs"); > + } > + > mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options); > if (IS_ERR(mnt)) > panic("Can't create rootfs"); > > - m = real_mount(mnt); > - init_mnt_ns.root = m; > - init_mnt_ns.nr_mounts = 1; > - mnt_add_to_ns(&init_mnt_ns, m); > + if (immutable_rootfs) { > + VFS_WARN_ON_ONCE(real_mount(immutable_mnt)->mnt_id != 1); > + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 2); > + > + /* The namespace root is the immutable rootfs. */ > + mnt_root = real_mount(immutable_mnt); > + init_mnt_ns.root = mnt_root; > + > + /* Mount mutable rootfs on top of the immutable rootfs. */ > + root.mnt = immutable_mnt; > + root.dentry = immutable_mnt->mnt_root; > + > + LOCK_MOUNT_EXACT(mp, &root); > + if (unlikely(IS_ERR(mp.parent))) > + panic("VFS: Failed to setup immutable rootfs"); > + scoped_guard(mount_writer) > + attach_mnt(real_mount(mnt), mp.parent, mp.mp); > + > + pr_info("VFS: Finished setting up immutable rootfs\n"); > + } else { > + VFS_WARN_ON_ONCE(real_mount(mnt)->mnt_id != 1); > + > + /* The namespace root is the mutable rootfs. */ > + mnt_root = real_mount(mnt); > + init_mnt_ns.root = mnt_root; > + } > + > + /* > + * We've dropped all locks here but that's fine. Not just are we > + * the only task that's running, there's no other mount > + * namespace in existence and the initial mount namespace is > + * completely empty until we add the mounts we just created. > + */ > + for (struct mount *p = mnt_root; p; p = next_mnt(p, mnt_root)) { > + mnt_add_to_ns(&init_mnt_ns, p); > + init_mnt_ns.nr_mounts++; > + } > + > init_task.nsproxy->mnt_ns = &init_mnt_ns; > get_mnt_ns(&init_mnt_ns); > > - root.mnt = mnt; > - root.dentry = mnt->mnt_root; > - > + /* The root and pwd always point to the mutable rootfs. */ > + root.mnt = mnt; > + root.dentry = mnt->mnt_root; > set_fs_pwd(current->fs, &root); > set_fs_root(current->fs, &root); > > diff --git a/fs/rootfs.c b/fs/rootfs.c > new file mode 100644 > index 000000000000..b82b73bb8bb2 > --- /dev/null > +++ b/fs/rootfs.c > @@ -0,0 +1,65 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */ > +#include <linux/fs/super_types.h> > +#include <linux/fs_context.h> > +#include <linux/magic.h> > + > +static const struct super_operations rootfs_super_operations = { > + .statfs = simple_statfs, > +}; > + > +static int rootfs_fs_fill_super(struct super_block *s, struct fs_context *fc) > +{ > + struct inode *inode; > + > + s->s_maxbytes = MAX_LFS_FILESIZE; > + s->s_blocksize = PAGE_SIZE; > + s->s_blocksize_bits = PAGE_SHIFT; > + s->s_magic = ROOT_FS_MAGIC; Just one random suggestion. Regardless of Al's comments, if we really would like to expose a new visible type to userspace, how about giving it a meaningful name like emptyfs or nullfs (I know it could have other meanings in other OSes) from its tree hierarchy to avoid the ambiguous "rootfs" naming, especially if it may be considered for mounting by users in future potential use cases? Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-07 2:28 ` Gao Xiang @ 2026-01-07 2:47 ` Al Viro 2026-01-07 2:55 ` Gao Xiang 2026-01-07 10:52 ` Christian Brauner 0 siblings, 2 replies; 16+ messages in thread From: Al Viro @ 2026-01-07 2:47 UTC (permalink / raw) To: Gao Xiang Cc: Christian Brauner, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote: > Just one random suggestion. Regardless of Al's comments, > if we really would like to expose a new visible type to > userspace, how about giving it a meaningful name like > emptyfs or nullfs (I know it could have other meanings > in other OSes) from its tree hierarchy to avoid the > ambiguous "rootfs" naming, especially if it may be > considered for mounting by users in future potential use > cases? *boggle* _what_ potential use cases? "This here directory is empty and it'll stay empty and anyone trying to create stuff in it will get an error; oh, and we want it to be a mount boundary, for some reason"? IDGI... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-07 2:47 ` Al Viro @ 2026-01-07 2:55 ` Gao Xiang 2026-01-07 10:52 ` Christian Brauner 1 sibling, 0 replies; 16+ messages in thread From: Gao Xiang @ 2026-01-07 2:55 UTC (permalink / raw) To: Al Viro Cc: Christian Brauner, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On 2026/1/7 10:47, Al Viro wrote: > On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote: > >> Just one random suggestion. Regardless of Al's comments, >> if we really would like to expose a new visible type to >> userspace, how about giving it a meaningful name like >> emptyfs or nullfs (I know it could have other meanings >> in other OSes) from its tree hierarchy to avoid the >> ambiguous "rootfs" naming, especially if it may be >> considered for mounting by users in future potential use >> cases? > > *boggle* > > _what_ potential use cases? "This here directory is empty and > it'll stay empty and anyone trying to create stuff in it will > get an error; oh, and we want it to be a mount boundary, for > some reason"? > > IDGI... My concern is that "rootfs" naming is already (ab)used in various ways, although kernel folks know what happens here by checking the kernel code for example, but making it visible to users I'm afraid that userspace folks already get various concepts out of the word "root" (it's absolutely not "chroot" but for a mount namespace?). Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-07 2:47 ` Al Viro 2026-01-07 2:55 ` Gao Xiang @ 2026-01-07 10:52 ` Christian Brauner 2026-01-07 16:33 ` Colin Walters 1 sibling, 1 reply; 16+ messages in thread From: Christian Brauner @ 2026-01-07 10:52 UTC (permalink / raw) To: Al Viro Cc: Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote: > On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote: > > > Just one random suggestion. Regardless of Al's comments, > > if we really would like to expose a new visible type to > > userspace, how about giving it a meaningful name like > > emptyfs or nullfs (I know it could have other meanings > > in other OSes) from its tree hierarchy to avoid the > > ambiguous "rootfs" naming, especially if it may be > > considered for mounting by users in future potential use > > cases? > > *boggle* > > _what_ potential use cases? "This here directory is empty and > it'll stay empty and anyone trying to create stuff in it will > get an error; oh, and we want it to be a mount boundary, for > some reason"? > > IDGI... It's not a completely crazy idea. I thought about this as well. You could e.g. use it to overmount and hide other directories - like procfs overmounting or sysfs overmounting or hiding stuff in /etc where currently tmpfs is used. But tmpfs is not ideal because you don't get the reliable immutability guarantees. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-07 10:52 ` Christian Brauner @ 2026-01-07 16:33 ` Colin Walters 2026-01-08 11:02 ` Christian Brauner 0 siblings, 1 reply; 16+ messages in thread From: Colin Walters @ 2026-01-07 16:33 UTC (permalink / raw) To: Christian Brauner, Al Viro Cc: Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Wed, Jan 7, 2026, at 5:52 AM, Christian Brauner wrote: > On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote: >> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote: >> >> > Just one random suggestion. Regardless of Al's comments, >> > if we really would like to expose a new visible type to >> > userspace, how about giving it a meaningful name like >> > emptyfs or nullfs (I know it could have other meanings >> > in other OSes) from its tree hierarchy to avoid the >> > ambiguous "rootfs" naming, especially if it may be >> > considered for mounting by users in future potential use >> > cases? >> >> *boggle* >> >> _what_ potential use cases? "This here directory is empty and >> it'll stay empty and anyone trying to create stuff in it will >> get an error; oh, and we want it to be a mount boundary, for >> some reason"? >> >> IDGI... > > It's not a completely crazy idea. I thought about this as well. You > could e.g. use it to overmount and hide other directories - like procfs > overmounting or sysfs overmounting or hiding stuff in /etc where > currently tmpfs is used. But tmpfs is not ideal because you don't get > the reliable immutability guarantees. Yeah, there's e.g. `/usr/share/empty` that is intended for things like that as a canonical bind mount source. I also like this idea (though bikeshed I'd call it "emptyfs") but if we generalize it beyond just the current case, it probably needs to support configuring things like permissions (some cases may want 0700, others 0755 etc.) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-07 16:33 ` Colin Walters @ 2026-01-08 11:02 ` Christian Brauner 2026-01-25 20:47 ` Askar Safin 0 siblings, 1 reply; 16+ messages in thread From: Christian Brauner @ 2026-01-08 11:02 UTC (permalink / raw) To: Colin Walters Cc: Al Viro, Gao Xiang, linux-fsdevel, Jan Kara, Jeff Layton, Amir Goldstein, Lennart Poettering, Zbigniew Jędrzejewski-Szmek, Josef Bacik On Wed, Jan 07, 2026 at 11:33:29AM -0500, Colin Walters wrote: > > > On Wed, Jan 7, 2026, at 5:52 AM, Christian Brauner wrote: > > On Wed, Jan 07, 2026 at 02:47:27AM +0000, Al Viro wrote: > >> On Wed, Jan 07, 2026 at 10:28:23AM +0800, Gao Xiang wrote: > >> > >> > Just one random suggestion. Regardless of Al's comments, > >> > if we really would like to expose a new visible type to > >> > userspace, how about giving it a meaningful name like > >> > emptyfs or nullfs (I know it could have other meanings > >> > in other OSes) from its tree hierarchy to avoid the > >> > ambiguous "rootfs" naming, especially if it may be > >> > considered for mounting by users in future potential use > >> > cases? > >> > >> *boggle* > >> > >> _what_ potential use cases? "This here directory is empty and > >> it'll stay empty and anyone trying to create stuff in it will > >> get an error; oh, and we want it to be a mount boundary, for > >> some reason"? > >> > >> IDGI... > > > > It's not a completely crazy idea. I thought about this as well. You > > could e.g. use it to overmount and hide other directories - like procfs > > overmounting or sysfs overmounting or hiding stuff in /etc where > > currently tmpfs is used. But tmpfs is not ideal because you don't get > > the reliable immutability guarantees. > > Yeah, there's e.g. `/usr/share/empty` that is intended for things like that as a canonical bind mount source. > > I also like this idea (though bikeshed I'd call it "emptyfs") but if we generalize it beyond just the current case, it probably needs to support configuring things like permissions (some cases may want 0700, others 0755 etc.) We can start with the basic right now where it's not mountable from userspace and then make it mountable from userspace later. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] fs: add immutable rootfs 2026-01-08 11:02 ` Christian Brauner @ 2026-01-25 20:47 ` Askar Safin 0 siblings, 0 replies; 16+ messages in thread From: Askar Safin @ 2026-01-25 20:47 UTC (permalink / raw) To: brauner Cc: amir73il, hsiangkao, jack, jlayton, josef, lennart, linux-fsdevel, viro, walters, zbyszek Christian Brauner <brauner@kernel.org>: > We can start with the basic right now where it's not mountable from > userspace and then make it mountable from userspace later. For your information: if we make it mountable by userspace, then magic number will not be reliable indicator of whether this is actual root of hierarchy. But this is okay, because we can do listmount/statmount and check whether mount id is equal to parent mount id. -- Askar Safin ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-01-25 20:47 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-02 14:36 [PATCH 0/3] fs: add immutable rootfs and support pivot_root() in the initramfs Christian Brauner 2026-01-02 14:36 ` [PATCH 1/3] fs: ensure that internal tmpfs mount gets mount id zero Christian Brauner 2026-01-02 14:36 ` [PATCH 2/3] fs: add init_pivot_root() Christian Brauner 2026-01-02 14:36 ` [PATCH 3/3] fs: add immutable rootfs Christian Brauner 2026-01-04 7:27 ` Al Viro 2026-01-04 7:41 ` Al Viro 2026-01-06 22:07 ` Christian Brauner 2026-01-06 22:59 ` Al Viro 2026-01-07 10:53 ` Christian Brauner 2026-01-07 2:28 ` Gao Xiang 2026-01-07 2:47 ` Al Viro 2026-01-07 2:55 ` Gao Xiang 2026-01-07 10:52 ` Christian Brauner 2026-01-07 16:33 ` Colin Walters 2026-01-08 11:02 ` Christian Brauner 2026-01-25 20:47 ` Askar Safin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox