* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE [not found] <20251229-work-empty-namespace-v1-0-bfb24c7b061f@kernel.org> @ 2026-01-19 17:11 ` Askar Safin 2026-01-19 19:05 ` Andy Lutomirski [not found] ` <20251229-work-empty-namespace-v1-1-bfb24c7b061f@kernel.org> 1 sibling, 1 reply; 15+ messages in thread From: Askar Safin @ 2026-01-19 17:11 UTC (permalink / raw) To: brauner Cc: amir73il, cyphar, jack, jlayton, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Zhang Yunkai, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig Christian Brauner <brauner@kernel.org>: > Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to > OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of > returning a file descriptor referring to that mount tree > OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor > to a new mount namespace. In that new mount namespace the copied mount > tree has been mounted on top of a copy of the real rootfs. I want to point at security benefits of this. [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. I like them, and I think they should get wider exposure. ]] If this patchset ([1]) and [2] both land (they are both in "next" now and likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will usually contain exactly 2 mounts: nullfs and whatever was passed to open_tree(OPEN_TREE_NAMESPACE). This means that even if attacker somehow is able to unmount its root and get access to underlying mounts, then the only underlying thing they will get is nullfs. Also this means that other mounts are not only hidden in new namespace, they are fully absent. This prevents attacks discussed here: [3], [4]. Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" is passed), there is no anymore hidden writable mount shared by all containers, potentially available to attackers. This is concern raised in [5]: > You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to > actually _be_ a filesystem. Even with your "fix", containers could communicate > with each _other_ through it if it becomes accessible. If a container can get > access to an empty initramfs and write into it, it can ask/answer the question > "Are there any other containers on this machine running stux24" and then coordinate. Note: as well as I understand all actual security bugs are already fixed in kernel, runc and similar tools. But still [1] and [2] reduce chances of similar bugs in the future, and this is very good thing. Also: [1] and [2] are pretty big changes to how mount namespaces work, so I added more people and lists to CC. This mail is answer to [1]. [1] https://lore.kernel.org/all/20251229-work-empty-namespace-v1-0-bfb24c7b061f@kernel.org/ [2] https://lore.kernel.org/all/20260112-work-immutable-rootfs-v2-0-88dd1c34a204@kernel.org/ [3] https://lore.kernel.org/all/rxh6knvencwjajhgvdgzmrkwmyxwotu3itqyreun3h2pmaujhr@snhuqoq44kkf/ [4] https://github.com/opencontainers/runc/pull/1962 [5] https://lore.kernel.org/all/cec90924-e7ec-377c-fb02-e0f25ab9db73@landley.net/ -- Askar Safin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-19 17:11 ` [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE Askar Safin @ 2026-01-19 19:05 ` Andy Lutomirski 2026-01-19 22:21 ` Jeff Layton 0 siblings, 1 reply; 15+ messages in thread From: Andy Lutomirski @ 2026-01-19 19:05 UTC (permalink / raw) To: Askar Safin Cc: brauner, amir73il, cyphar, jack, jlayton, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Zhang Yunkai, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig On Mon, Jan 19, 2026 at 10:56 AM Askar Safin <safinaskar@gmail.com> wrote: > > Christian Brauner <brauner@kernel.org>: > > Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to > > OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of > > returning a file descriptor referring to that mount tree > > OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor > > to a new mount namespace. In that new mount namespace the copied mount > > tree has been mounted on top of a copy of the real rootfs. > > I want to point at security benefits of this. > > [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. > I like them, and I think they should get wider exposure. ]] > > If this patchset ([1]) and [2] both land (they are both in "next" now and > likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on > command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will > usually contain exactly 2 mounts: nullfs and whatever was passed to > open_tree(OPEN_TREE_NAMESPACE). > > This means that even if attacker somehow is able to unmount its root and > get access to underlying mounts, then the only underlying thing they will > get is nullfs. > > Also this means that other mounts are not only hidden in new namespace, they > are fully absent. This prevents attacks discussed here: [3], [4]. > > Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" > is passed), there is no anymore hidden writable mount shared by all containers, > potentially available to attackers. This is concern raised in [5]: > > > You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to > > actually _be_ a filesystem. Even with your "fix", containers could communicate > > with each _other_ through it if it becomes accessible. If a container can get > > access to an empty initramfs and write into it, it can ask/answer the question > > "Are there any other containers on this machine running stux24" and then coordinate. I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the path that gives it sensible behavior should be conditional like this. Either make it *always* mount on top of nullfs (regardless of boot options) or find some way to have it actually be the root. I assume the latter is challenging for some reason. --Andy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-19 19:05 ` Andy Lutomirski @ 2026-01-19 22:21 ` Jeff Layton 2026-01-21 10:20 ` Christian Brauner ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Jeff Layton @ 2026-01-19 22:21 UTC (permalink / raw) To: Andy Lutomirski, Askar Safin Cc: brauner, amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Zhang Yunkai, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig On Mon, 2026-01-19 at 11:05 -0800, Andy Lutomirski wrote: > On Mon, Jan 19, 2026 at 10:56 AM Askar Safin <safinaskar@gmail.com> wrote: > > > > Christian Brauner <brauner@kernel.org>: > > > Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to > > > OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of > > > returning a file descriptor referring to that mount tree > > > OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor > > > to a new mount namespace. In that new mount namespace the copied mount > > > tree has been mounted on top of a copy of the real rootfs. > > > > I want to point at security benefits of this. > > > > [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. > > I like them, and I think they should get wider exposure. ]] > > > > If this patchset ([1]) and [2] both land (they are both in "next" now and > > likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on > > command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will > > usually contain exactly 2 mounts: nullfs and whatever was passed to > > open_tree(OPEN_TREE_NAMESPACE). > > > > This means that even if attacker somehow is able to unmount its root and > > get access to underlying mounts, then the only underlying thing they will > > get is nullfs. > > > > Also this means that other mounts are not only hidden in new namespace, they > > are fully absent. This prevents attacks discussed here: [3], [4]. > > > > Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" > > is passed), there is no anymore hidden writable mount shared by all containers, > > potentially available to attackers. This is concern raised in [5]: > > > > > You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to > > > actually _be_ a filesystem. Even with your "fix", containers could communicate > > > with each _other_ through it if it becomes accessible. If a container can get > > > access to an empty initramfs and write into it, it can ask/answer the question > > > "Are there any other containers on this machine running stux24" and then coordinate. > > I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the > path that gives it sensible behavior should be conditional like this. > Either make it *always* mount on top of nullfs (regardless of boot > options) or find some way to have it actually be the root. I assume > the latter is challenging for some reason. > I think that's the plan. I suggested the same to Christian last week, and he was amenable to removing the option and just always doing a nullfs_rootfs mount. We think that older runtimes should still "just work" with this scheme. Out of an abundance of caution, we _might_ want a command-line option to make it go back to old way, in case we find some userland stuff that doesn't like this for some reason, but hopefully we won't even need that. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-19 22:21 ` Jeff Layton @ 2026-01-21 10:20 ` Christian Brauner 2026-01-21 18:00 ` Andy Lutomirski 2026-01-21 19:56 ` Rob Landley 2 siblings, 0 replies; 15+ messages in thread From: Christian Brauner @ 2026-01-21 10:20 UTC (permalink / raw) To: Jeff Layton Cc: Andy Lutomirski, Askar Safin, amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Zhang Yunkai, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig On Mon, Jan 19, 2026 at 05:21:30PM -0500, Jeff Layton wrote: > On Mon, 2026-01-19 at 11:05 -0800, Andy Lutomirski wrote: > > On Mon, Jan 19, 2026 at 10:56 AM Askar Safin <safinaskar@gmail.com> wrote: > > > > > > Christian Brauner <brauner@kernel.org>: > > > > Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to > > > > OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of > > > > returning a file descriptor referring to that mount tree > > > > OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor > > > > to a new mount namespace. In that new mount namespace the copied mount > > > > tree has been mounted on top of a copy of the real rootfs. > > > > > > I want to point at security benefits of this. > > > > > > [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. > > > I like them, and I think they should get wider exposure. ]] > > > > > > If this patchset ([1]) and [2] both land (they are both in "next" now and > > > likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on > > > command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will > > > usually contain exactly 2 mounts: nullfs and whatever was passed to > > > open_tree(OPEN_TREE_NAMESPACE). > > > > > > This means that even if attacker somehow is able to unmount its root and > > > get access to underlying mounts, then the only underlying thing they will > > > get is nullfs. > > > > > > Also this means that other mounts are not only hidden in new namespace, they > > > are fully absent. This prevents attacks discussed here: [3], [4]. > > > > > > Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" > > > is passed), there is no anymore hidden writable mount shared by all containers, > > > potentially available to attackers. This is concern raised in [5]: > > > > > > > You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to > > > > actually _be_ a filesystem. Even with your "fix", containers could communicate > > > > with each _other_ through it if it becomes accessible. If a container can get > > > > access to an empty initramfs and write into it, it can ask/answer the question > > > > "Are there any other containers on this machine running stux24" and then coordinate. > > > > I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the > > path that gives it sensible behavior should be conditional like this. > > Either make it *always* mount on top of nullfs (regardless of boot > > options) or find some way to have it actually be the root. I assume > > the latter is challenging for some reason. > > > > I think that's the plan. I suggested the same to Christian last week, > and he was amenable to removing the option and just always doing a > nullfs_rootfs mount. Whether or not the underlying mount is nullfs or not is irrelevant. If it's not nullfs but a regular tmpfs it works just as well. If it has any locked overmounts the new rootfs will become locked as well similarly if it'll be owned by a new userns. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-19 22:21 ` Jeff Layton 2026-01-21 10:20 ` Christian Brauner @ 2026-01-21 18:00 ` Andy Lutomirski 2026-01-23 10:23 ` Christian Brauner 2026-01-21 19:56 ` Rob Landley 2 siblings, 1 reply; 15+ messages in thread From: Andy Lutomirski @ 2026-01-21 18:00 UTC (permalink / raw) To: Jeff Layton Cc: Askar Safin, brauner, amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Yunkai Zhang, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig > On Jan 19, 2026, at 2:21 PM, Jeff Layton <jlayton@kernel.org> wrote: > > On Mon, 2026-01-19 at 11:05 -0800, Andy Lutomirski wrote: >>> On Mon, Jan 19, 2026 at 10:56 AM Askar Safin <safinaskar@gmail.com> wrote: >>> >>> Christian Brauner <brauner@kernel.org>: >>>> Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to >>>> OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of >>>> returning a file descriptor referring to that mount tree >>>> OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor >>>> to a new mount namespace. In that new mount namespace the copied mount >>>> tree has been mounted on top of a copy of the real rootfs. >>> >>> I want to point at security benefits of this. >>> >>> [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. >>> I like them, and I think they should get wider exposure. ]] >>> >>> If this patchset ([1]) and [2] both land (they are both in "next" now and >>> likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on >>> command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will >>> usually contain exactly 2 mounts: nullfs and whatever was passed to >>> open_tree(OPEN_TREE_NAMESPACE). >>> >>> This means that even if attacker somehow is able to unmount its root and >>> get access to underlying mounts, then the only underlying thing they will >>> get is nullfs. >>> >>> Also this means that other mounts are not only hidden in new namespace, they >>> are fully absent. This prevents attacks discussed here: [3], [4]. >>> >>> Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" >>> is passed), there is no anymore hidden writable mount shared by all containers, >>> potentially available to attackers. This is concern raised in [5]: >>> >>>> You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to >>>> actually _be_ a filesystem. Even with your "fix", containers could communicate >>>> with each _other_ through it if it becomes accessible. If a container can get >>>> access to an empty initramfs and write into it, it can ask/answer the question >>>> "Are there any other containers on this machine running stux24" and then coordinate. >> >> I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the >> path that gives it sensible behavior should be conditional like this. >> Either make it *always* mount on top of nullfs (regardless of boot >> options) or find some way to have it actually be the root. I assume >> the latter is challenging for some reason. >> > > I think that's the plan. I suggested the same to Christian last week, > and he was amenable to removing the option and just always doing a > nullfs_rootfs mount. > > We think that older runtimes should still "just work" with this scheme. > Out of an abundance of caution, we _might_ want a command-line option > to make it go back to old way, in case we find some userland stuff that > doesn't like this for some reason, but hopefully we won't even need > that. What I mean is: even if for some reason the kernel is running in a mode where the *initial* rootfs is a real fs, I think it would be nice for OPEN_TREE_NAMESPACE to use nullfs. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-21 18:00 ` Andy Lutomirski @ 2026-01-23 10:23 ` Christian Brauner 2026-01-24 10:13 ` Askar Safin 0 siblings, 1 reply; 15+ messages in thread From: Christian Brauner @ 2026-01-23 10:23 UTC (permalink / raw) To: Andy Lutomirski Cc: Jeff Layton, Askar Safin, amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Yunkai Zhang, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, emily, Christoph Hellwig On Wed, Jan 21, 2026 at 10:00:19AM -0800, Andy Lutomirski wrote: > > On Jan 19, 2026, at 2:21 PM, Jeff Layton <jlayton@kernel.org> wrote: > > > > On Mon, 2026-01-19 at 11:05 -0800, Andy Lutomirski wrote: > >>> On Mon, Jan 19, 2026 at 10:56 AM Askar Safin <safinaskar@gmail.com> wrote: > >>> > >>> Christian Brauner <brauner@kernel.org>: > >>>> Extend open_tree() with a new OPEN_TREE_NAMESPACE flag. Similar to > >>>> OPEN_TREE_CLONE only the indicated mount tree is copied. Instead of > >>>> returning a file descriptor referring to that mount tree > >>>> OPEN_TREE_NAMESPACE will cause open_tree() to return a file descriptor > >>>> to a new mount namespace. In that new mount namespace the copied mount > >>>> tree has been mounted on top of a copy of the real rootfs. > >>> > >>> I want to point at security benefits of this. > >>> > >>> [[ TL;DR: [1] and [2] are very big changes to how mount namespaces work. > >>> I like them, and I think they should get wider exposure. ]] > >>> > >>> If this patchset ([1]) and [2] both land (they are both in "next" now and > >>> likely will be submitted to mainline soon) and "nullfs_rootfs" is passed on > >>> command line, then mount namespace created by open_tree(OPEN_TREE_NAMESPACE) will > >>> usually contain exactly 2 mounts: nullfs and whatever was passed to > >>> open_tree(OPEN_TREE_NAMESPACE). > >>> > >>> This means that even if attacker somehow is able to unmount its root and > >>> get access to underlying mounts, then the only underlying thing they will > >>> get is nullfs. > >>> > >>> Also this means that other mounts are not only hidden in new namespace, they > >>> are fully absent. This prevents attacks discussed here: [3], [4]. > >>> > >>> Also this means that (assuming we have both [1] and [2] and "nullfs_rootfs" > >>> is passed), there is no anymore hidden writable mount shared by all containers, > >>> potentially available to attackers. This is concern raised in [5]: > >>> > >>>> You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to > >>>> actually _be_ a filesystem. Even with your "fix", containers could communicate > >>>> with each _other_ through it if it becomes accessible. If a container can get > >>>> access to an empty initramfs and write into it, it can ask/answer the question > >>>> "Are there any other containers on this machine running stux24" and then coordinate. > >> > >> I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the > >> path that gives it sensible behavior should be conditional like this. > >> Either make it *always* mount on top of nullfs (regardless of boot > >> options) or find some way to have it actually be the root. I assume > >> the latter is challenging for some reason. > >> > > > > I think that's the plan. I suggested the same to Christian last week, > > and he was amenable to removing the option and just always doing a > > nullfs_rootfs mount. > > > > We think that older runtimes should still "just work" with this scheme. > > Out of an abundance of caution, we _might_ want a command-line option > > to make it go back to old way, in case we find some userland stuff that > > doesn't like this for some reason, but hopefully we won't even need > > that. > > What I mean is: even if for some reason the kernel is running in a > mode where the *initial* rootfs is a real fs, I think it would be nice > for OPEN_TREE_NAMESPACE to use nullfs. The current patchset makes nullfs unconditional. As each mount namespaces creates a new copy of the namespace root of the namespace it was created from all mount namespace have nullfs as namespace root. So every OPEN_TREE_NAMESPACE/FSMOUNT_NAMESPACE will be mounted on top of nullfs as we always take the namespace root. If we have to make nullfs conditional then yes, we could still do that - althoug it would be ugly in various ways. I would love to keep nullfs unconditional because it means I can wipe a whole class of MNT_LOCKED nonsense from the face of the earth afterwards. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-23 10:23 ` Christian Brauner @ 2026-01-24 10:13 ` Askar Safin 0 siblings, 0 replies; 15+ messages in thread From: Askar Safin @ 2026-01-24 10:13 UTC (permalink / raw) To: Christian Brauner Cc: Andy Lutomirski, Jeff Layton, amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Yunkai Zhang, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, Rob Landley, Christoph Hellwig On Fri, Jan 23, 2026 at 1:23 PM Christian Brauner <brauner@kernel.org> wrote: > The current patchset makes nullfs unconditional. As each mount Oops, I missed that "fs: use nullfs unconditionally as the real rootfs" is present in vfs.all. -- Askar Safin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-19 22:21 ` Jeff Layton 2026-01-21 10:20 ` Christian Brauner 2026-01-21 18:00 ` Andy Lutomirski @ 2026-01-21 19:56 ` Rob Landley 2026-02-19 23:42 ` Askar Safin 2 siblings, 1 reply; 15+ messages in thread From: Rob Landley @ 2026-01-21 19:56 UTC (permalink / raw) To: Jeff Layton, Andy Lutomirski, Askar Safin Cc: amir73il, cyphar, jack, josef, linux-fsdevel, viro, Lennart Poettering, David Howells, Zhang Yunkai, cgel.zte, Menglong Dong, linux-kernel, initramfs, containers, linux-api, news, lwn, Jonathan Corbet, emily, Christoph Hellwig >>>> You want rootfs to be a NULLFS instead of ramfs. You don't seem to want it to >>>> actually _be_ a filesystem. Even with your "fix", containers could communicate >>>> with each _other_ through it if it becomes accessible. If a container can get >>>> access to an empty initramfs and write into it, it can ask/answer the question >>>> "Are there any other containers on this machine running stux24" and then coordinate. Or you could just make the ROOT= codepath remount the empty initramfs -o ro like some switch_root implementations do. If the PID 1 you launch isn't in initramfs, don't leave initramfs writeable. That seems unlikely to break userspace. (Having permissions to remount initramfs but _not_ having already "cracked root" seems... a bit funky? You have waaaaay more faith in security modules than I do...) >> I think this new OPEN_TREE_NAMESPACE is nifty, but I don't think the >> path that gives it sensible behavior should be conditional like this. >> Either make it *always* mount on top of nullfs (regardless of boot >> options) or find some way to have it actually be the root. I assume >> the latter is challenging for some reason. > > I think that's the plan. I suggested the same to Christian last week, > and he was amenable to removing the option and just always doing a > nullfs_rootfs mount. Since 2013, initramfs might be ramfs or tmpfs depending on circumstances. Adding a third option for it be nullfs when there's no cpio.gz extracted into it seems reasonable. (You can always mount a tmpfs _over_ it if you need that later, it's writeable so a PID 1 launched in it has workspace.) That said, if you are changing the semantics, right now we switch_root from initramfs instead of pivot_root because initramfs couldn't be unmounted. With this change would pivot_root become the mechanism for initramfs too? (If the cpio.gz recipient wasn't actually rootfs but was an overmount the way ROOT= does it.) Aside: it would be nice if inaccessible mount points could automatically be garbage collected. There's already some "lazy umount" plumbing that does that when explicitly requested to, but last I checked there were cases that didn't get caught. It's been a while though, might already have been fixed. Presumably initramfs would always get pinned because it's PID 0's / reference... Also, could you guys make CONFIG_DEVTMPFS_MOUNT work with initramfs? I've posted patches for that on and off since 2017, most recent one's probably https://landley.net/bin/mkroot/0.8.13/linux-patches/0003-Wire-up-CONFIG_DEVTMPFS_MOUNT-to-initramfs.patch (tested on a 6.17 kernel). > We think that older runtimes should still "just work" with this scheme. > Out of an abundance of caution, we _might_ want a command-line option > to make it go back to old way, in case we find some userland stuff that > doesn't like this for some reason, but hopefully we won't even need > that. I assume it will break stuff, but I also assume the systems it breaks will never upgrade to a 7.x kernel because the kernel itself would consume all available memory before launching PID 1. Rob ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE 2026-01-21 19:56 ` Rob Landley @ 2026-02-19 23:42 ` Askar Safin 0 siblings, 0 replies; 15+ messages in thread From: Askar Safin @ 2026-02-19 23:42 UTC (permalink / raw) To: rob; +Cc: containers, initramfs, linux-api, linux-fsdevel, linux-kernel Rob Landley <rob@landley.net>: > Also, could you guys make CONFIG_DEVTMPFS_MOUNT work with initramfs? I did something similar: https://lore.kernel.org/initramfs/20260219210312.3468980-1-safinaskar@gmail.com/T/#u Does this solve your problem? -- Askar Safin ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20251229-work-empty-namespace-v1-1-bfb24c7b061f@kernel.org>]
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE [not found] ` <20251229-work-empty-namespace-v1-1-bfb24c7b061f@kernel.org> @ 2026-02-24 11:23 ` Florian Weimer 2026-02-24 12:05 ` Christian Brauner 0 siblings, 1 reply; 15+ messages in thread From: Florian Weimer @ 2026-02-24 11:23 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi * Christian Brauner: > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h > index 5d3f8c9e3a62..acbc22241c9c 100644 > --- a/include/uapi/linux/mount.h > +++ b/include/uapi/linux/mount.h > @@ -61,7 +61,8 @@ > /* > * open_tree() flags. > */ > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ This change causes pointless -Werror=undef errors in projects that have settled on the old definition. Reported here: Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> Thanks, Florian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE 2026-02-24 11:23 ` [PATCH 1/2] " Florian Weimer @ 2026-02-24 12:05 ` Christian Brauner 2026-02-24 13:30 ` Florian Weimer 0 siblings, 1 reply; 15+ messages in thread From: Christian Brauner @ 2026-02-24 12:05 UTC (permalink / raw) To: Florian Weimer Cc: linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi On Tue, Feb 24, 2026 at 12:23:33PM +0100, Florian Weimer wrote: > * Christian Brauner: > > > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h > > index 5d3f8c9e3a62..acbc22241c9c 100644 > > --- a/include/uapi/linux/mount.h > > +++ b/include/uapi/linux/mount.h > > @@ -61,7 +61,8 @@ > > /* > > * open_tree() flags. > > */ > > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ > > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ > > This change causes pointless -Werror=undef errors in projects that have > settled on the old definition. > > Reported here: > > Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE > <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> Send a patch to change it back, please. Otherwise it might take a few days until I get around to it. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE 2026-02-24 12:05 ` Christian Brauner @ 2026-02-24 13:30 ` Florian Weimer 2026-02-24 14:33 ` Christian Brauner 0 siblings, 1 reply; 15+ messages in thread From: Florian Weimer @ 2026-02-24 13:30 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi * Christian Brauner: > On Tue, Feb 24, 2026 at 12:23:33PM +0100, Florian Weimer wrote: >> * Christian Brauner: >> >> > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h >> > index 5d3f8c9e3a62..acbc22241c9c 100644 >> > --- a/include/uapi/linux/mount.h >> > +++ b/include/uapi/linux/mount.h >> > @@ -61,7 +61,8 @@ >> > /* >> > * open_tree() flags. >> > */ >> > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ >> > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ >> >> This change causes pointless -Werror=undef errors in projects that have >> settled on the old definition. >> >> Reported here: >> >> Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE >> <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> > > Send a patch to change it back, please. > Otherwise it might take a few days until I get around to it. Rudi, could you post a patch? Thanks, Florian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE 2026-02-24 13:30 ` Florian Weimer @ 2026-02-24 14:33 ` Christian Brauner 2026-02-26 11:54 ` Jan Kara 2026-03-02 10:15 ` Florian Weimer 0 siblings, 2 replies; 15+ messages in thread From: Christian Brauner @ 2026-02-24 14:33 UTC (permalink / raw) To: Florian Weimer Cc: linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi On Tue, Feb 24, 2026 at 02:30:37PM +0100, Florian Weimer wrote: > * Christian Brauner: > > > On Tue, Feb 24, 2026 at 12:23:33PM +0100, Florian Weimer wrote: > >> * Christian Brauner: > >> > >> > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h > >> > index 5d3f8c9e3a62..acbc22241c9c 100644 > >> > --- a/include/uapi/linux/mount.h > >> > +++ b/include/uapi/linux/mount.h > >> > @@ -61,7 +61,8 @@ > >> > /* > >> > * open_tree() flags. > >> > */ > >> > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ > >> > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ > >> > >> This change causes pointless -Werror=undef errors in projects that have > >> settled on the old definition. > >> > >> Reported here: > >> > >> Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE > >> <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> > > > > Send a patch to change it back, please. > > Otherwise it might take a few days until I get around to it. > > Rudi, could you post a patch? I'm a bit confused though and not super happy that you're basically asking us to be so constrained that we aren't even allowed to change 1 to 1 - just syntactically different. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE 2026-02-24 14:33 ` Christian Brauner @ 2026-02-26 11:54 ` Jan Kara 2026-03-02 10:15 ` Florian Weimer 1 sibling, 0 replies; 15+ messages in thread From: Jan Kara @ 2026-02-26 11:54 UTC (permalink / raw) To: Christian Brauner Cc: Florian Weimer, linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi On Tue 24-02-26 15:33:13, Christian Brauner wrote: > On Tue, Feb 24, 2026 at 02:30:37PM +0100, Florian Weimer wrote: > > * Christian Brauner: > > > > > On Tue, Feb 24, 2026 at 12:23:33PM +0100, Florian Weimer wrote: > > >> * Christian Brauner: > > >> > > >> > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h > > >> > index 5d3f8c9e3a62..acbc22241c9c 100644 > > >> > --- a/include/uapi/linux/mount.h > > >> > +++ b/include/uapi/linux/mount.h > > >> > @@ -61,7 +61,8 @@ > > >> > /* > > >> > * open_tree() flags. > > >> > */ > > >> > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ > > >> > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ > > >> > > >> This change causes pointless -Werror=undef errors in projects that have > > >> settled on the old definition. > > >> > > >> Reported here: > > >> > > >> Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE > > >> <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> > > > > > > Send a patch to change it back, please. > > > Otherwise it might take a few days until I get around to it. > > > > Rudi, could you post a patch? > > I'm a bit confused though and not super happy that you're basically > asking us to be so constrained that we aren't even allowed to change 1 > to 1 - just syntactically different. Agreed, this looks more like a tooling bug than anything else... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] mount: add OPEN_TREE_NAMESPACE 2026-02-24 14:33 ` Christian Brauner 2026-02-26 11:54 ` Jan Kara @ 2026-03-02 10:15 ` Florian Weimer 1 sibling, 0 replies; 15+ messages in thread From: Florian Weimer @ 2026-03-02 10:15 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jeff Layton, Alexander Viro, Amir Goldstein, Josef Bacik, Jan Kara, Aleksa Sarai, linux-api, rudi * Christian Brauner: > On Tue, Feb 24, 2026 at 02:30:37PM +0100, Florian Weimer wrote: >> * Christian Brauner: >> >> > On Tue, Feb 24, 2026 at 12:23:33PM +0100, Florian Weimer wrote: >> >> * Christian Brauner: >> >> >> >> > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h >> >> > index 5d3f8c9e3a62..acbc22241c9c 100644 >> >> > --- a/include/uapi/linux/mount.h >> >> > +++ b/include/uapi/linux/mount.h >> >> > @@ -61,7 +61,8 @@ >> >> > /* >> >> > * open_tree() flags. >> >> > */ >> >> > -#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ >> >> > +#define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ >> >> >> >> This change causes pointless -Werror=undef errors in projects that have >> >> settled on the old definition. >> >> >> >> Reported here: >> >> >> >> Bug 33921 - Building with Linux-7.0-rc1 errors on OPEN_TREE_CLONE >> >> <https://sourceware.org/bugzilla/show_bug.cgi?id=33921> >> > >> > Send a patch to change it back, please. >> > Otherwise it might take a few days until I get around to it. >> >> Rudi, could you post a patch? > > I'm a bit confused though and not super happy that you're basically > asking us to be so constrained that we aren't even allowed to change 1 > to 1 - just syntactically different. I'm not happy about it, either. But it has happened before, for the RENAME_* constants I believe. We are already including <linux/mount.h> from <sys/mount.h>, so we can work around this reliably on the glibc side, regardless of header inclusion order. Thanks, Florian ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-03-02 10:15 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20251229-work-empty-namespace-v1-0-bfb24c7b061f@kernel.org>
2026-01-19 17:11 ` [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE Askar Safin
2026-01-19 19:05 ` Andy Lutomirski
2026-01-19 22:21 ` Jeff Layton
2026-01-21 10:20 ` Christian Brauner
2026-01-21 18:00 ` Andy Lutomirski
2026-01-23 10:23 ` Christian Brauner
2026-01-24 10:13 ` Askar Safin
2026-01-21 19:56 ` Rob Landley
2026-02-19 23:42 ` Askar Safin
[not found] ` <20251229-work-empty-namespace-v1-1-bfb24c7b061f@kernel.org>
2026-02-24 11:23 ` [PATCH 1/2] " Florian Weimer
2026-02-24 12:05 ` Christian Brauner
2026-02-24 13:30 ` Florian Weimer
2026-02-24 14:33 ` Christian Brauner
2026-02-26 11:54 ` Jan Kara
2026-03-02 10:15 ` Florian Weimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox