* Apparent mount behaviour change in 6.15 @ 2025-05-15 4:06 Allison Karlitskaya 2025-05-15 11:25 ` Christian Brauner 0 siblings, 1 reply; 11+ messages in thread From: Allison Karlitskaya @ 2025-05-15 4:06 UTC (permalink / raw) To: Christian Brauner; +Cc: linux-fsdevel hi, The CI in the composefs-rs project picked up an interesting issue with the recent mount API changes in the latest 6.15-rc kernels. I've managed to produce a minimal reproducer. ==> test.sh <== #!/bin/sh set -eux uname -a umount --recursive tmp/mnt || true rm -rf tmp/mnt tmp/new mkdir -p tmp/mnt tmp/new tmp/new/old touch tmp/mnt/this-is-old touch tmp/new/this-is-new ./swapmnt tmp/new tmp/mnt find tmp/mnt ==> swapmnt.c <== // Replace [old] with a clone of [new], moving [old] to [new]/"old" #define _GNU_SOURCE #include <err.h> #include <fcntl.h> #include <linux/mount.h> #include <stdio.h> #include <stdlib.h> #include <sys/stat.h> #include <sys/syscall.h> #include <unistd.h> int main (int argc, char **argv) { if (argc != 3) { fprintf(stderr, "usage: %s [new] [old]", argv[0]); return 1; } const char *new = argv[1]; const char *old = argv[2]; int oldfd = syscall(SYS_open_tree, AT_FDCWD, old, OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC); if (oldfd == -1) err(EXIT_FAILURE, "open_tree('%s', OPEN_TREE_CLONE)", old); int newfd = syscall(SYS_open_tree, AT_FDCWD, new, OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC); if (newfd == -1) err(EXIT_FAILURE, "open_tree('%s', OPEN_TREE_CLONE)", new); if (syscall(SYS_move_mount, newfd, "", AT_FDCWD, old, MOVE_MOUNT_F_EMPTY_PATH)) err(EXIT_FAILURE, "move_mount('%s' -> '%s')", new, old); if (syscall(SYS_move_mount, oldfd, "", newfd, "old", MOVE_MOUNT_F_EMPTY_PATH)) err(EXIT_FAILURE, "move_mount('%s' -> (new)'%s'/old)", old, old); return 0; } On 6.14 (Fedora 42) this looks like: [root@fedora-bls-efi-127-0-0-2-2201 tmp]# sh test.sh + uname -a Linux fedora-bls-efi-127-0-0-2-2201 6.14.5-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Fri May 2 14:16:46 UTC 2025 x86_64 GNU/Linux + umount --recursive tmp/mnt umount: tmp/mnt: not found + true + rm -rf tmp/mnt tmp/new + mkdir -p tmp/mnt tmp/new tmp/new/old + touch tmp/mnt/this-is-old + touch tmp/new/this-is-new + ./swapmnt tmp/new tmp/mnt + find tmp/mnt tmp/mnt tmp/mnt/this-is-new tmp/mnt/old tmp/mnt/old/this-is-old [root@fedora-bls-efi-127-0-0-2-2201 tmp]# On 6.15 from yesterday (9f35e33144ae, via @kernel-vanilla/mainline copr, on rawhide): [root@fedora tmp]# sh test.sh + uname -a Linux fedora 6.15.0-0.rc6.20250514.9f35e331.450.vanilla.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Wed May 14 04:18:35 UTC 2025 x86_64 GNU/Linux + umount --recursive tmp/mnt umount: tmp/mnt: not mounted + true + rm -rf tmp/mnt tmp/new + mkdir -p tmp/mnt tmp/new tmp/new/old + touch tmp/mnt/this-is-old + touch tmp/new/this-is-new + ./swapmnt tmp/new tmp/mnt + find tmp/mnt tmp/mnt tmp/mnt/this-is-new find: File system loop detected; ‘tmp/mnt/old’ is part of the same file system loop as ‘tmp/mnt’. [root@fedora tmp]# Otherwise, I gotta say I'm loving all of the new mount work this cycle! Thanks, lis ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-15 4:06 Apparent mount behaviour change in 6.15 Allison Karlitskaya @ 2025-05-15 11:25 ` Christian Brauner 2025-05-23 6:32 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Christian Brauner @ 2025-05-15 11:25 UTC (permalink / raw) To: Allison Karlitskaya, Alexander Viro; +Cc: linux-fsdevel On Thu, May 15, 2025 at 06:06:13AM +0200, Allison Karlitskaya wrote: > hi, > > The CI in the composefs-rs project picked up an interesting issue with > the recent mount API changes in the latest 6.15-rc kernels. I've > managed to produce a minimal reproducer. > > ==> test.sh <== > #!/bin/sh > > set -eux > > uname -a > umount --recursive tmp/mnt || true > rm -rf tmp/mnt tmp/new > mkdir -p tmp/mnt tmp/new tmp/new/old > touch tmp/mnt/this-is-old > touch tmp/new/this-is-new > ./swapmnt tmp/new tmp/mnt > find tmp/mnt > > ==> swapmnt.c <== > // Replace [old] with a clone of [new], moving [old] to [new]/"old" > > #define _GNU_SOURCE > #include <err.h> > #include <fcntl.h> > #include <linux/mount.h> > #include <stdio.h> > #include <stdlib.h> > #include <sys/stat.h> > #include <sys/syscall.h> > #include <unistd.h> > > int > main (int argc, char **argv) { > if (argc != 3) { > fprintf(stderr, "usage: %s [new] [old]", argv[0]); > return 1; > } > > const char *new = argv[1]; > const char *old = argv[2]; > > int oldfd = syscall(SYS_open_tree, AT_FDCWD, old, > OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC); > if (oldfd == -1) > err(EXIT_FAILURE, "open_tree('%s', OPEN_TREE_CLONE)", old); > > int newfd = syscall(SYS_open_tree, AT_FDCWD, new, > OPEN_TREE_CLONE|OPEN_TREE_CLOEXEC); > if (newfd == -1) > err(EXIT_FAILURE, "open_tree('%s', OPEN_TREE_CLONE)", new); > > if (syscall(SYS_move_mount, newfd, "", AT_FDCWD, old, > MOVE_MOUNT_F_EMPTY_PATH)) > err(EXIT_FAILURE, "move_mount('%s' -> '%s')", new, old); > > if (syscall(SYS_move_mount, oldfd, "", newfd, "old", MOVE_MOUNT_F_EMPTY_PATH)) > err(EXIT_FAILURE, "move_mount('%s' -> (new)'%s'/old)", old, old); > > return 0; > } > > On 6.14 (Fedora 42) this looks like: > > [root@fedora-bls-efi-127-0-0-2-2201 tmp]# sh test.sh > + uname -a > Linux fedora-bls-efi-127-0-0-2-2201 6.14.5-300.fc42.x86_64 #1 SMP > PREEMPT_DYNAMIC Fri May 2 14:16:46 UTC 2025 x86_64 GNU/Linux > + umount --recursive tmp/mnt > umount: tmp/mnt: not found > + true > + rm -rf tmp/mnt tmp/new > + mkdir -p tmp/mnt tmp/new tmp/new/old > + touch tmp/mnt/this-is-old > + touch tmp/new/this-is-new > + ./swapmnt tmp/new tmp/mnt > + find tmp/mnt > tmp/mnt > tmp/mnt/this-is-new > tmp/mnt/old > tmp/mnt/old/this-is-old > [root@fedora-bls-efi-127-0-0-2-2201 tmp]# > > On 6.15 from yesterday (9f35e33144ae, via @kernel-vanilla/mainline > copr, on rawhide): > > [root@fedora tmp]# sh test.sh > + uname -a > Linux fedora 6.15.0-0.rc6.20250514.9f35e331.450.vanilla.fc43.x86_64 #1 > SMP PREEMPT_DYNAMIC Wed May 14 04:18:35 UTC 2025 x86_64 GNU/Linux > + umount --recursive tmp/mnt > umount: tmp/mnt: not mounted > + true > + rm -rf tmp/mnt tmp/new > + mkdir -p tmp/mnt tmp/new tmp/new/old > + touch tmp/mnt/this-is-old > + touch tmp/new/this-is-new > + ./swapmnt tmp/new tmp/mnt > + find tmp/mnt > tmp/mnt > tmp/mnt/this-is-new > find: File system loop detected; ‘tmp/mnt/old’ is part of the same > file system loop as ‘tmp/mnt’. > [root@fedora tmp]# > > Otherwise, I gotta say I'm loving all of the new mount work this cycle! Thank you! I'm happy it's useful and that it's picked up so quickly! :) This is caused by my patch to allow mount propagation into detached mount trees. That had been disabled forever until v6.15 to give strong guarantees to userspace that detached mount trees are really detached. I thought it might be useful for them to partake in propagation via the MNTNS_PROPAGATION flag. But I've alredy expressed that I regret this in another thread we had. Al, I want to kill this again and restore the pre v6.15 behavior. Allowing mount propagation for detached trees was a crazy idea on my part. It's a pain and it regresses userspace. If composefs is broken by this then systemd will absolutely get broken by my change as well. Something like this will allow to restore the status-quo: fs/mount.h | 5 ----- fs/namespace.c | 15 ++------------- fs/pnode.c | 3 --- fs/pnode.h | 2 +- 4 files changed, 3 insertions(+), 22 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index 7aecf2a60472..ad7173037924 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -7,10 +7,6 @@ extern struct list_head notify_list; -typedef __u32 __bitwise mntns_flags_t; - -#define MNTNS_PROPAGATING ((__force mntns_flags_t)(1 << 0)) - struct mnt_namespace { struct ns_common ns; struct mount * root; @@ -37,7 +33,6 @@ struct mnt_namespace { struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ struct list_head mnt_ns_list; /* entry in the sequential list of mounts namespace */ refcount_t passive; /* number references not pinning @mounts */ - mntns_flags_t mntns_flags; } __randomize_layout; struct mnt_pcp { diff --git a/fs/namespace.c b/fs/namespace.c index 04a9bb9f31fa..900ffaeff7c6 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3647,7 +3647,7 @@ static int do_move_mount(struct path *old_path, if (!(attached ? check_mnt(old) : is_anon_ns(ns))) goto out; - if (is_anon_ns(ns)) { + if (is_anon_ns(ns) && ns == p->mnt_ns) { /* * Ending up with two files referring to the root of the * same anonymous mount namespace would cause an error @@ -3655,16 +3655,7 @@ static int do_move_mount(struct path *old_path, * twice into the mount tree which would be rejected * later. But be explicit about it right here. */ - if ((is_anon_ns(p->mnt_ns) && ns == p->mnt_ns)) - goto out; - - /* - * If this is an anonymous mount tree ensure that mount - * propagation can detect mounts that were just - * propagated to the target mount tree so we don't - * propagate onto them. - */ - ns->mntns_flags |= MNTNS_PROPAGATING; + goto out; } else if (is_anon_ns(p->mnt_ns)) { /* * Don't allow moving an attached mount tree to an @@ -3721,8 +3712,6 @@ static int do_move_mount(struct path *old_path, if (attached) put_mountpoint(old_mp); out: - if (is_anon_ns(ns)) - ns->mntns_flags &= ~MNTNS_PROPAGATING; unlock_mount(mp); if (!err) { if (attached) { diff --git a/fs/pnode.c b/fs/pnode.c index fb77427df39e..d80944d66237 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -231,9 +231,6 @@ static int propagate_one(struct mount *m, struct mountpoint *dest_mp) /* skip if mountpoint isn't visible in m */ if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root)) return 0; - /* skip if m is in the anon_ns we are emptying */ - if (m->mnt_ns->mntns_flags & MNTNS_PROPAGATING) - return 0; if (peers(m, last_dest)) { type = CL_MAKE_SHARED; diff --git a/fs/pnode.h b/fs/pnode.h index 34b6247af01d..077028814e8d 100644 --- a/fs/pnode.h +++ b/fs/pnode.h @@ -12,7 +12,7 @@ #define IS_MNT_SHARED(m) ((m)->mnt.mnt_flags & MNT_SHARED) #define IS_MNT_SLAVE(m) ((m)->mnt_master) -#define IS_MNT_NEW(m) (!(m)->mnt_ns) +#define IS_MNT_NEW(m) (!(m)->mnt_ns || is_anon_ns((m)->mnt_ns)) #define CLEAR_MNT_SHARED(m) ((m)->mnt.mnt_flags &= ~MNT_SHARED) #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE) #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED) -- 2.47.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-15 11:25 ` Christian Brauner @ 2025-05-23 6:32 ` Al Viro 2025-05-23 8:41 ` Christian Brauner 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2025-05-23 6:32 UTC (permalink / raw) To: Christian Brauner; +Cc: Allison Karlitskaya, linux-fsdevel On Thu, May 15, 2025 at 01:25:27PM +0200, Christian Brauner wrote: > Al, I want to kill this again and restore the pre v6.15 behavior. > Allowing mount propagation for detached trees was a crazy > idea on my part. It's a pain and it regresses userspace. If composefs is > broken by this then systemd will absolutely get broken by my change as > well. > > Something like this will allow to restore the status-quo: > -#define IS_MNT_NEW(m) (!(m)->mnt_ns) > +#define IS_MNT_NEW(m) (!(m)->mnt_ns || is_anon_ns((m)->mnt_ns)) FWIW, I'm not sure that ever had been quite correct, no matter how you call the macro. I'm not up to building a counterexample right now, will do in the morning... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 6:32 ` Al Viro @ 2025-05-23 8:41 ` Christian Brauner 2025-05-23 21:29 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Christian Brauner @ 2025-05-23 8:41 UTC (permalink / raw) To: Al Viro; +Cc: Allison Karlitskaya, linux-fsdevel On Fri, May 23, 2025 at 07:32:38AM +0100, Al Viro wrote: > On Thu, May 15, 2025 at 01:25:27PM +0200, Christian Brauner wrote: > > > Al, I want to kill this again and restore the pre v6.15 behavior. > > Allowing mount propagation for detached trees was a crazy > > idea on my part. It's a pain and it regresses userspace. If composefs is > > broken by this then systemd will absolutely get broken by my change as > > well. > > > > Something like this will allow to restore the status-quo: > > > -#define IS_MNT_NEW(m) (!(m)->mnt_ns) > > +#define IS_MNT_NEW(m) (!(m)->mnt_ns || is_anon_ns((m)->mnt_ns)) > > FWIW, I'm not sure that ever had been quite correct, no matter how you > call the macro. I'm not up to building a counterexample right now, > will do in the morning... The point is that we can't do mount propagation with detached trees without regressing userspace. And we didn't do it before. I don't specifically care how we block this but it needs to go out again. Otherwise we release a kernel with the new semantics that regress userspace. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 8:41 ` Christian Brauner @ 2025-05-23 21:29 ` Al Viro 2025-05-23 21:37 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2025-05-23 21:29 UTC (permalink / raw) To: Christian Brauner; +Cc: Allison Karlitskaya, linux-fsdevel On Fri, May 23, 2025 at 10:41:14AM +0200, Christian Brauner wrote: > On Fri, May 23, 2025 at 07:32:38AM +0100, Al Viro wrote: > > On Thu, May 15, 2025 at 01:25:27PM +0200, Christian Brauner wrote: > > > > > Al, I want to kill this again and restore the pre v6.15 behavior. > > > Allowing mount propagation for detached trees was a crazy > > > idea on my part. It's a pain and it regresses userspace. If composefs is > > > broken by this then systemd will absolutely get broken by my change as > > > well. > > > > > > Something like this will allow to restore the status-quo: > > > > > -#define IS_MNT_NEW(m) (!(m)->mnt_ns) > > > +#define IS_MNT_NEW(m) (!(m)->mnt_ns || is_anon_ns((m)->mnt_ns)) > > > > FWIW, I'm not sure that ever had been quite correct, no matter how you > > call the macro. I'm not up to building a counterexample right now, > > will do in the morning... > > The point is that we can't do mount propagation with detached trees > without regressing userspace. And we didn't do it before. I don't > specifically care how we block this but it needs to go out again. > Otherwise we release a kernel with the new semantics that regress > userspace. The problem is not in propagation *into* detached trees. All you need is A -> B -> C (all in your normal namespace, nothing detached, etc.) followed by make-private on B. C should (and does) keep getting events from A. Now, have open_tree() create a detached clone of B. Shouldn't affect anything between A, B and C, right? With your patch doing open_tree before make-private B ends up with no propagation from A to C... until the detached tree is closed. See the problem? You have nodes on detached trees still in propagation graph. And your variant _stops_ propagation at nodes in anon namespaces, not just skips such nodes themselves. This is bogus, IMO. I'm perfectly fine with propagate_one() returning 0 on anon_ns(m->mnt); that would refuse to propagate into *any* anon ns, but won't screw the propagation between the mounts that are in normal, non-anon namespaces. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 21:29 ` Al Viro @ 2025-05-23 21:37 ` Al Viro 2025-05-23 23:22 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2025-05-23 21:37 UTC (permalink / raw) To: Christian Brauner; +Cc: Allison Karlitskaya, linux-fsdevel On Fri, May 23, 2025 at 10:29:58PM +0100, Al Viro wrote: > This is bogus, IMO. I'm perfectly fine with propagate_one() returning 0 > on anon_ns(m->mnt); that would refuse to propagate into *any* anon ns, > but won't screw the propagation between the mounts that are in normal, non-anon > namespaces. IOW, I mean this variant - the only difference from what you've posted is the location of is_anon_ns() test; you do it in IS_MNT_NEW(), this variant has it in propagate_one(). Does the variant below fix regression? diff --git a/fs/mount.h b/fs/mount.h index 7aecf2a60472..ad7173037924 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -7,10 +7,6 @@ extern struct list_head notify_list; -typedef __u32 __bitwise mntns_flags_t; - -#define MNTNS_PROPAGATING ((__force mntns_flags_t)(1 << 0)) - struct mnt_namespace { struct ns_common ns; struct mount * root; @@ -37,7 +33,6 @@ struct mnt_namespace { struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ struct list_head mnt_ns_list; /* entry in the sequential list of mounts namespace */ refcount_t passive; /* number references not pinning @mounts */ - mntns_flags_t mntns_flags; } __randomize_layout; struct mnt_pcp { diff --git a/fs/namespace.c b/fs/namespace.c index 1b466c54a357..623cd110076d 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3648,7 +3648,7 @@ static int do_move_mount(struct path *old_path, if (!(attached ? check_mnt(old) : is_anon_ns(ns))) goto out; - if (is_anon_ns(ns)) { + if (is_anon_ns(ns) && ns == p->mnt_ns) { /* * Ending up with two files referring to the root of the * same anonymous mount namespace would cause an error @@ -3656,16 +3656,7 @@ static int do_move_mount(struct path *old_path, * twice into the mount tree which would be rejected * later. But be explicit about it right here. */ - if ((is_anon_ns(p->mnt_ns) && ns == p->mnt_ns)) - goto out; - - /* - * If this is an anonymous mount tree ensure that mount - * propagation can detect mounts that were just - * propagated to the target mount tree so we don't - * propagate onto them. - */ - ns->mntns_flags |= MNTNS_PROPAGATING; + goto out; } else if (is_anon_ns(p->mnt_ns)) { /* * Don't allow moving an attached mount tree to an @@ -3722,8 +3713,6 @@ static int do_move_mount(struct path *old_path, if (attached) put_mountpoint(old_mp); out: - if (is_anon_ns(ns)) - ns->mntns_flags &= ~MNTNS_PROPAGATING; unlock_mount(mp); if (!err) { if (attached) { diff --git a/fs/pnode.c b/fs/pnode.c index fb77427df39e..ffd429b760d5 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -231,8 +231,8 @@ static int propagate_one(struct mount *m, struct mountpoint *dest_mp) /* skip if mountpoint isn't visible in m */ if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root)) return 0; - /* skip if m is in the anon_ns we are emptying */ - if (m->mnt_ns->mntns_flags & MNTNS_PROPAGATING) + /* skip if m is in the anon_ns */ + if (is_anon_ns(m->mnt_ns)) return 0; if (peers(m, last_dest)) { ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 21:37 ` Al Viro @ 2025-05-23 23:22 ` Al Viro 2025-05-26 4:47 ` Christian Brauner 2025-05-26 7:18 ` Allison Karlitskaya 0 siblings, 2 replies; 11+ messages in thread From: Al Viro @ 2025-05-23 23:22 UTC (permalink / raw) To: Christian Brauner; +Cc: Allison Karlitskaya, linux-fsdevel On Fri, May 23, 2025 at 10:37:35PM +0100, Al Viro wrote: > On Fri, May 23, 2025 at 10:29:58PM +0100, Al Viro wrote: > > > This is bogus, IMO. I'm perfectly fine with propagate_one() returning 0 > > on anon_ns(m->mnt); that would refuse to propagate into *any* anon ns, > > but won't screw the propagation between the mounts that are in normal, non-anon > > namespaces. > > IOW, I mean this variant - the only difference from what you've posted is > the location of is_anon_ns() test; you do it in IS_MNT_NEW(), this variant > has it in propagate_one(). Does the variant below fix regression? AFAICS, it does suffice to revert the behaviour change on the reproducer upthread. I've replaced the top of viro/vfs.git#fixes with that; commit message there is tentative - if nothing else, that's a patch from Christian with slight modifications from me. It also needs reported-by, etc. Said that, could somebody (original reporter) confirm that the variant in git.kernel.org:/pub/scm/linux/kernel/git/viro/vfs.git #fixes (head at 63e90fcc1807) is OK with them? And yes, it will need a proper commit message. Christian? commit 63e90fcc18072638a62196caae93de66fc6cbc37 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Fri May 23 19:20:36 2025 -0400 Don't propagate mounts into detached trees That reverts to behaviour of 6.14 and earlier, with fix from "fix IS_MNT_PROPAGATING uses" preserved. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/fs/mount.h b/fs/mount.h index 7aecf2a60472..ad7173037924 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -7,10 +7,6 @@ extern struct list_head notify_list; -typedef __u32 __bitwise mntns_flags_t; - -#define MNTNS_PROPAGATING ((__force mntns_flags_t)(1 << 0)) - struct mnt_namespace { struct ns_common ns; struct mount * root; @@ -37,7 +33,6 @@ struct mnt_namespace { struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ struct list_head mnt_ns_list; /* entry in the sequential list of mounts namespace */ refcount_t passive; /* number references not pinning @mounts */ - mntns_flags_t mntns_flags; } __randomize_layout; struct mnt_pcp { diff --git a/fs/namespace.c b/fs/namespace.c index 1b466c54a357..623cd110076d 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3648,7 +3648,7 @@ static int do_move_mount(struct path *old_path, if (!(attached ? check_mnt(old) : is_anon_ns(ns))) goto out; - if (is_anon_ns(ns)) { + if (is_anon_ns(ns) && ns == p->mnt_ns) { /* * Ending up with two files referring to the root of the * same anonymous mount namespace would cause an error @@ -3656,16 +3656,7 @@ static int do_move_mount(struct path *old_path, * twice into the mount tree which would be rejected * later. But be explicit about it right here. */ - if ((is_anon_ns(p->mnt_ns) && ns == p->mnt_ns)) - goto out; - - /* - * If this is an anonymous mount tree ensure that mount - * propagation can detect mounts that were just - * propagated to the target mount tree so we don't - * propagate onto them. - */ - ns->mntns_flags |= MNTNS_PROPAGATING; + goto out; } else if (is_anon_ns(p->mnt_ns)) { /* * Don't allow moving an attached mount tree to an @@ -3722,8 +3713,6 @@ static int do_move_mount(struct path *old_path, if (attached) put_mountpoint(old_mp); out: - if (is_anon_ns(ns)) - ns->mntns_flags &= ~MNTNS_PROPAGATING; unlock_mount(mp); if (!err) { if (attached) { diff --git a/fs/pnode.c b/fs/pnode.c index fb77427df39e..ffd429b760d5 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -231,8 +231,8 @@ static int propagate_one(struct mount *m, struct mountpoint *dest_mp) /* skip if mountpoint isn't visible in m */ if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root)) return 0; - /* skip if m is in the anon_ns we are emptying */ - if (m->mnt_ns->mntns_flags & MNTNS_PROPAGATING) + /* skip if m is in the anon_ns */ + if (is_anon_ns(m->mnt_ns)) return 0; if (peers(m, last_dest)) { ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 23:22 ` Al Viro @ 2025-05-26 4:47 ` Christian Brauner 2025-05-26 21:32 ` Al Viro 2025-05-26 7:18 ` Allison Karlitskaya 1 sibling, 1 reply; 11+ messages in thread From: Christian Brauner @ 2025-05-26 4:47 UTC (permalink / raw) To: Al Viro; +Cc: Allison Karlitskaya, linux-fsdevel On Sat, May 24, 2025 at 12:22:13AM +0100, Al Viro wrote: > On Fri, May 23, 2025 at 10:37:35PM +0100, Al Viro wrote: > > On Fri, May 23, 2025 at 10:29:58PM +0100, Al Viro wrote: > > > > > This is bogus, IMO. I'm perfectly fine with propagate_one() returning 0 > > > on anon_ns(m->mnt); that would refuse to propagate into *any* anon ns, > > > but won't screw the propagation between the mounts that are in normal, non-anon > > > namespaces. > > > > IOW, I mean this variant - the only difference from what you've posted is > > the location of is_anon_ns() test; you do it in IS_MNT_NEW(), this variant > > has it in propagate_one(). Does the variant below fix regression? > > AFAICS, it does suffice to revert the behaviour change on the reproducer > upthread. > > I've replaced the top of viro/vfs.git#fixes with that; commit message there > is tentative - if nothing else, that's a patch from Christian with slight > modifications from me. It also needs reported-by, etc. > > Said that, could somebody (original reporter) confirm that the variant > in git.kernel.org:/pub/scm/linux/kernel/git/viro/vfs.git #fixes (head at > 63e90fcc1807) is OK with them? > > And yes, it will need a proper commit message. Christian? Yes, that looks good to me, thank you! > > commit 63e90fcc18072638a62196caae93de66fc6cbc37 > Author: Al Viro <viro@zeniv.linux.org.uk> > Date: Fri May 23 19:20:36 2025 -0400 > > Don't propagate mounts into detached trees > > That reverts to behaviour of 6.14 and earlier, with > fix from "fix IS_MNT_PROPAGATING uses" preserved. > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > > diff --git a/fs/mount.h b/fs/mount.h > index 7aecf2a60472..ad7173037924 100644 > --- a/fs/mount.h > +++ b/fs/mount.h > @@ -7,10 +7,6 @@ > > extern struct list_head notify_list; > > -typedef __u32 __bitwise mntns_flags_t; > - > -#define MNTNS_PROPAGATING ((__force mntns_flags_t)(1 << 0)) > - > struct mnt_namespace { > struct ns_common ns; > struct mount * root; > @@ -37,7 +33,6 @@ struct mnt_namespace { > struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ > struct list_head mnt_ns_list; /* entry in the sequential list of mounts namespace */ > refcount_t passive; /* number references not pinning @mounts */ > - mntns_flags_t mntns_flags; > } __randomize_layout; > > struct mnt_pcp { > diff --git a/fs/namespace.c b/fs/namespace.c > index 1b466c54a357..623cd110076d 100644 > --- a/fs/namespace.c > +++ b/fs/namespace.c > @@ -3648,7 +3648,7 @@ static int do_move_mount(struct path *old_path, > if (!(attached ? check_mnt(old) : is_anon_ns(ns))) > goto out; > > - if (is_anon_ns(ns)) { > + if (is_anon_ns(ns) && ns == p->mnt_ns) { > /* > * Ending up with two files referring to the root of the > * same anonymous mount namespace would cause an error > @@ -3656,16 +3656,7 @@ static int do_move_mount(struct path *old_path, > * twice into the mount tree which would be rejected > * later. But be explicit about it right here. > */ > - if ((is_anon_ns(p->mnt_ns) && ns == p->mnt_ns)) > - goto out; > - > - /* > - * If this is an anonymous mount tree ensure that mount > - * propagation can detect mounts that were just > - * propagated to the target mount tree so we don't > - * propagate onto them. > - */ > - ns->mntns_flags |= MNTNS_PROPAGATING; > + goto out; > } else if (is_anon_ns(p->mnt_ns)) { > /* > * Don't allow moving an attached mount tree to an > @@ -3722,8 +3713,6 @@ static int do_move_mount(struct path *old_path, > if (attached) > put_mountpoint(old_mp); > out: > - if (is_anon_ns(ns)) > - ns->mntns_flags &= ~MNTNS_PROPAGATING; > unlock_mount(mp); > if (!err) { > if (attached) { > diff --git a/fs/pnode.c b/fs/pnode.c > index fb77427df39e..ffd429b760d5 100644 > --- a/fs/pnode.c > +++ b/fs/pnode.c > @@ -231,8 +231,8 @@ static int propagate_one(struct mount *m, struct mountpoint *dest_mp) > /* skip if mountpoint isn't visible in m */ > if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root)) > return 0; > - /* skip if m is in the anon_ns we are emptying */ > - if (m->mnt_ns->mntns_flags & MNTNS_PROPAGATING) > + /* skip if m is in the anon_ns */ > + if (is_anon_ns(m->mnt_ns)) > return 0; > > if (peers(m, last_dest)) { ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-26 4:47 ` Christian Brauner @ 2025-05-26 21:32 ` Al Viro 0 siblings, 0 replies; 11+ messages in thread From: Al Viro @ 2025-05-26 21:32 UTC (permalink / raw) To: Christian Brauner; +Cc: Allison Karlitskaya, linux-fsdevel On Mon, May 26, 2025 at 06:47:25AM +0200, Christian Brauner wrote: > On Sat, May 24, 2025 at 12:22:13AM +0100, Al Viro wrote: > > On Fri, May 23, 2025 at 10:37:35PM +0100, Al Viro wrote: > > > On Fri, May 23, 2025 at 10:29:58PM +0100, Al Viro wrote: > > > > > > > This is bogus, IMO. I'm perfectly fine with propagate_one() returning 0 > > > > on anon_ns(m->mnt); that would refuse to propagate into *any* anon ns, > > > > but won't screw the propagation between the mounts that are in normal, non-anon > > > > namespaces. > > > > > > IOW, I mean this variant - the only difference from what you've posted is > > > the location of is_anon_ns() test; you do it in IS_MNT_NEW(), this variant > > > has it in propagate_one(). Does the variant below fix regression? > > > > AFAICS, it does suffice to revert the behaviour change on the reproducer > > upthread. > > > > I've replaced the top of viro/vfs.git#fixes with that; commit message there > > is tentative - if nothing else, that's a patch from Christian with slight > > modifications from me. It also needs reported-by, etc. > > > > Said that, could somebody (original reporter) confirm that the variant > > in git.kernel.org:/pub/scm/linux/kernel/git/viro/vfs.git #fixes (head at > > 63e90fcc1807) is OK with them? > > > > And yes, it will need a proper commit message. Christian? > > Yes, that looks good to me, thank you! OK, I went with the following for commit message: ----- Don't propagate mounts into detached trees All versions up to 6.14 did not propagate mount events into detached tree. Shortly after 6.14 a merge of vfs-6.15-rc1.mount.namespace (130e696aa68b) has changed that. Unfortunately, that has caused userland regressions (reported in https://lore.kernel.org/all/CAOYeF9WQhFDe+BGW=Dp5fK8oRy5AgZ6zokVyTj1Wp4EUiYgt4w@mail.gmail.com/) Straight revert wouldn't be an option - in particular, the variant in 6.14 had a bug that got fixed in d1ddc6f1d9f0 ("fix IS_MNT_PROPAGATING uses") and we don't want to bring the bug back. This is a modification of manual revert posted by Christian, with changes needed to avoid reintroducing the breakage in scenario described in d1ddc6f1d9f0. Cc: stable@vger.kernel.org Reported-by: Allison Karlitskaya <lis@redhat.com> Tested-by: Allison Karlitskaya <lis@redhat.com> Acked-by: Christian Brauner <brauner@kernel.org> Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> ----- It's in viro/vfs.git #fixes; if everyone's OK with the commit message, I'm sending a pull request tomorrow. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-23 23:22 ` Al Viro 2025-05-26 4:47 ` Christian Brauner @ 2025-05-26 7:18 ` Allison Karlitskaya 2025-05-26 8:00 ` Allison Karlitskaya 1 sibling, 1 reply; 11+ messages in thread From: Allison Karlitskaya @ 2025-05-26 7:18 UTC (permalink / raw) To: Al Viro; +Cc: Christian Brauner, linux-fsdevel good morning, On Sat, 24 May 2025 at 01:22, Al Viro <viro@zeniv.linux.org.uk> wrote: > Said that, could somebody (original reporter) confirm that the variant > in git.kernel.org:/pub/scm/linux/kernel/git/viro/vfs.git #fixes (head at > 63e90fcc1807) is OK with them? I've tested the commit (and its parent) against my original usecase that found the bug, along with the latest kernel in Fedora rawhide. Here's the results: broken: Linux fedora 6.15.0-0.rc7.58.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 20 14:10:49 UTC 2025 x86_64 GNU/Linux (current kernel in rawhide) broken: d1ddc6f1d9f0 ("fix IS_MNT_PROPAGATING uses") Linux fedora 6.15.0-rc5+ #8 SMP PREEMPT_DYNAMIC Mon May 26 09:14:09 CEST 2025 x86_64 GNU/Linux (parent commit of the fix) working: 63e90fcc1807 ("Don't propagate mounts into detached trees") Linux fedora 6.15.0-rc5+ #7 SMP PREEMPT_DYNAMIC Mon May 26 09:12:43 CEST 2025 x86_64 GNU/Linux tl;dr: Seems that the fix works as expected. Thanks! lis ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Apparent mount behaviour change in 6.15 2025-05-26 7:18 ` Allison Karlitskaya @ 2025-05-26 8:00 ` Allison Karlitskaya 0 siblings, 0 replies; 11+ messages in thread From: Allison Karlitskaya @ 2025-05-26 8:00 UTC (permalink / raw) To: Al Viro; +Cc: Christian Brauner, linux-fsdevel On Mon, 26 May 2025 at 09:18, Allison Karlitskaya <lis@redhat.com> wrote: > I've tested the commit (and its parent) against my original usecase > that found the bug, along with the latest kernel in Fedora rawhide. > Here's the results: > > > broken: > Linux fedora 6.15.0-0.rc7.58.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Tue > May 20 14:10:49 UTC 2025 x86_64 GNU/Linux > (current kernel in rawhide) > > broken: > d1ddc6f1d9f0 ("fix IS_MNT_PROPAGATING uses") > Linux fedora 6.15.0-rc5+ #8 SMP PREEMPT_DYNAMIC Mon May 26 09:14:09 > CEST 2025 x86_64 GNU/Linux > (parent commit of the fix) > > working: > 63e90fcc1807 ("Don't propagate mounts into detached trees") > Linux fedora 6.15.0-rc5+ #7 SMP PREEMPT_DYNAMIC Mon May 26 09:12:43 > CEST 2025 x86_64 GNU/Linux ...and worryingly: broken: 0ff41df1cb26 ("Linux 6.15") Linux fedora 6.15.0 #9 SMP PREEMPT_DYNAMIC Mon May 26 09:51:47 CEST 2025 x86_64 GNU/Linux ie: the release went out with the regression :( lis ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-05-26 21:32 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-15 4:06 Apparent mount behaviour change in 6.15 Allison Karlitskaya 2025-05-15 11:25 ` Christian Brauner 2025-05-23 6:32 ` Al Viro 2025-05-23 8:41 ` Christian Brauner 2025-05-23 21:29 ` Al Viro 2025-05-23 21:37 ` Al Viro 2025-05-23 23:22 ` Al Viro 2025-05-26 4:47 ` Christian Brauner 2025-05-26 21:32 ` Al Viro 2025-05-26 7:18 ` Allison Karlitskaya 2025-05-26 8:00 ` Allison Karlitskaya
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).