* [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
@ 2025-05-09 8:26 Al Viro
2025-05-09 8:28 ` Al Viro
0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2025-05-09 8:26 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Pavel Tikhomirov, Christian Brauner
AFAICS, 9ffb14ef61ba "move_mount: allow to add a mount into an existing
group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data
structures in question are actually documented.
Documentation/filesystem/sharedsubtree.rst:
All vfsmounts in a peer group have the same ->mnt_master. If it is
non-NULL, they form a contiguous (ordered) segment of slave list.
fs/pnode.c:
* Note that peer groups form contiguous segments of slave lists.
fs/namespace.c:do_set_group():
if (IS_MNT_SLAVE(from)) {
struct mount *m = from->mnt_master;
list_add(&to->mnt_slave, &m->mnt_slave_list);
to->mnt_master = m;
}
if (IS_MNT_SHARED(from)) {
to->mnt_group_id = from->mnt_group_id;
list_add(&to->mnt_share, &from->mnt_share);
lock_mount_hash();
set_mnt_shared(to);
unlock_mount_hash();
}
Note that 'to' goes right after 'from' in ->mnt_share (i.e. peer group
list) and into the beginning of the slave list 'from' belongs to. IOW,
contiguity gets broken if 'from' is both IS_MNT_SLAVE and IS_MNT_SHARED.
Which is what happens when the peer group 'from' is in gets propagation
from somewhere.
It's not hard to fix - something like
if (IS_MNT_SHARED(from)) {
to->mnt_group_id = from->mnt_group_id;
list_add(&to->mnt_share, &from->mnt_share);
if (IS_MNT_SLAVE(from))
list_add(&to->mnt_slave, &from->mnt_slave);
to->mnt_master = from->mnt_master;
lock_mount_hash();
set_mnt_shared(to);
unlock_mount_hash();
} else if (IS_MNT_SLAVE(from)) {
to->mnt_master = from->mnt_master;
list_add(&to->mnt_slave, &from->mnt_master->mnt_slave_list);
}
ought to do it. I'm nowhere near sufficiently awake right now to put
together a regression test, but unless I'm missing something subtle, it
should be possible to get a fairly obvious breakage of propagate_mnt()
out of that...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
2025-05-09 8:26 [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2) Al Viro
@ 2025-05-09 8:28 ` Al Viro
2025-05-09 10:59 ` Pavel Tikhomirov
0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2025-05-09 8:28 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Pavel Tikhomirov, Christian Brauner
On Fri, May 09, 2025 at 09:26:28AM +0100, Al Viro wrote:
> AFAICS, 9ffb14ef61ba "move_mount: allow to add a mount into an existing
> group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data
> structures in question are actually documented.
>
> Documentation/filesystem/sharedsubtree.rst:
> All vfsmounts in a peer group have the same ->mnt_master. If it is
> non-NULL, they form a contiguous (ordered) segment of slave list.
>
> fs/pnode.c:
> * Note that peer groups form contiguous segments of slave lists.
>
> fs/namespace.c:do_set_group():
> if (IS_MNT_SLAVE(from)) {
> struct mount *m = from->mnt_master;
>
> list_add(&to->mnt_slave, &m->mnt_slave_list);
> to->mnt_master = m;
> }
>
> if (IS_MNT_SHARED(from)) {
> to->mnt_group_id = from->mnt_group_id;
> list_add(&to->mnt_share, &from->mnt_share);
> lock_mount_hash();
> set_mnt_shared(to);
> unlock_mount_hash();
> }
>
> Note that 'to' goes right after 'from' in ->mnt_share (i.e. peer group
> list) and into the beginning of the slave list 'from' belongs to. IOW,
> contiguity gets broken if 'from' is both IS_MNT_SLAVE and IS_MNT_SHARED.
> Which is what happens when the peer group 'from' is in gets propagation
> from somewhere.
>
> It's not hard to fix - something like
>
> if (IS_MNT_SHARED(from)) {
> to->mnt_group_id = from->mnt_group_id;
> list_add(&to->mnt_share, &from->mnt_share);
> if (IS_MNT_SLAVE(from))
> list_add(&to->mnt_slave, &from->mnt_slave);
> to->mnt_master = from->mnt_master;
> lock_mount_hash();
> set_mnt_shared(to);
> unlock_mount_hash();
> } else if (IS_MNT_SLAVE(from)) {
> to->mnt_master = from->mnt_master;
> list_add(&to->mnt_slave, &from->mnt_master->mnt_slave_list);
> }
>
> ought to do it. I'm nowhere near sufficiently awake right now to put
> together a regression test, but unless I'm missing something subtle, it
> should be possible to get a fairly obvious breakage of propagate_mnt()
> out of that...
Not sufficiently awake is right - wrong address on Cc... Anyway, bedtime
for me...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
2025-05-09 8:28 ` Al Viro
@ 2025-05-09 10:59 ` Pavel Tikhomirov
2025-05-09 11:06 ` Pavel Tikhomirov
0 siblings, 1 reply; 4+ messages in thread
From: Pavel Tikhomirov @ 2025-05-09 10:59 UTC (permalink / raw)
To: Al Viro, linux-fsdevel; +Cc: Christian Brauner
On 5/9/25 16:28, Al Viro wrote:
> On Fri, May 09, 2025 at 09:26:28AM +0100, Al Viro wrote:
>> AFAICS, 9ffb14ef61ba "move_mount: allow to add a mount into an existing
>> group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data
>> structures in question are actually documented.
>>
>> Documentation/filesystem/sharedsubtree.rst:
>> All vfsmounts in a peer group have the same ->mnt_master. If it is
>> non-NULL, they form a contiguous (ordered) segment of slave list.
>>
>> fs/pnode.c:
>> * Note that peer groups form contiguous segments of slave lists.
>>
>> fs/namespace.c:do_set_group():
>> if (IS_MNT_SLAVE(from)) {
>> struct mount *m = from->mnt_master;
>>
>> list_add(&to->mnt_slave, &m->mnt_slave_list);
>> to->mnt_master = m;
>> }
>>
>> if (IS_MNT_SHARED(from)) {
>> to->mnt_group_id = from->mnt_group_id;
>> list_add(&to->mnt_share, &from->mnt_share);
>> lock_mount_hash();
>> set_mnt_shared(to);
>> unlock_mount_hash();
>> }
>>
>> Note that 'to' goes right after 'from' in ->mnt_share (i.e. peer group
>> list) and into the beginning of the slave list 'from' belongs to. IOW,
>> contiguity gets broken if 'from' is both IS_MNT_SLAVE and IS_MNT_SHARED.
>> Which is what happens when the peer group 'from' is in gets propagation
>> from somewhere.
Agreed, list ordering consistency looks broken by my commit.
>>
>> It's not hard to fix - something like
>>
>> if (IS_MNT_SHARED(from)) {
>> to->mnt_group_id = from->mnt_group_id;
>> list_add(&to->mnt_share, &from->mnt_share);
>> if (IS_MNT_SLAVE(from))
>> list_add(&to->mnt_slave, &from->mnt_slave);
>> to->mnt_master = from->mnt_master;
>> lock_mount_hash();
>> set_mnt_shared(to);
>> unlock_mount_hash();
>> } else if (IS_MNT_SLAVE(from)) {
>> to->mnt_master = from->mnt_master;
>> list_add(&to->mnt_slave, &from->mnt_master->mnt_slave_list);
>> }
>>
>> ought to do it.
Yes it should work.
In case (IS_MNT_SLAVE(from) && !IS_MNT_SHARED(from)) we can probably
also do:
list_add(&to->mnt_slave, &from->mnt_slave);
as next slave after "from" is definitely not from the same shared group
with "from" (as it's not in a shared group) so we won't break list
continuity.
That will allow to simplify code change to:
if (IS_MNT_SLAVE(from)) {
struct mount *m = from->mnt_master;
- list_add(&to->mnt_slave, &m->mnt_slave_list);
+ list_add(&to->mnt_slave, &from->mnt_slave);
to->mnt_master = m;
}
if (IS_MNT_SHARED(from)) {
to->mnt_group_id = from->mnt_group_id;
list_add(&to->mnt_share, &from->mnt_share);
lock_mount_hash();
set_mnt_shared(to);
unlock_mount_hash();
}
If I'm not missing something (didn't test yet).
>> I'm nowhere near sufficiently awake right now to put
>> together a regression test, but unless I'm missing something subtle, it
>> should be possible to get a fairly obvious breakage of propagate_mnt()
>> out of that...
I managed to see weird behavior like that:
# rmdir /tmp/{A,B,C,D,E,Z}
# unshare -m
mkdir /tmp/{A,B,C,D,E,Z}
mount --make-rprivate /
mount -t tmpfs tmpfs /tmp/A
mount --bind /tmp/A /tmp/Z
mount --make-shared /tmp/A
mount --bind /tmp/A /tmp/B
mount --make-slave /tmp/B
mount --make-shared /tmp/B
mount --bind /tmp/B /tmp/C
mount --bind /tmp/C /tmp/D
mount --bind /tmp/D /tmp/E
./setgroup-v2 /tmp/C /tmp/Z
mkdir /tmp/A/subdir
mount -t tmpfs tmpfs /tmp/A/subdir
This creates 16 subdir mounts instead of expected 6:
cat /proc/self/mountinfo | grep /tmp/
1071 1065 0:109 / /tmp/A rw,relatime shared:556 - tmpfs tmpfs
rw,seclabel,inode64
1073 1065 0:109 / /tmp/Z rw,relatime shared:1040 master:556 - tmpfs
tmpfs rw,seclabel,inode64
1076 1065 0:109 / /tmp/B rw,relatime shared:1040 master:556 - tmpfs
tmpfs rw,seclabel,inode64
1077 1065 0:109 / /tmp/C rw,relatime shared:1040 master:556 - tmpfs
tmpfs rw,seclabel,inode64
1078 1065 0:109 / /tmp/D rw,relatime shared:1040 master:556 - tmpfs
tmpfs rw,seclabel,inode64
1079 1065 0:109 / /tmp/E rw,relatime shared:1040 master:556 - tmpfs
tmpfs rw,seclabel,inode64
1080 1071 0:136 / /tmp/A/subdir rw,relatime shared:1041 - tmpfs tmpfs
rw,seclabel,inode64
1081 1073 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1082 1078 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1083 1079 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1084 1076 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1085 1077 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1086 1084 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1087 1085 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1088 1081 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1089 1082 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1090 1083 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1142 1089 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1143 1090 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1144 1086 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1145 1087 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
1146 1088 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
tmpfs tmpfs rw,seclabel,inode64
Maybe that can be converted to a regression test.
>
> Not sufficiently awake is right - wrong address on Cc... Anyway, bedtime
> for me...
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
2025-05-09 10:59 ` Pavel Tikhomirov
@ 2025-05-09 11:06 ` Pavel Tikhomirov
0 siblings, 0 replies; 4+ messages in thread
From: Pavel Tikhomirov @ 2025-05-09 11:06 UTC (permalink / raw)
To: Al Viro, linux-fsdevel; +Cc: Christian Brauner
On 5/9/25 18:59, Pavel Tikhomirov wrote:
>
>
> On 5/9/25 16:28, Al Viro wrote:
>> On Fri, May 09, 2025 at 09:26:28AM +0100, Al Viro wrote:
>>> AFAICS, 9ffb14ef61ba "move_mount: allow to add a mount into an existing
>>> group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data
>>> structures in question are actually documented.
>>>
>>> Documentation/filesystem/sharedsubtree.rst:
>>> All vfsmounts in a peer group have the same ->mnt_master.
>>> If it is
>>> non-NULL, they form a contiguous (ordered) segment of slave list.
>>>
>>> fs/pnode.c:
>>> * Note that peer groups form contiguous segments of slave lists.
>>>
>>> fs/namespace.c:do_set_group():
>>> if (IS_MNT_SLAVE(from)) {
>>> struct mount *m = from->mnt_master;
>>>
>>> list_add(&to->mnt_slave, &m->mnt_slave_list);
>>> to->mnt_master = m;
>>> }
>>>
>>> if (IS_MNT_SHARED(from)) {
>>> to->mnt_group_id = from->mnt_group_id;
>>> list_add(&to->mnt_share, &from->mnt_share);
>>> lock_mount_hash();
>>> set_mnt_shared(to);
>>> unlock_mount_hash();
>>> }
>>>
>>> Note that 'to' goes right after 'from' in ->mnt_share (i.e. peer group
>>> list) and into the beginning of the slave list 'from' belongs to. IOW,
>>> contiguity gets broken if 'from' is both IS_MNT_SLAVE and IS_MNT_SHARED.
>>> Which is what happens when the peer group 'from' is in gets propagation
>>> from somewhere.
>
> Agreed, list ordering consistency looks broken by my commit.
>
>>>
>>> It's not hard to fix - something like
>>>
>>> if (IS_MNT_SHARED(from)) {
>>> to->mnt_group_id = from->mnt_group_id;
>>> list_add(&to->mnt_share, &from->mnt_share);
>>> if (IS_MNT_SLAVE(from))
>>> list_add(&to->mnt_slave, &from->mnt_slave);
>>> to->mnt_master = from->mnt_master;
>>> lock_mount_hash();
>>> set_mnt_shared(to);
>>> unlock_mount_hash();
>>> } else if (IS_MNT_SLAVE(from)) {
>>> to->mnt_master = from->mnt_master;
>>> list_add(&to->mnt_slave, &from->mnt_master->mnt_slave_list);
>>> }
>>>
>>> ought to do it.
>
> Yes it should work.
>
> In case (IS_MNT_SLAVE(from) && !IS_MNT_SHARED(from)) we can probably
> also do:
>
> list_add(&to->mnt_slave, &from->mnt_slave);
>
> as next slave after "from" is definitely not from the same shared group
> with "from" (as it's not in a shared group) so we won't break list
> continuity.
>
> That will allow to simplify code change to:
>
> if (IS_MNT_SLAVE(from)) {
> struct mount *m = from->mnt_master;
>
> - list_add(&to->mnt_slave, &m->mnt_slave_list);
> + list_add(&to->mnt_slave, &from->mnt_slave);
> to->mnt_master = m;
> }
>
> if (IS_MNT_SHARED(from)) {
> to->mnt_group_id = from->mnt_group_id;
> list_add(&to->mnt_share, &from->mnt_share);
> lock_mount_hash();
> set_mnt_shared(to);
> unlock_mount_hash();
> }
>
> If I'm not missing something (didn't test yet).
>
>>> I'm nowhere near sufficiently awake right now to put
>>> together a regression test, but unless I'm missing something subtle, it
>>> should be possible to get a fairly obvious breakage of propagate_mnt()
>>> out of that...
>
> I managed to see weird behavior like that:
>
> # rmdir /tmp/{A,B,C,D,E,Z}
> # unshare -m
> mkdir /tmp/{A,B,C,D,E,Z}
> mount --make-rprivate /
> mount -t tmpfs tmpfs /tmp/A
> mount --bind /tmp/A /tmp/Z
> mount --make-shared /tmp/A
> mount --bind /tmp/A /tmp/B
> mount --make-slave /tmp/B
> mount --make-shared /tmp/B
> mount --bind /tmp/B /tmp/C
> mount --bind /tmp/C /tmp/D
> mount --bind /tmp/D /tmp/E
> ./setgroup-v2 /tmp/C /tmp/Z
Note I put ./setgroup-v2 code here:
https://github.com/Snorch/linux-helpers/blob/master/mount_set_group.c
> mkdir /tmp/A/subdir
> mount -t tmpfs tmpfs /tmp/A/subdir
>
> This creates 16 subdir mounts instead of expected 6:
>
> cat /proc/self/mountinfo | grep /tmp/
> 1071 1065 0:109 / /tmp/A rw,relatime shared:556 - tmpfs tmpfs
> rw,seclabel,inode64
> 1073 1065 0:109 / /tmp/Z rw,relatime shared:1040 master:556 - tmpfs
> tmpfs rw,seclabel,inode64
> 1076 1065 0:109 / /tmp/B rw,relatime shared:1040 master:556 - tmpfs
> tmpfs rw,seclabel,inode64
> 1077 1065 0:109 / /tmp/C rw,relatime shared:1040 master:556 - tmpfs
> tmpfs rw,seclabel,inode64
> 1078 1065 0:109 / /tmp/D rw,relatime shared:1040 master:556 - tmpfs
> tmpfs rw,seclabel,inode64
> 1079 1065 0:109 / /tmp/E rw,relatime shared:1040 master:556 - tmpfs
> tmpfs rw,seclabel,inode64
> 1080 1071 0:136 / /tmp/A/subdir rw,relatime shared:1041 - tmpfs tmpfs
> rw,seclabel,inode64
> 1081 1073 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1082 1078 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1083 1079 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1084 1076 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1085 1077 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1086 1084 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1087 1085 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1088 1081 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1089 1082 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1090 1083 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1142 1089 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1143 1090 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1144 1086 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1145 1087 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
> 1146 1088 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 -
> tmpfs tmpfs rw,seclabel,inode64
>
> Maybe that can be converted to a regression test.
>
>>
>> Not sufficiently awake is right - wrong address on Cc... Anyway, bedtime
>> for me...
>
>
>
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-09 11:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-09 8:26 [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2) Al Viro
2025-05-09 8:28 ` Al Viro
2025-05-09 10:59 ` Pavel Tikhomirov
2025-05-09 11:06 ` Pavel Tikhomirov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.