Linux filesystem development
 help / color / mirror / Atom feed
* Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace
       [not found] ` <20260602011907.GM2636677@ZenIV>
@ 2026-06-02  1:35   ` Al Viro
  2026-06-02  2:04     ` [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace) Al Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Al Viro @ 2026-06-02  1:35 UTC (permalink / raw)
  To: Denis Arefev
  Cc: linux-fsdevel, Jens Axboe, linux-block, linux-kernel, lvc-project,
	stable

On Tue, Jun 02, 2026 at 02:19:07AM +0100, Al Viro wrote:
> On Thu, May 21, 2026 at 10:28:56AM +0300, Denis Arefev wrote:
> > The bdev pseudo-filesystem is an internal kernel filesystem with which
> > userspace should not interfere. Unregister it so that userspace cannot
> > even attempt to mount it.
> > 
> > This fixes a bug [1] that occurs when attempting to access files,
> > because the system call move_mount() uses pointers declared in the
> > inode_operations structure, which for the bdev pseudo-filesystem
> > are always equal to 0. `inode->i_op = &empty_iops;`
> 
> What?  init_pseudo() sets SB_NOUSER; what are you talking about?

... which doesn't suffice, apparently, since now bdev has become
mountable, along with the rest of pseudo-fs.  *THAT* is a bug.

> And assuming you've somehow managed to mount the sucker, which
> ->i_op method had been accessed?

->lookup(), apparently.  Which means that 'directory' should've been
rejected by d_can_lookup(), no matter which filesystem it's been
from.  Which might or might not be a bug in its own right.

In any case, NAK on that patch - it's papering over the real bug that
has nothing to do with block layer.

mount -t bdev none /mnt

must fail, same as for pipefs, sockfs, etc.  It doesn't.

fsdevel Cc'd, as it should've been from the very beginning.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02  1:35   ` [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace Al Viro
@ 2026-06-02  2:04     ` Al Viro
  2026-06-02  9:11       ` Jan Kara
  2026-06-02 14:55       ` Christian Brauner
  0 siblings, 2 replies; 8+ messages in thread
From: Al Viro @ 2026-06-02  2:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Brauner, Jan Kara, linux-fsdevel, Jens Axboe,
	linux-block, linux-kernel, lvc-project, stable, Denis Arefev

one should *not* be allowed to mount one of those, new API or not.

Reported-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[[ I still want to see the rest of the reproducer - report smells like a missing
d_can_lookup() somewhere, on top of fsmount(2) bug]]
diff --git a/fs/namespace.c b/fs/namespace.c
index fe919abd2f01..17777c837683 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4499,6 +4499,10 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
 	new_mnt = vfs_create_mount(fc);
 	if (IS_ERR(new_mnt))
 		return PTR_ERR(new_mnt);
+	if (new_mnt->mnt_sb->s_flags & SB_NOUSER) {
+		mntput(new_mnt);
+		return -EINVAL;
+	}
 	new_mnt->mnt_flags = mnt_flags;
 
 	new_path.dentry = dget(fc->root);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02  2:04     ` [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace) Al Viro
@ 2026-06-02  9:11       ` Jan Kara
  2026-06-02 13:23         ` Arefev
  2026-06-02 14:07         ` Al Viro
  2026-06-02 14:55       ` Christian Brauner
  1 sibling, 2 replies; 8+ messages in thread
From: Jan Kara @ 2026-06-02  9:11 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Christian Brauner, Jan Kara, linux-fsdevel,
	Jens Axboe, linux-block, linux-kernel, lvc-project, stable,
	Denis Arefev

On Tue 02-06-26 03:04:44, Al Viro wrote:
> one should *not* be allowed to mount one of those, new API or not.
> 
> Reported-by: Denis Arefev <arefev@swemel.ru>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Won't it make sense to actually check fc->sb_flags before we call
vfs_create_mount()? Otherwise it looks good to me.

								Honza

> ---
> [[ I still want to see the rest of the reproducer - report smells like a missing
> d_can_lookup() somewhere, on top of fsmount(2) bug]]
> diff --git a/fs/namespace.c b/fs/namespace.c
> index fe919abd2f01..17777c837683 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -4499,6 +4499,10 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
>  	new_mnt = vfs_create_mount(fc);
>  	if (IS_ERR(new_mnt))
>  		return PTR_ERR(new_mnt);
> +	if (new_mnt->mnt_sb->s_flags & SB_NOUSER) {
> +		mntput(new_mnt);
> +		return -EINVAL;
> +	}
>  	new_mnt->mnt_flags = mnt_flags;
>  
>  	new_path.dentry = dget(fc->root);
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02  9:11       ` Jan Kara
@ 2026-06-02 13:23         ` Arefev
  2026-06-02 14:54           ` Al Viro
  2026-06-02 14:07         ` Al Viro
  1 sibling, 1 reply; 8+ messages in thread
From: Arefev @ 2026-06-02 13:23 UTC (permalink / raw)
  To: Jan Kara, Al Viro
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, Jens Axboe,
	linux-block, linux-kernel, lvc-project, stable


02.06.2026 12:11, Jan Kara пишет:
> On Tue 02-06-26 03:04:44, Al Viro wrote:
>> one should *not* be allowed to mount one of those, new API or not.
>>
>> Reported-by: Denis Arefev <arefev@swemel.ru>
>> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> Won't it make sense to actually check fc->sb_flags before we call
> vfs_create_mount()? Otherwise it looks good to me.
>
> 								Honza

Hi all.

The sequence of system calls before the crash could be as follows:

fsopen("bdev", ...)
fsconfig(fd_fs, FSCONFIG_CMD_CREATE, 0,0,0)
fsmount(fd_fs, 0,0)
move_mount(fd_mnt, "", AT_FDCWD, "./file1", 0x46ul)

The system call executed at the time of the cras:

open("/dev/media0", ...);

Simplified stacktrace:

path_openat
|-> link_path_walk
    |-> walk_component
       |-> __lookup_slow
          |-> ld = inode->i_op->lookup(inode, dentry, flags);   <- Oops


Searching for possible solutions in the commit history yielded the 
following result:

commit fd3e007f6c6a0f677e4ee8aca4b9bab8ad6cab9a
commit 1a6e9e76b713d9632783efe78295ed3507fdad64
commit d6f2589ad561aa5fa39f347eca6942668b7560a1

Checking the fc->sb_flags flag before calling vfs_create_mount() is a 
great idea,
if it helps prevent crashes in two more file systems, 'sockfs' and 'pipefs'.

Best regards, Denis.
>
>> ---
>> [[ I still want to see the rest of the reproducer - report smells like a missing
>> d_can_lookup() somewhere, on top of fsmount(2) bug]]
>> diff --git a/fs/namespace.c b/fs/namespace.c
>> index fe919abd2f01..17777c837683 100644
>> --- a/fs/namespace.c
>> +++ b/fs/namespace.c
>> @@ -4499,6 +4499,10 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
>>   	new_mnt = vfs_create_mount(fc);
>>   	if (IS_ERR(new_mnt))
>>   		return PTR_ERR(new_mnt);
>> +	if (new_mnt->mnt_sb->s_flags & SB_NOUSER) {
>> +		mntput(new_mnt);
>> +		return -EINVAL;
>> +	}
>>   	new_mnt->mnt_flags = mnt_flags;
>>   
>>   	new_path.dentry = dget(fc->root);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02  9:11       ` Jan Kara
  2026-06-02 13:23         ` Arefev
@ 2026-06-02 14:07         ` Al Viro
  2026-06-08 10:22           ` Jan Kara
  1 sibling, 1 reply; 8+ messages in thread
From: Al Viro @ 2026-06-02 14:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, Jens Axboe,
	linux-block, linux-kernel, lvc-project, stable, Denis Arefev

On Tue, Jun 02, 2026 at 11:11:11AM +0200, Jan Kara wrote:
> On Tue 02-06-26 03:04:44, Al Viro wrote:
> > one should *not* be allowed to mount one of those, new API or not.
> > 
> > Reported-by: Denis Arefev <arefev@swemel.ru>
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> 
> Won't it make sense to actually check fc->sb_flags before we call
> vfs_create_mount()? Otherwise it looks good to me.

Interpretation of fc->sb_flags is up to your ->get_tree().  What matters
is ->s_flags in the resulting superblock; that's type-independent and
that's what we ought to check...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02 13:23         ` Arefev
@ 2026-06-02 14:54           ` Al Viro
  0 siblings, 0 replies; 8+ messages in thread
From: Al Viro @ 2026-06-02 14:54 UTC (permalink / raw)
  To: Arefev
  Cc: Jan Kara, Linus Torvalds, Christian Brauner, linux-fsdevel,
	Jens Axboe, linux-block, linux-kernel, lvc-project, stable

On Tue, Jun 02, 2026 at 04:23:21PM +0300, Arefev wrote:

> The sequence of system calls before the crash could be as follows:
> 
> fsopen("bdev", ...)
> fsconfig(fd_fs, FSCONFIG_CMD_CREATE, 0,0,0)
> fsmount(fd_fs, 0,0)
> move_mount(fd_mnt, "", AT_FDCWD, "./file1", 0x46ul)

	Huh?  "file1" being a regular file or was it actually
a directory?  AFAICS, the d_is_dir() mismatch would be rejected
by do_move_mount()...

> The system call executed at the time of the cras:
> 
> open("/dev/media0", ...);
> 
> Simplified stacktrace:
> 
> path_openat
> |-> link_path_walk
>    |-> walk_component
>       |-> __lookup_slow
>          |-> ld = inode->i_op->lookup(inode, dentry, flags);   <- Oops

How the hell does that thing bound on top of "./file1" lead to
resolution of "/dev/media0" walking anywhere near it?  Something's
missing here.

> Checking the fc->sb_flags flag before calling vfs_create_mount() is a great
> idea,
> if it helps prevent crashes in two more file systems, 'sockfs' and 'pipefs'.

Calling vfs_create_mount() is not a problem; refusing to attach
the result if SB_NOUSER has ended up in ->s_flags is the right
thing to do, but I still would like to understand how did this call
of walk_component() manage to evade
                if (unlikely(!d_can_lookup(nd->path.dentry))) {
			if (nd->flags & LOOKUP_RCU) {
				if (!try_to_unlazy(nd))
					return -ECHILD;
			}
			return -ENOTDIR;
		}
on the previous iteration through link_path_walk() or, if it had been
the first one, the corresponding checks at chroot()/chdir()/fchdir() time.

Note that there are very legitimate objects with NULL ->lookup() - every
regular file is like that, obviously, but there also exist ones that look
like directories in mode bits, but still have NULL ->lookup().  See
d_flags_for_inode() and look for DCACHE_AUTODIR_TYPE there.

So whatever scenario has played out, you've got a call of walk_component()
with nd->path.dentry that should have failed d_can_lookup().  That ought
to have been prevented and this prevention would better be much closer
than anything fsmount(2) does.

Don't get me wrong - userland mounting of bdev and friends should not be
allowed, but that's not the only thing that went wrong in the reproducer.
BTW, how easy to trigger it is?  Is that "you need to run for a few months
on a bunch of boxen" or "run this sequence and it'll crash that way"?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02  2:04     ` [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace) Al Viro
  2026-06-02  9:11       ` Jan Kara
@ 2026-06-02 14:55       ` Christian Brauner
  1 sibling, 0 replies; 8+ messages in thread
From: Christian Brauner @ 2026-06-02 14:55 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro
  Cc: Christian Brauner, Jan Kara, linux-fsdevel, Jens Axboe,
	linux-block, linux-kernel, lvc-project, stable, Denis Arefev

On Tue, 02 Jun 2026 03:04:44 +0100, Al Viro wrote:
> one should *not* be allowed to mount one of those, new API or not.

Applied to the vfs-7.2.misc branch of the vfs/vfs.git tree.
Patches in the vfs-7.2.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-7.2.misc

[1/1] mount: honour SB_NOUSER in the new mount API 
      https://git.kernel.org/vfs/vfs/c/67d8c452fae1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace)
  2026-06-02 14:07         ` Al Viro
@ 2026-06-08 10:22           ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2026-06-08 10:22 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Linus Torvalds, Christian Brauner, linux-fsdevel,
	Jens Axboe, linux-block, linux-kernel, lvc-project, stable,
	Denis Arefev

On Tue 02-06-26 15:07:51, Al Viro wrote:
> On Tue, Jun 02, 2026 at 11:11:11AM +0200, Jan Kara wrote:
> > On Tue 02-06-26 03:04:44, Al Viro wrote:
> > > one should *not* be allowed to mount one of those, new API or not.
> > > 
> > > Reported-by: Denis Arefev <arefev@swemel.ru>
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > Won't it make sense to actually check fc->sb_flags before we call
> > vfs_create_mount()? Otherwise it looks good to me.
> 
> Interpretation of fc->sb_flags is up to your ->get_tree().  What matters
> is ->s_flags in the resulting superblock; that's type-independent and
> that's what we ought to check...

Ah, right. Thanks for explanation!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-08 10:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260521072857.5078-1-arefev@swemel.ru>
     [not found] ` <20260602011907.GM2636677@ZenIV>
2026-06-02  1:35   ` [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace Al Viro
2026-06-02  2:04     ` [PATCH] make new mount API honour SB_NOUSER (was Re: [PATCH] block: Avoid mounting the bdev pseudo-filesystem in userspace) Al Viro
2026-06-02  9:11       ` Jan Kara
2026-06-02 13:23         ` Arefev
2026-06-02 14:54           ` Al Viro
2026-06-02 14:07         ` Al Viro
2026-06-08 10:22           ` Jan Kara
2026-06-02 14:55       ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox