public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
@ 2024-04-03 18:23 syzbot
  2024-04-03 23:51 ` syzbot
  0 siblings, 1 reply; 29+ messages in thread
From: syzbot @ 2024-04-03 18:23 UTC (permalink / raw)
  To: gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj

Hello,

syzbot found the following issue on:

HEAD commit:    fe46a7dd189e Merge tag 'sound-6.9-rc1' of git://git.kernel..
git tree:       upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=136eb2f6180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4d90a36f0cab495a
dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17f1d93d180000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c38139180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/72ab73815344/disk-fe46a7dd.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2d6d6b0d7071/vmlinux-fe46a7dd.xz
kernel image: https://storage.googleapis.com/syzbot-assets/48e275e5478b/bzImage-fe46a7dd.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-08951-gfe46a7dd189e #0 Not tainted
------------------------------------------------------
syz-executor250/5062 is trying to acquire lock:
ffff888022c36888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_llseek+0x7e/0x2a0 fs/kernfs/file.c:867

but task is already holding lock:
ffff88807edeadd8 (&ovl_i_lock_key[depth]){+.+.}-{3:3}, at: ovl_inode_lock fs/overlayfs/overlayfs.h:649 [inline]
ffff88807edeadd8 (&ovl_i_lock_key[depth]){+.+.}-{3:3}, at: ovl_llseek+0x26b/0x470 fs/overlayfs/file.c:214

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&ovl_i_lock_key[depth]){+.+.}-{3:3}:
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       ovl_inode_lock_interruptible fs/overlayfs/overlayfs.h:654 [inline]
       ovl_nlink_start+0xdc/0x390 fs/overlayfs/util.c:1162
       ovl_do_remove+0x1fa/0xd90 fs/overlayfs/dir.c:893
       vfs_rmdir+0x367/0x4c0 fs/namei.c:4209
       do_rmdir+0x3b5/0x580 fs/namei.c:4268
       __do_sys_rmdir fs/namei.c:4287 [inline]
       __se_sys_rmdir fs/namei.c:4285 [inline]
       __x64_sys_rmdir+0x49/0x60 fs/namei.c:4285
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

-> #1 (&ovl_i_mutex_dir_key[depth]){++++}-{3:3}:
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       down_read+0xb1/0xa40 kernel/locking/rwsem.c:1526
       inode_lock_shared include/linux/fs.h:803 [inline]
       lookup_slow+0x45/0x70 fs/namei.c:1708
       walk_component+0x2e1/0x410 fs/namei.c:2004
       lookup_last fs/namei.c:2461 [inline]
       path_lookupat+0x16f/0x450 fs/namei.c:2485
       filename_lookup+0x256/0x610 fs/namei.c:2514
       kern_path+0x35/0x50 fs/namei.c:2622
       lookup_bdev+0xc5/0x290 block/bdev.c:1072
       resume_store+0x1a0/0x710 kernel/power/hibernate.c:1235
       kernfs_fop_write_iter+0x3a4/0x500 fs/kernfs/file.c:334
       call_write_iter include/linux/fs.h:2108 [inline]
       new_sync_write fs/read_write.c:497 [inline]
       vfs_write+0xa84/0xcb0 fs/read_write.c:590
       ksys_write+0x1a0/0x2c0 fs/read_write.c:643
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

-> #0 (&of->mutex){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       kernfs_fop_llseek+0x7e/0x2a0 fs/kernfs/file.c:867
       ovl_llseek+0x314/0x470 fs/overlayfs/file.c:218
       vfs_llseek fs/read_write.c:289 [inline]
       ksys_lseek fs/read_write.c:302 [inline]
       __do_sys_lseek fs/read_write.c:313 [inline]
       __se_sys_lseek fs/read_write.c:311 [inline]
       __x64_sys_lseek+0x153/0x1e0 fs/read_write.c:311
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

other info that might help us debug this:

Chain exists of:
  &of->mutex --> &ovl_i_mutex_dir_key[depth] --> &ovl_i_lock_key[depth]

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ovl_i_lock_key[depth]);
                               lock(&ovl_i_mutex_dir_key[depth]);
                               lock(&ovl_i_lock_key[depth]);
  lock(&of->mutex);

 *** DEADLOCK ***

1 lock held by syz-executor250/5062:
 #0: ffff88807edeadd8 (&ovl_i_lock_key[depth]){+.+.}-{3:3}, at: ovl_inode_lock fs/overlayfs/overlayfs.h:649 [inline]
 #0: ffff88807edeadd8 (&ovl_i_lock_key[depth]){+.+.}-{3:3}, at: ovl_llseek+0x26b/0x470 fs/overlayfs/file.c:214

stack backtrace:
CPU: 1 PID: 5062 Comm: syz-executor250 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 kernfs_fop_llseek+0x7e/0x2a0 fs/kernfs/file.c:867
 ovl_llseek+0x314/0x470 fs/overlayfs/file.c:218
 vfs_llseek fs/read_write.c:289 [inline]
 ksys_lseek fs/read_write.c:302 [inline]
 __do_sys_lseek fs/read_write.c:313 [inline]
 __se_sys_lseek fs/read_write.c:311 [inline]
 __x64_sys_lseek+0x153/0x1e0 fs/read_write.c:311
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f0e2bdfd219
Code: 48 83 c4 28 c3 e8 67 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd2f80f3f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000008
RAX: ffffffffffffffda RBX: 00007ffd2f80f400 RCX: 00007f0e2bdfd219
RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007ffd2f80f408 R08: 00007f0e2bdca000 R09: 00007f0e2bdca000
R10: 00007f0e2bdca000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffd2f80f668 R14: 0000000000000001 R15: 0000000000000001


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-03 18:23 [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek syzbot
@ 2024-04-03 23:51 ` syzbot
  2024-04-04  6:54   ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: syzbot @ 2024-04-03 23:51 UTC (permalink / raw)
  To: gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj, valesini

syzbot has bisected this issue to:

commit 0fedefd4c4e33dd24f726b13b5d7c143e2b483be
Author: Valentine Sinitsyn <valesini@yandex-team.ru>
Date:   Mon Sep 25 08:40:12 2023 +0000

    kernfs: sysfs: support custom llseek method for sysfs entries

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17cb5e03180000
start commit:   fe46a7dd189e Merge tag 'sound-6.9-rc1' of git://git.kernel..
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=142b5e03180000
console output: https://syzkaller.appspot.com/x/log.txt?x=102b5e03180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4d90a36f0cab495a
dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17f1d93d180000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c38139180000

Reported-by: syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com
Fixes: 0fedefd4c4e3 ("kernfs: sysfs: support custom llseek method for sysfs entries")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-03 23:51 ` syzbot
@ 2024-04-04  6:54   ` Amir Goldstein
  2024-04-04  8:11     ` Al Viro
  2024-04-05 15:08     ` Amir Goldstein
  0 siblings, 2 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-04  6:54 UTC (permalink / raw)
  To: syzbot
  Cc: gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj, valesini,
	Christoph Hellwig, Christian Brauner, Jan Kara, Miklos Szeredi,
	Al Viro

On Thu, Apr 4, 2024 at 2:51 AM syzbot
<syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com> wrote:
>
> syzbot has bisected this issue to:
>
> commit 0fedefd4c4e33dd24f726b13b5d7c143e2b483be
> Author: Valentine Sinitsyn <valesini@yandex-team.ru>
> Date:   Mon Sep 25 08:40:12 2023 +0000
>
>     kernfs: sysfs: support custom llseek method for sysfs entries
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17cb5e03180000
> start commit:   fe46a7dd189e Merge tag 'sound-6.9-rc1' of git://git.kernel..
> git tree:       upstream
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=142b5e03180000
> console output: https://syzkaller.appspot.com/x/log.txt?x=102b5e03180000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=4d90a36f0cab495a
> dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17f1d93d180000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c38139180000
>
> Reported-by: syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com
> Fixes: 0fedefd4c4e3 ("kernfs: sysfs: support custom llseek method for sysfs entries")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>

I think this commit is only the trigger for lockdep warning in this
specific scenario, but the conceptual issue existed before that
for example, with read from sysfs, which also can take of->mutex.

I think (not sure) that the potential deadlock is real, not a false
positive. OTOH, hibernation code may be crawling with potential
and more likely deadlocks...

The conceptual issue (I think) is this:

Overlayfs is a stacked filesystem which regularly calls vfs helpers
such as path lookup on other filesystems.
This specialized behavior is accompanied with a declaration of
s_stack_depth > 0, annotating ovl inode locks per stack depth
(ovl_lockdep_annotate_inode_mutex_key) and restricting the
types of filesystems that are allowed for writable upper layer.

In the lockdep dependency chain, overlayfs inode lock is taken
before kernfs internal of->mutex, where kernfs (sysfs) is the lower
layer of overlayfs, which is sane.

With /sys/power/resume (and probably other files), sysfs also
behaves as a stacking filesystem, calling vfs helpers, such as
lookup_bdev() -> kern_path(), which is a behavior of a stacked
filesystem, without all the precautions that comes with behaving
as a stacked filesystem.

If an overlayfs path is written into /sys/power/resume and that
overlayfs has sysfs as its lower layer, then there may be a small
chance of hitting the ABBA deadlock, but there could very well
be some conditions that prevent it.

It's a shame that converting blockdev path to major:minor is
not always done as a separate step by usersapce, but not much
to do about it now...

I don't think that Christoph's refactoring to blockdev interfaces
have changed anything in this regard, but I wasn't sure so CCed
the developers involved.

A relatively easy way to avert this use case is to define a lookup
flag LOOKUP_NO_STACKED, which does not traverse into any
stacked fs (s_stack_depth > 0) and use this flag for looking up
the hibernation blockdev.

I suppose writing the hibernation blockdev from inside containers
is not a common thing to do, so this mitigation could be good enough.

Thoughts?
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  6:54   ` Amir Goldstein
@ 2024-04-04  8:11     ` Al Viro
  2024-04-04  8:21       ` Al Viro
  2024-04-05 15:08     ` Amir Goldstein
  1 sibling, 1 reply; 29+ messages in thread
From: Al Viro @ 2024-04-04  8:11 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 09:54:35AM +0300, Amir Goldstein wrote:
> 
> In the lockdep dependency chain, overlayfs inode lock is taken
> before kernfs internal of->mutex, where kernfs (sysfs) is the lower
> layer of overlayfs, which is sane.
> 
> With /sys/power/resume (and probably other files), sysfs also
> behaves as a stacking filesystem, calling vfs helpers, such as
> lookup_bdev() -> kern_path(), which is a behavior of a stacked
> filesystem, without all the precautions that comes with behaving
> as a stacked filesystem.

No.  This is far worse than anything stacked filesystems do - it's
an arbitrary pathname resolution while holding a lock.
It's not local.  Just about anything (including automounts, etc.)
can be happening there and it pushes the lock in question outside
of *ALL* pathwalk-related locks.  Pathname doesn't have to
resolve to anything on overlayfs - it can just go through
a symlink on it, or walk into it and traverse a bunch of ..
afterwards, etc.

Don't confuse that with stacking - it's not even close.
You can't use that anywhere near overlayfs layers.

Maybe isolate it into a separate filesystem, to be automounted
on /sys/power.  And make anyone playing with overlayfs with
sysfs as a layer mount the damn thing on top of power/ in your
overlayfs.  But using that thing as a part of layer is
a non-starter.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  8:11     ` Al Viro
@ 2024-04-04  8:21       ` Al Viro
  2024-04-04  8:40         ` Al Viro
  2024-04-04  9:33         ` Amir Goldstein
  0 siblings, 2 replies; 29+ messages in thread
From: Al Viro @ 2024-04-04  8:21 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 09:11:22AM +0100, Al Viro wrote:
> On Thu, Apr 04, 2024 at 09:54:35AM +0300, Amir Goldstein wrote:
> > 
> > In the lockdep dependency chain, overlayfs inode lock is taken
> > before kernfs internal of->mutex, where kernfs (sysfs) is the lower
> > layer of overlayfs, which is sane.
> > 
> > With /sys/power/resume (and probably other files), sysfs also
> > behaves as a stacking filesystem, calling vfs helpers, such as
> > lookup_bdev() -> kern_path(), which is a behavior of a stacked
> > filesystem, without all the precautions that comes with behaving
> > as a stacked filesystem.
> 
> No.  This is far worse than anything stacked filesystems do - it's
> an arbitrary pathname resolution while holding a lock.
> It's not local.  Just about anything (including automounts, etc.)
> can be happening there and it pushes the lock in question outside
> of *ALL* pathwalk-related locks.  Pathname doesn't have to
> resolve to anything on overlayfs - it can just go through
> a symlink on it, or walk into it and traverse a bunch of ..
> afterwards, etc.
> 
> Don't confuse that with stacking - it's not even close.
> You can't use that anywhere near overlayfs layers.
> 
> Maybe isolate it into a separate filesystem, to be automounted
> on /sys/power.  And make anyone playing with overlayfs with
> sysfs as a layer mount the damn thing on top of power/ in your
> overlayfs.  But using that thing as a part of layer is
> a non-starter.

Incidentally, why do you need to lock overlayfs inode to call
vfs_llseek() on the underlying file?  It might (or might not)
need to lock the underlying file (for things like ->i_size,
etc.), but that will be done by ->llseek() instance and it
would deal with the inode in the layer, not overlayfs one.

Similar question applies to ovl_write_iter() - why do you
need to hold the overlayfs inode locked during the call of
backing_file_write_iter()?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  8:21       ` Al Viro
@ 2024-04-04  8:40         ` Al Viro
  2024-04-04  9:33         ` Amir Goldstein
  1 sibling, 0 replies; 29+ messages in thread
From: Al Viro @ 2024-04-04  8:40 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 09:21:10AM +0100, Al Viro wrote:

> Similar question applies to ovl_write_iter() - why do you
> need to hold the overlayfs inode locked during the call of
> backing_file_write_iter()?

Consider the scenario when unlink() is called on that sucker
during the write() that triggers that pathwalk.  We have

unlink: blocked on overlayfs inode of file, while holding the parent directory.
write: holding the overlayfs inode of file and trying to resolve a pathname
that contains .../power/suspend_stats/../../...; blocked on attempt to lock
parent so we could do a lookup in it.

No llseek involved anywhere, kernfs of->mutex held, but not contended,
deadlock purely on ->i_rwsem of overlayfs inodes.

Holding overlayfs inode locked during the call of lookup_bdev() is really
no-go.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  8:21       ` Al Viro
  2024-04-04  8:40         ` Al Viro
@ 2024-04-04  9:33         ` Amir Goldstein
  2024-04-04 22:01           ` Al Viro
                             ` (3 more replies)
  1 sibling, 4 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-04  9:33 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 4, 2024 at 11:21 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Apr 04, 2024 at 09:11:22AM +0100, Al Viro wrote:
> > On Thu, Apr 04, 2024 at 09:54:35AM +0300, Amir Goldstein wrote:
> > >
> > > In the lockdep dependency chain, overlayfs inode lock is taken
> > > before kernfs internal of->mutex, where kernfs (sysfs) is the lower
> > > layer of overlayfs, which is sane.
> > >
> > > With /sys/power/resume (and probably other files), sysfs also
> > > behaves as a stacking filesystem, calling vfs helpers, such as
> > > lookup_bdev() -> kern_path(), which is a behavior of a stacked
> > > filesystem, without all the precautions that comes with behaving
> > > as a stacked filesystem.
> >
> > No.  This is far worse than anything stacked filesystems do - it's
> > an arbitrary pathname resolution while holding a lock.
> > It's not local.  Just about anything (including automounts, etc.)
> > can be happening there and it pushes the lock in question outside
> > of *ALL* pathwalk-related locks.  Pathname doesn't have to
> > resolve to anything on overlayfs - it can just go through
> > a symlink on it, or walk into it and traverse a bunch of ..
> > afterwards, etc.
> >
> > Don't confuse that with stacking - it's not even close.
> > You can't use that anywhere near overlayfs layers.
> >
> > Maybe isolate it into a separate filesystem, to be automounted
> > on /sys/power.  And make anyone playing with overlayfs with
> > sysfs as a layer mount the damn thing on top of power/ in your
> > overlayfs.  But using that thing as a part of layer is
> > a non-starter.

I don't follow what you are saying.
Which code is in non-starter violation?
kernfs for calling lookup_bdev() with internal of->mutex held?
Overlayfs for allowing sysfs as a lower layer and calling
vfs_llseek(lower_sysfs_file,...) during copy up while ovl inode is held
for legit reasons (e.g. from ovl_rename())?

>
> Incidentally, why do you need to lock overlayfs inode to call
> vfs_llseek() on the underlying file?  It might (or might not)
> need to lock the underlying file (for things like ->i_size,
> etc.), but that will be done by ->llseek() instance and it
> would deal with the inode in the layer, not overlayfs one.

We do not (anymore) lock ovl inode in ovl_llseek(), see:
b1f9d3858f72 ovl: use ovl_inode_lock in ovl_llseek()
but ovl inode is held in operations (e.g. ovl_rename)
which trigger copy up and call vfs_llseek() on the lower file.

>
> Similar question applies to ovl_write_iter() - why do you
> need to hold the overlayfs inode locked during the call of
> backing_file_write_iter()?
>

Not sure. This question I need to defer to Miklos.
I see in several places the pattern:
        inode_lock(inode);
        /* Update mode */
        ovl_copyattr(inode);
        ret = file_remove_privs(file);
...
        /* Update size */
        ovl_file_modified(file);
...
        inode_unlock(inode);

so it could be related to atomic remove privs and update mtime,
but possibly we could convert all of those inode_lock() to
ovl_inode_lock() (i.e. internal lock below vfs inode lock).

[...]
> Consider the scenario when unlink() is called on that sucker
> during the write() that triggers that pathwalk.  We have
>
> unlink: blocked on overlayfs inode of file, while holding the parent directory.
> write: holding the overlayfs inode of file and trying to resolve a pathname
> that contains .../power/suspend_stats/../../...; blocked on attempt to lock
> parent so we could do a lookup in it.

This specifically cannot happen because sysfs is not allowed as an
upper layer only as a lower layer, so overlayfs itself will not be writing to
/sys/power/resume.

>
> No llseek involved anywhere, kernfs of->mutex held, but not contended,
> deadlock purely on ->i_rwsem of overlayfs inodes.
>
> Holding overlayfs inode locked during the call of lookup_bdev() is really
> no-go.

Yes. I see that, but how can this be resolved?

If the specific file /sys/power/resume will not have FMODE_LSEEK,
then ovl_copy_up_file() will not try to skip_hole on it at all and the
llseek lock dependency will be averted.

The question is whether opt-out of FMODE_LSEEK for /sys/power/resume
will break anything or if we should mark seq files in another way so that
overlayfs would not try to seek_hole on any of them categorically.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  9:33         ` Amir Goldstein
@ 2024-04-04 22:01           ` Al Viro
  2024-04-05 10:34             ` Amir Goldstein
  2024-04-05  6:51           ` Christoph Hellwig
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Al Viro @ 2024-04-04 22:01 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:

> This specifically cannot happen because sysfs is not allowed as an
> upper layer only as a lower layer, so overlayfs itself will not be writing to
> /sys/power/resume.

Then how could you possibly get a deadlock there?  What would your minimal
deadlocked set look like?

1.  Something is blocked in lookup_bdev() called from resume_store(), called
from sysfs_kf_write(), called from kernfs_write_iter(), which has acquired
->mutex of struct kernfs_open_file that had been allocated by
kernfs_fop_open() back when the file had been opened.  Note that each
struct file instance gets a separate struct kernfs_open_file.  Since we are
calling ->write_iter(), the file *MUST* have been opened for write.

2.  Something is blocked in kernfs_fop_llseek() on the same of->mutex,
i.e. using the same struct file as (1).  That something is holding an
overlayfs inode lock, which is what the next thread is blocked on.

+ at least one more thread, to complete the cycle.

Right?  How could that possibly happen without overlayfs opening /sys/power/resume
for write?  Again, each struct file instance gets a separate of->mutex;
for a deadlock you need a cycle of threads and a cycle of locks, such
that each thread is holding the corresponding lock and is blocked on
attempt to get the lock that comes next in the cyclic order.

If overlayfs never writes to that sucker, it can't participate in that
cycle.  Sure, you can get overlayfs llseek grabbing of->mutex of *ANOTHER*
struct file opened for the same sysfs file.  Since it's not the same
struct file and since each struct file there gets a separate kernfs_open_file
instance, the mutex won't be the same.

Unless I'm missing something else, that can't deadlock.  For a quick and
dirty experiment, try to give of->mutex on r/o opens a class separate from
that on r/w and w/o opens (mutex_init() in kernfs_fop_open()) and see
if lockdep warnings persist.

Something like

        if (has_mmap)
                mutex_init(&of->mutex);
        else if (file->f_mode & FMODE_WRITE)
                mutex_init(&of->mutex);
	else
                mutex_init(&of->mutex);

circa fs/kernfs/file.c:642.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  9:33         ` Amir Goldstein
  2024-04-04 22:01           ` Al Viro
@ 2024-04-05  6:51           ` Christoph Hellwig
  2024-04-05 11:19             ` Christian Brauner
                               ` (2 more replies)
  2024-04-05 10:47           ` Christian Brauner
  2024-04-06  4:09           ` Al Viro
  3 siblings, 3 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-04-05  6:51 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Al Viro, syzbot, gregkh, linux-fsdevel, linux-kernel,
	syzkaller-bugs, tj, valesini, Christoph Hellwig,
	Christian Brauner, Jan Kara, Miklos Szeredi

On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> I don't follow what you are saying.
> Which code is in non-starter violation?
> kernfs for calling lookup_bdev() with internal of->mutex held?

That is a huge problem, and has been causing endless annoying lockdep
chains in the block layer for us.  If we have some way to kill this
the whole block layer would benefit.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04 22:01           ` Al Viro
@ 2024-04-05 10:34             ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-05 10:34 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Fri, Apr 5, 2024 at 1:01 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
>
> > This specifically cannot happen because sysfs is not allowed as an
> > upper layer only as a lower layer, so overlayfs itself will not be writing to
> > /sys/power/resume.
>
> Then how could you possibly get a deadlock there?  What would your minimal
> deadlocked set look like?
>
> 1.  Something is blocked in lookup_bdev() called from resume_store(), called
> from sysfs_kf_write(), called from kernfs_write_iter(), which has acquired
> ->mutex of struct kernfs_open_file that had been allocated by
> kernfs_fop_open() back when the file had been opened.  Note that each
> struct file instance gets a separate struct kernfs_open_file.  Since we are
> calling ->write_iter(), the file *MUST* have been opened for write.
>
> 2.  Something is blocked in kernfs_fop_llseek() on the same of->mutex,
> i.e. using the same struct file as (1).  That something is holding an
> overlayfs inode lock, which is what the next thread is blocked on.
>
> + at least one more thread, to complete the cycle.
>
> Right?  How could that possibly happen without overlayfs opening /sys/power/resume
> for write?  Again, each struct file instance gets a separate of->mutex;
> for a deadlock you need a cycle of threads and a cycle of locks, such
> that each thread is holding the corresponding lock and is blocked on
> attempt to get the lock that comes next in the cyclic order.

Absolutely right.
I had it in my mind that this was a node lock. Did not look closely.

>
> If overlayfs never writes to that sucker, it can't participate in that
> cycle.  Sure, you can get overlayfs llseek grabbing of->mutex of *ANOTHER*
> struct file opened for the same sysfs file.  Since it's not the same
> struct file and since each struct file there gets a separate kernfs_open_file
> instance, the mutex won't be the same.
>
> Unless I'm missing something else, that can't deadlock.  For a quick and
> dirty experiment, try to give of->mutex on r/o opens a class separate from
> that on r/w and w/o opens (mutex_init() in kernfs_fop_open()) and see
> if lockdep warnings persist.
>
> Something like
>
>         if (has_mmap)
>                 mutex_init(&of->mutex);
>         else if (file->f_mode & FMODE_WRITE)
>                 mutex_init(&of->mutex);
>         else
>                 mutex_init(&of->mutex);

Why a quick experiment?
Why not a permanent kludge?

It is not any better or worse than the already existing has_mmap
subclass annotation. huh?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  9:33         ` Amir Goldstein
  2024-04-04 22:01           ` Al Viro
  2024-04-05  6:51           ` Christoph Hellwig
@ 2024-04-05 10:47           ` Christian Brauner
  2024-04-05 14:48             ` Amir Goldstein
  2024-04-06  4:09           ` Al Viro
  3 siblings, 1 reply; 29+ messages in thread
From: Christian Brauner @ 2024-04-05 10:47 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Al Viro, syzbot, gregkh, linux-fsdevel, linux-kernel,
	syzkaller-bugs, tj, valesini, Christoph Hellwig, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> On Thu, Apr 4, 2024 at 11:21 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, Apr 04, 2024 at 09:11:22AM +0100, Al Viro wrote:
> > > On Thu, Apr 04, 2024 at 09:54:35AM +0300, Amir Goldstein wrote:
> > > >
> > > > In the lockdep dependency chain, overlayfs inode lock is taken
> > > > before kernfs internal of->mutex, where kernfs (sysfs) is the lower
> > > > layer of overlayfs, which is sane.
> > > >
> > > > With /sys/power/resume (and probably other files), sysfs also
> > > > behaves as a stacking filesystem, calling vfs helpers, such as
> > > > lookup_bdev() -> kern_path(), which is a behavior of a stacked
> > > > filesystem, without all the precautions that comes with behaving
> > > > as a stacked filesystem.
> > >
> > > No.  This is far worse than anything stacked filesystems do - it's
> > > an arbitrary pathname resolution while holding a lock.
> > > It's not local.  Just about anything (including automounts, etc.)
> > > can be happening there and it pushes the lock in question outside
> > > of *ALL* pathwalk-related locks.  Pathname doesn't have to
> > > resolve to anything on overlayfs - it can just go through
> > > a symlink on it, or walk into it and traverse a bunch of ..
> > > afterwards, etc.
> > >
> > > Don't confuse that with stacking - it's not even close.
> > > You can't use that anywhere near overlayfs layers.
> > >
> > > Maybe isolate it into a separate filesystem, to be automounted
> > > on /sys/power.  And make anyone playing with overlayfs with
> > > sysfs as a layer mount the damn thing on top of power/ in your
> > > overlayfs.  But using that thing as a part of layer is
> > > a non-starter.
> 
> I don't follow what you are saying.
> Which code is in non-starter violation?
> kernfs for calling lookup_bdev() with internal of->mutex held?
> Overlayfs for allowing sysfs as a lower layer and calling
> vfs_llseek(lower_sysfs_file,...) during copy up while ovl inode is held
> for legit reasons (e.g. from ovl_rename())?
> 
> >
> > Incidentally, why do you need to lock overlayfs inode to call
> > vfs_llseek() on the underlying file?  It might (or might not)
> > need to lock the underlying file (for things like ->i_size,
> > etc.), but that will be done by ->llseek() instance and it
> > would deal with the inode in the layer, not overlayfs one.
> 
> We do not (anymore) lock ovl inode in ovl_llseek(), see:
> b1f9d3858f72 ovl: use ovl_inode_lock in ovl_llseek()
> but ovl inode is held in operations (e.g. ovl_rename)
> which trigger copy up and call vfs_llseek() on the lower file.
> 
> >
> > Similar question applies to ovl_write_iter() - why do you
> > need to hold the overlayfs inode locked during the call of
> > backing_file_write_iter()?
> >
> 
> Not sure. This question I need to defer to Miklos.
> I see in several places the pattern:
>         inode_lock(inode);
>         /* Update mode */
>         ovl_copyattr(inode);
>         ret = file_remove_privs(file);
> ...
>         /* Update size */
>         ovl_file_modified(file);
> ...
>         inode_unlock(inode);
> 
> so it could be related to atomic remove privs and update mtime,
> but possibly we could convert all of those inode_lock() to
> ovl_inode_lock() (i.e. internal lock below vfs inode lock).
> 
> [...]
> > Consider the scenario when unlink() is called on that sucker
> > during the write() that triggers that pathwalk.  We have
> >
> > unlink: blocked on overlayfs inode of file, while holding the parent directory.
> > write: holding the overlayfs inode of file and trying to resolve a pathname
> > that contains .../power/suspend_stats/../../...; blocked on attempt to lock
> > parent so we could do a lookup in it.
> 
> This specifically cannot happen because sysfs is not allowed as an
> upper layer only as a lower layer, so overlayfs itself will not be writing to
> /sys/power/resume.

I don't understand that part. If overlayfs uses /sys/power/ as a lower
layer it can open and write to /sys/power/resume, no?

Honestly, why don't you just block /sys/power from appearing in any
layer in overlayfs? This seems like such a niche use-case that it's so
unlikely that this will be used that I would just try and kill it.

If you do it like Al suggested and switch it to an automount you get
that for free. But I guess you can also just block it without that.

(Frankly, I find it weird that sysfs is allowed as a layer in any case. I
completely forgot about this. Imho, both procfs and sysfs should not be
usable as a lower layer - procfs is, I know - and then only select parts
should be like /sys/fs/cgroup or sm where I can see the container people
somehow using this to mess with the cgroup tree or something.)

> 
> >
> > No llseek involved anywhere, kernfs of->mutex held, but not contended,
> > deadlock purely on ->i_rwsem of overlayfs inodes.
> >
> > Holding overlayfs inode locked during the call of lookup_bdev() is really
> > no-go.
> 
> Yes. I see that, but how can this be resolved?
> 
> If the specific file /sys/power/resume will not have FMODE_LSEEK,
> then ovl_copy_up_file() will not try to skip_hole on it at all and the
> llseek lock dependency will be averted.
> 
> The question is whether opt-out of FMODE_LSEEK for /sys/power/resume
> will break anything or if we should mark seq files in another way so that
> overlayfs would not try to seek_hole on any of them categorically.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05  6:51           ` Christoph Hellwig
@ 2024-04-05 11:19             ` Christian Brauner
  2024-04-05 13:48               ` Christian Brauner
  2024-04-05 15:41             ` Tejun Heo
  2024-04-06  3:54             ` Al Viro
  2 siblings, 1 reply; 29+ messages in thread
From: Christian Brauner @ 2024-04-05 11:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Amir Goldstein, Al Viro, syzbot, gregkh, linux-fsdevel,
	linux-kernel, syzkaller-bugs, tj, valesini, Jan Kara,
	Miklos Szeredi

On Fri, Apr 05, 2024 at 08:51:35AM +0200, Christoph Hellwig wrote:
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> > I don't follow what you are saying.
> > Which code is in non-starter violation?
> > kernfs for calling lookup_bdev() with internal of->mutex held?
> 
> That is a huge problem, and has been causing endless annoying lockdep
> chains in the block layer for us.  If we have some way to kill this
> the whole block layer would benefit.

Why not just try and add a better resume api that forces resume to not
use a path argument neither for resume_file nor for writes to
/sys/power/resume. IOW, extend the protocol what can get written to
/sys/power/resume and then phase out the path argument. It'll take a
while but it's a possibly clean solution.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 11:19             ` Christian Brauner
@ 2024-04-05 13:48               ` Christian Brauner
  2024-04-05 14:52                 ` Christoph Hellwig
  0 siblings, 1 reply; 29+ messages in thread
From: Christian Brauner @ 2024-04-05 13:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Amir Goldstein, Al Viro, syzbot, gregkh, linux-fsdevel,
	linux-kernel, syzkaller-bugs, tj, valesini, Jan Kara,
	Miklos Szeredi

On Fri, Apr 05, 2024 at 01:19:35PM +0200, Christian Brauner wrote:
> On Fri, Apr 05, 2024 at 08:51:35AM +0200, Christoph Hellwig wrote:
> > On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> > > I don't follow what you are saying.
> > > Which code is in non-starter violation?
> > > kernfs for calling lookup_bdev() with internal of->mutex held?
> > 
> > That is a huge problem, and has been causing endless annoying lockdep
> > chains in the block layer for us.  If we have some way to kill this
> > the whole block layer would benefit.
> 
> Why not just try and add a better resume api that forces resume to not
> use a path argument neither for resume_file nor for writes to
> /sys/power/resume. IOW, extend the protocol what can get written to
> /sys/power/resume and then phase out the path argument. It'll take a
> while but it's a possibly clean solution.

In fact, just looking at this code with a naive set of eyes for a second:

* There's early_lookup_bdev() which deals with PARTUUID,
  PARTLABEL, raw device number, and lookup based on /dev. No actual path
  lookup involved in that.

* So the only interesting case is lookup_bdev() for /sys/power/suspend.
  That one takes arbitrary paths. But being realistic for a moment...
  How many people will specify a device path that's _not_ some variant
  of /dev/...? IOW, how many people will specify a device path that's
  not on devtmpfs or a symlink on devtmpfs? Probably almost no one.

  Containers come to mind ofc. But they can't mount devtmpfs so usually
  what they do is that they create a tmpfs mount at /dev and then
  bind-mount device nodes they need into there. But unprivileged
  containers cannot use suspend because that requires init_user_ns
  capabilities. And privileged containers that are allowed to hibernate
  and use custom paths seem extremly unlikely as well.

So really, _naively_ it seems to me that one could factor out the /dev/*
part of the device number parsing logic in early_lookup_bdev() and port
resume_store() to use that first and only if that fails fall back to
full lookup_bdev(). (Possibly combined with some sort of logging that the
user should use /dev/... paths or at least a way to recognize that this
arbitrary path stuff is actually used.)

And citing from a chat with the hibernation maintainer in systemd:

<brauner> So /sys/power/resume does systemd ever write anything other than a /dev/* path in to there?
<maintainer> Hmm? You never do that? It only accepts devno.

So that takes away one of the main users of this api. So I really
suspect that arbitrary device path is unused in practice. Maybe I'm all
wrong though.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 10:47           ` Christian Brauner
@ 2024-04-05 14:48             ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-05 14:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Al Viro, syzbot, gregkh, linux-fsdevel, linux-kernel,
	syzkaller-bugs, tj, valesini, Christoph Hellwig, Jan Kara,
	Miklos Szeredi

On Fri, Apr 5, 2024 at 1:47 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> > On Thu, Apr 4, 2024 at 11:21 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Thu, Apr 04, 2024 at 09:11:22AM +0100, Al Viro wrote:
> > > > On Thu, Apr 04, 2024 at 09:54:35AM +0300, Amir Goldstein wrote:
> > > > >
> > > > > In the lockdep dependency chain, overlayfs inode lock is taken
> > > > > before kernfs internal of->mutex, where kernfs (sysfs) is the lower
> > > > > layer of overlayfs, which is sane.
> > > > >
> > > > > With /sys/power/resume (and probably other files), sysfs also
> > > > > behaves as a stacking filesystem, calling vfs helpers, such as
> > > > > lookup_bdev() -> kern_path(), which is a behavior of a stacked
> > > > > filesystem, without all the precautions that comes with behaving
> > > > > as a stacked filesystem.
> > > >
> > > > No.  This is far worse than anything stacked filesystems do - it's
> > > > an arbitrary pathname resolution while holding a lock.
> > > > It's not local.  Just about anything (including automounts, etc.)
> > > > can be happening there and it pushes the lock in question outside
> > > > of *ALL* pathwalk-related locks.  Pathname doesn't have to
> > > > resolve to anything on overlayfs - it can just go through
> > > > a symlink on it, or walk into it and traverse a bunch of ..
> > > > afterwards, etc.
> > > >
> > > > Don't confuse that with stacking - it's not even close.
> > > > You can't use that anywhere near overlayfs layers.
> > > >
> > > > Maybe isolate it into a separate filesystem, to be automounted
> > > > on /sys/power.  And make anyone playing with overlayfs with
> > > > sysfs as a layer mount the damn thing on top of power/ in your
> > > > overlayfs.  But using that thing as a part of layer is
> > > > a non-starter.
> >
> > I don't follow what you are saying.
> > Which code is in non-starter violation?
> > kernfs for calling lookup_bdev() with internal of->mutex held?
> > Overlayfs for allowing sysfs as a lower layer and calling
> > vfs_llseek(lower_sysfs_file,...) during copy up while ovl inode is held
> > for legit reasons (e.g. from ovl_rename())?
> >
> > >
> > > Incidentally, why do you need to lock overlayfs inode to call
> > > vfs_llseek() on the underlying file?  It might (or might not)
> > > need to lock the underlying file (for things like ->i_size,
> > > etc.), but that will be done by ->llseek() instance and it
> > > would deal with the inode in the layer, not overlayfs one.
> >
> > We do not (anymore) lock ovl inode in ovl_llseek(), see:
> > b1f9d3858f72 ovl: use ovl_inode_lock in ovl_llseek()
> > but ovl inode is held in operations (e.g. ovl_rename)
> > which trigger copy up and call vfs_llseek() on the lower file.
> >
> > >
> > > Similar question applies to ovl_write_iter() - why do you
> > > need to hold the overlayfs inode locked during the call of
> > > backing_file_write_iter()?
> > >
> >
> > Not sure. This question I need to defer to Miklos.
> > I see in several places the pattern:
> >         inode_lock(inode);
> >         /* Update mode */
> >         ovl_copyattr(inode);
> >         ret = file_remove_privs(file);
> > ...
> >         /* Update size */
> >         ovl_file_modified(file);
> > ...
> >         inode_unlock(inode);
> >
> > so it could be related to atomic remove privs and update mtime,
> > but possibly we could convert all of those inode_lock() to
> > ovl_inode_lock() (i.e. internal lock below vfs inode lock).
> >
> > [...]
> > > Consider the scenario when unlink() is called on that sucker
> > > during the write() that triggers that pathwalk.  We have
> > >
> > > unlink: blocked on overlayfs inode of file, while holding the parent directory.
> > > write: holding the overlayfs inode of file and trying to resolve a pathname
> > > that contains .../power/suspend_stats/../../...; blocked on attempt to lock
> > > parent so we could do a lookup in it.
> >
> > This specifically cannot happen because sysfs is not allowed as an
> > upper layer only as a lower layer, so overlayfs itself will not be writing to
> > /sys/power/resume.
>
> I don't understand that part. If overlayfs uses /sys/power/ as a lower
> layer it can open and write to /sys/power/resume, no?
>
> Honestly, why don't you just block /sys/power from appearing in any
> layer in overlayfs? This seems like such a niche use-case that it's so
> unlikely that this will be used that I would just try and kill it.

I do not want to special case /sys/power in overlayfs.

>
> If you do it like Al suggested and switch it to an automount you get

Not important enough IMO to make this change.

> that for free. But I guess you can also just block it without that.
>
> (Frankly, I find it weird that sysfs is allowed as a layer in any case. I
> completely forgot about this. Imho, both procfs and sysfs should not be
> usable as a lower layer - procfs is, I know - and then only select parts
> should be like /sys/fs/cgroup or sm where I can see the container people
> somehow using this to mess with the cgroup tree or something.)
>

I do not know if using sysfs as a lower layer is an important use case,
but I have a feeling that people already may do it, so I cannot regress it
without a good reason.

Al's suggestion to annotate writable kernfs files as a different class from
readonly kernfs files seems fine by me to silence lockdep false positive.

I will try to feed this solution to syzbot.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 13:48               ` Christian Brauner
@ 2024-04-05 14:52                 ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2024-04-05 14:52 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Amir Goldstein, Al Viro, syzbot, gregkh,
	linux-fsdevel, linux-kernel, syzkaller-bugs, tj, valesini,
	Jan Kara, Miklos Szeredi

On Fri, Apr 05, 2024 at 03:48:20PM +0200, Christian Brauner wrote:
> * There's early_lookup_bdev() which deals with PARTUUID,
>   PARTLABEL, raw device number, and lookup based on /dev. No actual path
>   lookup involved in that.
> 
> * So the only interesting case is lookup_bdev() for /sys/power/suspend.
>   That one takes arbitrary paths. But being realistic for a moment...
>   How many people will specify a device path that's _not_ some variant
>   of /dev/...? IOW, how many people will specify a device path that's
>   not on devtmpfs or a symlink on devtmpfs? Probably almost no one.

That's not the point.  The poins is that trying to do the dumb name
to bdev translation in early_lookup_bdev is wrong.  Distro had and have
their own numbering schemes, and not using them bypasses access
control.  We should never use that at runtime.

> <brauner> So /sys/power/resume does systemd ever write anything other than a /dev/* path in to there?
> <maintainer> Hmm? You never do that? It only accepts devno.
> 
> So that takes away one of the main users of this api. So I really
> suspect that arbitrary device path is unused in practice. Maybe I'm all
> wrong though.

I'm all fine with just accepting a devno and no name.  But I fear it
will break something as someone added it for whatever use case they had
(and we should not have allowed that back then, but that ship has sailed
unfortunately)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  6:54   ` Amir Goldstein
  2024-04-04  8:11     ` Al Viro
@ 2024-04-05 15:08     ` Amir Goldstein
  2024-04-05 15:37       ` syzbot
  1 sibling, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2024-04-05 15:08 UTC (permalink / raw)
  To: syzbot
  Cc: gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj, valesini,
	Christoph Hellwig, Christian Brauner, Jan Kara, Miklos Szeredi,
	Al Viro

On Thu, Apr 4, 2024 at 9:54 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 2:51 AM syzbot
> <syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com> wrote:
> >
> > syzbot has bisected this issue to:
> >
> > commit 0fedefd4c4e33dd24f726b13b5d7c143e2b483be
> > Author: Valentine Sinitsyn <valesini@yandex-team.ru>
> > Date:   Mon Sep 25 08:40:12 2023 +0000
> >
> >     kernfs: sysfs: support custom llseek method for sysfs entries
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17cb5e03180000
> > start commit:   fe46a7dd189e Merge tag 'sound-6.9-rc1' of git://git.kernel..
> > git tree:       upstream
> > final oops:     https://syzkaller.appspot.com/x/report.txt?x=142b5e03180000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=102b5e03180000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=4d90a36f0cab495a
> > dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17f1d93d180000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c38139180000
> >
> > Reported-by: syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com
> > Fixes: 0fedefd4c4e3 ("kernfs: sysfs: support custom llseek method for sysfs entries")
> >
> > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> >
>

Let's test Al's solution.

#syz test: https://github.com/amir73il/linux/ vfs-fixes

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 15:08     ` Amir Goldstein
@ 2024-04-05 15:37       ` syzbot
  2024-04-05 16:23         ` Al Viro
  0 siblings, 1 reply; 29+ messages in thread
From: syzbot @ 2024-04-05 15:37 UTC (permalink / raw)
  To: amir73il, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini, viro

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

viperboard
[    8.144672][    T1] usbcore: registered new interface driver dln2
[    8.146814][    T1] usbcore: registered new interface driver pn533_usb
[    8.154721][    T1] nfcsim 0.2 initialized
[    8.156856][    T1] usbcore: registered new interface driver port100
[    8.158745][    T1] usbcore: registered new interface driver nfcmrvl
[    8.168709][    T1] Loading iSCSI transport class v2.0-870.
[    8.188275][    T1] virtio_scsi virtio0: 1/0/0 default/read/poll queues
[    8.202624][    T1] ------------[ cut here ]------------
[    8.204252][    T1] refcount_t: decrement hit 0; leaking memory.
[    8.206219][    T1] WARNING: CPU: 0 PID: 1 at lib/refcount.c:31 refcount_warn_saturate+0xfa/0x1d0
[    8.208638][    T1] Modules linked in:
[    8.209556][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc1-syzkaller-00001-g70d370568b75 #0
[    8.214301][    T1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
[    8.217361][    T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
[    8.218680][    T1] Code: b2 00 00 00 e8 77 99 f2 fc 5b 5d c3 cc cc cc cc e8 6b 99 f2 fc c6 05 11 1e f0 0a 01 90 48 c7 c7 e0 5e 1e 8c e8 b7 35 b5 fc 90 <0f> 0b 90 90 eb d9 e8 4b 99 f2 fc c6 05 ee 1d f0 0a 01 90 48 c7 c7
[    8.223425][    T1] RSP: 0000:ffffc90000066e18 EFLAGS: 00010246
[    8.225385][    T1] RAX: 1f757896feb95b00 RBX: ffff8881482a2d7c RCX: ffff888016ac8000
[    8.227629][    T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    8.229990][    T1] RBP: 0000000000000004 R08: ffffffff8157ffe2 R09: fffffbfff1c396e0
[    8.232520][    T1] R10: dffffc0000000000 R11: fffffbfff1c396e0 R12: ffffea000502cdc0
[    8.234601][    T1] R13: ffffea000502cdc8 R14: 1ffffd4000a059b9 R15: 0000000000000000
[    8.236876][    T1] FS:  0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
[    8.239600][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.240914][    T1] CR2: ffff88823ffff000 CR3: 000000000e132000 CR4: 00000000003506f0
[    8.243303][    T1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    8.245750][    T1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    8.247945][    T1] Call Trace:
[    8.248964][    T1]  <TASK>
[    8.250217][    T1]  ? __warn+0x163/0x4e0
[    8.251589][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.253743][    T1]  ? report_bug+0x2b3/0x500
[    8.254770][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.256336][    T1]  ? handle_bug+0x3e/0x70
[    8.257030][    T1]  ? exc_invalid_op+0x1a/0x50
[    8.258667][    T1]  ? asm_exc_invalid_op+0x1a/0x20
[    8.259957][    T1]  ? __warn_printk+0x292/0x360
[    8.261215][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.263208][    T1]  ? refcount_warn_saturate+0xf9/0x1d0
[    8.264293][    T1]  __free_pages_ok+0xc54/0xd80
[    8.265404][    T1]  make_alloc_exact+0xa3/0xf0
[    8.266560][    T1]  vring_alloc_queue_split+0x20a/0x600
[    8.267988][    T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
[    8.269965][    T1]  ? vp_find_vqs+0x4c/0x4e0
[    8.271058][    T1]  ? virtscsi_probe+0x3ea/0xf60
[    8.272057][    T1]  ? virtio_dev_probe+0x991/0xaf0
[    8.273466][    T1]  ? really_probe+0x2b8/0xad0
[    8.275102][    T1]  ? driver_probe_device+0x50/0x430
[    8.276174][    T1]  vring_create_virtqueue_split+0xc6/0x310
[    8.277826][    T1]  ? ret_from_fork+0x4b/0x80
[    8.279140][    T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
[    8.281586][    T1]  vring_create_virtqueue+0xca/0x110
[    8.283311][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.285146][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.286554][    T1]  setup_vq+0xe9/0x2d0
[    8.287916][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.289684][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.291329][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.292741][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.294786][    T1]  vp_setup_vq+0xbf/0x330
[    8.296422][    T1]  ? __pfx_vp_config_changed+0x10/0x10
[    8.297504][    T1]  ? ioread16+0x2f/0x90
[    8.298989][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.300925][    T1]  vp_find_vqs_msix+0x8b2/0xc80
[    8.302291][    T1]  vp_find_vqs+0x4c/0x4e0
[    8.303546][    T1]  virtscsi_init+0x8db/0xd00
[    8.304820][    T1]  ? __pfx_virtscsi_init+0x10/0x10
[    8.305808][    T1]  ? __pfx_default_calc_sets+0x10/0x10
[    8.307015][    T1]  ? scsi_host_alloc+0xa57/0xea0
[    8.308290][    T1]  ? vp_get+0xfd/0x140
[    8.309405][    T1]  virtscsi_probe+0x3ea/0xf60
[    8.310389][    T1]  ? __pfx_virtscsi_probe+0x10/0x10
[    8.311500][    T1]  ? kernfs_add_one+0x156/0x8b0
[    8.312348][    T1]  ? virtio_no_restricted_mem_acc+0x9/0x10
[    8.313977][    T1]  ? virtio_features_ok+0x10c/0x270
[    8.314859][    T1]  virtio_dev_probe+0x991/0xaf0
[    8.315766][    T1]  ? __pfx_virtio_dev_probe+0x10/0x10
[    8.316886][    T1]  really_probe+0x2b8/0xad0
[    8.317711][    T1]  __driver_probe_device+0x1a2/0x390
[    8.318525][    T1]  driver_probe_device+0x50/0x430
[    8.319567][    T1]  __driver_attach+0x45f/0x710
[    8.320525][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.321841][    T1]  bus_for_each_dev+0x239/0x2b0
[    8.322872][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.324235][    T1]  ? __pfx_bus_for_each_dev+0x10/0x10
[    8.326127][    T1]  ? do_raw_spin_unlock+0x13c/0x8b0
[    8.327579][    T1]  bus_add_driver+0x347/0x620
[    8.328989][    T1]  driver_register+0x23a/0x320
[    8.330318][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.332178][    T1]  virtio_scsi_init+0x65/0xe0
[    8.333752][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.334950][    T1]  do_one_initcall+0x248/0x880
[    8.336015][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.337202][    T1]  ? __pfx_do_one_initcall+0x10/0x10
[    8.338254][    T1]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[    8.339378][    T1]  ? __pfx_parse_args+0x10/0x10
[    8.341464][    T1]  ? do_initcalls+0x1c/0x80
[    8.342223][    T1]  ? rcu_is_watching+0x15/0xb0
[    8.343161][    T1]  do_initcall_level+0x157/0x210
[    8.344252][    T1]  do_initcalls+0x3f/0x80
[    8.345400][    T1]  kernel_init_freeable+0x435/0x5d0
[    8.346571][    T1]  ? __pfx_kernel_init_freeable+0x10/0x10
[    8.349024][    T1]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[    8.350664][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.352143][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.353430][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.354609][    T1]  kernel_init+0x1d/0x2b0
[    8.355433][    T1]  ret_from_fork+0x4b/0x80
[    8.356203][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.357092][    T1]  ret_from_fork_asm+0x1a/0x30
[    8.358282][    T1]  </TASK>
[    8.359166][    T1] Kernel panic - not syncing: kernel: panic_on_warn set ...
[    8.360941][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc1-syzkaller-00001-g70d370568b75 #0
[    8.363085][    T1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
[    8.363411][    T1] Call Trace:
[    8.363411][    T1]  <TASK>
[    8.363411][    T1]  dump_stack_lvl+0x241/0x360
[    8.363411][    T1]  ? __pfx_dump_stack_lvl+0x10/0x10
[    8.363411][    T1]  ? __pfx__printk+0x10/0x10
[    8.363411][    T1]  ? _printk+0xd5/0x120
[    8.363411][    T1]  ? vscnprintf+0x5d/0x90
[    8.363411][    T1]  panic+0x349/0x860
[    8.363411][    T1]  ? __warn+0x172/0x4e0
[    8.363411][    T1]  ? __pfx_panic+0x10/0x10
[    8.363411][    T1]  ? show_trace_log_lvl+0x4e6/0x520
[    8.363411][    T1]  ? ret_from_fork_asm+0x1a/0x30
[    8.363411][    T1]  __warn+0x346/0x4e0
[    8.363411][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.363411][    T1]  report_bug+0x2b3/0x500
[    8.363411][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.363411][    T1]  handle_bug+0x3e/0x70
[    8.363411][    T1]  exc_invalid_op+0x1a/0x50
[    8.363411][    T1]  asm_exc_invalid_op+0x1a/0x20
[    8.363411][    T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
[    8.363411][    T1] Code: b2 00 00 00 e8 77 99 f2 fc 5b 5d c3 cc cc cc cc e8 6b 99 f2 fc c6 05 11 1e f0 0a 01 90 48 c7 c7 e0 5e 1e 8c e8 b7 35 b5 fc 90 <0f> 0b 90 90 eb d9 e8 4b 99 f2 fc c6 05 ee 1d f0 0a 01 90 48 c7 c7
[    8.363411][    T1] RSP: 0000:ffffc90000066e18 EFLAGS: 00010246
[    8.363411][    T1] RAX: 1f757896feb95b00 RBX: ffff8881482a2d7c RCX: ffff888016ac8000
[    8.363411][    T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    8.363411][    T1] RBP: 0000000000000004 R08: ffffffff8157ffe2 R09: fffffbfff1c396e0
[    8.363411][    T1] R10: dffffc0000000000 R11: fffffbfff1c396e0 R12: ffffea000502cdc0
[    8.363411][    T1] R13: ffffea000502cdc8 R14: 1ffffd4000a059b9 R15: 0000000000000000
[    8.363411][    T1]  ? __warn_printk+0x292/0x360
[    8.363411][    T1]  ? refcount_warn_saturate+0xf9/0x1d0
[    8.363411][    T1]  __free_pages_ok+0xc54/0xd80
[    8.363411][    T1]  make_alloc_exact+0xa3/0xf0
[    8.363411][    T1]  vring_alloc_queue_split+0x20a/0x600
[    8.363411][    T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
[    8.363411][    T1]  ? vp_find_vqs+0x4c/0x4e0
[    8.363411][    T1]  ? virtscsi_probe+0x3ea/0xf60
[    8.363411][    T1]  ? virtio_dev_probe+0x991/0xaf0
[    8.363411][    T1]  ? really_probe+0x2b8/0xad0
[    8.363411][    T1]  ? driver_probe_device+0x50/0x430
[    8.363411][    T1]  vring_create_virtqueue_split+0xc6/0x310
[    8.363411][    T1]  ? ret_from_fork+0x4b/0x80
[    8.363411][    T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
[    8.363411][    T1]  vring_create_virtqueue+0xca/0x110
[    8.363411][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.363411][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.363411][    T1]  setup_vq+0xe9/0x2d0
[    8.363411][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.363411][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.363411][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.363411][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.363411][    T1]  vp_setup_vq+0xbf/0x330
[    8.363411][    T1]  ? __pfx_vp_config_changed+0x10/0x10
[    8.363411][    T1]  ? ioread16+0x2f/0x90
[    8.363411][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.363411][    T1]  vp_find_vqs_msix+0x8b2/0xc80
[    8.363411][    T1]  vp_find_vqs+0x4c/0x4e0
[    8.363411][    T1]  virtscsi_init+0x8db/0xd00
[    8.363411][    T1]  ? __pfx_virtscsi_init+0x10/0x10
[    8.363411][    T1]  ? __pfx_default_calc_sets+0x10/0x10
[    8.363411][    T1]  ? scsi_host_alloc+0xa57/0xea0
[    8.363411][    T1]  ? vp_get+0xfd/0x140
[    8.363411][    T1]  virtscsi_probe+0x3ea/0xf60
[    8.363411][    T1]  ? __pfx_virtscsi_probe+0x10/0x10
[    8.363411][    T1]  ? kernfs_add_one+0x156/0x8b0
[    8.363411][    T1]  ? virtio_no_restricted_mem_acc+0x9/0x10
[    8.363411][    T1]  ? virtio_features_ok+0x10c/0x270
[    8.363411][    T1]  virtio_dev_probe+0x991/0xaf0
[    8.363411][    T1]  ? __pfx_virtio_dev_probe+0x10/0x10
[    8.363411][    T1]  really_probe+0x2b8/0xad0
[    8.363411][    T1]  __driver_probe_device+0x1a2/0x390
[    8.363411][    T1]  driver_probe_device+0x50/0x430
[    8.363411][    T1]  __driver_attach+0x45f/0x710
[    8.363411][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.363411][    T1]  bus_for_each_dev+0x239/0x2b0
[    8.363411][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.363411][    T1]  ? __pfx_bus_for_each_dev+0x10/0x10
[    8.363411][    T1]  ? do_raw_spin_unlock+0x13c/0x8b0
[    8.363411][    T1]  bus_add_driver+0x347/0x620
[    8.363411][    T1]  driver_register+0x23a/0x320
[    8.363411][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.363411][    T1]  virtio_scsi_init+0x65/0xe0
[    8.363411][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.363411][    T1]  do_one_initcall+0x248/0x880
[    8.363411][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.363411][    T1]  ? __pfx_do_one_initcall+0x10/0x10
[    8.363411][    T1]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[    8.363411][    T1]  ? __pfx_parse_args+0x10/0x10
[    8.363411][    T1]  ? do_initcalls+0x1c/0x80
[    8.363411][    T1]  ? rcu_is_watching+0x15/0xb0
[    8.363411][    T1]  do_initcall_level+0x157/0x210
[    8.363411][    T1]  do_initcalls+0x3f/0x80
[    8.363411][    T1]  kernel_init_freeable+0x435/0x5d0
[    8.363411][    T1]  ? __pfx_kernel_init_freeable+0x10/0x10
[    8.363411][    T1]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[    8.363411][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.363411][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.363411][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.363411][    T1]  kernel_init+0x1d/0x2b0
[    8.363411][    T1]  ret_from_fork+0x4b/0x80
[    8.363411][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.363411][    T1]  ret_from_fork_asm+0x1a/0x30
[    8.363411][    T1]  </TASK>
[    8.363411][    T1] Kernel Offset: disabled
[    8.363411][    T1] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2829307758=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at 51c4dcff8
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"51c4dcff83b0574620c280cc5130ef59cc4a2e32\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=16ae7ead180000


Tested on:

commit:         70d37056 kernfs: annotate different lockdep class for ..
git tree:       https://github.com/amir73il/linux/ vfs-fixes
kernel config:  https://syzkaller.appspot.com/x/.config?x=d45c08b2154c215d
dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05  6:51           ` Christoph Hellwig
  2024-04-05 11:19             ` Christian Brauner
@ 2024-04-05 15:41             ` Tejun Heo
  2024-04-06  3:54             ` Al Viro
  2 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2024-04-05 15:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Amir Goldstein, Al Viro, syzbot, gregkh, linux-fsdevel,
	linux-kernel, syzkaller-bugs, valesini, Christian Brauner,
	Jan Kara, Miklos Szeredi

Hello,

On Fri, Apr 05, 2024 at 08:51:35AM +0200, Christoph Hellwig wrote:
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> > I don't follow what you are saying.
> > Which code is in non-starter violation?
> > kernfs for calling lookup_bdev() with internal of->mutex held?
> 
> That is a huge problem, and has been causing endless annoying lockdep
> chains in the block layer for us.  If we have some way to kill this
> the whole block layer would benefit.

of->mutex is mostly there as a convenience to kernfs (here, sysfs) users so
that they don't have to worry about concurrent invocation of the callbacks.
It needs more careful look but on cursory observation, it shouldn't be
difficult to implement a flag or different op type which skips of->mutex if
this causes a lot of pain.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 15:37       ` syzbot
@ 2024-04-05 16:23         ` Al Viro
  2024-04-06  5:34           ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Al Viro @ 2024-04-05 16:23 UTC (permalink / raw)
  To: syzbot
  Cc: amir73il, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini

On Fri, Apr 05, 2024 at 08:37:03AM -0700, syzbot wrote:
> Hello,
> 
> syzbot tried to test the proposed patch but the build/boot failed:

WTF?  The patch is

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index e9df2f87072c6..8502ef68459b9 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -636,11 +636,18 @@ static int kernfs_fop_open(struct inode *inode, struct file *file)
 	 * each file a separate locking class.  Let's differentiate on
 	 * whether the file has mmap or not for now.
 	 *
-	 * Both paths of the branch look the same.  They're supposed to
+	 * For similar reasons, writable and readonly files are given different
+	 * lockdep key, because the writable file /sys/power/resume may call vfs
+	 * lookup helpers for arbitrary paths and readonly files can be read by
+	 * overlayfs from vfs helpers when sysfs is a lower layer of overalyfs.
+	 *
+	 * All three cases look the same.  They're supposed to
 	 * look that way and give @of->mutex different static lockdep keys.
 	 */
 	if (has_mmap)
 		mutex_init(&of->mutex);
+	else if (file->f_mode & FMODE_WRITE)
+		mutex_init(&of->mutex);
 	else
 		mutex_init(&of->mutex);
 
How could it possibly trigger boot failure?  Test the parent, perhaps?

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05  6:51           ` Christoph Hellwig
  2024-04-05 11:19             ` Christian Brauner
  2024-04-05 15:41             ` Tejun Heo
@ 2024-04-06  3:54             ` Al Viro
  2 siblings, 0 replies; 29+ messages in thread
From: Al Viro @ 2024-04-06  3:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Amir Goldstein, syzbot, gregkh, linux-fsdevel, linux-kernel,
	syzkaller-bugs, tj, valesini, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Fri, Apr 05, 2024 at 08:51:35AM +0200, Christoph Hellwig wrote:
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
> > I don't follow what you are saying.
> > Which code is in non-starter violation?
> > kernfs for calling lookup_bdev() with internal of->mutex held?
> 
> That is a huge problem, and has been causing endless annoying lockdep
> chains in the block layer for us.  If we have some way to kill this
> the whole block layer would benefit.

Specifically of->mutex or having pathwalk done from some ->write_iter()?

Incidentally, losetup happily accepts sysfs files as backing store.
I don't think it adds problems we don't have otherwise, but IMO
that's really bogus...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-04  9:33         ` Amir Goldstein
                             ` (2 preceding siblings ...)
  2024-04-05 10:47           ` Christian Brauner
@ 2024-04-06  4:09           ` Al Viro
  2024-04-06  5:25             ` Amir Goldstein
  3 siblings, 1 reply; 29+ messages in thread
From: Al Viro @ 2024-04-06  4:09 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara,
	Miklos Szeredi

On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:

> We do not (anymore) lock ovl inode in ovl_llseek(), see:
> b1f9d3858f72 ovl: use ovl_inode_lock in ovl_llseek()
> but ovl inode is held in operations (e.g. ovl_rename)
> which trigger copy up and call vfs_llseek() on the lower file.

OK, but why do we bother with ovl_inode_lock() there?
Note that serialization on struct file level is provided
on syscall level - see call of fdget_pos() in there.
IOW, which object are you protecting?  If it's struct file
passed your way, you should already have the serialization.
If it's underlying file on disk, that's up to vfs_llseek().
Exclusion with copyup by a different operation?

I'm not saying it's wrong - it's just that the thing is
tricky enough, so some clarification might be a good idea.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-06  4:09           ` Al Viro
@ 2024-04-06  5:25             ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-06  5:25 UTC (permalink / raw)
  To: Al Viro, Miklos Szeredi
  Cc: syzbot, gregkh, linux-fsdevel, linux-kernel, syzkaller-bugs, tj,
	valesini, Christoph Hellwig, Christian Brauner, Jan Kara

On Sat, Apr 6, 2024 at 7:09 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
>
> > We do not (anymore) lock ovl inode in ovl_llseek(), see:
> > b1f9d3858f72 ovl: use ovl_inode_lock in ovl_llseek()
> > but ovl inode is held in operations (e.g. ovl_rename)
> > which trigger copy up and call vfs_llseek() on the lower file.
>
> OK, but why do we bother with ovl_inode_lock() there?
> Note that serialization on struct file level is provided
> on syscall level - see call of fdget_pos() in there.
> IOW, which object are you protecting?  If it's struct file
> passed your way, you should already have the serialization.
> If it's underlying file on disk, that's up to vfs_llseek().

You're right.

> Exclusion with copyup by a different operation?

Nah, don't see how this is relevant to file->f_pos.

>
> I'm not saying it's wrong - it's just that the thing is
> tricky enough, so some clarification might be a good idea.

I think I just used inode_lock() in
9e46b840c705 ("ovl: support stacked SEEK_HOLE/SEEK_DATA")
as a common coding pattern in overlayfs when protecting the
"master" copy of overlay inode attributes, but it was not needed
for file->f_pos.

Miklos, please ack that I am not missing anything and that
ovl_inode_lock() is indeed redundant in ovl_llseek().

Anyway, this lock is not part of the lockdep issue that started this thread.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-05 16:23         ` Al Viro
@ 2024-04-06  5:34           ` Amir Goldstein
  2024-04-06  7:05             ` syzbot
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2024-04-06  5:34 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini

On Fri, Apr 5, 2024 at 7:23 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Fri, Apr 05, 2024 at 08:37:03AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot tried to test the proposed patch but the build/boot failed:
>
> WTF?  The patch is
>
> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> index e9df2f87072c6..8502ef68459b9 100644
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -636,11 +636,18 @@ static int kernfs_fop_open(struct inode *inode, struct file *file)
>          * each file a separate locking class.  Let's differentiate on
>          * whether the file has mmap or not for now.
>          *
> -        * Both paths of the branch look the same.  They're supposed to
> +        * For similar reasons, writable and readonly files are given different
> +        * lockdep key, because the writable file /sys/power/resume may call vfs
> +        * lookup helpers for arbitrary paths and readonly files can be read by
> +        * overlayfs from vfs helpers when sysfs is a lower layer of overalyfs.
> +        *
> +        * All three cases look the same.  They're supposed to
>          * look that way and give @of->mutex different static lockdep keys.
>          */
>         if (has_mmap)
>                 mutex_init(&of->mutex);
> +       else if (file->f_mode & FMODE_WRITE)
> +               mutex_init(&of->mutex);
>         else
>                 mutex_init(&of->mutex);
>
> How could it possibly trigger boot failure?  Test the parent, perhaps?

Let's try again, rebased on current master:

#syz test: https://github.com/amir73il/linux/ vfs-fixes

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-06  5:34           ` Amir Goldstein
@ 2024-04-06  7:05             ` syzbot
  2024-04-06  7:11               ` Al Viro
  0 siblings, 1 reply; 29+ messages in thread
From: syzbot @ 2024-04-06  7:05 UTC (permalink / raw)
  To: amir73il, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini, viro

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

viperboard
[    8.566509][    T1] usbcore: registered new interface driver dln2
[    8.568801][    T1] usbcore: registered new interface driver pn533_usb
[    8.579446][    T1] nfcsim 0.2 initialized
[    8.581035][    T1] usbcore: registered new interface driver port100
[    8.583546][    T1] usbcore: registered new interface driver nfcmrvl
[    8.595440][    T1] Loading iSCSI transport class v2.0-870.
[    8.614503][    T1] virtio_scsi virtio0: 1/0/0 default/read/poll queues
[    8.624792][    T1] ------------[ cut here ]------------
[    8.626075][    T1] refcount_t: decrement hit 0; leaking memory.
[    8.627319][    T1] WARNING: CPU: 0 PID: 1 at lib/refcount.c:31 refcount_warn_saturate+0xfa/0x1d0
[    8.629053][    T1] Modules linked in:
[    8.629654][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc2-syzkaller-00388-g3398bf34c993 #0
[    8.631954][    T1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
[    8.634054][    T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
[    8.635268][    T1] Code: b2 00 00 00 e8 d7 93 f0 fc 5b 5d c3 cc cc cc cc e8 cb 93 f0 fc c6 05 ae 5c ee 0a 01 90 48 c7 c7 40 80 1e 8c e8 e7 2e b3 fc 90 <0f> 0b 90 90 eb d9 e8 ab 93 f0 fc c6 05 8b 5c ee 0a 01 90 48 c7 c7
[    8.638923][    T1] RSP: 0000:ffffc90000066e18 EFLAGS: 00010246
[    8.639822][    T1] RAX: 9b6fe4ff1df42a00 RBX: ffff888021006ccc RCX: ffff888016ac0000
[    8.641085][    T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    8.642968][    T1] RBP: 0000000000000004 R08: ffffffff8157ffc2 R09: fffffbfff1c39af8
[    8.645656][    T1] R10: dffffc0000000000 R11: fffffbfff1c39af8 R12: ffffea000502fdc0
[    8.646945][    T1] R13: ffffea000502fdc8 R14: 1ffffd4000a05fb9 R15: 0000000000000000
[    8.648121][    T1] FS:  0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
[    8.650496][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.652023][    T1] CR2: ffff88823ffff000 CR3: 000000000e134000 CR4: 00000000003506f0
[    8.653935][    T1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    8.655934][    T1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    8.657851][    T1] Call Trace:
[    8.658338][    T1]  <TASK>
[    8.658807][    T1]  ? __warn+0x163/0x4e0
[    8.659517][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.660549][    T1]  ? report_bug+0x2b3/0x500
[    8.661574][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.663372][    T1]  ? handle_bug+0x3e/0x70
[    8.664928][    T1]  ? exc_invalid_op+0x1a/0x50
[    8.666599][    T1]  ? asm_exc_invalid_op+0x1a/0x20
[    8.668117][    T1]  ? __warn_printk+0x292/0x360
[    8.670085][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.671620][    T1]  ? refcount_warn_saturate+0xf9/0x1d0
[    8.673520][    T1]  __free_pages_ok+0xc54/0xd80
[    8.675020][    T1]  make_alloc_exact+0xa3/0xf0
[    8.676151][    T1]  vring_alloc_queue_split+0x20a/0x600
[    8.677685][    T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
[    8.678762][    T1]  ? vp_find_vqs+0x4c/0x4e0
[    8.679763][    T1]  ? virtscsi_probe+0x3ea/0xf60
[    8.681408][    T1]  ? virtio_dev_probe+0x991/0xaf0
[    8.683147][    T1]  ? really_probe+0x2b8/0xad0
[    8.684759][    T1]  ? driver_probe_device+0x50/0x430
[    8.685967][    T1]  vring_create_virtqueue_split+0xc6/0x310
[    8.687403][    T1]  ? ret_from_fork+0x4b/0x80
[    8.688676][    T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
[    8.690045][    T1]  vring_create_virtqueue+0xca/0x110
[    8.691014][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.692612][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.693924][    T1]  setup_vq+0xe9/0x2d0
[    8.695355][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.696851][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.699030][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.700639][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.702131][    T1]  vp_setup_vq+0xbf/0x330
[    8.703279][    T1]  ? __pfx_vp_config_changed+0x10/0x10
[    8.704850][    T1]  ? ioread16+0x2f/0x90
[    8.706423][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.707372][    T1]  vp_find_vqs_msix+0x8b2/0xc80
[    8.708975][    T1]  vp_find_vqs+0x4c/0x4e0
[    8.709918][    T1]  virtscsi_init+0x8db/0xd00
[    8.710723][    T1]  ? __pfx_virtscsi_init+0x10/0x10
[    8.711813][    T1]  ? __pfx_default_calc_sets+0x10/0x10
[    8.713608][    T1]  ? scsi_host_alloc+0xa57/0xea0
[    8.714998][    T1]  ? vp_get+0xfd/0x140
[    8.715786][    T1]  virtscsi_probe+0x3ea/0xf60
[    8.716933][    T1]  ? __pfx_virtscsi_probe+0x10/0x10
[    8.718052][    T1]  ? kernfs_add_one+0x156/0x8b0
[    8.719084][    T1]  ? virtio_no_restricted_mem_acc+0x9/0x10
[    8.719983][    T1]  ? virtio_features_ok+0x10c/0x270
[    8.721406][    T1]  virtio_dev_probe+0x991/0xaf0
[    8.722392][    T1]  ? __pfx_virtio_dev_probe+0x10/0x10
[    8.723354][    T1]  really_probe+0x2b8/0xad0
[    8.725227][    T1]  __driver_probe_device+0x1a2/0x390
[    8.726307][    T1]  driver_probe_device+0x50/0x430
[    8.727156][    T1]  __driver_attach+0x45f/0x710
[    8.727968][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.728999][    T1]  bus_for_each_dev+0x239/0x2b0
[    8.730291][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.731396][    T1]  ? __pfx_bus_for_each_dev+0x10/0x10
[    8.732523][    T1]  ? do_raw_spin_unlock+0x13c/0x8b0
[    8.733601][    T1]  bus_add_driver+0x347/0x620
[    8.734733][    T1]  driver_register+0x23a/0x320
[    8.735723][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.736859][    T1]  virtio_scsi_init+0x65/0xe0
[    8.737729][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.739109][    T1]  do_one_initcall+0x248/0x880
[    8.740693][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.742428][    T1]  ? __pfx_do_one_initcall+0x10/0x10
[    8.745322][    T1]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[    8.746957][    T1]  ? __pfx_parse_args+0x10/0x10
[    8.748064][    T1]  ? do_initcalls+0x1c/0x80
[    8.749434][    T1]  ? rcu_is_watching+0x15/0xb0
[    8.750232][    T1]  do_initcall_level+0x157/0x210
[    8.751547][    T1]  do_initcalls+0x3f/0x80
[    8.752397][    T1]  kernel_init_freeable+0x435/0x5d0
[    8.754091][    T1]  ? __pfx_kernel_init_freeable+0x10/0x10
[    8.755229][    T1]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[    8.757834][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.759572][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.761202][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.762156][    T1]  kernel_init+0x1d/0x2b0
[    8.763250][    T1]  ret_from_fork+0x4b/0x80
[    8.764029][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.765039][    T1]  ret_from_fork_asm+0x1a/0x30
[    8.766178][    T1]  </TASK>
[    8.766835][    T1] Kernel panic - not syncing: kernel: panic_on_warn set ...
[    8.768906][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc2-syzkaller-00388-g3398bf34c993 #0
[    8.770531][    T1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
[    8.772129][    T1] Call Trace:
[    8.772129][    T1]  <TASK>
[    8.772129][    T1]  dump_stack_lvl+0x241/0x360
[    8.772129][    T1]  ? __pfx_dump_stack_lvl+0x10/0x10
[    8.772129][    T1]  ? __pfx__printk+0x10/0x10
[    8.772129][    T1]  ? _printk+0xd5/0x120
[    8.772129][    T1]  ? vscnprintf+0x5d/0x90
[    8.772129][    T1]  panic+0x349/0x860
[    8.772129][    T1]  ? __warn+0x172/0x4e0
[    8.772129][    T1]  ? __pfx_panic+0x10/0x10
[    8.772129][    T1]  ? show_trace_log_lvl+0x4e6/0x520
[    8.781563][    T1]  ? ret_from_fork_asm+0x1a/0x30
[    8.781563][    T1]  __warn+0x346/0x4e0
[    8.781563][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.781563][    T1]  report_bug+0x2b3/0x500
[    8.781563][    T1]  ? refcount_warn_saturate+0xfa/0x1d0
[    8.781563][    T1]  handle_bug+0x3e/0x70
[    8.781563][    T1]  exc_invalid_op+0x1a/0x50
[    8.781563][    T1]  asm_exc_invalid_op+0x1a/0x20
[    8.781563][    T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
[    8.791455][    T1] Code: b2 00 00 00 e8 d7 93 f0 fc 5b 5d c3 cc cc cc cc e8 cb 93 f0 fc c6 05 ae 5c ee 0a 01 90 48 c7 c7 40 80 1e 8c e8 e7 2e b3 fc 90 <0f> 0b 90 90 eb d9 e8 ab 93 f0 fc c6 05 8b 5c ee 0a 01 90 48 c7 c7
[    8.791455][    T1] RSP: 0000:ffffc90000066e18 EFLAGS: 00010246
[    8.791455][    T1] RAX: 9b6fe4ff1df42a00 RBX: ffff888021006ccc RCX: ffff888016ac0000
[    8.791455][    T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    8.791455][    T1] RBP: 0000000000000004 R08: ffffffff8157ffc2 R09: fffffbfff1c39af8
[    8.801572][    T1] R10: dffffc0000000000 R11: fffffbfff1c39af8 R12: ffffea000502fdc0
[    8.801572][    T1] R13: ffffea000502fdc8 R14: 1ffffd4000a05fb9 R15: 0000000000000000
[    8.801572][    T1]  ? __warn_printk+0x292/0x360
[    8.801572][    T1]  ? refcount_warn_saturate+0xf9/0x1d0
[    8.801572][    T1]  __free_pages_ok+0xc54/0xd80
[    8.801572][    T1]  make_alloc_exact+0xa3/0xf0
[    8.801572][    T1]  vring_alloc_queue_split+0x20a/0x600
[    8.801572][    T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
[    8.811433][    T1]  ? vp_find_vqs+0x4c/0x4e0
[    8.811433][    T1]  ? virtscsi_probe+0x3ea/0xf60
[    8.811433][    T1]  ? virtio_dev_probe+0x991/0xaf0
[    8.811433][    T1]  ? really_probe+0x2b8/0xad0
[    8.811433][    T1]  ? driver_probe_device+0x50/0x430
[    8.811433][    T1]  vring_create_virtqueue_split+0xc6/0x310
[    8.811433][    T1]  ? ret_from_fork+0x4b/0x80
[    8.811433][    T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
[    8.811433][    T1]  vring_create_virtqueue+0xca/0x110
[    8.821568][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.821568][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.821568][    T1]  setup_vq+0xe9/0x2d0
[    8.821568][    T1]  ? __pfx_vp_notify+0x10/0x10
[    8.821568][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.821568][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.821568][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.821568][    T1]  vp_setup_vq+0xbf/0x330
[    8.831450][    T1]  ? __pfx_vp_config_changed+0x10/0x10
[    8.831450][    T1]  ? ioread16+0x2f/0x90
[    8.831450][    T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[    8.831450][    T1]  vp_find_vqs_msix+0x8b2/0xc80
[    8.831450][    T1]  vp_find_vqs+0x4c/0x4e0
[    8.831450][    T1]  virtscsi_init+0x8db/0xd00
[    8.831450][    T1]  ? __pfx_virtscsi_init+0x10/0x10
[    8.831450][    T1]  ? __pfx_default_calc_sets+0x10/0x10
[    8.831450][    T1]  ? scsi_host_alloc+0xa57/0xea0
[    8.831450][    T1]  ? vp_get+0xfd/0x140
[    8.831450][    T1]  virtscsi_probe+0x3ea/0xf60
[    8.841557][    T1]  ? __pfx_virtscsi_probe+0x10/0x10
[    8.841557][    T1]  ? kernfs_add_one+0x156/0x8b0
[    8.841557][    T1]  ? virtio_no_restricted_mem_acc+0x9/0x10
[    8.841557][    T1]  ? virtio_features_ok+0x10c/0x270
[    8.841557][    T1]  virtio_dev_probe+0x991/0xaf0
[    8.841557][    T1]  ? __pfx_virtio_dev_probe+0x10/0x10
[    8.841557][    T1]  really_probe+0x2b8/0xad0
[    8.841557][    T1]  __driver_probe_device+0x1a2/0x390
[    8.841557][    T1]  driver_probe_device+0x50/0x430
[    8.841557][    T1]  __driver_attach+0x45f/0x710
[    8.851437][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.851437][    T1]  bus_for_each_dev+0x239/0x2b0
[    8.851437][    T1]  ? __pfx___driver_attach+0x10/0x10
[    8.851437][    T1]  ? __pfx_bus_for_each_dev+0x10/0x10
[    8.851437][    T1]  ? do_raw_spin_unlock+0x13c/0x8b0
[    8.851437][    T1]  bus_add_driver+0x347/0x620
[    8.851437][    T1]  driver_register+0x23a/0x320
[    8.851437][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.851437][    T1]  virtio_scsi_init+0x65/0xe0
[    8.851437][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.851437][    T1]  do_one_initcall+0x248/0x880
[    8.861551][    T1]  ? __pfx_virtio_scsi_init+0x10/0x10
[    8.861551][    T1]  ? __pfx_do_one_initcall+0x10/0x10
[    8.861551][    T1]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[    8.861551][    T1]  ? __pfx_parse_args+0x10/0x10
[    8.861551][    T1]  ? do_initcalls+0x1c/0x80
[    8.861551][    T1]  ? rcu_is_watching+0x15/0xb0
[    8.861551][    T1]  do_initcall_level+0x157/0x210
[    8.861551][    T1]  do_initcalls+0x3f/0x80
[    8.861551][    T1]  kernel_init_freeable+0x435/0x5d0
[    8.871429][    T1]  ? __pfx_kernel_init_freeable+0x10/0x10
[    8.871429][    T1]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[    8.871429][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.871429][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.871429][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.871429][    T1]  kernel_init+0x1d/0x2b0
[    8.871429][    T1]  ret_from_fork+0x4b/0x80
[    8.871429][    T1]  ? __pfx_kernel_init+0x10/0x10
[    8.871429][    T1]  ret_from_fork_asm+0x1a/0x30
[    8.871429][    T1]  </TASK>
[    8.871429][    T1] Kernel Offset: disabled
[    8.871429][    T1] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3147238964=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at 51c4dcff8
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=51c4dcff83b0574620c280cc5130ef59cc4a2e32 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240403-123700'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"51c4dcff83b0574620c280cc5130ef59cc4a2e32\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=16a356f6180000


Tested on:

commit:         3398bf34 kernfs: annotate different lockdep class for ..
git tree:       https://github.com/amir73il/linux/ vfs-fixes
kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-06  7:05             ` syzbot
@ 2024-04-06  7:11               ` Al Viro
  2024-04-06  8:57                 ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Al Viro @ 2024-04-06  7:11 UTC (permalink / raw)
  To: syzbot
  Cc: amir73il, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini

On Sat, Apr 06, 2024 at 12:05:04AM -0700, syzbot wrote:

> commit:         3398bf34 kernfs: annotate different lockdep class for ..
> git tree:       https://github.com/amir73il/linux/ vfs-fixes
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
> dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> 
> Note: no patches were applied.

How about the same test on 6c6e47d69d821047097909288b6d7f1aafb3b9b1?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-06  7:11               ` Al Viro
@ 2024-04-06  8:57                 ` Amir Goldstein
  2024-04-07  0:50                   ` Al Viro
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2024-04-06  8:57 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini, Hillf Danton

On Sat, Apr 6, 2024 at 10:11 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Sat, Apr 06, 2024 at 12:05:04AM -0700, syzbot wrote:
>
> > commit:         3398bf34 kernfs: annotate different lockdep class for ..
> > git tree:       https://github.com/amir73il/linux/ vfs-fixes
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
> > dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Note: no patches were applied.
>

Looks like it fixes the problem:
https://lore.kernel.org/lkml/000000000000a386f2061562ba6a@google.com/

Al,

Are you ok with going with your solution?
Do you want to pick it up through your tree?
Or shall I post it and ask Christian or Greg to pick it up?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-06  8:57                 ` Amir Goldstein
@ 2024-04-07  0:50                   ` Al Viro
  2024-04-07 11:02                     ` Amir Goldstein
  2024-04-09  9:18                     ` Christian Brauner
  0 siblings, 2 replies; 29+ messages in thread
From: Al Viro @ 2024-04-07  0:50 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini, Hillf Danton

On Sat, Apr 06, 2024 at 11:57:11AM +0300, Amir Goldstein wrote:
> On Sat, Apr 6, 2024 at 10:11 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Sat, Apr 06, 2024 at 12:05:04AM -0700, syzbot wrote:
> >
> > > commit:         3398bf34 kernfs: annotate different lockdep class for ..
> > > git tree:       https://github.com/amir73il/linux/ vfs-fixes
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > >
> > > Note: no patches were applied.
> >
> 
> Looks like it fixes the problem:
> https://lore.kernel.org/lkml/000000000000a386f2061562ba6a@google.com/
> 
> Al,
> 
> Are you ok with going with your solution?
> Do you want to pick it up through your tree?
> Or shall I post it and ask Christian or Greg to pick it up?

Umm...  I can grab it into #fixes.  Would be nice to get the result of
that test both on current mainline and mainline + patch, though - looks
like the same test keeps hitting some unrelated shite in virtio_scsi.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-07  0:50                   ` Al Viro
@ 2024-04-07 11:02                     ` Amir Goldstein
  2024-04-09  9:18                     ` Christian Brauner
  1 sibling, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2024-04-07 11:02 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, brauner, gregkh, hch, jack, linux-fsdevel, linux-kernel,
	miklos, syzkaller-bugs, tj, valesini, Hillf Danton

On Sun, Apr 7, 2024 at 3:51 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Sat, Apr 06, 2024 at 11:57:11AM +0300, Amir Goldstein wrote:
> > On Sat, Apr 6, 2024 at 10:11 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Sat, Apr 06, 2024 at 12:05:04AM -0700, syzbot wrote:
> > >
> > > > commit:         3398bf34 kernfs: annotate different lockdep class for ..
> > > > git tree:       https://github.com/amir73il/linux/ vfs-fixes
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > >
> > > > Note: no patches were applied.
> > >
> >
> > Looks like it fixes the problem:
> > https://lore.kernel.org/lkml/000000000000a386f2061562ba6a@google.com/
> >
> > Al,
> >
> > Are you ok with going with your solution?
> > Do you want to pick it up through your tree?
> > Or shall I post it and ask Christian or Greg to pick it up?
>
> Umm...  I can grab it into #fixes.

Cool.

Also:

#syz dup: possible deadlock in seq_read_iter (3)

https://lore.kernel.org/linux-fsdevel/CAJfpegumD0gDXpZwy53pPu54ifOfW-tTBLniLHep3sW2Z96MjQ@mail.gmail.com/

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek
  2024-04-07  0:50                   ` Al Viro
  2024-04-07 11:02                     ` Amir Goldstein
@ 2024-04-09  9:18                     ` Christian Brauner
  1 sibling, 0 replies; 29+ messages in thread
From: Christian Brauner @ 2024-04-09  9:18 UTC (permalink / raw)
  To: Al Viro
  Cc: Amir Goldstein, syzbot, gregkh, hch, jack, linux-fsdevel,
	linux-kernel, miklos, syzkaller-bugs, tj, valesini, Hillf Danton

On Sun, Apr 07, 2024 at 01:50:56AM +0100, Al Viro wrote:
> On Sat, Apr 06, 2024 at 11:57:11AM +0300, Amir Goldstein wrote:
> > On Sat, Apr 6, 2024 at 10:11 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Sat, Apr 06, 2024 at 12:05:04AM -0700, syzbot wrote:
> > >
> > > > commit:         3398bf34 kernfs: annotate different lockdep class for ..
> > > > git tree:       https://github.com/amir73il/linux/ vfs-fixes
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c5cda112a8438056
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=9a5b0ced8b1bfb238b56
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > >
> > > > Note: no patches were applied.
> > >
> > 
> > Looks like it fixes the problem:
> > https://lore.kernel.org/lkml/000000000000a386f2061562ba6a@google.com/
> > 
> > Al,
> > 
> > Are you ok with going with your solution?
> > Do you want to pick it up through your tree?
> > Or shall I post it and ask Christian or Greg to pick it up?
> 
> Umm...  I can grab it into #fixes.  Would be nice to get the result of

Fine by me if you have other stuff pending for -rc4 anyway. Otherwise I
can grab it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-04-09  9:18 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 18:23 [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek syzbot
2024-04-03 23:51 ` syzbot
2024-04-04  6:54   ` Amir Goldstein
2024-04-04  8:11     ` Al Viro
2024-04-04  8:21       ` Al Viro
2024-04-04  8:40         ` Al Viro
2024-04-04  9:33         ` Amir Goldstein
2024-04-04 22:01           ` Al Viro
2024-04-05 10:34             ` Amir Goldstein
2024-04-05  6:51           ` Christoph Hellwig
2024-04-05 11:19             ` Christian Brauner
2024-04-05 13:48               ` Christian Brauner
2024-04-05 14:52                 ` Christoph Hellwig
2024-04-05 15:41             ` Tejun Heo
2024-04-06  3:54             ` Al Viro
2024-04-05 10:47           ` Christian Brauner
2024-04-05 14:48             ` Amir Goldstein
2024-04-06  4:09           ` Al Viro
2024-04-06  5:25             ` Amir Goldstein
2024-04-05 15:08     ` Amir Goldstein
2024-04-05 15:37       ` syzbot
2024-04-05 16:23         ` Al Viro
2024-04-06  5:34           ` Amir Goldstein
2024-04-06  7:05             ` syzbot
2024-04-06  7:11               ` Al Viro
2024-04-06  8:57                 ` Amir Goldstein
2024-04-07  0:50                   ` Al Viro
2024-04-07 11:02                     ` Amir Goldstein
2024-04-09  9:18                     ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox