* [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
@ 2025-02-02 9:01 syzbot
2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
0 siblings, 2 replies; 5+ messages in thread
From: syzbot @ 2025-02-02 9:01 UTC (permalink / raw)
To: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 69b8923f5003 Merge tag 'for-linus-6.14-ofs4' of git://git...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=100c4eb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=57ab43c279fa614d
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ea84ac864e92/disk-69b8923f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6a465997b4e0/vmlinux-69b8923f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d72b67b2bd15/bzImage-69b8923f.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f59a1ae7b7227c859b8f@syzkaller.appspotmail.com
ocfs2: Finishing quota recovery on device (7,0) for slot 0
======================================================
WARNING: possible circular locking dependency detected
6.13.0-syzkaller-09793-g69b8923f5003 #0 Not tainted
------------------------------------------------------
kworker/u8:6/1142 is trying to acquire lock:
ffff888055ab40e0 (&type->s_umount_key#51){++++}-{4:4}, at: ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
but task is already holding lock:
ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3212 [inline]
ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x976/0x1840 kernel/workqueue.c:3317
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
process_one_work kernel/workqueue.c:3212 [inline]
process_scheduled_works+0x994/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
-> #1 ((wq_completion)ocfs2_wq){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3905
__flush_workqueue+0x14a/0x1280 kernel/workqueue.c:3947
ocfs2_shutdown_local_alloc+0x109/0xa90 fs/ocfs2/localalloc.c:380
ocfs2_dismount_volume+0x202/0x910 fs/ocfs2/super.c:1822
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
kill_block_super+0x44/0x90 fs/super.c:1710
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1413
task_work_run+0x24f/0x310 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&type->s_umount_key#51){++++}-{4:4}:
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
&type->s_umount_key#51 --> (wq_completion)ocfs2_wq --> (work_completion)(&journal->j_recovery_work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&journal->j_recovery_work));
lock((wq_completion)ocfs2_wq);
lock((work_completion)(&journal->j_recovery_work));
rlock(&type->s_umount_key#51);
*** DEADLOCK ***
2 locks held by kworker/u8:6/1142:
#0: ffff88802420b148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3211 [inline]
#0: ffff88802420b148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_scheduled_works+0x93b/0x1840 kernel/workqueue.c:3317
#1: ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3212 [inline]
#1: ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x976/0x1840 kernel/workqueue.c:3317
stack backtrace:
CPU: 0 UID: 0 PID: 1142 Comm: kworker/u8:6 Not tainted 6.13.0-syzkaller-09793-g69b8923f5003 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Workqueue: ocfs2_wq ocfs2_complete_recovery
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
@ 2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
1 sibling, 0 replies; 5+ messages in thread
From: syzbot @ 2025-03-13 13:50 UTC (permalink / raw)
To: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, syzkaller-bugs
syzbot has found a reproducer for the following issue on:
HEAD commit: b7f94fcf5546 Merge tag 'sched_ext-for-6.14-rc6-fixes' of g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=113addb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=efbd4e7089941bb6
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1695e698580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12a6e04c580000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-b7f94fcf.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/961a37fe09ad/vmlinux-b7f94fcf.xz
kernel image: https://storage.googleapis.com/syzbot-assets/483b4fd1ba55/bzImage-b7f94fcf.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/c0b6b6f0715a/mount_0.gz
fsck result: OK (log: https://syzkaller.appspot.com/x/fsck.log?x=14a6e04c580000)
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f59a1ae7b7227c859b8f@syzkaller.appspotmail.com
ocfs2: Finishing quota recovery on device (7,0) for slot 0
======================================================
WARNING: possible circular locking dependency detected
6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0 Not tainted
------------------------------------------------------
kworker/u4:10/1087 is trying to acquire lock:
ffff88803c49e0e0 (&type->s_umount_key#42){++++}-{4:4}, at: ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
but task is already holding lock:
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
process_one_work kernel/workqueue.c:3214 [inline]
process_scheduled_works+0x9e4/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
-> #1 ((wq_completion)ocfs2_wq){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3907
__flush_workqueue+0x14a/0x1280 kernel/workqueue.c:3949
ocfs2_shutdown_local_alloc+0x109/0xa90 fs/ocfs2/localalloc.c:380
ocfs2_dismount_volume+0x202/0x910 fs/ocfs2/super.c:1822
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
kill_block_super+0x44/0x90 fs/super.c:1710
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1413
task_work_run+0x24f/0x310 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&type->s_umount_key#42){++++}-{4:4}:
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
&type->s_umount_key#42 --> (wq_completion)ocfs2_wq --> (work_completion)(&journal->j_recovery_work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&journal->j_recovery_work));
lock((wq_completion)ocfs2_wq);
lock((work_completion)(&journal->j_recovery_work));
rlock(&type->s_umount_key#42);
*** DEADLOCK ***
2 locks held by kworker/u4:10/1087:
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3213 [inline]
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_scheduled_works+0x98b/0x18e0 kernel/workqueue.c:3319
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319
stack backtrace:
CPU: 0 UID: 0 PID: 1087 Comm: kworker/u4:10 Not tainted 6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: ocfs2_wq ocfs2_complete_recovery
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
2025-03-13 13:50 ` syzbot
@ 2025-04-17 4:03 ` syzbot
1 sibling, 0 replies; 5+ messages in thread
From: syzbot @ 2025-04-17 4:03 UTC (permalink / raw)
To: linux-kernel, syzkaller-bugs
For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.
***
Subject: Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery
Author: heming.zhao@suse.com
I created a branch for these 3 patch files. Let's ask syzbot to test them.
syzbot page: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
#syz test: https://github.com/zhaohem/linux jans_ocfs2
On 4/3/25 19:32, Jan Kara wrote:
> Hello,
>
> this implements another approach to fixing quota recovery deadlocks. We avoid
> grabbing sb->s_umount semaphore from ocfs2_finish_quota_recovery() and instead
> stop quota recovery early in ocfs2_dismount_volume(). Please review and test,
> the series has been only lightly tested in local mode as I don't have
> proper OCFS2 test setup.
>
> Honza
>
^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <89d94f84-8e88-4860-aaf7-102a51c11537@suse.com>]
* Re: [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
[not found] <89d94f84-8e88-4860-aaf7-102a51c11537@suse.com>
@ 2025-04-17 4:21 ` syzbot
2025-04-17 5:05 ` Heming Zhao
0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2025-04-17 4:21 UTC (permalink / raw)
To: heming.zhao, jack, joseph.qi, linux-kernel, m.masimov,
ocfs2-devel, syzkaller-bugs
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
unregister_netdevice: waiting for DEV to become free
unregister_netdevice: waiting for batadv0 to become free. Usage count = 3
Tested on:
commit: 11376431 ocfs2: Stop quota recovery before disabling q..
git tree: https://github.com/zhaohem/linux jans_ocfs2
console output: https://syzkaller.appspot.com/x/log.txt?x=14183a3f980000
kernel config: https://syzkaller.appspot.com/x/.config?x=5348e7fd1b89a770
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Note: no patches were applied.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
2025-04-17 4:21 ` [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
@ 2025-04-17 5:05 ` Heming Zhao
0 siblings, 0 replies; 5+ messages in thread
From: Heming Zhao @ 2025-04-17 5:05 UTC (permalink / raw)
To: syzbot, jack, joseph.qi, linux-kernel, m.masimov, ocfs2-devel,
syzkaller-bugs
On 4/17/25 12:21, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> unregister_netdevice: waiting for DEV to become free
>
> unregister_netdevice: waiting for batadv0 to become free. Usage count = 3
This is another module's issue, and the deadlock issue disappeared.
- Heming
>
>
> Tested on:
>
> commit: 11376431 ocfs2: Stop quota recovery before disabling q..
> git tree: https://github.com/zhaohem/linux jans_ocfs2
> console output: https://syzkaller.appspot.com/x/log.txt?x=14183a3f980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=5348e7fd1b89a770
> dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Note: no patches were applied.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-04-17 5:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
[not found] <89d94f84-8e88-4860-aaf7-102a51c11537@suse.com>
2025-04-17 4:21 ` [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
2025-04-17 5:05 ` Heming Zhao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox