* [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
@ 2025-02-02 9:01 syzbot
2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
0 siblings, 2 replies; 3+ messages in thread
From: syzbot @ 2025-02-02 9:01 UTC (permalink / raw)
To: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 69b8923f5003 Merge tag 'for-linus-6.14-ofs4' of git://git...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=100c4eb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=57ab43c279fa614d
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ea84ac864e92/disk-69b8923f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6a465997b4e0/vmlinux-69b8923f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d72b67b2bd15/bzImage-69b8923f.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f59a1ae7b7227c859b8f@syzkaller.appspotmail.com
ocfs2: Finishing quota recovery on device (7,0) for slot 0
======================================================
WARNING: possible circular locking dependency detected
6.13.0-syzkaller-09793-g69b8923f5003 #0 Not tainted
------------------------------------------------------
kworker/u8:6/1142 is trying to acquire lock:
ffff888055ab40e0 (&type->s_umount_key#51){++++}-{4:4}, at: ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
but task is already holding lock:
ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3212 [inline]
ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x976/0x1840 kernel/workqueue.c:3317
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
process_one_work kernel/workqueue.c:3212 [inline]
process_scheduled_works+0x994/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
-> #1 ((wq_completion)ocfs2_wq){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3905
__flush_workqueue+0x14a/0x1280 kernel/workqueue.c:3947
ocfs2_shutdown_local_alloc+0x109/0xa90 fs/ocfs2/localalloc.c:380
ocfs2_dismount_volume+0x202/0x910 fs/ocfs2/super.c:1822
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
kill_block_super+0x44/0x90 fs/super.c:1710
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1413
task_work_run+0x24f/0x310 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&type->s_umount_key#51){++++}-{4:4}:
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
&type->s_umount_key#51 --> (wq_completion)ocfs2_wq --> (work_completion)(&journal->j_recovery_work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&journal->j_recovery_work));
lock((wq_completion)ocfs2_wq);
lock((work_completion)(&journal->j_recovery_work));
rlock(&type->s_umount_key#51);
*** DEADLOCK ***
2 locks held by kworker/u8:6/1142:
#0: ffff88802420b148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3211 [inline]
#0: ffff88802420b148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_scheduled_works+0x93b/0x1840 kernel/workqueue.c:3317
#1: ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3212 [inline]
#1: ffffc90003c4fc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x976/0x1840 kernel/workqueue.c:3317
stack backtrace:
CPU: 0 UID: 0 PID: 1142 Comm: kworker/u8:6 Not tainted 6.13.0-syzkaller-09793-g69b8923f5003 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Workqueue: ocfs2_wq ocfs2_complete_recovery
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
@ 2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
1 sibling, 0 replies; 3+ messages in thread
From: syzbot @ 2025-03-13 13:50 UTC (permalink / raw)
To: jlbec, joseph.qi, linux-kernel, mark, ocfs2-devel, syzkaller-bugs
syzbot has found a reproducer for the following issue on:
HEAD commit: b7f94fcf5546 Merge tag 'sched_ext-for-6.14-rc6-fixes' of g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=113addb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=efbd4e7089941bb6
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1695e698580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12a6e04c580000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-b7f94fcf.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/961a37fe09ad/vmlinux-b7f94fcf.xz
kernel image: https://storage.googleapis.com/syzbot-assets/483b4fd1ba55/bzImage-b7f94fcf.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/c0b6b6f0715a/mount_0.gz
fsck result: OK (log: https://syzkaller.appspot.com/x/fsck.log?x=14a6e04c580000)
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f59a1ae7b7227c859b8f@syzkaller.appspotmail.com
ocfs2: Finishing quota recovery on device (7,0) for slot 0
======================================================
WARNING: possible circular locking dependency detected
6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0 Not tainted
------------------------------------------------------
kworker/u4:10/1087 is trying to acquire lock:
ffff88803c49e0e0 (&type->s_umount_key#42){++++}-{4:4}, at: ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
but task is already holding lock:
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
process_one_work kernel/workqueue.c:3214 [inline]
process_scheduled_works+0x9e4/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
-> #1 ((wq_completion)ocfs2_wq){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3907
__flush_workqueue+0x14a/0x1280 kernel/workqueue.c:3949
ocfs2_shutdown_local_alloc+0x109/0xa90 fs/ocfs2/localalloc.c:380
ocfs2_dismount_volume+0x202/0x910 fs/ocfs2/super.c:1822
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
kill_block_super+0x44/0x90 fs/super.c:1710
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1413
task_work_run+0x24f/0x310 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&type->s_umount_key#42){++++}-{4:4}:
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
&type->s_umount_key#42 --> (wq_completion)ocfs2_wq --> (work_completion)(&journal->j_recovery_work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&journal->j_recovery_work));
lock((wq_completion)ocfs2_wq);
lock((work_completion)(&journal->j_recovery_work));
rlock(&type->s_umount_key#42);
*** DEADLOCK ***
2 locks held by kworker/u4:10/1087:
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3213 [inline]
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_scheduled_works+0x98b/0x18e0 kernel/workqueue.c:3319
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319
stack backtrace:
CPU: 0 UID: 0 PID: 1087 Comm: kworker/u4:10 Not tainted 6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: ocfs2_wq ocfs2_complete_recovery
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
2025-03-13 13:50 ` syzbot
@ 2025-04-17 4:03 ` syzbot
1 sibling, 0 replies; 3+ messages in thread
From: syzbot @ 2025-04-17 4:03 UTC (permalink / raw)
To: linux-kernel, syzkaller-bugs
For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.
***
Subject: Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery
Author: heming.zhao@suse.com
I created a branch for these 3 patch files. Let's ask syzbot to test them.
syzbot page: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
#syz test: https://github.com/zhaohem/linux jans_ocfs2
On 4/3/25 19:32, Jan Kara wrote:
> Hello,
>
> this implements another approach to fixing quota recovery deadlocks. We avoid
> grabbing sb->s_umount semaphore from ocfs2_finish_quota_recovery() and instead
> stop quota recovery early in ocfs2_dismount_volume(). Please review and test,
> the series has been only lightly tested in local mode as I don't have
> proper OCFS2 test setup.
>
> Honza
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-04-17 4:03 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-02 9:01 [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery syzbot
2025-03-13 13:50 ` syzbot
2025-04-17 4:03 ` [syzbot] Re: [PATCH 0/3] ocfs2: Fix deadlocks in quota recovery syzbot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.