From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 218230] xfstests xfs/538 hung
Date: Mon, 24 Jun 2024 09:42:51 +0000 [thread overview]
Message-ID: <bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-218230-201763@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=218230
--- Comment #2 from chandanbabu@kernel.org ---
>
> I have root caused the bug and hope to post the patches soon.
Sorry, I had forgotten about this bug. Luis reminded me about this
recently. The patches I had written turned out to be incorrect. However, the
following provides the root cause analysis of the bug.
Executing xfs/538 on a Linux v6.6 kernel can lead to the following
deadlock,
|------------------+------------------+------------------|
| Task A | Task B | Task C |
|------------------+------------------+------------------|
| Lock AG 1's AGF | | |
| | Lock AG 2's AGI | |
| | Wait for lock on | |
| | AG 1's AGF | |
| | | Lock AG 3's AGF |
| | | Wait for lock on |
| | | AG 2's AGI |
| Wait for lock on | | |
| AG 3's AGF | | |
|------------------+------------------+------------------|
As illustrated above, Task B and C are violating the AG locking order
rule i.e. AGI/AGF must be locked in increasing AG order and that
within an AG, AGI must be locked before an AGF.
Task B's call trace:
context_switch (kernel/sched/core.c:5382:2)
__schedule (kernel/sched/core.c:6695:8)
schedule (kernel/sched/core.c:6771:3)
schedule_timeout (kernel/time/timer.c:2143:3)
___down_common (kernel/locking/semaphore.c:225:13)
__down_common (kernel/locking/semaphore.c:246:8)
down (kernel/locking/semaphore.c:63:3)
xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
xfs_read_agf (fs/xfs/libxfs/xfs_alloc.c:3153:10)
xfs_alloc_read_agf (fs/xfs/libxfs/xfs_alloc.c:3185:10)
xfs_alloc_fix_freelist (fs/xfs/libxfs/xfs_alloc.c:2658:11)
xfs_alloc_vextent_prepare_ag (fs/xfs/libxfs/xfs_alloc.c:3321:10)
xfs_alloc_vextent_iterate_ags (fs/xfs/libxfs/xfs_alloc.c:3506:11)
xfs_alloc_vextent_first_ag (fs/xfs/libxfs/xfs_alloc.c:3641:10)
xfs_bmap_exact_minlen_extent_alloc (fs/xfs/libxfs/xfs_bmap.c:3434:10)
xfs_bmap_alloc_userdata (fs/xfs/libxfs/xfs_bmap.c:4084:10)
xfs_bmapi_allocate (fs/xfs/libxfs/xfs_bmap.c:4129:11)
xfs_bmapi_write (fs/xfs/libxfs/xfs_bmap.c:4438:12)
xfs_symlink (fs/xfs/xfs_symlink.c:271:11)
xfs_vn_symlink (fs/xfs/xfs_iops.c:419:10)
vfs_symlink (fs/namei.c:4480:10)
vfs_symlink (fs/namei.c:4464:5)
do_symlinkat (fs/namei.c:4506:11)
__do_sys_symlink (fs/namei.c:4527:9)
__se_sys_symlink (fs/namei.c:4525:1)
__x64_sys_symlink (fs/namei.c:4525:1)
do_syscall_x64 (arch/x86/entry/common.c:50:14)
do_syscall_64 (arch/x86/entry/common.c:80:7)
entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)
Task B above locked AG 2's AGI, allocated an ondisk inode, then tried
to allocate blocks (required for holding pathname representing the
symbolic link) from AG 1. This happened due to
xfs_bmap_exact_minlen_extent_alloc() iterating across AGs starting
from AG 0.
Task C's call trace:
context_switch (kernel/sched/core.c:5382:2)
__schedule (kernel/sched/core.c:6695:8)
schedule (kernel/sched/core.c:6771:3)
schedule_timeout (kernel/time/timer.c:2143:3)
___down_common (kernel/locking/semaphore.c:225:13)
__down_common (kernel/locking/semaphore.c:246:8)
down (kernel/locking/semaphore.c:63:3)
xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
xfs_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2598:10)
xfs_ialloc_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2626:10)
xfs_dialloc_try_ag (fs/xfs/libxfs/xfs_ialloc.c:1690:10)
xfs_dialloc (fs/xfs/libxfs/xfs_ialloc.c:1803:12)
xfs_create (fs/xfs/xfs_inode.c:1020:10)
xfs_generic_create (fs/xfs/xfs_iops.c:199:11)
vfs_mkdir (fs/namei.c:4120:10)
do_mkdirat (fs/namei.c:4143:11)
__do_sys_mkdir (fs/namei.c:4163:9)
__se_sys_mkdir (fs/namei.c:4161:1)
__x64_sys_mkdir (fs/namei.c:4161:1)
do_syscall_x64 (arch/x86/entry/common.c:50:14)
do_syscall_64 (arch/x86/entry/common.c:80:7)
entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)
Task C above was trying to allocate an inode chunk to serve a mkdir()
syscall request. Task C locked AG 3's AGF and searched for the
required extent. However, the only suitable extent was found to be
straddling xfs_alloc_arg->max_agbno. Hence, it moved to the next AG
and ended up wrapping around the AG list.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
prev parent reply other threads:[~2024-06-24 9:42 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-05 12:31 [Bug 218230] New: xfstests xfs/538 hung bugzilla-daemon
2023-12-05 14:29 ` [Bug 218230] " bugzilla-daemon
2024-06-24 9:39 ` Chandan Babu R
2024-06-24 9:42 ` bugzilla-daemon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox