public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 218230] xfstests xfs/538 hung
Date: Mon, 24 Jun 2024 09:42:51 +0000	[thread overview]
Message-ID: <bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-218230-201763@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=218230

--- Comment #2 from chandanbabu@kernel.org ---
>
> I have root caused the bug and hope to post the patches soon.

Sorry, I had forgotten about this bug. Luis reminded me about this
recently. The patches I had written turned out to be incorrect. However, the
following provides the root cause analysis of the bug.

Executing xfs/538 on a Linux v6.6 kernel can lead to the following
deadlock,

   |------------------+------------------+------------------|
   | Task A           | Task B           | Task C           |
   |------------------+------------------+------------------|
   | Lock AG 1's AGF  |                  |                  |
   |                  | Lock AG 2's AGI  |                  |
   |                  | Wait for lock on |                  |
   |                  | AG 1's AGF       |                  |
   |                  |                  | Lock AG 3's AGF  |
   |                  |                  | Wait for lock on |
   |                  |                  | AG 2's AGI       |
   | Wait for lock on |                  |                  |
   | AG 3's AGF       |                  |                  |
   |------------------+------------------+------------------|

As illustrated above, Task B and C are violating the AG locking order
rule i.e. AGI/AGF must be locked in increasing AG order and that
within an AG, AGI must be locked before an AGF.

Task B's call trace:
  context_switch (kernel/sched/core.c:5382:2)
  __schedule (kernel/sched/core.c:6695:8)
  schedule (kernel/sched/core.c:6771:3)
  schedule_timeout (kernel/time/timer.c:2143:3)
  ___down_common (kernel/locking/semaphore.c:225:13)
  __down_common (kernel/locking/semaphore.c:246:8)
  down (kernel/locking/semaphore.c:63:3)
  xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
  xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
  xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
  xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
  xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
  xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
  xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
  xfs_read_agf (fs/xfs/libxfs/xfs_alloc.c:3153:10)
  xfs_alloc_read_agf (fs/xfs/libxfs/xfs_alloc.c:3185:10)
  xfs_alloc_fix_freelist (fs/xfs/libxfs/xfs_alloc.c:2658:11)
  xfs_alloc_vextent_prepare_ag (fs/xfs/libxfs/xfs_alloc.c:3321:10)
  xfs_alloc_vextent_iterate_ags (fs/xfs/libxfs/xfs_alloc.c:3506:11)
  xfs_alloc_vextent_first_ag (fs/xfs/libxfs/xfs_alloc.c:3641:10)
  xfs_bmap_exact_minlen_extent_alloc (fs/xfs/libxfs/xfs_bmap.c:3434:10)
  xfs_bmap_alloc_userdata (fs/xfs/libxfs/xfs_bmap.c:4084:10)
  xfs_bmapi_allocate (fs/xfs/libxfs/xfs_bmap.c:4129:11)
  xfs_bmapi_write (fs/xfs/libxfs/xfs_bmap.c:4438:12)
  xfs_symlink (fs/xfs/xfs_symlink.c:271:11)
  xfs_vn_symlink (fs/xfs/xfs_iops.c:419:10)
  vfs_symlink (fs/namei.c:4480:10)
  vfs_symlink (fs/namei.c:4464:5)
  do_symlinkat (fs/namei.c:4506:11)
  __do_sys_symlink (fs/namei.c:4527:9)
  __se_sys_symlink (fs/namei.c:4525:1)
  __x64_sys_symlink (fs/namei.c:4525:1)
  do_syscall_x64 (arch/x86/entry/common.c:50:14)
  do_syscall_64 (arch/x86/entry/common.c:80:7)
  entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)

Task B above locked AG 2's AGI, allocated an ondisk inode, then tried
to allocate blocks (required for holding pathname representing the
symbolic link) from AG 1. This happened due to
xfs_bmap_exact_minlen_extent_alloc() iterating across AGs starting
from AG 0.

Task C's call trace:
  context_switch (kernel/sched/core.c:5382:2)
  __schedule (kernel/sched/core.c:6695:8)
  schedule (kernel/sched/core.c:6771:3)
  schedule_timeout (kernel/time/timer.c:2143:3)
  ___down_common (kernel/locking/semaphore.c:225:13)
  __down_common (kernel/locking/semaphore.c:246:8)
  down (kernel/locking/semaphore.c:63:3)
  xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
  xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
  xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
  xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
  xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
  xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
  xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
  xfs_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2598:10)
  xfs_ialloc_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2626:10)
  xfs_dialloc_try_ag (fs/xfs/libxfs/xfs_ialloc.c:1690:10)
  xfs_dialloc (fs/xfs/libxfs/xfs_ialloc.c:1803:12)
  xfs_create (fs/xfs/xfs_inode.c:1020:10)
  xfs_generic_create (fs/xfs/xfs_iops.c:199:11)
  vfs_mkdir (fs/namei.c:4120:10)
  do_mkdirat (fs/namei.c:4143:11)
  __do_sys_mkdir (fs/namei.c:4163:9)
  __se_sys_mkdir (fs/namei.c:4161:1)
  __x64_sys_mkdir (fs/namei.c:4161:1)
  do_syscall_x64 (arch/x86/entry/common.c:50:14)
  do_syscall_64 (arch/x86/entry/common.c:80:7)
  entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)

Task C above was trying to allocate an inode chunk to serve a mkdir()
syscall request. Task C locked AG 3's AGF and searched for the
required extent. However, the only suitable extent was found to be
straddling xfs_alloc_arg->max_agbno. Hence, it moved to the next AG
and ended up wrapping around the AG list.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

      parent reply	other threads:[~2024-06-24  9:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-05 12:31 [Bug 218230] New: xfstests xfs/538 hung bugzilla-daemon
2023-12-05 14:29 ` [Bug 218230] " bugzilla-daemon
2024-06-24  9:39   ` Chandan Babu R
2024-06-24  9:42 ` bugzilla-daemon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox