All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 218230] xfstests xfs/538 hung
Date: Mon, 24 Jun 2024 09:42:51 +0000	[thread overview]
Message-ID: <bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-218230-201763@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=218230

--- Comment #2 from chandanbabu@kernel.org ---
>
> I have root caused the bug and hope to post the patches soon.

Sorry, I had forgotten about this bug. Luis reminded me about this
recently. The patches I had written turned out to be incorrect. However, the
following provides the root cause analysis of the bug.

Executing xfs/538 on a Linux v6.6 kernel can lead to the following
deadlock,

   |------------------+------------------+------------------|
   | Task A           | Task B           | Task C           |
   |------------------+------------------+------------------|
   | Lock AG 1's AGF  |                  |                  |
   |                  | Lock AG 2's AGI  |                  |
   |                  | Wait for lock on |                  |
   |                  | AG 1's AGF       |                  |
   |                  |                  | Lock AG 3's AGF  |
   |                  |                  | Wait for lock on |
   |                  |                  | AG 2's AGI       |
   | Wait for lock on |                  |                  |
   | AG 3's AGF       |                  |                  |
   |------------------+------------------+------------------|

As illustrated above, Task B and C are violating the AG locking order
rule i.e. AGI/AGF must be locked in increasing AG order and that
within an AG, AGI must be locked before an AGF.

Task B's call trace:
  context_switch (kernel/sched/core.c:5382:2)
  __schedule (kernel/sched/core.c:6695:8)
  schedule (kernel/sched/core.c:6771:3)
  schedule_timeout (kernel/time/timer.c:2143:3)
  ___down_common (kernel/locking/semaphore.c:225:13)
  __down_common (kernel/locking/semaphore.c:246:8)
  down (kernel/locking/semaphore.c:63:3)
  xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
  xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
  xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
  xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
  xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
  xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
  xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
  xfs_read_agf (fs/xfs/libxfs/xfs_alloc.c:3153:10)
  xfs_alloc_read_agf (fs/xfs/libxfs/xfs_alloc.c:3185:10)
  xfs_alloc_fix_freelist (fs/xfs/libxfs/xfs_alloc.c:2658:11)
  xfs_alloc_vextent_prepare_ag (fs/xfs/libxfs/xfs_alloc.c:3321:10)
  xfs_alloc_vextent_iterate_ags (fs/xfs/libxfs/xfs_alloc.c:3506:11)
  xfs_alloc_vextent_first_ag (fs/xfs/libxfs/xfs_alloc.c:3641:10)
  xfs_bmap_exact_minlen_extent_alloc (fs/xfs/libxfs/xfs_bmap.c:3434:10)
  xfs_bmap_alloc_userdata (fs/xfs/libxfs/xfs_bmap.c:4084:10)
  xfs_bmapi_allocate (fs/xfs/libxfs/xfs_bmap.c:4129:11)
  xfs_bmapi_write (fs/xfs/libxfs/xfs_bmap.c:4438:12)
  xfs_symlink (fs/xfs/xfs_symlink.c:271:11)
  xfs_vn_symlink (fs/xfs/xfs_iops.c:419:10)
  vfs_symlink (fs/namei.c:4480:10)
  vfs_symlink (fs/namei.c:4464:5)
  do_symlinkat (fs/namei.c:4506:11)
  __do_sys_symlink (fs/namei.c:4527:9)
  __se_sys_symlink (fs/namei.c:4525:1)
  __x64_sys_symlink (fs/namei.c:4525:1)
  do_syscall_x64 (arch/x86/entry/common.c:50:14)
  do_syscall_64 (arch/x86/entry/common.c:80:7)
  entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)

Task B above locked AG 2's AGI, allocated an ondisk inode, then tried
to allocate blocks (required for holding pathname representing the
symbolic link) from AG 1. This happened due to
xfs_bmap_exact_minlen_extent_alloc() iterating across AGs starting
from AG 0.

Task C's call trace:
  context_switch (kernel/sched/core.c:5382:2)
  __schedule (kernel/sched/core.c:6695:8)
  schedule (kernel/sched/core.c:6771:3)
  schedule_timeout (kernel/time/timer.c:2143:3)
  ___down_common (kernel/locking/semaphore.c:225:13)
  __down_common (kernel/locking/semaphore.c:246:8)
  down (kernel/locking/semaphore.c:63:3)
  xfs_buf_lock (fs/xfs/xfs_buf.c:1126:2)
  xfs_buf_find_lock (fs/xfs/xfs_buf.c:553:3)
  xfs_buf_lookup (fs/xfs/xfs_buf.c:592:10)
  xfs_buf_get_map (fs/xfs/xfs_buf.c:702:10)
  xfs_buf_read_map (fs/xfs/xfs_buf.c:817:10)
  xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289:10)
  xfs_trans_read_buf (./fs/xfs/xfs_trans.h:212:9)
  xfs_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2598:10)
  xfs_ialloc_read_agi (fs/xfs/libxfs/xfs_ialloc.c:2626:10)
  xfs_dialloc_try_ag (fs/xfs/libxfs/xfs_ialloc.c:1690:10)
  xfs_dialloc (fs/xfs/libxfs/xfs_ialloc.c:1803:12)
  xfs_create (fs/xfs/xfs_inode.c:1020:10)
  xfs_generic_create (fs/xfs/xfs_iops.c:199:11)
  vfs_mkdir (fs/namei.c:4120:10)
  do_mkdirat (fs/namei.c:4143:11)
  __do_sys_mkdir (fs/namei.c:4163:9)
  __se_sys_mkdir (fs/namei.c:4161:1)
  __x64_sys_mkdir (fs/namei.c:4161:1)
  do_syscall_x64 (arch/x86/entry/common.c:50:14)
  do_syscall_64 (arch/x86/entry/common.c:80:7)
  entry_SYSCALL_64+0xaa/0x1a6 (arch/x86/entry/entry_64.S:120)

Task C above was trying to allocate an inode chunk to serve a mkdir()
syscall request. Task C locked AG 3's AGF and searched for the
required extent. However, the only suitable extent was found to be
straddling xfs_alloc_arg->max_agbno. Hence, it moved to the next AG
and ended up wrapping around the AG list.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

      parent reply	other threads:[~2024-06-24  9:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-05 12:31 [Bug 218230] New: xfstests xfs/538 hung bugzilla-daemon
2023-12-05 14:29 ` [Bug 218230] " bugzilla-daemon
2024-06-24  9:39   ` Chandan Babu R
2024-06-24  9:42 ` bugzilla-daemon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-218230-201763-U2vMSafKRz@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.