From: Catherine Hoang <catherine.hoang@oracle.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 6.6 CANDIDATE 4/8] xfs: shrink failure needs to hold AGI buffer
Date: Thu, 13 Jun 2024 18:49:42 -0700 [thread overview]
Message-ID: <20240614014946.43237-5-catherine.hoang@oracle.com> (raw)
In-Reply-To: <20240614014946.43237-1-catherine.hoang@oracle.com>
From: Dave Chinner <dchinner@redhat.com>
commit 75bcffbb9e7563259b7aed0fa77459d6a3a35627 upstream.
Chandan reported a AGI/AGF lock order hang on xfs/168 during recent
testing. The cause of the problem was the task running xfs_growfs
to shrink the filesystem. A failure occurred trying to remove the
free space from the btrees that the shrink would make disappear,
and that meant it ran the error handling for a partial failure.
This error path involves restoring the per-ag block reservations,
and that requires calculating the amount of space needed to be
reserved for the free inode btree. The growfs operation hung here:
[18679.536829] down+0x71/0xa0
[18679.537657] xfs_buf_lock+0xa4/0x290 [xfs]
[18679.538731] xfs_buf_find_lock+0xf7/0x4d0 [xfs]
[18679.539920] xfs_buf_lookup.constprop.0+0x289/0x500 [xfs]
[18679.542628] xfs_buf_get_map+0x2b3/0xe40 [xfs]
[18679.547076] xfs_buf_read_map+0xbb/0x900 [xfs]
[18679.562616] xfs_trans_read_buf_map+0x449/0xb10 [xfs]
[18679.569778] xfs_read_agi+0x1cd/0x500 [xfs]
[18679.573126] xfs_ialloc_read_agi+0xc2/0x5b0 [xfs]
[18679.578708] xfs_finobt_calc_reserves+0xe7/0x4d0 [xfs]
[18679.582480] xfs_ag_resv_init+0x2c5/0x490 [xfs]
[18679.586023] xfs_ag_shrink_space+0x736/0xd30 [xfs]
[18679.590730] xfs_growfs_data_private.isra.0+0x55e/0x990 [xfs]
[18679.599764] xfs_growfs_data+0x2f1/0x410 [xfs]
[18679.602212] xfs_file_ioctl+0xd1e/0x1370 [xfs]
trying to get the AGI lock. The AGI lock was held by a fstress task
trying to do an inode allocation, and it was waiting on the AGF
lock to allocate a new inode chunk on disk. Hence deadlock.
The fix for this is for the growfs code to hold the AGI over the
transaction roll it does in the error path. It already holds the AGF
locked across this, and that is what causes the lock order inversion
in the xfs_ag_resv_init() call.
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Fixes: 46141dc891f7 ("xfs: introduce xfs_ag_shrink_space()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
---
fs/xfs/libxfs/xfs_ag.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 18d9bb2ebe8e..1531bd0ee359 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -979,14 +979,23 @@ xfs_ag_shrink_space(
if (error) {
/*
- * if extent allocation fails, need to roll the transaction to
+ * If extent allocation fails, need to roll the transaction to
* ensure that the AGFL fixup has been committed anyway.
+ *
+ * We need to hold the AGF across the roll to ensure nothing can
+ * access the AG for allocation until the shrink is fully
+ * cleaned up. And due to the resetting of the AG block
+ * reservation space needing to lock the AGI, we also have to
+ * hold that so we don't get AGI/AGF lock order inversions in
+ * the error handling path.
*/
xfs_trans_bhold(*tpp, agfbp);
+ xfs_trans_bhold(*tpp, agibp);
err2 = xfs_trans_roll(tpp);
if (err2)
return err2;
xfs_trans_bjoin(*tpp, agfbp);
+ xfs_trans_bjoin(*tpp, agibp);
goto resv_init_out;
}
--
2.39.3
next prev parent reply other threads:[~2024-06-14 1:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-14 1:49 [PATCH 6.6 CANDIDATE 0/8] xfs backports for 6.6.y (from 6.9) Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 1/8] xfs: fix imprecise logic in xchk_btree_check_block_owner Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 2/8] xfs: fix scrub stats file permissions Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 3/8] xfs: fix SEEK_HOLE/DATA for regions with active COW extents Catherine Hoang
2024-06-14 1:49 ` Catherine Hoang [this message]
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 5/8] xfs: ensure submit buffers on LSN boundaries in error handlers Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 6/8] xfs: allow sunit mount option to repair bad primary sb stripe values Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 7/8] xfs: don't use current->journal_info Catherine Hoang
2024-06-14 1:49 ` [PATCH 6.6 CANDIDATE 8/8] xfs: allow cross-linking special files without project quota Catherine Hoang
2024-06-14 3:57 ` [PATCH 6.6 CANDIDATE 0/8] xfs backports for 6.6.y (from 6.9) Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240614014946.43237-5-catherine.hoang@oracle.com \
--to=catherine.hoang@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox