From: Chandan Babu R <chandan.babu@oracle.com>
To: djwong@kernel.org
Cc: chandan.babu@oracle.com, linux-xfs@vger.kernel.org,
amir73il@gmail.com, leah.rumancik@gmail.com
Subject: [PATCH 5.4 CANDIDATE V2 08/17] xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()
Date: Tue, 20 Sep 2022 18:18:27 +0530 [thread overview]
Message-ID: <20220920124836.1914918-9-chandan.babu@oracle.com> (raw)
In-Reply-To: <20220920124836.1914918-1-chandan.babu@oracle.com>
From: kaixuxia <xiakaixu1987@gmail.com>
commit 93597ae8dac0149b5c00b787cba6bf7ba213e666 upstream.
When target_ip exists in xfs_rename(), the xfs_dir_replace() call may
need to hold the AGF lock to allocate more blocks, and then invoking
the xfs_droplink() call to hold AGI lock to drop target_ip onto the
unlinked list, so we get the lock order AGF->AGI. This would break the
ordering constraint on AGI and AGF locking - inode allocation locks
the AGI, then can allocate a new extent for new inodes, locking the
AGF after the AGI.
In this patch we check whether the replace operation need more
blocks firstly. If so, acquire the agi lock firstly to preserve
locking order(AGI/AGF). Actually, the locking order problem only
occurs when we are locking the AGI/AGF of the same AG. For multiple
AGs the AGI lock will be released after the transaction committed.
Signed-off-by: kaixuxia <kaixuxia@tencent.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: reword the comment]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
---
fs/xfs/libxfs/xfs_dir2.h | 2 ++
fs/xfs/libxfs/xfs_dir2_sf.c | 28 +++++++++++++++++++++++-----
fs/xfs/xfs_inode.c | 17 +++++++++++++++++
3 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index f54244779492..01b1722333a9 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -124,6 +124,8 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t ino,
xfs_extlen_t tot);
+extern bool xfs_dir2_sf_replace_needblock(struct xfs_inode *dp,
+ xfs_ino_t inum);
extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t inum,
xfs_extlen_t tot);
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index ae16ca7c422a..90eff6c2de7e 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -944,6 +944,27 @@ xfs_dir2_sf_removename(
return 0;
}
+/*
+ * Check whether the sf dir replace operation need more blocks.
+ */
+bool
+xfs_dir2_sf_replace_needblock(
+ struct xfs_inode *dp,
+ xfs_ino_t inum)
+{
+ int newsize;
+ struct xfs_dir2_sf_hdr *sfp;
+
+ if (dp->i_d.di_format != XFS_DINODE_FMT_LOCAL)
+ return false;
+
+ sfp = (struct xfs_dir2_sf_hdr *)dp->i_df.if_u1.if_data;
+ newsize = dp->i_df.if_bytes + (sfp->count + 1) * XFS_INO64_DIFF;
+
+ return inum > XFS_DIR2_MAX_SHORT_INUM &&
+ sfp->i8count == 0 && newsize > XFS_IFORK_DSIZE(dp);
+}
+
/*
* Replace the inode number of an entry in a shortform directory.
*/
@@ -980,17 +1001,14 @@ xfs_dir2_sf_replace(
*/
if (args->inumber > XFS_DIR2_MAX_SHORT_INUM && sfp->i8count == 0) {
int error; /* error return value */
- int newsize; /* new inode size */
- newsize = dp->i_df.if_bytes + (sfp->count + 1) * XFS_INO64_DIFF;
/*
* Won't fit as shortform, convert to block then do replace.
*/
- if (newsize > XFS_IFORK_DSIZE(dp)) {
+ if (xfs_dir2_sf_replace_needblock(dp, args->inumber)) {
error = xfs_dir2_sf_to_block(args);
- if (error) {
+ if (error)
return error;
- }
return xfs_dir2_block_replace(args);
}
/*
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7a9048c4c2f9..8990be13a16c 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3215,6 +3215,7 @@ xfs_rename(
struct xfs_trans *tp;
struct xfs_inode *wip = NULL; /* whiteout inode */
struct xfs_inode *inodes[__XFS_SORT_INODES];
+ struct xfs_buf *agibp;
int num_inodes = __XFS_SORT_INODES;
bool new_parent = (src_dp != target_dp);
bool src_is_directory = S_ISDIR(VFS_I(src_ip)->i_mode);
@@ -3379,6 +3380,22 @@ xfs_rename(
* In case there is already an entry with the same
* name at the destination directory, remove it first.
*/
+
+ /*
+ * Check whether the replace operation will need to allocate
+ * blocks. This happens when the shortform directory lacks
+ * space and we have to convert it to a block format directory.
+ * When more blocks are necessary, we must lock the AGI first
+ * to preserve locking order (AGI -> AGF).
+ */
+ if (xfs_dir2_sf_replace_needblock(target_dp, src_ip->i_ino)) {
+ error = xfs_read_agi(mp, tp,
+ XFS_INO_TO_AGNO(mp, target_ip->i_ino),
+ &agibp);
+ if (error)
+ goto out_trans_cancel;
+ }
+
error = xfs_dir_replace(tp, target_dp, target_name,
src_ip->i_ino, spaceres);
if (error)
--
2.35.1
next prev parent reply other threads:[~2022-09-20 12:49 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-20 12:48 [PATCH 5.4 CANDIDATE V2 00/17] xfs stable candidate patches for 5.4.y (from v5.5) Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 01/17] MAINTAINERS: add Chandan as xfs maintainer for 5.4.y Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 02/17] iomap: iomap that extends beyond EOF should be marked dirty Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 03/17] xfs: replace -EIO with -EFSCORRUPTED for corrupt metadata Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 04/17] xfs: slightly tweak an assert in xfs_fs_map_blocks Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 05/17] xfs: add missing assert in xfs_fsmap_owner_from_rmap Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 06/17] xfs: range check ri_cnt when recovering log items Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 07/17] xfs: attach dquots and reserve quota blocks during unwritten conversion Chandan Babu R
2022-09-20 12:48 ` Chandan Babu R [this message]
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 09/17] xfs: convert EIO to EFSCORRUPTED when log contents are invalid Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 10/17] xfs: constify the buffer pointer arguments to error functions Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 11/17] xfs: always log corruption errors Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 12/17] xfs: fix some memory leaks in log recovery Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 13/17] xfs: stabilize insert range start boundary to avoid COW writeback race Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 14/17] xfs: use bitops interface for buf log item AIL flag check Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 15/17] xfs: refactor agfl length computation function Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 16/17] xfs: split the sunit parameter update into two parts Chandan Babu R
2022-09-20 12:48 ` [PATCH 5.4 CANDIDATE V2 17/17] xfs: don't commit sunit/swidth updates to disk if that would cause repair failures Chandan Babu R
2022-09-21 0:38 ` [PATCH 5.4 CANDIDATE V2 00/17] xfs stable candidate patches for 5.4.y (from v5.5) Darrick J. Wong
2022-09-21 2:13 ` Chandan Babu R
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220920124836.1914918-9-chandan.babu@oracle.com \
--to=chandan.babu@oracle.com \
--cc=amir73il@gmail.com \
--cc=djwong@kernel.org \
--cc=leah.rumancik@gmail.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox