From: Brian Foster <bfoster@redhat.com>
To: kaixuxia <xiakaixu1987@gmail.com>
Cc: linux-xfs@vger.kernel.org, darrick.wong@oracle.com,
newtongao@tencent.com, jasperwang@tencent.com
Subject: Re: [PATCH RFC] xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()
Date: Thu, 31 Oct 2019 08:27:01 -0400 [thread overview]
Message-ID: <20191031122701.GB54006@bfoster> (raw)
In-Reply-To: <1572428974-8657-1-git-send-email-kaixuxia@tencent.com>
On Wed, Oct 30, 2019 at 05:49:34PM +0800, kaixuxia wrote:
> When target_ip exists in xfs_rename(), the xfs_dir_replace() call may
> need to hold the AGF lock to allocate more blocks, and then invoking
> the xfs_droplink() call to hold AGI lock to drop target_ip onto the
> unlinked list, so we get the lock order AGF->AGI. This would break the
> ordering constraint on AGI and AGF locking - inode allocation locks
> the AGI, then can allocate a new extent for new inodes, locking the
> AGF after the AGI.
>
> In this patch we check whether the replace operation need more
> blocks firstly. If so, acquire the agi lock firstly to preserve
> locking order(AGI/AGF). Actually, the locking order problem only
> occurs when we are locking the AGI/AGF of the same AG. For multiple
> AGs the AGI lock will be released after the transaction committed.
>
> Signed-off-by: kaixuxia <kaixuxia@tencent.com>
> ---
> fs/xfs/libxfs/xfs_dir2.c | 30 ++++++++++++++++++++++++++++++
> fs/xfs/libxfs/xfs_dir2.h | 2 ++
> fs/xfs/xfs_inode.c | 14 ++++++++++++++
> 3 files changed, 46 insertions(+)
>
> diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> index 867c5de..9d9ae16 100644
> --- a/fs/xfs/libxfs/xfs_dir2.c
> +++ b/fs/xfs/libxfs/xfs_dir2.c
> @@ -463,6 +463,36 @@
> }
>
> /*
> + * Check whether the replace operation need more blocks. Ignore
> + * the parameters check since the real replace() call below will
> + * do that.
> + */
> +bool
> +xfs_dir_replace_needblock(
> + struct xfs_inode *dp,
> + xfs_ino_t inum)
> +{
> + int newsize;
> + xfs_dir2_sf_hdr_t *sfp;
> +
> + /*
> + * Only convert the shortform directory to block form maybe need
> + * more blocks.
> + */
> + if (dp->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> + return false;
> +
> + sfp = (xfs_dir2_sf_hdr_t *)dp->i_df.if_u1.if_data;
> + newsize = dp->i_df.if_bytes + (sfp->count + 1) * XFS_INO64_DIFF;
> +
> + if (inum > XFS_DIR2_MAX_SHORT_INUM &&
> + sfp->i8count == 0 && newsize > XFS_IFORK_DSIZE(dp))
> + return true;
> + else
> + return false;
> +}
> +
It's slightly unfortunate we need to do these kind of double checks, but
it seems reasonable enough as an isolated fix. From a factoring
standpoint, it might be a little cleaner to move this down in
xfs_dir2_sf.c as an xfs_dir2_sf_replace_needblock() helper, actually use
it in the xfs_dir2_sf_replace() function where these checks are
currently open coded and then export it so we can call it in the higher
level function as well for the locking fix.
Brian
> +/*
> * Replace the inode number of a directory entry.
> */
> int
> diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
> index f542447..e436c14 100644
> --- a/fs/xfs/libxfs/xfs_dir2.h
> +++ b/fs/xfs/libxfs/xfs_dir2.h
> @@ -124,6 +124,8 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
> extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
> struct xfs_name *name, xfs_ino_t ino,
> xfs_extlen_t tot);
> +extern bool xfs_dir_replace_needblock(struct xfs_inode *dp,
> + xfs_ino_t inum);
> extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
> struct xfs_name *name, xfs_ino_t inum,
> xfs_extlen_t tot);
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 18f4b26..c239070 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -3196,6 +3196,7 @@ struct xfs_iunlink {
> struct xfs_trans *tp;
> struct xfs_inode *wip = NULL; /* whiteout inode */
> struct xfs_inode *inodes[__XFS_SORT_INODES];
> + struct xfs_buf *agibp;
> int num_inodes = __XFS_SORT_INODES;
> bool new_parent = (src_dp != target_dp);
> bool src_is_directory = S_ISDIR(VFS_I(src_ip)->i_mode);
> @@ -3361,6 +3362,19 @@ struct xfs_iunlink {
> * In case there is already an entry with the same
> * name at the destination directory, remove it first.
> */
> +
> + /*
> + * Check whether the replace operation need more blocks.
> + * If so, acquire the agi lock firstly to preserve locking
> + * order(AGI/AGF).
> + */
> + if (xfs_dir_replace_needblock(target_dp, src_ip->i_ino)) {
> + error = xfs_read_agi(mp, tp,
> + XFS_INO_TO_AGNO(mp, target_ip->i_ino), &agibp);
> + if (error)
> + goto out_trans_cancel;
> + }
> +
> error = xfs_dir_replace(tp, target_dp, target_name,
> src_ip->i_ino, spaceres);
> if (error)
> --
> 1.8.3.1
>
next prev parent reply other threads:[~2019-10-31 12:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-30 9:49 [PATCH RFC] xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename() kaixuxia
2019-10-31 12:27 ` Brian Foster [this message]
2019-11-01 7:04 ` kaixuxia
2019-11-01 10:30 ` Brian Foster
2019-11-04 8:56 ` kaixuxia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191031122701.GB54006@bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=jasperwang@tencent.com \
--cc=linux-xfs@vger.kernel.org \
--cc=newtongao@tencent.com \
--cc=xiakaixu1987@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.