public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: kaixuxia <xiakaixu1987@gmail.com>
Cc: linux-xfs@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	newtongao@tencent.com, jasperwang@tencent.com
Subject: Re: [PATCH V2] xfs: Fix agi&agf ABBA deadlock when performing rename with RENAME_WHITEOUT flag
Date: Mon, 19 Aug 2019 11:13:35 -0400	[thread overview]
Message-ID: <20190819151335.GB2875@bfoster> (raw)
In-Reply-To: <8eda2397-b7fb-6dd4-a448-a81628b48edc@gmail.com>

On Mon, Aug 19, 2019 at 09:06:39PM +0800, kaixuxia wrote:
> When performing rename operation with RENAME_WHITEOUT flag, we will
> hold AGF lock to allocate or free extents in manipulating the dirents
> firstly, and then doing the xfs_iunlink_remove() call last to hold
> AGI lock to modify the tmpfile info, so we the lock order AGI->AGF.
> 
> The big problem here is that we have an ordering constraint on AGF
> and AGI locking - inode allocation locks the AGI, then can allocate
> a new extent for new inodes, locking the AGF after the AGI. Hence
> the ordering that is imposed by other parts of the code is AGI before
> AGF. So we get the ABBA agi&agf deadlock here.
> 
...
> 
> In this patch we move the xfs_iunlink_remove() call to between
> xfs_dir_canenter() and xfs_dir_createname(). By doing xfs_iunlink
> _remove() firstly, we remove the AGI/AGF lock inversion problem.
> 
> Signed-off-by: kaixuxia <kaixuxia@tencent.com>
> ---
>  fs/xfs/xfs_inode.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 6467d5e..48691f2 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -3294,6 +3294,18 @@ struct xfs_iunlink {
>  			if (error)
>  				goto out_trans_cancel;
>  		}
> +
> +		/*
> +		 * Handle the whiteout inode and acquire the AGI lock, so
> +		 * fix the AGI/AGF lock inversion problem.
> +		 */

The comment could be a little more specific. For example:

"Directory entry creation may acquire the AGF. Remove the whiteout from
the unlinked list first to preserve correct AGI/AGF locking order."

> +		if (wip) {
> +			ASSERT(VFS_I(wip)->i_nlink == 0);
> +			error = xfs_iunlink_remove(tp, wip);
> +			if (error)
> +				goto out_trans_cancel;
> +		}
> +
>  		/*
>  		 * If target does not exist and the rename crosses
>  		 * directories, adjust the target directory link count
> @@ -3428,9 +3440,11 @@ struct xfs_iunlink {
>  	if (wip) {
>  		ASSERT(VFS_I(wip)->i_nlink == 0);
>  		xfs_bumplink(tp, wip);
> -		error = xfs_iunlink_remove(tp, wip);
> -		if (error)
> -			goto out_trans_cancel;
> +		if (target_ip != NULL) {
> +			error = xfs_iunlink_remove(tp, wip);
> +			if (error)
> +				goto out_trans_cancel;
> +		}

The comment above this hunk needs to be updated. I'm also not a big fan
of this factoring of doing the removal in the if branch above and then
encoding the else logic down here. It might be cleaner and more
consistent to have a call in each branch of the if/else above.

FWIW, I'm also curious if this could be cleaned up further by pulling
the -ENOSPC/-EEXIST checks out of the earlier branch, following that
with the whiteout removal, and then doing the dir_create/replace. For
example, something like:

	/* error checks before we dirty the transaction */
	if (!target_ip && !spaceres) {
		error = xfs_dir_canenter();
		...
	} else if (S_ISDIR() && !(empty || nlink > 2))
		error = -EEXIST;
		...
	}

	if (wip) {
		...
		xfs_iunlink_remove();
	}

	if (!target_ip) {
		xfs_dir_create();
		...
	} else {
		xfs_dir_replace();
		...
	}

... but that may not be any cleaner..? It could also be done as a
followup cleanup patch as well.

Brian

>  		xfs_trans_log_inode(tp, wip, XFS_ILOG_CORE);
> 
>  		/*
> -- 
> 1.8.3.1
> 
> -- 
> kaixuxia

  reply	other threads:[~2019-08-19 15:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-19 13:06 [PATCH V2] xfs: Fix agi&agf ABBA deadlock when performing rename with RENAME_WHITEOUT flag kaixuxia
2019-08-19 15:13 ` Brian Foster [this message]
2019-08-19 22:12   ` Dave Chinner
2019-08-20  6:45   ` kaixuxia
2019-08-20  8:07     ` Dave Chinner
2019-08-20  8:53       ` kaixuxia
2019-08-20 10:51         ` Brian Foster
2019-08-20 11:23           ` Dave Chinner
2019-08-20 12:23             ` Brian Foster
2019-08-20 22:13               ` Dave Chinner
2019-08-21 11:25                 ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190819151335.GB2875@bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=jasperwang@tencent.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=newtongao@tencent.com \
    --cc=xiakaixu1987@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox