Re: [PATCH] xfs: don't reuse busy extents on extent trim

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Chandan Babu R <chandanrlinux@gmail.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't reuse busy extents on extent trim
Date: Tue, 23 Feb 2021 11:54:32 +0530	[thread overview]
Message-ID: <87k0qzmhcv.fsf@garuda> (raw)
In-Reply-To: <20210222153442.897089-1-bfoster@redhat.com>

On 22 Feb 2021 at 21:04, Brian Foster wrote:
> Freed extents are marked busy from the point the freeing transaction
> commits until the associated CIL context is checkpointed to the log.
> This prevents reuse and overwrite of recently freed blocks before
> the changes are committed to disk, which can lead to corruption
> after a crash. The exception to this rule is that metadata
> allocation is allowed to reuse busy extents because metadata changes
> are also logged.
>
> As of commit 97d3ac75e5e0 ("xfs: exact busy extent tracking"), XFS
> has allowed modification or complete invalidation of outstanding
> busy extents for metadata allocations. This implementation assumes
> that use of the associated extent is imminent, which is not always
> the case. For example, the trimmed extent might not satisfy the
> minimum length of the allocation request, or the allocation
> algorithm might be involved in a search for the optimal result based
> on locality.
>
> generic/019 reproduces a corruption caused by this scenario. First,
> a metadata block (usually a bmbt or symlink block) is freed from an
> inode. A subsequent bmbt split on an unrelated inode attempts a near
> mode allocation request that invalidates the busy block during the
> search, but does not ultimately allocate it. Due to the busy state
> invalidation, the block is no longer considered busy to subsequent
> allocation. A direct I/O write request immediately allocates the
> block and writes to it. Finally, the filesystem crashes while in a
> state where the initial metadata block free had not committed to the
> on-disk log. After recovery, the original metadata block is in its
> original location as expected, but has been corrupted by the
> aforementioned dio.
>
> This demonstrates that it is fundamentally unsafe to modify busy
> extent state for extents that are not guaranteed to be allocated.
> This applies to pretty much all of the code paths that currently
> trim busy extents for one reason or another. Therefore to address
> this problem, drop the reuse mechanism from the busy extent trim
> path. This code already knows how to return partial non-busy ranges
> of the targeted free extent and higher level code tracks the busy
> state of the allocation attempt. If a block allocation fails where
> one or more candidate extents is busy, we force the log and retry
> the allocation.
>

The changes look good to me.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_extent_busy.c | 14 --------------
>  1 file changed, 14 deletions(-)
>
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index 3991e59cfd18..ef17c1f6db32 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -344,7 +344,6 @@ xfs_extent_busy_trim(
>  	ASSERT(*len > 0);
>  
>  	spin_lock(&args->pag->pagb_lock);
> -restart:
>  	fbno = *bno;
>  	flen = *len;
>  	rbp = args->pag->pagb_tree.rb_node;
> @@ -363,19 +362,6 @@ xfs_extent_busy_trim(
>  			continue;
>  		}
>  
> -		/*
> -		 * If this is a metadata allocation, try to reuse the busy
> -		 * extent instead of trimming the allocation.
> -		 */
> -		if (!(args->datatype & XFS_ALLOC_USERDATA) &&
> -		    !(busyp->flags & XFS_EXTENT_BUSY_DISCARDED)) {
> -			if (!xfs_extent_busy_update_extent(args->mp, args->pag,
> -							  busyp, fbno, flen,
> -							  false))
> -				goto restart;
> -			continue;
> -		}
> -
>  		if (bbno <= fbno) {
>  			/* start overlap */


-- 
chandan

next prev parent reply	other threads:[~2021-02-23  6:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 15:34 [PATCH] xfs: don't reuse busy extents on extent trim Brian Foster
2021-02-22 18:27 ` Darrick J. Wong
2021-02-23 12:31   ` Brian Foster
2021-02-23 18:22     ` Darrick J. Wong
2022-05-26 11:34     ` Amir Goldstein
2022-05-26 14:21       ` Brian Foster
2022-05-26 15:28         ` Amir Goldstein
2022-05-26 19:33           ` Amir Goldstein
2022-05-26 19:47           ` Brian Foster
2022-05-26 20:56             ` Amir Goldstein
2022-05-27  7:06               ` Amir Goldstein
2022-05-27 12:50                 ` Brian Foster
2022-05-27 14:16                   ` Amir Goldstein
2022-05-28 14:23             ` Amir Goldstein
2022-05-31 15:55               ` Brian Foster
2023-01-11 14:41       ` Gao Xiang
2023-01-11 21:06         ` Dave Chinner
2021-02-23  6:24 ` Chandan Babu R [this message]
2021-02-25  7:51 ` Christoph Hellwig
2021-02-25 18:05   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k0qzmhcv.fsf@garuda \
    --to=chandanrlinux@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox