From: Chandan Babu R <chandanrlinux@gmail.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't reuse busy extents on extent trim
Date: Tue, 23 Feb 2021 11:54:32 +0530 [thread overview]
Message-ID: <87k0qzmhcv.fsf@garuda> (raw)
In-Reply-To: <20210222153442.897089-1-bfoster@redhat.com>
On 22 Feb 2021 at 21:04, Brian Foster wrote:
> Freed extents are marked busy from the point the freeing transaction
> commits until the associated CIL context is checkpointed to the log.
> This prevents reuse and overwrite of recently freed blocks before
> the changes are committed to disk, which can lead to corruption
> after a crash. The exception to this rule is that metadata
> allocation is allowed to reuse busy extents because metadata changes
> are also logged.
>
> As of commit 97d3ac75e5e0 ("xfs: exact busy extent tracking"), XFS
> has allowed modification or complete invalidation of outstanding
> busy extents for metadata allocations. This implementation assumes
> that use of the associated extent is imminent, which is not always
> the case. For example, the trimmed extent might not satisfy the
> minimum length of the allocation request, or the allocation
> algorithm might be involved in a search for the optimal result based
> on locality.
>
> generic/019 reproduces a corruption caused by this scenario. First,
> a metadata block (usually a bmbt or symlink block) is freed from an
> inode. A subsequent bmbt split on an unrelated inode attempts a near
> mode allocation request that invalidates the busy block during the
> search, but does not ultimately allocate it. Due to the busy state
> invalidation, the block is no longer considered busy to subsequent
> allocation. A direct I/O write request immediately allocates the
> block and writes to it. Finally, the filesystem crashes while in a
> state where the initial metadata block free had not committed to the
> on-disk log. After recovery, the original metadata block is in its
> original location as expected, but has been corrupted by the
> aforementioned dio.
>
> This demonstrates that it is fundamentally unsafe to modify busy
> extent state for extents that are not guaranteed to be allocated.
> This applies to pretty much all of the code paths that currently
> trim busy extents for one reason or another. Therefore to address
> this problem, drop the reuse mechanism from the busy extent trim
> path. This code already knows how to return partial non-busy ranges
> of the targeted free extent and higher level code tracks the busy
> state of the allocation attempt. If a block allocation fails where
> one or more candidate extents is busy, we force the log and retry
> the allocation.
>
The changes look good to me.
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> fs/xfs/xfs_extent_busy.c | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index 3991e59cfd18..ef17c1f6db32 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -344,7 +344,6 @@ xfs_extent_busy_trim(
> ASSERT(*len > 0);
>
> spin_lock(&args->pag->pagb_lock);
> -restart:
> fbno = *bno;
> flen = *len;
> rbp = args->pag->pagb_tree.rb_node;
> @@ -363,19 +362,6 @@ xfs_extent_busy_trim(
> continue;
> }
>
> - /*
> - * If this is a metadata allocation, try to reuse the busy
> - * extent instead of trimming the allocation.
> - */
> - if (!(args->datatype & XFS_ALLOC_USERDATA) &&
> - !(busyp->flags & XFS_EXTENT_BUSY_DISCARDED)) {
> - if (!xfs_extent_busy_update_extent(args->mp, args->pag,
> - busyp, fbno, flen,
> - false))
> - goto restart;
> - continue;
> - }
> -
> if (bbno <= fbno) {
> /* start overlap */
--
chandan
next prev parent reply other threads:[~2021-02-23 6:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 15:34 [PATCH] xfs: don't reuse busy extents on extent trim Brian Foster
2021-02-22 18:27 ` Darrick J. Wong
2021-02-23 12:31 ` Brian Foster
2021-02-23 18:22 ` Darrick J. Wong
2022-05-26 11:34 ` Amir Goldstein
2022-05-26 14:21 ` Brian Foster
2022-05-26 15:28 ` Amir Goldstein
2022-05-26 19:33 ` Amir Goldstein
2022-05-26 19:47 ` Brian Foster
2022-05-26 20:56 ` Amir Goldstein
2022-05-27 7:06 ` Amir Goldstein
2022-05-27 12:50 ` Brian Foster
2022-05-27 14:16 ` Amir Goldstein
2022-05-28 14:23 ` Amir Goldstein
2022-05-31 15:55 ` Brian Foster
2023-01-11 14:41 ` Gao Xiang
2023-01-11 21:06 ` Dave Chinner
2021-02-23 6:24 ` Chandan Babu R [this message]
2021-02-25 7:51 ` Christoph Hellwig
2021-02-25 18:05 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k0qzmhcv.fsf@garuda \
--to=chandanrlinux@gmail.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox