From: Boris Burkov <boris@bur.io>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: don't call end_extent_writepage() in __extent_writepage() when IO failed
Date: Wed, 8 Sep 2021 10:03:41 -0700 [thread overview]
Message-ID: <YTjstM7duaGeAgwK@zen> (raw)
In-Reply-To: <20210803055348.170042-1-wqu@suse.com>
On Tue, Aug 03, 2021 at 01:53:48PM +0800, Qu Wenruo wrote:
> [BUG]
> When running generic/475 with 64K page size and 4K sectorsize (aka
> subpage), it can trigger the following BUG_ON() inside
> btrfs_csum_one_bio(), the possibility is around 1/20 ~ 1/5:
>
> bio_for_each_segment(bvec, bio, iter) {
> if (!contig)
> offset = page_offset(bvec.bv_page) + bvec.bv_offset;
>
> if (!ordered) {
> ordered = btrfs_lookup_ordered_extent(inode, offset);
> BUG_ON(!ordered); /* Logic error */ <<<<
> }
>
> nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info,
>
> [CAUSE]
> Test case generic/475 uses dm-errors to emulate IO failure.
>
> Here if we have a page cache which has the following delalloc range:
>
> 0 32K 64K
> |/////| |////| |
> \- [0, 4K) \- [32K, 36K)
>
> And then __extent_writepage() can go through the following race:
>
> T1 (writeback) | T2 (endio)
> --------------------------------+----------------------------------
> __extent_writepage() |
> |- writepage_delalloc() |
> | |- run_delalloc_range() |
> | | Add OE for [0, 4K) |
> | |- run_delalloc_range() |
> | Add OE for [32K, 36K) |
> | |
> |- __extent_writepage_io() |
> | |- submit_extent_page() |
> | | |- Assemble the bio for |
> | | range [0, 4K) |
> | |- submit_extent_page() |
> | | |- Submit the bio for |
> | | | range [0, 4K) |
> | | | | end_bio_extent_writepage()
> | | | | |- error = -EIO;
> | | | | |- end_extent_writepage( error=-EIO);
> | | | | |- writepage_endio_finish_ordered()
> | | | | | Remove OE for range [0, 4K)
> | | | | |- btrfs_page_set_error()
> | |- submit_extent_page() |
> | |- Assemble the bio for |
> | range [32K, 36K) |
> |- if (PageError(page)) |
> |- end_extent_writepage() |
> |- endio_finish_ordered() |
> Remove OE [32K, 36K) |
> |
> Submit bio for [32K, 36K) |
> |- btrfs_csum_one_bio() |
> |- BUG_ON(!ordered_extent) |
> OE [32K, 36K) is already |
> removed. |
>
> This can only happen for subpage case, as for regular sectorsize, we
> never submit current page, thus IO error will never mark the current
> page Error.
>
> [FIX]
> Just remove the end_extent_writepage() call and the if (PageError())
> check.
>
> As mentioned, the end_extent_writepage() never really get executed for
> regular sectorsize, and could cause above BUG_ON() for subpage.
I was a little surprised to see this assertion, because it begs the
question: "why was this call added in the first place?"
As best as I can tell, it was introduced by Filipe in
"Btrfs: fix hang on error (such as ENOSPC) when writing extent pages"
That looks like a reasonably niche case that might not be covered by
xfstests, so I was wondering if you had already convinced yourself that
it no longer applies.
I'll try to see if I can reproduce his issue with this patch, or if the
code has changed by enough that it no longer reproduces.
>
> This also means, inside __extent_writepage() we should not bother any IO
> failure, but only focus on the error hit during bio assembly and
> submission.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/extent_io.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index e665779c046d..a1a6ac787faf 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4111,8 +4111,8 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc,
> * Here we used to have a check for PageError() and then set @ret and
> * call end_extent_writepage().
> *
> - * But in fact setting @ret here will cause different error paths
> - * between subpage and regular sectorsize.
> + * But in fact setting @ret and call end_extent_writepage() here will
> + * cause different error paths between subpage and regular sectorsize.
> *
> * For regular page size, we never submit current page, but only add
> * current page to current bio.
> @@ -4124,7 +4124,12 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc,
> * thus can get PageError() set by submitted bio of the same page,
> * while our @ret is still 0.
> *
> - * So here we unify the behavior and don't set @ret.
> + * The same is also for end_extent_writepage(), which can finish
> + * ordered extent before submitting the real bio, causing
> + * BUG_ON() in btrfs_csum_one_bio().
> + *
> + * So here we unify the behavior and don't set @ret nor call
> + * end_extent_writepage().
> * Error can still be properly passed to higher layer as page will
> * be set error, here we just don't handle the IO failure.
> *
> @@ -4138,8 +4143,7 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc,
> * Currently the full page based __extent_writepage_io() is not
> * capable of that.
> */
> - if (PageError(page))
> - end_extent_writepage(page, ret, start, page_end);
> +
> unlock_page(page);
> ASSERT(ret <= 0);
> return ret;
> --
> 2.32.0
>
next prev parent reply other threads:[~2021-09-08 17:03 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-03 5:53 [PATCH] btrfs: don't call end_extent_writepage() in __extent_writepage() when IO failed Qu Wenruo
2021-09-08 17:03 ` Boris Burkov [this message]
2021-09-08 22:36 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YTjstM7duaGeAgwK@zen \
--to=boris@bur.io \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.