linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 0/8] btrfs: error handling fixes
Date: Sun,  8 Dec 2024 13:20:57 +1030	[thread overview]
Message-ID: <cover.1733624454.git.wqu@suse.com> (raw)

I believe there is a regression in the last 2 or 3 releases where
metadata/data space reservation code is no longer working properly,
result us to hit -ENOSPC during btrfs_run_delalloc_range().

One of the most common situation to hit such problem is during
generic/750, along with other long running generic tests.

Although I should start bisecting the space reservation bug, but I can
not help fixing the exposed bugs first.

This exposed quite some long existing bugs, all in the error handling
paths, that can lead to the following crashes

- Double ordered extent accounting
  Triggers WARN_ON_OCE() inside can_finish_ordered_extent() then crash.

  This bug is fixed by the first 3 patches.

- Subpage ASSERT() triggered, where subpage folio bitmap differs from
  folio status
  This happens most likey in submit_uncompressed_range(), where it
  unlock the folio without updating the subpage bitmaps.

  This bug is fixed by the 3rd patch.

- WARN_ON() if out-of-tree patch "btrfs: reject out-of-band dirty folios
  during writeback" applied
  This is a more complex case, where error handling leaves some folios
  dirty, but with EXTENT_DELALLOC flag cleared from extent io tree.

  Such dirty folios are still possible to be written back later, but
  since there is no EXTENT_DELALLOC flag, it will be treat as
  out-of-band dirty flags and trigger COW fixup.

  This bug is fixed by the 4th and 5th patch

With so many existing bugs exposed, there is more than enough motivation
to make btrfs_run_delalloc_range() (and its delalloc range functions)
output extra error messages so that at least we know something is wrong.

And those error messages have already helped a lot during my
development.

Patches 6~8 are here to enhance the error messages.

With all these patches applied, at least fstests can finish reliably,
otherwise it frequently crashes in generic tests that I was unable to
finish even one full run since the space reservation regression.

Qu Wenruo (8):
  btrfs: fix double accounting race when btrfs_run_delalloc_range()
    failed
  btrfs: fix double accounting race when extent_writepage_io() failed
  btrfs: fix the error handling of submit_uncompressed_range()
  btrfs: do proper folio cleanup when cow_file_range() failed
  btrfs: do proper folio cleanup when run_delalloc_nocow() failed
  btrfs: subpage: fix the bitmap dump for the locked flags
  btrfs: subpage: dump the involved bitmap when ASSERT() failed
  btrfs: add extra error messages for delalloc range related errors

 fs/btrfs/extent_io.c |  79 ++++++++++++++----
 fs/btrfs/inode.c     | 188 +++++++++++++++++++++++++++++++------------
 fs/btrfs/subpage.c   |  48 ++++++++---
 3 files changed, 234 insertions(+), 81 deletions(-)

-- 
2.47.1


             reply	other threads:[~2024-12-08  2:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-08  2:50 Qu Wenruo [this message]
2024-12-08  2:50 ` [PATCH 1/8] btrfs: fix double accounting race when btrfs_run_delalloc_range() failed Qu Wenruo
2024-12-08  2:50 ` [PATCH 2/8] btrfs: fix double accounting race when extent_writepage_io() failed Qu Wenruo
2024-12-08  2:51 ` [PATCH 3/8] btrfs: fix the error handling of submit_uncompressed_range() Qu Wenruo
2024-12-08  2:51 ` [PATCH 4/8] btrfs: do proper folio cleanup when cow_file_range() failed Qu Wenruo
2024-12-08  2:51 ` [PATCH 5/8] btrfs: do proper folio cleanup when run_delalloc_nocow() failed Qu Wenruo
2024-12-08  2:51 ` [PATCH 6/8] btrfs: subpage: fix the bitmap dump for the locked flags Qu Wenruo
2024-12-08  2:51 ` [PATCH 7/8] btrfs: subpage: dump the involved bitmap when ASSERT() failed Qu Wenruo
2024-12-08  2:51 ` [PATCH 8/8] btrfs: add extra error messages for delalloc range related errors Qu Wenruo
  -- strict thread matches above, loose matches on Subject: below --
2011-08-18 21:56 [PATCH 0/8] btrfs: Error handling fixes Mark Fasheh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1733624454.git.wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).