From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v5.3 00/11] btrfs: tree-checker: Write time tree checker
Date: Wed, 20 Mar 2019 14:27:38 +0800 [thread overview]
Message-ID: <20190320062749.30953-1-wqu@suse.com> (raw)
Patchset can be fetched from github:
https://github.com/adam900710/linux/tree/write_time_tree_checker
Which is based on v5.1-rc1 tag.
This patchset has the following 3 features:
- Tree block validation output enhancement
* Output validation failure timing (write time or read time)
* Always output tree block level/key mismatch error message
This part is already submitted and reviewed.
- Write time tree block validation check
To catch memory corruption either from hardware or kernel.
Example output would be:
BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0)
BTRFS error (device dm-3): block=1350630375424 write time tree block corruption detected
BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO failure (Error while writing out transaction)
BTRFS info (device dm-3): forced readonly
BTRFS warning (device dm-3): Skipping commit of aborted transaction.
BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO failure
BTRFS info (device dm-3): delayed_refs has NO entry
- Better error handling before calling flush_write_bio()
One hidden reason of calling flush_write_bio() under all cases is,
flush_write_bio() will trigger endio function and endio function of
epd->bio will free the bio under all cases.
So we're in fact abusing flush_write_bio() as cleanup.
Since now flush_write_bio() has its own return value, we shouldn't call
flush_write_bio() no-brain, here we introduce proper cleanup helper,
end_write_bio(). Now we call flush_write_bio() like:
New | Old
--------------------------------------------------------------
ret = do_some_evil(&epd); | ret = do_some_evil(&epd);
if (ret < 0) { | flush_write_bio(&epd);
end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio?
return ret; | return ret;
} |
ret = flush_write_bio(&epd); |
return ret; |
Above code should be more streamline for the error handling part.
Changelog:
v2:
- Unlock locked pages in lock_extent_buffer_for_io() for error handling.
- Added Reviewed-by tags.
v3:
- Remove duplicated error message.
- Use IS_ENABLED() macro to replace #ifdef.
- Added Reviewed-by tags.
v4:
- Re-organized patch split
Now each BUG_ON() cleanup has its own patch
- Dig much further into the call sites to eliminate unexpected >0 return
May be a little paranoid and abuse some ASSERT(), but it should be
much safer against further code change.
- Fix the false alert caused by balance and memory pressure
The fix it skip owner checker for non-essential tree at write time.
Since owner root can't always be reliable, either due to commit root
created in current transaction or balance + memory pressure.
v5:
- Do proper error-out handling other than relying on flush_write_bio()
to clean up.
This has a side effect that no Reviewed-by tags for modified patches.
- New comment for why we don't need to do anything about ebp->bio when
submit_one_bio() fails.
- Add some Reviewed-by tag.
v5.1:
- Add "block=%llu " output for write/read time error line.
- Also output read time error message for fsid/start/level check.
v5.2:
- Fix a missing page_unlock() in error hanlding
v5.3:
- Rebase to v5.1-rc1 tag
Qu Wenruo (11):
btrfs: Always output error message when key/level verification fails
btrfs: disk-io: Show the timing of corrupted tree block explicitly
btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
btrfs: extent_io: Handle error better in extent_write_full_page()
btrfs: extent_io: Handle error better in btree_write_cache_pages()
btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
btrfs: extent_io: Handle error better in extent_write_locked_range()
btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
btrfs: extent_io: Handle error better in extent_writepages()
btrfs: Do mandatory tree block check before submitting bio
fs/btrfs/disk-io.c | 24 ++++++---
fs/btrfs/extent_io.c | 109 +++++++++++++++++++++++++++++++++-------
fs/btrfs/tree-checker.c | 24 +++++++--
fs/btrfs/tree-checker.h | 8 +++
4 files changed, 137 insertions(+), 28 deletions(-)
--
2.21.0
next reply other threads:[~2019-03-20 6:27 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-20 6:27 Qu Wenruo [this message]
2019-03-20 6:27 ` [PATCH v5.3 01/11] btrfs: Always output error message when key/level verification fails Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 02/11] btrfs: disk-io: Show the timing of corrupted tree block explicitly Qu Wenruo
2019-03-29 14:11 ` David Sterba
2019-03-29 14:18 ` Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 03/11] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up Qu Wenruo
2019-03-20 19:39 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 04/11] btrfs: extent_io: Handle error better in extent_write_full_page() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 05/11] btrfs: extent_io: Handle error better in btree_write_cache_pages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 06/11] btrfs: extent_io: Kill the dead branch in extent_write_cache_pages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 07/11] btrfs: extent_io: Handle error better in extent_write_locked_range() Qu Wenruo
2019-03-21 13:19 ` David Sterba
2019-03-21 13:45 ` Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 08/11] btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io() Qu Wenruo
2019-03-21 13:30 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 09/11] btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages() Qu Wenruo
2019-03-21 14:14 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 10/11] btrfs: extent_io: Handle error better in extent_writepages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 11/11] btrfs: Do mandatory tree block check before submitting bio Qu Wenruo
2019-04-03 9:07 ` Qu Wenruo
2019-03-21 14:34 ` [PATCH v5.3 00/11] btrfs: tree-checker: Write time tree checker David Sterba
2019-03-21 23:56 ` Qu Wenruo
2019-03-22 17:52 ` David Sterba
2019-03-25 4:32 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190320062749.30953-1-wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).