From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v5.3 00/11] btrfs: tree-checker: Write time tree checker
Date: Fri, 22 Mar 2019 07:56:37 +0800 [thread overview]
Message-ID: <ec2d0bb8-49bf-a7ec-3bc5-7c0f80adf018@gmx.com> (raw)
In-Reply-To: <20190321143406.GF26179@twin.jikos.cz>
[-- Attachment #1.1: Type: text/plain, Size: 4970 bytes --]
On 2019/3/21 下午10:34, David Sterba wrote:
> On Wed, Mar 20, 2019 at 02:27:38PM +0800, Qu Wenruo wrote:
>> Patchset can be fetched from github:
>> https://github.com/adam900710/linux/tree/write_time_tree_checker
>> Which is based on v5.1-rc1 tag.
>>
>> This patchset has the following 3 features:
>> - Tree block validation output enhancement
>> * Output validation failure timing (write time or read time)
>> * Always output tree block level/key mismatch error message
>> This part is already submitted and reviewed.
>>
>> - Write time tree block validation check
>> To catch memory corruption either from hardware or kernel.
>> Example output would be:
>>
>> BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0)
>> BTRFS error (device dm-3): block=1350630375424 write time tree block corruption detected
>> BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO failure (Error while writing out transaction)
>> BTRFS info (device dm-3): forced readonly
>> BTRFS warning (device dm-3): Skipping commit of aborted transaction.
>> BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO failure
>> BTRFS info (device dm-3): delayed_refs has NO entry
>>
>> - Better error handling before calling flush_write_bio()
>> One hidden reason of calling flush_write_bio() under all cases is,
>> flush_write_bio() will trigger endio function and endio function of
>> epd->bio will free the bio under all cases.
>> So we're in fact abusing flush_write_bio() as cleanup.
>>
>> Since now flush_write_bio() has its own return value, we shouldn't call
>> flush_write_bio() no-brain, here we introduce proper cleanup helper,
>> end_write_bio(). Now we call flush_write_bio() like:
>> New | Old
>> --------------------------------------------------------------
>> ret = do_some_evil(&epd); | ret = do_some_evil(&epd);
>> if (ret < 0) { | flush_write_bio(&epd);
>> end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio?
>> return ret; | return ret;
>> } |
>> ret = flush_write_bio(&epd); |
>> return ret; |
>>
>> Above code should be more streamline for the error handling part.
>>
>> Changelog:
>> v2:
>> - Unlock locked pages in lock_extent_buffer_for_io() for error handling.
>> - Added Reviewed-by tags.
>>
>> v3:
>> - Remove duplicated error message.
>> - Use IS_ENABLED() macro to replace #ifdef.
>> - Added Reviewed-by tags.
>>
>> v4:
>> - Re-organized patch split
>> Now each BUG_ON() cleanup has its own patch
>> - Dig much further into the call sites to eliminate unexpected >0 return
>> May be a little paranoid and abuse some ASSERT(), but it should be
>> much safer against further code change.
>> - Fix the false alert caused by balance and memory pressure
>> The fix it skip owner checker for non-essential tree at write time.
>> Since owner root can't always be reliable, either due to commit root
>> created in current transaction or balance + memory pressure.
>>
>> v5:
>> - Do proper error-out handling other than relying on flush_write_bio()
>> to clean up.
>> This has a side effect that no Reviewed-by tags for modified patches.
>> - New comment for why we don't need to do anything about ebp->bio when
>> submit_one_bio() fails.
>> - Add some Reviewed-by tag.
>>
>> v5.1:
>> - Add "block=%llu " output for write/read time error line.
>> - Also output read time error message for fsid/start/level check.
>>
>> v5.2:
>> - Fix a missing page_unlock() in error hanlding
>>
>> v5.3:
>> - Rebase to v5.1-rc1 tag
>>
>> Qu Wenruo (11):
>> btrfs: Always output error message when key/level verification fails
>> btrfs: disk-io: Show the timing of corrupted tree block explicitly
>> btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
>> btrfs: extent_io: Handle error better in extent_write_full_page()
>> btrfs: extent_io: Handle error better in btree_write_cache_pages()
>> btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
>> btrfs: extent_io: Handle error better in extent_write_locked_range()
>> btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
>> btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
>> btrfs: extent_io: Handle error better in extent_writepages()
>> btrfs: Do mandatory tree block check before submitting bio
>
> All except 9 and 11 are going to misc-next after a final test. The two
> patches are only postponed, one for review and the other one due to
> conflict with another patch in misc-next.
Do I need to resend the last patch to solve the conflict by myself?
Thanks,
Qu
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-03-21 23:56 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-20 6:27 [PATCH v5.3 00/11] btrfs: tree-checker: Write time tree checker Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 01/11] btrfs: Always output error message when key/level verification fails Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 02/11] btrfs: disk-io: Show the timing of corrupted tree block explicitly Qu Wenruo
2019-03-29 14:11 ` David Sterba
2019-03-29 14:18 ` Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 03/11] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up Qu Wenruo
2019-03-20 19:39 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 04/11] btrfs: extent_io: Handle error better in extent_write_full_page() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 05/11] btrfs: extent_io: Handle error better in btree_write_cache_pages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 06/11] btrfs: extent_io: Kill the dead branch in extent_write_cache_pages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 07/11] btrfs: extent_io: Handle error better in extent_write_locked_range() Qu Wenruo
2019-03-21 13:19 ` David Sterba
2019-03-21 13:45 ` Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 08/11] btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io() Qu Wenruo
2019-03-21 13:30 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 09/11] btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages() Qu Wenruo
2019-03-21 14:14 ` David Sterba
2019-03-20 6:27 ` [PATCH v5.3 10/11] btrfs: extent_io: Handle error better in extent_writepages() Qu Wenruo
2019-03-20 6:27 ` [PATCH v5.3 11/11] btrfs: Do mandatory tree block check before submitting bio Qu Wenruo
2019-04-03 9:07 ` Qu Wenruo
2019-03-21 14:34 ` [PATCH v5.3 00/11] btrfs: tree-checker: Write time tree checker David Sterba
2019-03-21 23:56 ` Qu Wenruo [this message]
2019-03-22 17:52 ` David Sterba
2019-03-25 4:32 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ec2d0bb8-49bf-a7ec-3bc5-7c0f80adf018@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).