From: Qu Wenruo <wqu@suse.com>
To: dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: tree-checker: add btrfs dev extent checks
Date: Wed, 14 Aug 2024 08:47:40 +0930 [thread overview]
Message-ID: <0d1da382-8cf1-480d-941a-9e01298e466f@suse.com> (raw)
In-Reply-To: <20240813231146.GW25962@twin.jikos.cz>
在 2024/8/14 08:41, David Sterba 写道:
> On Sun, Aug 11, 2024 at 03:20:08PM +0930, Qu Wenruo wrote:
>> [REPORT]
>> There is a corruption report that btrfs refuse to mount a fs that has
>> overlapping dev extents:
>>
>> BTRFS error (device sdc): dev extent devid 4 physical offset
>> 14263979671552 overlap with previous dev extent end 14263980982272
>> BTRFS error (device sdc): failed to verify dev extents against chunks: -117
>> BTRFS error (device sdc): open_ctree failed
>>
>> [CAUSE]
>> The cause is very obvious, there is a bad dev extent item with incorrect
>> length.
>> Although we are not 100% sure of the cause before getting the dev tree
>> dump, I'm already surprised that we do not have any checks on dev tree.
>>
>> Currently we only do the dev-extent verification at mount time, but if the
>> corruption is caused by memory bitflip, we really want to catch it before
>> writing the corruption to the storage.
>>
>> Furthermore the dev extent items has the following key definition:
>>
>> (<device id> DEV_EXTENT <physical offset>)
>>
>> Thus we can not just rely on the generic key order check to make sure
>> there is no overlapping.
>>
>> [ENHANCEMENT]
>> Introduce dedicated dev extent checks, including:
>>
>> - Fixed member checks
>> * chunk_tree should always be BTRFS_CHUNK_TREE_OBJECTID (3)
>> * chunk_objectid should always be
>> BTRFS_FIRST_CHUNK_CHUNK_TREE_OBJECTID (256)
>>
>> - Alignment checks
>> * chunk_offset should be aligned to sectorsize
>> * length should be aligned to sectorsize
>> * key.offset should be aligned to sectorsize
>>
>> - Overlap checks
>> If the previous key is also a dev-extent item, with the same
>> device id, make sure we do not overlap with the previous dev extent.
>>
>> Reported: Stefan N <stefannnau@gmail.com>
>> Link: https://lore.kernel.org/linux-btrfs/CA+W5K0rSO3koYTo=nzxxTm1-Pdu1HYgVxEpgJ=aGc7d=E8mGEg@mail.gmail.com/
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> Looks like we missed some simple tree item checks indeed.
The original idea is, we have btrfs_verify_dev_extents() at mount time,
thus it's enough to reject bad dev extents, and no need for tree-checker
for dev-extents.
But this method doesn't prevent bitflip from sneaking in during runtime.
So in the long run, our sanity checks should:
- Do cross-checks at mount time for critical infrastructure
To prevent corruption sneaking in undetected.
- Do in-leaf checks at tree-checker
To prevent corruption reach storage.
- Do extra read-time cross-checks
Just like the dir item checks we did.
Thanks,
Qu
>
> Reviewed-by: David Sterba <dsterba@suse.com>
next prev parent reply other threads:[~2024-08-13 23:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-11 5:50 [PATCH] btrfs: tree-checker: add btrfs dev extent checks Qu Wenruo
2024-08-13 23:11 ` David Sterba
2024-08-13 23:17 ` Qu Wenruo [this message]
2024-08-13 23:32 ` David Sterba
2024-08-15 5:17 ` Anand Jain
2024-08-15 12:25 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d1da382-8cf1-480d-941a-9e01298e466f@suse.com \
--to=wqu@suse.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox