Re: [PATCH v2 3/4] btrfs: Add sanity check for EXTENT_DATA when reading out leaf

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 3/4] btrfs: Add sanity check for EXTENT_DATA when reading out leaf
Date: Tue, 26 Sep 2017 20:42:53 +0800	[thread overview]
Message-ID: <64782ff3-143c-f400-b791-6ec7f3adfb19@gmx.com> (raw)
In-Reply-To: <20170926120501.GR31640@twin.jikos.cz>



On 2017年09月26日 20:05, David Sterba wrote:
> On Tue, Sep 26, 2017 at 08:28:25AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2017年09月25日 23:45, David Sterba wrote:
>>> On Wed, Aug 23, 2017 at 04:57:58PM +0900, Qu Wenruo wrote:
>>>> Add extra checker for item with EXTENT_DATA type.
>>>> This checks the following thing:
>>>> 0) Key offset
>>>>      All key offset must be aligned to sectorsize.
>>>>      Inline extent must have 0 for key offset.
>>>>
>>>> 1) Item size
>>>>      Plain text inline file extent size must match item size.
>>>
>>> 'plain text' seems to be a bit misleading, I don't think we've ever
>>> referred to uncompressed extent as such, although it makes some sense. I
>>> think 'uncompressed' would work too.
>>
>> I'll use 'uncompressed' instead.
> 
> I've applied an fixed that in the changelog, as there were some other
> changes needed due to other patch removing the BTRFS_COMPRESSION_LAST.

Checked the commit 0826e7faa895f5463e4790082392cdaaff98d8d8, which uses 
BTRFS_FILE_EXTENT_TYPES and doesn't increase but using the last value.

Looks very good to me.

> 
>>>>      (compressed inline file extent has no info about its on-disk size)
>>>>      Regular/preallocated file extent size must be a fixed value.
>>>>
>>>> 2) Every member of regular file extent item
>>>>      Including alignment for bytenr and offset, possible value for
>>>>      compression/encryption/type.
>>>>
>>>> 3) Type/compression/encode must be one of the valid values.
>>>>
>>>> This should be the most comprehensive and restrict check in the context
>>>> of btrfs_item for EXTENT_DATA.
>>>>
>>>> Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
>>>> ---
>>>>    fs/btrfs/disk-io.c              | 108 ++++++++++++++++++++++++++++++++++++++++
>>>>    include/uapi/linux/btrfs_tree.h |   1 +
>>>>    2 files changed, 109 insertions(+)
>>>>
>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>>>> index e034d08bd036..b92296c6a698 100644
>>>> --- a/fs/btrfs/disk-io.c
>>>> +++ b/fs/btrfs/disk-io.c
>>>> @@ -549,6 +549,103 @@ static int check_tree_block_fsid(struct btrfs_fs_info *fs_info,
>>>>    		   btrfs_header_level(eb) == 0 ? "leaf" : "node",	\
>>>>    		   reason, btrfs_header_bytenr(eb), root->objectid, slot)
>>>>    
>>>> +static int check_extent_data_item(struct btrfs_root *root,
>>>> +				  struct extent_buffer *leaf,
>>>> +				  struct btrfs_key *key, int slot)
>>>> +{
>>>> +	struct btrfs_file_extent_item *fi;
>>>> +	u32 sectorsize = root->fs_info->sectorsize;
>>>> +	u32 item_size = btrfs_item_size_nr(leaf, slot);
>>>> +
>>>> +	if (!IS_ALIGNED(key->offset, sectorsize)) {
>>>> +		CORRUPT("unaligned key offset for file extent",
>>>
>>> The CORRUPT macro does not print any details beyond what it gets from
>>> the parameters, so here we'd like to know which extent it is and what's
>>> the size. The sectorsize can be found elsewhere so it does not need
>>> to be printed.
>>
>> Did you mean despite bytenr of the tree block and root objectid, we
>> should output more info about the key?
> 
> Possibly yes, but rather more about the item. If the key objectid/offset
> are eg. some structural value like the extent offset etc.

Understood.

So in short, the reporter should do:
1) Report the wrong value
2) Report expected value (or value range)
3) Using meaningful name other than key values
4) Report extra meaningful values if they passed their checker
5) Checker order must follow member order

And following above principle, using wrong file_extent_type as example,
it should report like:
---
root=512 ino=768 file_offset=4096 invalid file_extent_type, have 0x4, 
expected range [0, 2]
---

What about this principle?

(Although I think it's a little long, especially when extra fs_uuid 
appended)

<snip>
>>
>> Please note that, this condition is only for regular/prealloc file
>> extent items, so ram_bytes should be sectorsize aligned.
> 
> I can't find the tree dump and code confirms that the value should be
> aligned.

Strange.

I just did a 6K write with compression, it produced such dump-tree result:
---
         item 4 key (257 INODE_ITEM 0) itemoff 15881 itemsize 160
                 generation 7 transid 7 size 6144 nbytes 8192 <<<
                 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
                 sequence 0 flags 0x2(none)
                 <snip>
         item 5 key (257 INODE_REF 256) itemoff 15866 itemsize 15
                 index 2 namelen 5 name: file1
         item 6 key (257 EXTENT_DATA 0) itemoff 15813 itemsize 53
                 generation 7 type 1 (regular)
                 extent data disk byte 12845056 nr 4096 <<<
                 extent data offset 0 nr 8192 ram 8192 <<<
                 extent compression 1 (zlib)
---

Since the compression is all page based, I didn't think the ram_bytes 
can be page unaligned.

Anyway, I can remove the ram_bytes check until we get a better 
understanding of its definition.
(And better record it in wiki for later developers)

All these checker must be fully reviewed and get agreement from all 
reviewers.
Or it will cause tons of error report from end users.
So I'm pretty OK to delay the merge 1 or 2 cycles.



And if above error report principles are OK for you, I think I'll need 
to create new report mechanisms for these patches.
And all of these patches may get a big update.
(Of course, I'll use function other than macro unless I need 
stringification)

So I'm afraid you may need to re-apply all of them then.
Sorry for the inconvience in advance.

Thanks,
Qu

next prev parent reply	other threads:[~2017-09-26 12:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23  7:57 [PATCH v2 0/3] Introduce comprehensive sanity check framework and Qu Wenruo
2017-08-23  7:57 ` [PATCH v2 1/4] btrfs: Refactor check_leaf function for later expansion Qu Wenruo
2017-08-23  7:57 ` [PATCH v2 2/4] btrfs: Check if item pointer overlap with item itself Qu Wenruo
2017-08-23  7:57 ` [PATCH v2 3/4] btrfs: Add sanity check for EXTENT_DATA when reading out leaf Qu Wenruo
2017-09-25 15:45   ` David Sterba
2017-09-26  0:28     ` Qu Wenruo
2017-09-26  0:46       ` Qu Wenruo
2017-09-26 12:13         ` David Sterba
2017-09-26 12:49           ` Qu Wenruo
2017-09-26 14:42             ` David Sterba
2017-09-26 12:05       ` David Sterba
2017-09-26 12:42         ` Qu Wenruo [this message]
2017-08-23  7:57 ` [PATCH v2 4/4] btrfs: Add checker for EXTENT_CSUM Qu Wenruo
2017-08-23 15:19 ` [PATCH v2 0/3] Introduce comprehensive sanity check framework and Nikolay Borisov
2017-09-25 15:17 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64782ff3-143c-f400-b791-6ec7f3adfb19@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).