public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: ethanwu <ethanwu@synology.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: add extra ending condition for indirect data backref resolution
Date: Fri, 3 Jan 2020 20:32:25 +0800	[thread overview]
Message-ID: <f9807a7e-ffbc-7f81-47a5-861cfef3a008@gmx.com> (raw)
In-Reply-To: <af1fafa0b9e6617ab3bccbdfe908956d@synology.com>


[-- Attachment #1.1: Type: text/plain, Size: 3674 bytes --]



On 2020/1/3 下午7:37, ethanwu wrote:
> Qu Wenruo 於 2020-01-03 18:15 寫到:
>> On 2020/1/3 下午5:44, ethanwu wrote:
[snip]
>> What if the backref offset already underflows?
>> Like this:
>>   item 10 key (13631488 EXTENT_ITEM 1048576) itemoff 15860 itemsize 111
>>        refs 3 gen 6 flags DATA
>>        extent data backref root FS_TREE objectid 259 offset
>> 18446744073709547520 count 1 <<<
>>        extent data backref root FS_TREE objectid 257 offset 0 count 1
>>        extent data backref root FS_TREE objectid 258 offset 4096 count 1
>>
>>
>> Since backref offset is not file offset, but file_extent_item::offset -
>> file_offset, it can be a super large number for reflinked extents.
>>
>>
>> Current kernel handles this by a very ugly but working hack: resetting
>> key_for_search.offset to 0 in add_prelim_ref() if it detects such case.
>>
>> Then this would screw up your check, causing unexpected early exit.
> 
> Thanks for the reminder.
> I think in this case the check won't fail. Even when we revert the
> working hack
> in the future, it still works unless we use u64 to do the calculation.
> 
> (u64) 18446744073709547520 = (s64) -4096
> 
> Suppose this very large offset is equal to X
>            The next line is the original file view.
>            [                                 ]
>            ^           ^                     ^     ......    ^
>            0           (u64)X + num_bytes    EOF             X
>       [----oooooooooooo]  Original range to check. - part is
>       ^                   the very large offset where no file extents
>       X in terms of s64   exist, so actually the range [0,X+num_bytes)
>            [oooooooooooooooo]  range to check after hack X=>0
>                             ^
>                             0 + num_bytes
> 
> With my patch, applying this hack will only make my check condition looser.
> Causing more range to be checked (represented by o) compared to no hack.

Ah, you're right.

Since file_extent_item::offset can never be larger than extent size, so
we backref offset can only be in the range of [file_pos - 0, file_pos -
extent_size).

If we got a minus backref offset, it means file_pos -
file_extent_item::offset < 0, which means file_pos <
file_extent_item::offset.
And since file_extent_item::offset < extent_size, file_pos < extent_isze.

Thus even with current hack, the check still works, as we search from
file_pos 0, ends at file_pos extent size, which covers the file_extent_item.

Great explanation and patch.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> 
> The only way I think this check would fail would be:
> File at offset 2^64 - 4096 uses offset 0 of a 1MB data extent,
> key_for_search->offset + num_bytes = 2^64 - 4096 + 1048576 = 1044480
> Therefore, when iterating through the leafs, we'll break early at
> offset 1044480, leave the EXTENT_DATA key @2^64 - 4096 behind.
> But AFAIK, file of that size is not allowed in btrfs.
> 
> Thanks,
> ethanwu
>>
>> I guess we have to find a new method to solve the problem then.
>>
>> Thanks,
>> Qu
>>
[...]
>>> +        if (key_for_search->type == BTRFS_EXTENT_DATA_KEY &&
>>> +            key.offset >= key_for_search->offset + num_bytes)
>>> +               break;
>>>          if (disk_byte == wanted_disk_byte) {


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-01-03 12:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-03  9:44 [PATCH] btrfs: add extra ending condition for indirect data backref resolution ethanwu
2020-01-03 10:15 ` Qu Wenruo
2020-01-03 11:37   ` ethanwu
2020-01-03 12:32     ` Qu Wenruo [this message]
2020-01-03 16:31 ` Josef Bacik
2020-01-06  3:45   ` ethanwu
2020-01-06 16:05     ` Josef Bacik
2020-01-17 10:44       ` ethanwu
2020-01-17 14:21         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9807a7e-ffbc-7f81-47a5-861cfef3a008@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=ethanwu@synology.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox