From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Nikolay Borisov <nborisov@suse.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 15/18] btrfs: disk-io: introduce subpage metadata validation check
Date: Mon, 14 Dec 2020 18:50:40 +0800 [thread overview]
Message-ID: <0741b6ef-106e-8a12-b6c4-267a3ee57b67@gmx.com> (raw)
In-Reply-To: <a0ff059f-b1d1-29fa-6d0d-2d37a5c5a5e3@suse.com>
On 2020/12/14 下午6:21, Nikolay Borisov wrote:
>
>
> On 10.12.20 г. 8:39 ч., Qu Wenruo wrote:
>> For subpage metadata validation check, there are some difference:
>> - Read must finish in one bvec
>> Since we're just reading one subpage range in one page, it should
>> never be split into two bios nor two bvecs.
>>
>> - How to grab the existing eb
>> Instead of grabbing eb using page->private, we have to go search radix
>> tree as we don't have any direct pointer at hand.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/disk-io.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 82 insertions(+)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index b6c03a8b0c72..adda76895058 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -591,6 +591,84 @@ static int validate_extent_buffer(struct extent_buffer *eb)
>> return ret;
>> }
>>
>> +static int validate_subpage_buffer(struct page *page, u64 start, u64 end,
>> + int mirror)
>> +{
>> + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
>> + struct extent_buffer *eb;
>> + int reads_done;
>> + int ret = 0;
>> +
>> + if (!IS_ALIGNED(start, fs_info->sectorsize) ||
>
> That's guaranteed by the allocator.
>
>> + !IS_ALIGNED(end - start + 1, fs_info->sectorsize) ||
> That's guaranteed by the fact that nodesize is a multiple of sectorsize.
>
>> + !IS_ALIGNED(end - start + 1, fs_info->nodesize)) {
>
> And that's also guaranteed that the size of an eb is always a nodesize.
> Also aren't those checks already performed by the tree-checker during
> write? Just remove this as it adds noise.
>
>> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));> + btrfs_err(fs_info, "invalid tree read bytenr");
>> + return -EUCLEAN;
>> + }
>> +
>> + /*
>> + * We don't allow bio merge for subpage metadata read, so we should
>> + * only get one eb for each endio hook.
>> + */
>> + ASSERT(end == start + fs_info->nodesize - 1);
>> + ASSERT(PagePrivate(page));
>> +
>> + rcu_read_lock();
>> + eb = radix_tree_lookup(&fs_info->buffer_radix,
>> + start / fs_info->sectorsize);
>
> This division op likely produces the kernel robot's warning. It could be
> written to use >> fs_info->sectorsize_bits. Furthermore this usage of
> radix tree + rcu without acquiring the refs is unsafe as per my
> explanation of, essentially, identical issue in patch 12 and our offline
> chat about it.
Another relic I forgot in the long update history, nice find.
>
>> + rcu_read_unlock();
>> +
>> + /*
>> + * When we are reading one tree block, eb must have been
>> + * inserted into the radix tree. If not something is wrong.
>> + */
>> + if (!eb) {
>> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> + btrfs_err(fs_info,
>> + "can't find extent buffer for bytenr %llu",
>> + start);
>> + return -EUCLEAN;
>> + }
>
> That's impossible to execute and such a failure will result in a crash
> so just remove this code.
>
>> + /*
>> + * The pending IO might have been the only thing that kept
>> + * this buffer in memory. Make sure we have a ref for all
>> + * this other checks
>> + */
>> + atomic_inc(&eb->refs);
>> +
>> + reads_done = atomic_dec_and_test(&eb->io_pages);
>> + /* Subpage read must finish in page read */
>> + ASSERT(reads_done);
>
> Just ASSERT(atomic_dec_and_test(&eb->io_pages)). Again, for subpage I
> think that's a bit much since it only has 1 page so it's guaranteed that
> it will always be true.
IIRC ASSERT() won't execute whatever in it for non debug build.
Thus ASSERT(atomic_*) would cause non-debug kernel not to decrease the
io_pages and hangs the system.
Exactly the pitfall I'm thinking of.
Thanks,
Qu
>> +
>> + eb->read_mirror = mirror;
>> + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) {
>> + ret = -EIO;
>> + goto err;
>> + }
>> + ret = validate_extent_buffer(eb);
>> + if (ret < 0)
>> + goto err;
>> +
>> + if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags))
>> + btree_readahead_hook(eb, ret);
>> +
>> + set_extent_buffer_uptodate(eb);
>> +
>> + free_extent_buffer(eb);
>> + return ret;
>> +err:
>> + /*
>> + * our io error hook is going to dec the io pages
>> + * again, we have to make sure it has something to
>> + * decrement
>> + */
>
> That comment is slightly ambiguous - it's not the io error hook that
> does the decrement but end_bio_extent_readpage. Just rewrite the comment
> to :
>
> "end_bio_extent_readpage decrements io_pages in case of error, make sure
> it has ...."
>
>> + atomic_inc(&eb->io_pages);
>> + clear_extent_buffer_uptodate(eb);
>> + free_extent_buffer(eb);
>> + return ret;
>> +}
>> +
>> int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
>> struct page *page, u64 start, u64 end,
>> int mirror)
>> @@ -600,6 +678,10 @@ int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
>> int reads_done;
>>
>> ASSERT(page->private);
>> +
>> + if (btrfs_sb(page->mapping->host->i_sb)->sectorsize < PAGE_SIZE)
>> + return validate_subpage_buffer(page, start, end, mirror);
>
> nit: validate_metadata_buffer is called in only once place so I'm
> wondering won't it make it more readable if this check is lifted to its
> sole caller so that when reading end_bio_extent_readpage it's apparent
> what's going on. Though it's apparent that the nesting in the caller
> will get somewhat unwieldy so won't be pressing hard for this.
>> +
>> eb = (struct extent_buffer *)page->private;
>>
>>
>>
next prev parent reply other threads:[~2020-12-14 10:52 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-10 6:38 [PATCH v2 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 01/18] btrfs: extent_io: rename @offset parameter to @disk_bytenr for submit_extent_page() Qu Wenruo
2020-12-17 15:44 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 02/18] btrfs: extent_io: refactor __extent_writepage_io() to improve readability Qu Wenruo
2020-12-10 12:12 ` Nikolay Borisov
2020-12-10 12:53 ` Qu Wenruo
2020-12-10 12:58 ` Nikolay Borisov
2020-12-17 15:43 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 03/18] btrfs: file: update comment for btrfs_dirty_pages() Qu Wenruo
2020-12-10 12:16 ` Nikolay Borisov
2020-12-10 6:38 ` [PATCH v2 04/18] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Qu Wenruo
2020-12-10 13:51 ` Nikolay Borisov
2020-12-17 15:50 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 05/18] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2020-12-17 15:52 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2020-12-10 15:30 ` Nikolay Borisov
2020-12-17 6:48 ` Qu Wenruo
2020-12-10 16:09 ` Nikolay Borisov
2020-12-17 16:00 ` Josef Bacik
2020-12-18 0:44 ` Qu Wenruo
2020-12-18 15:41 ` Josef Bacik
2020-12-19 0:24 ` Qu Wenruo
2020-12-21 10:15 ` Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 07/18] btrfs: extent_io: make grab_extent_buffer_from_page() " Qu Wenruo
2020-12-10 15:39 ` Nikolay Borisov
2020-12-17 6:55 ` Qu Wenruo
2020-12-17 16:02 ` Josef Bacik
2020-12-18 0:49 ` Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 08/18] btrfs: extent_io: support subpage for extent buffer page release Qu Wenruo
2020-12-10 16:13 ` Nikolay Borisov
2020-12-10 6:38 ` [PATCH v2 09/18] btrfs: subpage: introduce helper for subpage uptodate status Qu Wenruo
2020-12-11 10:10 ` Nikolay Borisov
2020-12-11 10:48 ` Qu Wenruo
2020-12-11 11:41 ` Nikolay Borisov
2020-12-11 11:56 ` Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 10/18] btrfs: subpage: introduce helper for subpage error status Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 11/18] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2020-12-11 12:00 ` Nikolay Borisov
2020-12-11 12:11 ` Qu Wenruo
2020-12-11 16:57 ` Nikolay Borisov
2020-12-12 1:28 ` Qu Wenruo
2020-12-12 9:26 ` Nikolay Borisov
2020-12-12 10:26 ` Qu Wenruo
2020-12-12 5:44 ` Qu Wenruo
2020-12-12 10:30 ` Nikolay Borisov
2020-12-12 10:31 ` Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 13/18] btrfs: extent_io: introduce read_extent_buffer_subpage() Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2020-12-14 9:57 ` Nikolay Borisov
2020-12-14 10:46 ` Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2020-12-10 13:24 ` kernel test robot
2020-12-10 13:39 ` kernel test robot
2020-12-14 10:21 ` Nikolay Borisov
2020-12-14 10:50 ` Qu Wenruo [this message]
2020-12-14 11:17 ` Nikolay Borisov
2020-12-14 11:32 ` Qu Wenruo
2020-12-14 12:40 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2020-12-10 9:44 ` kernel test robot
2020-12-11 0:43 ` kernel test robot
2020-12-14 12:46 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 17/18] btrfs: integrate page status update for read path into begin/end_page_read() Qu Wenruo
2020-12-14 13:59 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0741b6ef-106e-8a12-b6c4-267a3ee57b67@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox