From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case
Date: Mon, 21 Dec 2020 18:15:22 +0800 [thread overview]
Message-ID: <8e25cab6-5b9f-452f-e3ef-20e48c2455e1@gmx.com> (raw)
In-Reply-To: <77af1d2b-18d6-bd7a-64cb-fce6b247e61a@gmx.com>
On 2020/12/19 上午8:24, Qu Wenruo wrote:
>
>
> On 2020/12/18 下午11:41, Josef Bacik wrote:
>> On 12/17/20 7:44 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2020/12/18 上午12:00, Josef Bacik wrote:
>>>> On 12/10/20 1:38 AM, Qu Wenruo wrote:
>>>>> For subpage case, we need to allocate new memory for each metadata
>>>>> page.
>>>>>
>>>>> So we need to:
>>>>> - Allow attach_extent_buffer_page() to return int
>>>>> To indicate allocation failure
>>>>>
>>>>> - Prealloc page->private for alloc_extent_buffer()
>>>>> We don't want to call memory allocation with spinlock hold, so
>>>>> do preallocation before we acquire the spin lock.
>>>>>
>>>>> - Handle subpage and regular case differently in
>>>>> attach_extent_buffer_page()
>>>>> For regular case, just do the usual thing.
>>>>> For subpage case, allocate new memory and update the tree_block
>>>>> bitmap.
>>>>>
>>>>> The bitmap update will be handled by new subpage specific helper,
>>>>> btrfs_subpage_set_tree_block().
>>>>>
>>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>>> ---
>>>>> fs/btrfs/extent_io.c | 69
>>>>> +++++++++++++++++++++++++++++++++++---------
>>>>> fs/btrfs/subpage.h | 44 ++++++++++++++++++++++++++++
>>>>> 2 files changed, 99 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>>>>> index 6350c2687c7e..51dd7ec3c2b3 100644
>>>>> --- a/fs/btrfs/extent_io.c
>>>>> +++ b/fs/btrfs/extent_io.c
>>>>> @@ -24,6 +24,7 @@
>>>>> #include "rcu-string.h"
>>>>> #include "backref.h"
>>>>> #include "disk-io.h"
>>>>> +#include "subpage.h"
>>>>> static struct kmem_cache *extent_state_cache;
>>>>> static struct kmem_cache *extent_buffer_cache;
>>>>> @@ -3142,22 +3143,41 @@ static int submit_extent_page(unsigned int
>>>>> opf,
>>>>> return ret;
>>>>> }
>>>>> -static void attach_extent_buffer_page(struct extent_buffer *eb,
>>>>> +static int attach_extent_buffer_page(struct extent_buffer *eb,
>>>>> struct page *page)
>>>>> {
>>>>> - /*
>>>>> - * If the page is mapped to btree inode, we should hold the
>>>>> private
>>>>> - * lock to prevent race.
>>>>> - * For cloned or dummy extent buffers, their pages are not
>>>>> mapped and
>>>>> - * will not race with any other ebs.
>>>>> - */
>>>>> - if (page->mapping)
>>>>> - lockdep_assert_held(&page->mapping->private_lock);
>>>>> + struct btrfs_fs_info *fs_info = eb->fs_info;
>>>>> + int ret;
>>>>> - if (!PagePrivate(page))
>>>>> - attach_page_private(page, eb);
>>>>> - else
>>>>> - WARN_ON(page->private != (unsigned long)eb);
>>>>> + if (fs_info->sectorsize == PAGE_SIZE) {
>>>>> + /*
>>>>> + * If the page is mapped to btree inode, we should hold the
>>>>> + * private lock to prevent race.
>>>>> + * For cloned or dummy extent buffers, their pages are not
>>>>> + * mapped and will not race with any other ebs.
>>>>> + */
>>>>> + if (page->mapping)
>>>>> + lockdep_assert_held(&page->mapping->private_lock);
>>>>> +
>>>>> + if (!PagePrivate(page))
>>>>> + attach_page_private(page, eb);
>>>>> + else
>>>>> + WARN_ON(page->private != (unsigned long)eb);
>>>>> + return 0;
>>>>> + }
>>>>> +
>>>>> + /* Already mapped, just update the existing range */
>>>>> + if (PagePrivate(page))
>>>>> + goto update_bitmap;
>>>>> +
>>>>> + /* Do new allocation to attach subpage */
>>>>> + ret = btrfs_attach_subpage(fs_info, page);
>>>>> + if (ret < 0)
>>>>> + return ret;
>>>>> +
>>>>> +update_bitmap:
>>>>> + btrfs_subpage_set_tree_block(fs_info, page, eb->start, eb->len);
>>>>> + return 0;
>>>>> }
>>>>> void set_page_extent_mapped(struct page *page)
>>>>> @@ -5067,12 +5087,19 @@ struct extent_buffer
>>>>> *btrfs_clone_extent_buffer(const struct extent_buffer *src)
>>>>> return NULL;
>>>>> for (i = 0; i < num_pages; i++) {
>>>>> + int ret;
>>>>> +
>>>>> p = alloc_page(GFP_NOFS);
>>>>> if (!p) {
>>>>> btrfs_release_extent_buffer(new);
>>>>> return NULL;
>>>>> }
>>>>> - attach_extent_buffer_page(new, p);
>>>>> + ret = attach_extent_buffer_page(new, p);
>>>>> + if (ret < 0) {
>>>>> + put_page(p);
>>>>> + btrfs_release_extent_buffer(new);
>>>>> + return NULL;
>>>>> + }
>>>>> WARN_ON(PageDirty(p));
>>>>> SetPageUptodate(p);
>>>>> new->pages[i] = p;
>>>>> @@ -5321,6 +5348,18 @@ struct extent_buffer
>>>>> *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>>>>> goto free_eb;
>>>>> }
>>>>> + /*
>>>>> + * Preallocate page->private for subpage case, so that
>>>>> + * we won't allocate memory with private_lock hold.
>>>>> + */
>>>>> + ret = btrfs_attach_subpage(fs_info, p);
>>>>> + if (ret < 0) {
>>>>> + unlock_page(p);
>>>>> + put_page(p);
>>>>> + exists = ERR_PTR(-ENOMEM);
>>>>> + goto free_eb;
>>>>> + }
>>>>> +
>>>>
>>>> This is broken, if we race with another thread adding an extent
>>>> buffer for this same range we'll overwrite the page private with the
>>>> new thing, losing any of the work that was done previously. Thanks,
>>>
>>> Firstly the page is locked, so there should be only one to grab the
>>> page.
>>>
>>> Secondly, btrfs_attach_subpage() would just exit if it detects the
>>> page is already private.
>>>
>>> So there shouldn't be a race.
>>>
>> Task1 Task2
>> alloc_extent_buffer(4096) alloc_extent_buffer(4096)
>> find_extent_buffer, nothing find_extent_buffer, nothing
>> find_or_create_page(1)
>> find_or_create_page(1)
>> waits on page lock
>> btrfs_attach_subpage()
>> radix_tree_insert()
>> unlock pages
>> exit find_or_create_page()
>> btrfs_attach_subpage(), BAD
To be more clear, in above case, btrfs_attach_subpage() would find page
is already private, thus exit without doing anything (no extra attaching
nor bitmap update).
Thus no btrfs_subpage info is overwritten.
>>
>> there's definitely a race, again this is why the code does the check
>> to see if there's a private attached to the EB already. Thanks,
That's exactly btrfs_attach_subpage() is doing.
Anyway, all the hassle is needed just to avoid memory allocation inside
the spinlock.
Personally speaking I don't see any better solution than pre-allocating
right now.
Thanks,
Qu
>
> btrfs_attach_subpage() is already doing the private check.
>
> Thanks,
> Qu
>
>>
>> Josef
next prev parent reply other threads:[~2020-12-21 10:17 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-10 6:38 [PATCH v2 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 01/18] btrfs: extent_io: rename @offset parameter to @disk_bytenr for submit_extent_page() Qu Wenruo
2020-12-17 15:44 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 02/18] btrfs: extent_io: refactor __extent_writepage_io() to improve readability Qu Wenruo
2020-12-10 12:12 ` Nikolay Borisov
2020-12-10 12:53 ` Qu Wenruo
2020-12-10 12:58 ` Nikolay Borisov
2020-12-17 15:43 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 03/18] btrfs: file: update comment for btrfs_dirty_pages() Qu Wenruo
2020-12-10 12:16 ` Nikolay Borisov
2020-12-10 6:38 ` [PATCH v2 04/18] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Qu Wenruo
2020-12-10 13:51 ` Nikolay Borisov
2020-12-17 15:50 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 05/18] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2020-12-17 15:52 ` Josef Bacik
2020-12-10 6:38 ` [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2020-12-10 15:30 ` Nikolay Borisov
2020-12-17 6:48 ` Qu Wenruo
2020-12-10 16:09 ` Nikolay Borisov
2020-12-17 16:00 ` Josef Bacik
2020-12-18 0:44 ` Qu Wenruo
2020-12-18 15:41 ` Josef Bacik
2020-12-19 0:24 ` Qu Wenruo
2020-12-21 10:15 ` Qu Wenruo [this message]
2020-12-10 6:38 ` [PATCH v2 07/18] btrfs: extent_io: make grab_extent_buffer_from_page() " Qu Wenruo
2020-12-10 15:39 ` Nikolay Borisov
2020-12-17 6:55 ` Qu Wenruo
2020-12-17 16:02 ` Josef Bacik
2020-12-18 0:49 ` Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 08/18] btrfs: extent_io: support subpage for extent buffer page release Qu Wenruo
2020-12-10 16:13 ` Nikolay Borisov
2020-12-10 6:38 ` [PATCH v2 09/18] btrfs: subpage: introduce helper for subpage uptodate status Qu Wenruo
2020-12-11 10:10 ` Nikolay Borisov
2020-12-11 10:48 ` Qu Wenruo
2020-12-11 11:41 ` Nikolay Borisov
2020-12-11 11:56 ` Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 10/18] btrfs: subpage: introduce helper for subpage error status Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 11/18] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2020-12-10 6:38 ` [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2020-12-11 12:00 ` Nikolay Borisov
2020-12-11 12:11 ` Qu Wenruo
2020-12-11 16:57 ` Nikolay Borisov
2020-12-12 1:28 ` Qu Wenruo
2020-12-12 9:26 ` Nikolay Borisov
2020-12-12 10:26 ` Qu Wenruo
2020-12-12 5:44 ` Qu Wenruo
2020-12-12 10:30 ` Nikolay Borisov
2020-12-12 10:31 ` Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 13/18] btrfs: extent_io: introduce read_extent_buffer_subpage() Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2020-12-14 9:57 ` Nikolay Borisov
2020-12-14 10:46 ` Qu Wenruo
2020-12-10 6:39 ` [PATCH v2 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2020-12-10 13:24 ` kernel test robot
2020-12-10 13:39 ` kernel test robot
2020-12-14 10:21 ` Nikolay Borisov
2020-12-14 10:50 ` Qu Wenruo
2020-12-14 11:17 ` Nikolay Borisov
2020-12-14 11:32 ` Qu Wenruo
2020-12-14 12:40 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2020-12-10 9:44 ` kernel test robot
2020-12-11 0:43 ` kernel test robot
2020-12-14 12:46 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 17/18] btrfs: integrate page status update for read path into begin/end_page_read() Qu Wenruo
2020-12-14 13:59 ` Nikolay Borisov
2020-12-10 6:39 ` [PATCH v2 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e25cab6-5b9f-452f-e3ef-20e48c2455e1@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox