Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Nikolay Borisov <nborisov@suse.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org, David Sterba <dsterba@suse.cz>
Subject: Re: [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case
Date: Thu, 17 Dec 2020 14:48:53 +0800	[thread overview]
Message-ID: <419d9a59-4f3a-eaf0-a312-5985b4421704@gmx.com> (raw)
In-Reply-To: <4dd63414-5e74-77d1-723b-6fb61ffca5fb@suse.com>



On 2020/12/10 下午11:30, Nikolay Borisov wrote:
>
>
> On 10.12.20 г. 8:38 ч., Qu Wenruo wrote:
>> For subpage case, we need to allocate new memory for each metadata page.
>>
>> So we need to:
>> - Allow attach_extent_buffer_page() to return int
>>    To indicate allocation failure
>>
>> - Prealloc page->private for alloc_extent_buffer()
>>    We don't want to call memory allocation with spinlock hold, so
>>    do preallocation before we acquire the spin lock.
>>
>> - Handle subpage and regular case differently in
>>    attach_extent_buffer_page()
>>    For regular case, just do the usual thing.
>>    For subpage case, allocate new memory and update the tree_block
>>    bitmap.
>>
>>    The bitmap update will be handled by new subpage specific helper,
>>    btrfs_subpage_set_tree_block().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/extent_io.c | 69 +++++++++++++++++++++++++++++++++++---------
>>   fs/btrfs/subpage.h   | 44 ++++++++++++++++++++++++++++
>>   2 files changed, 99 insertions(+), 14 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 6350c2687c7e..51dd7ec3c2b3 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -24,6 +24,7 @@
>>   #include "rcu-string.h"
>>   #include "backref.h"
>>   #include "disk-io.h"
>> +#include "subpage.h"
>>
>>   static struct kmem_cache *extent_state_cache;
>>   static struct kmem_cache *extent_buffer_cache;
>> @@ -3142,22 +3143,41 @@ static int submit_extent_page(unsigned int opf,
>>   	return ret;
>>   }
>>
>> -static void attach_extent_buffer_page(struct extent_buffer *eb,
>> +static int attach_extent_buffer_page(struct extent_buffer *eb,
>>   				      struct page *page)
>>   {
>> -	/*
>> -	 * If the page is mapped to btree inode, we should hold the private
>> -	 * lock to prevent race.
>> -	 * For cloned or dummy extent buffers, their pages are not mapped and
>> -	 * will not race with any other ebs.
>> -	 */
>> -	if (page->mapping)
>> -		lockdep_assert_held(&page->mapping->private_lock);
>> +	struct btrfs_fs_info *fs_info = eb->fs_info;
>> +	int ret;
>>
>> -	if (!PagePrivate(page))
>> -		attach_page_private(page, eb);
>> -	else
>> -		WARN_ON(page->private != (unsigned long)eb);
>> +	if (fs_info->sectorsize == PAGE_SIZE) {
>> +		/*
>> +		 * If the page is mapped to btree inode, we should hold the
>> +		 * private lock to prevent race.
>> +		 * For cloned or dummy extent buffers, their pages are not
>> +		 * mapped and will not race with any other ebs.
>> +		 */
>> +		if (page->mapping)
>> +			lockdep_assert_held(&page->mapping->private_lock);
>> +
>> +		if (!PagePrivate(page))
>> +			attach_page_private(page, eb);
>> +		else
>> +			WARN_ON(page->private != (unsigned long)eb);
>> +		return 0;
>> +	}
>> +
>> +	/* Already mapped, just update the existing range */
>> +	if (PagePrivate(page))
>> +		goto update_bitmap;
>
> How can this check ever be false, given btrfs_attach_subpage is called
> unconditionally  in alloc_extent_buffer so that you can avoid allocating
> memory with private lock held, yet in this function you check if memory
> hasn't been allocated and you proceed to do it? Also that memory
> allocation is done with GFP_NOFS under a spinlock, that's not atomic i.e
> IO can still be kicked which means you can go to sleep while holding a
> spinlock, not cool.

There are two callers of attach_extent_buffer_page(), one in
alloc_extent_buffer(), which we pre-allocate page::private before
calling attach_extent_buffer_page().

And the pre-allocation happens out of the spinlock.
Thus there is no memory allocation at all for that call site.

The other caller is in btrfs_clone_extent_buffer(), which needs proper
memory allocation.

>
>> +
>> +	/* Do new allocation to attach subpage */
>> +	ret = btrfs_attach_subpage(fs_info, page);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +update_bitmap:
>> +	btrfs_subpage_set_tree_block(fs_info, page, eb->start, eb->len);
>> +	return 0;
>
> Those are really 2 functions, demarcated by the if. Given that
> attach_extent_buffer is called in only 2 places, can't you opencode the
> if (fs_info->sectorize) check in the callers and define 2 functions:
>
> 1 for subpage blocksize and the other one for the old code?

Tried, looks much worse than current code, especially we need to add one
indent in btrfs_clone_extent_buffer().

>
>>   }
>>
>
> <snip>
>
>> diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h
>> index 96f3b226913e..c2ce603e7848 100644
>> --- a/fs/btrfs/subpage.h
>> +++ b/fs/btrfs/subpage.h
>> @@ -23,9 +23,53 @@
>>   struct btrfs_subpage {
>>   	/* Common members for both data and metadata pages */
>>   	spinlock_t lock;
>> +	union {
>> +		/* Structures only used by metadata */
>> +		struct {
>> +			u16 tree_block_bitmap;
>> +		};
>> +		/* structures only used by data */
>> +	};
>>   };
>>
>>   int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page);
>>   void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page);
>>
>> +/*
>> + * Convert the [start, start + len) range into a u16 bitmap
>> + *
>> + * E.g. if start == page_offset() + 16K, len = 16K, we get 0x00f0.
>> + */
>> +static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info,
>> +			struct page *page, u64 start, u32 len)
>> +{
>> +	int bit_start = (start - page_offset(page)) >> fs_info->sectorsize_bits;
>> +	int nbits = len >> fs_info->sectorsize_bits;
>> +
>> +	/* Basic checks */
>> +	ASSERT(PagePrivate(page) && page->private);
>> +	ASSERT(IS_ALIGNED(start, fs_info->sectorsize) &&
>> +	       IS_ALIGNED(len, fs_info->sectorsize));
>
> Separate aligns so if they feel it's evident which one failed.

I guess we are going to forget when ASSERT() should be used.
It's for something which shouldn't fail.

It's not used as a less-terrible BUG_ON(), but really to indicate what's
expected, thus I don't really expect it to be triggered, nor would it
matter if it's two lines or one line.

what's your idea on this David?

>
>> +	ASSERT(page_offset(page) <= start &&
>> +	       start + len <= page_offset(page) + PAGE_SIZE);
>
> ditto. Also instead of checking 'page_offset(page) <= start' you can
> simply check 'bit_start is >= 0' as that's what you ultimately care about.

Despite the ASSERT() usage, the start + len and page_offset() is much
easier to grasp without the need to refer to bit_start.

Thanks,
Qu

>
>> +	/*
>> +	 * Here nbits can be 16, thus can go beyond u16 range. Here we make the
>> +	 * first left shift to be calculated in unsigned long (u32), then
>> +	 * truncate the result to u16.
>> +	 */
>> +	return (u16)(((1UL << nbits) - 1) << bit_start);
>> +}
>> +
>> +static inline void btrfs_subpage_set_tree_block(struct btrfs_fs_info *fs_info,
>> +			struct page *page, u64 start, u32 len)
>> +{
>> +	struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private;
>> +	unsigned long flags;
>> +	u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len);
>> +
>> +	spin_lock_irqsave(&subpage->lock, flags);
>> +	subpage->tree_block_bitmap |= tmp;
>> +	spin_unlock_irqrestore(&subpage->lock, flags);
>> +}
>> +
>>   #endif /* BTRFS_SUBPAGE_H */
>>
>

  reply	other threads:[~2020-12-17  6:50 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-10  6:38 [PATCH v2 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 01/18] btrfs: extent_io: rename @offset parameter to @disk_bytenr for submit_extent_page() Qu Wenruo
2020-12-17 15:44   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 02/18] btrfs: extent_io: refactor __extent_writepage_io() to improve readability Qu Wenruo
2020-12-10 12:12   ` Nikolay Borisov
2020-12-10 12:53     ` Qu Wenruo
2020-12-10 12:58       ` Nikolay Borisov
2020-12-17 15:43   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 03/18] btrfs: file: update comment for btrfs_dirty_pages() Qu Wenruo
2020-12-10 12:16   ` Nikolay Borisov
2020-12-10  6:38 ` [PATCH v2 04/18] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Qu Wenruo
2020-12-10 13:51   ` Nikolay Borisov
2020-12-17 15:50   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 05/18] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2020-12-17 15:52   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2020-12-10 15:30   ` Nikolay Borisov
2020-12-17  6:48     ` Qu Wenruo [this message]
2020-12-10 16:09   ` Nikolay Borisov
2020-12-17 16:00   ` Josef Bacik
2020-12-18  0:44     ` Qu Wenruo
2020-12-18 15:41       ` Josef Bacik
2020-12-19  0:24         ` Qu Wenruo
2020-12-21 10:15           ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 07/18] btrfs: extent_io: make grab_extent_buffer_from_page() " Qu Wenruo
2020-12-10 15:39   ` Nikolay Borisov
2020-12-17  6:55     ` Qu Wenruo
2020-12-17 16:02   ` Josef Bacik
2020-12-18  0:49     ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 08/18] btrfs: extent_io: support subpage for extent buffer page release Qu Wenruo
2020-12-10 16:13   ` Nikolay Borisov
2020-12-10  6:38 ` [PATCH v2 09/18] btrfs: subpage: introduce helper for subpage uptodate status Qu Wenruo
2020-12-11 10:10   ` Nikolay Borisov
2020-12-11 10:48     ` Qu Wenruo
2020-12-11 11:41       ` Nikolay Borisov
2020-12-11 11:56         ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 10/18] btrfs: subpage: introduce helper for subpage error status Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 11/18] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2020-12-11 12:00   ` Nikolay Borisov
2020-12-11 12:11     ` Qu Wenruo
2020-12-11 16:57       ` Nikolay Borisov
2020-12-12  1:28         ` Qu Wenruo
2020-12-12  9:26           ` Nikolay Borisov
2020-12-12 10:26             ` Qu Wenruo
2020-12-12  5:44         ` Qu Wenruo
2020-12-12 10:30           ` Nikolay Borisov
2020-12-12 10:31             ` Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 13/18] btrfs: extent_io: introduce read_extent_buffer_subpage() Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2020-12-14  9:57   ` Nikolay Borisov
2020-12-14 10:46     ` Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2020-12-10 13:24   ` kernel test robot
2020-12-10 13:39   ` kernel test robot
2020-12-14 10:21   ` Nikolay Borisov
2020-12-14 10:50     ` Qu Wenruo
2020-12-14 11:17       ` Nikolay Borisov
2020-12-14 11:32         ` Qu Wenruo
2020-12-14 12:40           ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2020-12-10  9:44   ` kernel test robot
2020-12-11  0:43   ` kernel test robot
2020-12-14 12:46   ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 17/18] btrfs: integrate page status update for read path into begin/end_page_read() Qu Wenruo
2020-12-14 13:59   ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=419d9a59-4f3a-eaf0-a312-5985b4421704@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox