Re: [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: Qu Wenruo <wqu@suse.com>
To: Nikolay Borisov <nborisov@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support
Date: Fri, 11 Dec 2020 20:11:47 +0800	[thread overview]
Message-ID: <a2732cae-4dea-744e-2eda-8b8e5f2b6710@suse.com> (raw)
In-Reply-To: <d6684ad3-875e-53f1-cf1d-a4490c35c4f9@suse.com>



On 2020/12/11 下午8:00, Nikolay Borisov wrote:
> 
> 
> On 10.12.20 г. 8:38 ч., Qu Wenruo wrote:
>> Unlike the original try_release_extent_buffer,
>> try_release_subpage_extent_buffer() will iterate through
>> btrfs_subpage::tree_block_bitmap, and try to release each extent buffer.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/extent_io.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 73 insertions(+)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 141e414b1ab9..4d55803302e9 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -6258,10 +6258,83 @@ void memmove_extent_buffer(const struct extent_buffer *dst,
>>  	}
>>  }
>>  
>> +static int try_release_subpage_extent_buffer(struct page *page)
>> +{
>> +	struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
>> +	u64 page_start = page_offset(page);
>> +	int bitmap_size = BTRFS_SUBPAGE_BITMAP_SIZE;
> 
> Remove this variable and directly use BTRFS_SUBPAGE_BITMAP_SIZE as a
> terminating condition
> 
>> +	int bit_start = 0;
>> +	int ret;
>> +
>> +	while (bit_start < bitmap_size) {
> 
> You really want to iterate for a fixed number of items so switch that to
> a for loop.

The problem here is, it's not always fixed.

If it finds one bit set, it will skip (nodesize >> sectorsize_bits) bits.

But if not found, it will skip to just next bit.

Thus I'm not sure if for loop is really a good choice here for
differential step.

> 
>> +		struct btrfs_subpage *subpage;
>> +		struct extent_buffer *eb;
>> +		unsigned long flags;
>> +		u16 tmp = 1 << bit_start;
>> +		u64 start;
>> +
>> +		/*
>> +		 * Make sure the page still has private, as previous run can
>> +		 * detach the private
>> +		 */
> 
> But if previous run has run it would have disposed of this eb and you
> won't find this page at all, no ?

For the "previous run" I mean, previous iteration in the same loop.

E.g. the page has 4 bits set, just one eb (16K nodesize).

For the first run, it release the only eb of the page, and cleared page
private.
For the second run, since private is cleared, we need to break out.

> 
>> +		spin_lock(&page->mapping->private_lock);
>> +		if (!PagePrivate(page)) {
>> +			spin_unlock(&page->mapping->private_lock);
>> +			break;
>> +		}
>> +		subpage = (struct btrfs_subpage *)page->private;
>> +		spin_unlock(&page->mapping->private_lock);
>> +
>> +		spin_lock_irqsave(&subpage->lock, flags);
>> +		if (!(tmp & subpage->tree_block_bitmap))  {
>> +			spin_unlock_irqrestore(&subpage->lock, flags);
>> +			bit_start++;
>> +			continue;
>> +		}
>> +		spin_unlock_irqrestore(&subpage->lock, flags);
>> +
>> +		start = bit_start * fs_info->sectorsize + page_start;
>> +		bit_start += fs_info->nodesize >> fs_info->sectorsize_bits;
> 
> By doing this you are really saying "skip all blocks pertaining to this
> eb". In order for this to be correct it would imply that bit_start
> should _always_ be 0,4,8,12 - am I correct? 

Nope. As long as no eb crosses page boundary, it won't cause problem.
So in theory we support case like eb spans sector 1~5.

> But what happens if
> if (!(tmp & subpage->tree_block_bitmap))  has executed and bit_start is
> now 1, then you'd make start = page_start + 4k , skip next 4(16k) blocks
> but that would be wrong, no ?

For (!(tmp & subpage->tree_block_bitmap)) branch, isn't bit_start just
increased by one?
Exactly like what I said, we will check next sector, until we hit the
first bit set.

And only when we hit a bit, we increase the bit_start by nodesize /
sectorsize.

> 
> Essentially the page would look like:
> 
> |0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15
> 
> So you want to release the EB's that spawn 0-3, 4-7, 8-11, 12-15, but
> what if bit_start becomes 1 and you add 4 to that, this offsets all
> further calculation by 1 i.e you are going into the next eb.

Nope, 4 is only added we we hit a bit set.
If we hit a bit zero, we jump to next bit, not following nodesize >>
sectorsize.

That's exactly the reason I'm not using for() loop here, due to the
difference in step size.

Thanks,
Qu
> 
> 
>> +		/*
>> +		 * Here we can't call find_extent_buffer() which will increase
>> +		 * eb->refs.
>> +		 */
>> +		rcu_read_lock();
>> +		eb = radix_tree_lookup(&fs_info->buffer_radix,
>> +				start >> fs_info->sectorsize_bits);
>> +		rcu_read_unlock();
>> +		ASSERT(eb);
>> +		spin_lock(&eb->refs_lock);
>> +		if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb) ||
>> +		    !test_and_clear_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)) {
>> +			spin_unlock(&eb->refs_lock);
>> +			continue;
>> +		}
>> +		/*
>> +		 * Here we don't care the return value, we will always check
>> +		 * the page private at the end.
>> +		 * And release_extent_buffer() will release the refs_lock.
>> +		 */
>> +		release_extent_buffer(eb);
>> +	}
>> +	/* Finally to check if we have cleared page private */
>> +	spin_lock(&page->mapping->private_lock);
>> +	if (!PagePrivate(page))
>> +		ret = 1;
>> +	else
>> +		ret = 0;
>> +	spin_unlock(&page->mapping->private_lock);
>> +	return ret;
>> +
>> +}
>> +
>>  int try_release_extent_buffer(struct page *page)
>>  {
>>  	struct extent_buffer *eb;
>>  
>> +	if (btrfs_sb(page->mapping->host->i_sb)->sectorsize < PAGE_SIZE)
>> +		return try_release_subpage_extent_buffer(page);
>> +
>>  	/*
>>  	 * We need to make sure nobody is attaching this page to an eb right
>>  	 * now.
>>
>

next prev parent reply	other threads:[~2020-12-11 12:14 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-10  6:38 [PATCH v2 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 01/18] btrfs: extent_io: rename @offset parameter to @disk_bytenr for submit_extent_page() Qu Wenruo
2020-12-17 15:44   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 02/18] btrfs: extent_io: refactor __extent_writepage_io() to improve readability Qu Wenruo
2020-12-10 12:12   ` Nikolay Borisov
2020-12-10 12:53     ` Qu Wenruo
2020-12-10 12:58       ` Nikolay Borisov
2020-12-17 15:43   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 03/18] btrfs: file: update comment for btrfs_dirty_pages() Qu Wenruo
2020-12-10 12:16   ` Nikolay Borisov
2020-12-10  6:38 ` [PATCH v2 04/18] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Qu Wenruo
2020-12-10 13:51   ` Nikolay Borisov
2020-12-17 15:50   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 05/18] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2020-12-17 15:52   ` Josef Bacik
2020-12-10  6:38 ` [PATCH v2 06/18] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2020-12-10 15:30   ` Nikolay Borisov
2020-12-17  6:48     ` Qu Wenruo
2020-12-10 16:09   ` Nikolay Borisov
2020-12-17 16:00   ` Josef Bacik
2020-12-18  0:44     ` Qu Wenruo
2020-12-18 15:41       ` Josef Bacik
2020-12-19  0:24         ` Qu Wenruo
2020-12-21 10:15           ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 07/18] btrfs: extent_io: make grab_extent_buffer_from_page() " Qu Wenruo
2020-12-10 15:39   ` Nikolay Borisov
2020-12-17  6:55     ` Qu Wenruo
2020-12-17 16:02   ` Josef Bacik
2020-12-18  0:49     ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 08/18] btrfs: extent_io: support subpage for extent buffer page release Qu Wenruo
2020-12-10 16:13   ` Nikolay Borisov
2020-12-10  6:38 ` [PATCH v2 09/18] btrfs: subpage: introduce helper for subpage uptodate status Qu Wenruo
2020-12-11 10:10   ` Nikolay Borisov
2020-12-11 10:48     ` Qu Wenruo
2020-12-11 11:41       ` Nikolay Borisov
2020-12-11 11:56         ` Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 10/18] btrfs: subpage: introduce helper for subpage error status Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 11/18] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2020-12-10  6:38 ` [PATCH v2 12/18] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2020-12-11 12:00   ` Nikolay Borisov
2020-12-11 12:11     ` Qu Wenruo [this message]
2020-12-11 16:57       ` Nikolay Borisov
2020-12-12  1:28         ` Qu Wenruo
2020-12-12  9:26           ` Nikolay Borisov
2020-12-12 10:26             ` Qu Wenruo
2020-12-12  5:44         ` Qu Wenruo
2020-12-12 10:30           ` Nikolay Borisov
2020-12-12 10:31             ` Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 13/18] btrfs: extent_io: introduce read_extent_buffer_subpage() Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2020-12-14  9:57   ` Nikolay Borisov
2020-12-14 10:46     ` Qu Wenruo
2020-12-10  6:39 ` [PATCH v2 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2020-12-10 13:24   ` kernel test robot
2020-12-10 13:39   ` kernel test robot
2020-12-14 10:21   ` Nikolay Borisov
2020-12-14 10:50     ` Qu Wenruo
2020-12-14 11:17       ` Nikolay Borisov
2020-12-14 11:32         ` Qu Wenruo
2020-12-14 12:40           ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2020-12-10  9:44   ` kernel test robot
2020-12-11  0:43   ` kernel test robot
2020-12-14 12:46   ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 17/18] btrfs: integrate page status update for read path into begin/end_page_read() Qu Wenruo
2020-12-14 13:59   ` Nikolay Borisov
2020-12-10  6:39 ` [PATCH v2 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2732cae-4dea-744e-2eda-8b8e5f2b6710@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox