Re: [PATCH] btrfs: refactor alloc_extent_buffer() to allocate-then-attach method

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: refactor alloc_extent_buffer() to allocate-then-attach method
Date: Tue, 28 Nov 2023 08:47:11 +1030	[thread overview]
Message-ID: <ec785c3e-7466-4fcd-9cd6-f9ea454a41fd@gmx.com> (raw)
In-Reply-To: <20231127162801.GE2366036@perftesting>



On 2023/11/28 02:58, Josef Bacik wrote:
> On Thu, Nov 23, 2023 at 06:30:05AM +1030, Qu Wenruo wrote:
>>
>>
>> On 2023/11/23 00:44, Josef Bacik wrote:
>>> On Wed, Nov 22, 2023 at 10:05:04AM +1030, Qu Wenruo wrote:
>>>> Currently alloc_extent_buffer() utilizes find_or_create_page() to
>>>> allocate one page a time for an extent buffer.
>>>>
>>>> This method has the following disadvantages:
>>>>
>>>> - find_or_create_page() is the legacy way of allocating new pages
>>>>     With the new folio infrastructure, find_or_create_page() is just
>>>>     redirected to filemap_get_folio().
>>>>
>>>> - Lacks the way to support higher order (order >= 1) folios
>>>>     As we can not yet let filemap to give us a higher order folio (yet).
>>>>
>>>> This patch would change the workflow by the following way:
>>>>
>>>> 		Old		   |		new
>>>> -----------------------------------+-------------------------------------
>>>>                                      | ret = btrfs_alloc_page_array();
>>>> for (i = 0; i < num_pages; i++) {  | for (i = 0; i < num_pages; i++) {
>>>>       p = find_or_create_page();     |     ret = filemap_add_folio();
>>>>       /* Attach page private */      |     /* Reuse page cache if needed */
>>>>       /* Reused eb if needed */      |
>>>> 				   |     /* Attach page private and
>>>> 				   |        reuse eb if needed */
>>>> 				   | }
>>>>
>>>> By this we split the page allocation and private attaching into two
>>>> parts, allowing future updates to each part more easily, and migrate to
>>>> folio interfaces (especially for possible higher order folios).
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>    fs/btrfs/extent_io.c | 173 +++++++++++++++++++++++++++++++------------
>>>>    1 file changed, 126 insertions(+), 47 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>>>> index 99cc16aed9d7..0ea65f248c15 100644
>>>> --- a/fs/btrfs/extent_io.c
>>>> +++ b/fs/btrfs/extent_io.c
>>>> @@ -3084,6 +3084,14 @@ static bool page_range_has_eb(struct btrfs_fs_info *fs_info, struct page *page)
>>>>    static void detach_extent_buffer_page(struct extent_buffer *eb, struct page *page)
>>>>    {
>>>>    	struct btrfs_fs_info *fs_info = eb->fs_info;
>>>> +	/*
>>>> +	 * We can no longer using page->mapping reliably, as some extent buffer
>>>> +	 * may not have any page mapped to btree_inode yet.
>>>> +	 * Furthermore we have to handle dummy ebs during selftests, where
>>>> +	 * btree_inode is not yet initialized.
>>>> +	 */
>>>> +	struct address_space *mapping = fs_info->btree_inode ?
>>>> +					fs_info->btree_inode->i_mapping : NULL;
>>>
>>> I don't understand this, this should only happen if we managed to get
>>> PagePrivate set on the page, and in that case page->mapping is definitely
>>> reliable.  We shouldn't be getting in here with a page that hasn't actually been
>>> attached to the extent buffer, and if we are that needs to be fixed, we don't
>>> need to be dealing with that case in this way.  Thanks,
>>
>> The problem is, with the new way of allocating pages first, then attach
>> them to filemap, we can have a case where the extent buffer has 4 pages,
>> but only the first page is attached to filemap of btree inode.
>>
>> In that case, we still want to lock the btree inode private, but
>> page->mapping is still NULL for the remaining 3 pages.
>>
>
> In this case I think we just change the error handling at the end, as we already
> have to do something special anyways, so we now would do
>
> for (int i = 0; i < attached; i++) {
> 	unlock_page(eb->pages[i]);
> 	detach_extent_buffer_page(eb->pages[i]);
> }
>
> for (int i = 0; i < num_pages; i++) {
> 	if (!eb->pages[i])
> 		break;
> 	put_page(eb->pages[i]);
> }
>
> and then we can leave detach_extent_buffer_page on its own.  Thanks,

This sounds pretty reasonable, I'll do more tests to make sure this works.

Thanks,
Qu
>
> Josef
>

next prev parent reply	other threads:[~2023-11-27 22:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21 23:35 [PATCH] btrfs: refactor alloc_extent_buffer() to allocate-then-attach method Qu Wenruo
2023-11-22 14:14 ` Josef Bacik
2023-11-22 20:00   ` Qu Wenruo
2023-11-27 16:28     ` Josef Bacik
2023-11-27 22:17       ` Qu Wenruo [this message]
2023-11-22 14:38 ` David Sterba
2023-11-22 20:03   ` Qu Wenruo
2023-11-27  5:10     ` Should we still go __GFP_NOFAIL? (Was Re: [PATCH] btrfs: refactor alloc_extent_buffer() to allocate-then-attach method) Qu Wenruo
2023-11-27 16:19       ` Josef Bacik
2023-11-28 16:26       ` David Sterba
2023-11-28 20:06         ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec785c3e-7466-4fcd-9cd6-f9ea454a41fd@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox