From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC] btrfs: allow extent buffer helpers to skip cross-page handling
Date: Tue, 21 Nov 2023 16:35:29 +0100 [thread overview]
Message-ID: <20231121153529.GS11264@twin.jikos.cz> (raw)
In-Reply-To: <a73faeae-1925-4894-9512-7a049ff8353b@gmx.com>
On Tue, Nov 21, 2023 at 06:55:35AM +1030, Qu Wenruo wrote:
> >> @@ -3562,6 +3563,14 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> >>
> >> WARN_ON(btrfs_page_test_dirty(fs_info, p, eb->start, eb->len));
> >> eb->pages[i] = p;
> >> +
> >> + /*
> >> + * Check if the current page is physically contiguous with previous eb
> >> + * page.
> >> + */
> >> + if (i && eb->pages[i - 1] + 1 != p)
> >> + page_contig = false;
> >
> > This hasn't been fixed from last time, this has almost zero chance to
> > succeed once the system is up for some time.
>
> I have the same counter argument as the last time.
>
> If so, transparent huge page should not work, thus I strongly doubt
> about above statement.
>
> > Page addresses returned
> > from allocator are random. What I was suggesting is to use alloc_page()
> > with the given order (16K pages are 2).
>
> Nope, this patch is not intended to do that at all.
>
> This is just the preparation for the incoming changes.
> In fact alloc_page() with order needs more than those changes, it would
> only come after all the preparation, including:
>
> - Change how we allocate pages for eb
> It has to go allocation in one-go, then attaching those pages to
> filemap.
>
> - Extra changes to how concurrent eb allocation
>
> - Folio flags related changes
> Remember a lot of folio flags are not applied to all its pages.
>
> >
> > This works for all eb sizes we need, the prolematic one could be for 64K
> > because this is order 4 and PAGE_ALLOC_COSTLY_ORDER is 3, so this would
> > cost more on the MM side. But who uses 64K node size on x8_64.
>
> As long as you still want per-page allocation as fallback, this patch
> itself is still required.
>
> All the higher order allocation is only going to be an optimization or
> fast path.
>
> Furthermore, I found this suggestion is conflicting with your previous
> statement on contig pages.
> If you say the system can no longer provides contig pages after some
> uptime, then all above higher order page allocation should all fail.
No, what I say is that alloc_pages would be the fast path and
optimization if there's enough memory, otherwise allocation page-by-page
would happen as a fallback in case of fragmentation.
The idea is to try hard when allocating the extent buffers, with
fallbacks and potentially slower code but with the same guarantees as
now at least.
But as it is now, alloc_pages can't be used as replacement due to how
the pages are obtained, find_or_create_page(). Currently I don't see a
way how to convince folios to allocate the full nodesize range (with a
given order) and then get the individual pages.
Roughly like that in alloc_extent_buffer():
+ fgp_flags = fgf_set_order(fs_info->nodesize);
+ fgp_flags |= FGP_LOCK | FGP_ACCESSED | FGP_CREAT;
+ eb_folio = __filemap_get_folio(mapping, index, fgp_flags, GFP_NOFS | __GFP_NOFAIL);
for (i = 0; i < num_pages; i++, index++) {
- p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL);
+ p = folio_file_page(eb_folio, index);
fgf_set_order() does not guarantee the order, it's just a hint so I
guess we'll have to do full conversion to folios.
As an intermediate step we can take your patch with the contig
heuristic.
next prev parent reply other threads:[~2023-11-21 15:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-16 5:19 [PATCH RFC] btrfs: allow extent buffer helpers to skip cross-page handling Qu Wenruo
2023-11-20 17:00 ` David Sterba
2023-11-20 20:25 ` Qu Wenruo
2023-11-21 15:35 ` David Sterba [this message]
2023-11-21 20:37 ` Qu Wenruo
2023-11-21 21:14 ` David Sterba
2023-11-21 21:30 ` Qu Wenruo
2023-11-22 13:23 ` David Sterba
2023-11-22 13:34 ` David Sterba
2023-11-22 13:46 ` David Sterba
2023-11-22 20:01 ` Qu Wenruo
2023-11-22 22:05 ` David Sterba
2023-11-23 18:50 ` David Sterba
2023-11-23 20:51 ` Qu Wenruo
2023-11-24 16:03 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121153529.GS11264@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox