public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC] btrfs: allow extent buffer helpers to skip cross-page handling
Date: Tue, 21 Nov 2023 22:14:37 +0100	[thread overview]
Message-ID: <20231121211437.GX11264@twin.jikos.cz> (raw)
In-Reply-To: <1b63c587-c2c5-44d5-bbc3-5facc34f5361@gmx.com>

On Wed, Nov 22, 2023 at 07:07:10AM +1030, Qu Wenruo wrote:
> On 2023/11/22 02:05, David Sterba wrote:
> > On Tue, Nov 21, 2023 at 06:55:35AM +1030, Qu Wenruo wrote:
> >>>> @@ -3562,6 +3563,14 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> > No, what I say is that alloc_pages would be the fast path and
> > optimization if there's enough memory, otherwise allocation page-by-page
> > would happen as a fallback in case of fragmentation.
> 
> That's also my understanding.
> 
> But the counter argument is still there, if you really think after some
> uptime there would be no contig pages, then the fast path will never
> trigger, and all fall back to page-by-page routine, thus defeating any
> changes to introduce any patch like this one.

Such state is not permanent and memory management tries to coalesce the
freed pages under the costly order back to bigger chunks. So in a system
under heavy load the fragmentation can become bad, we would be ready for
that. It would have to be very bad luck for a long time not to be able
to get any higher order allocation at all. The process stacks depend on
contig allocations 16K, slabs are backed by 2x4K pages, it wouldn't be
just us depending on that.

> > The idea is to try hard when allocating the extent buffers, with
> > fallbacks and potentially slower code but with the same guarantees as
> > now at least.
> >
> > But as it is now, alloc_pages can't be used as replacement due to how
> > the pages are obtained, find_or_create_page(). Currently I don't see a
> > way how to convince folios to allocate the full nodesize range (with a
> > given order) and then get the individual pages.
> 
> I'm doing a preparation to separate the memory allocation for metadata
> page allocation.
> 
> The current patch is to do the allocation first, then attach the new
> pages to the address space of btree inode.
> 
> By this, we can easily add new method of allocation, like trying higher
> order folio first, if failed then page by page allocation.

Right, that would work and probably be the easiest way how to switch to
folios.

> But unfortunately there is a big problem here, even if we can forget the
> argument on whether we can get contig pages after some uptime:
> 
> - My local tests can still sometimes lockup due to some phantom locked
>    pages

I don't know how your patches do it, it could be that the page is not
attached the same way as find_or_create_page/pagecache_get_page/__filemap_get_folio
would do.

So do you want this patch (adding the contig detection and eb::addr)
applied?

  reply	other threads:[~2023-11-21 21:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-16  5:19 [PATCH RFC] btrfs: allow extent buffer helpers to skip cross-page handling Qu Wenruo
2023-11-20 17:00 ` David Sterba
2023-11-20 20:25   ` Qu Wenruo
2023-11-21 15:35     ` David Sterba
2023-11-21 20:37       ` Qu Wenruo
2023-11-21 21:14         ` David Sterba [this message]
2023-11-21 21:30           ` Qu Wenruo
2023-11-22 13:23             ` David Sterba
2023-11-22 13:34 ` David Sterba
2023-11-22 13:46 ` David Sterba
2023-11-22 20:01   ` Qu Wenruo
2023-11-22 22:05     ` David Sterba
2023-11-23 18:50     ` David Sterba
2023-11-23 20:51     ` Qu Wenruo
2023-11-24 16:03       ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121211437.GX11264@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox