From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Zi Yan <ziy@nvidia.com>
Cc: Muhammad Usama Anjum <usama.anjum@arm.com>,
Ryan.Roberts@arm.com, Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Uladzislau Rezki <urezki@gmail.com>,
Nick Terrell <terrelln@fb.com>, David Sterba <dsterba@suse.com>,
"Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, david.hildenbrand@arm.com
Subject: Re: [PATCH v2 1/3] mm/page_alloc: Optimize free_contig_range()
Date: Tue, 17 Mar 2026 19:48:15 +0100 [thread overview]
Message-ID: <3883abcd-426d-4211-ae76-b5601ee728de@kernel.org> (raw)
In-Reply-To: <2AE5B748-C57B-4BC1-BF9E-6299CCBF295B@nvidia.com>
On 3/17/26 16:17, Zi Yan wrote:
> On 16 Mar 2026, at 12:19, Vlastimil Babka (SUSE) wrote:
>
>> On 3/16/26 17:02, Zi Yan wrote:
>>> On 16 Mar 2026, at 11:21, Vlastimil Babka wrote:
>>>
>>>>> +/*
>>>>> + * free_pages_prepare() has already been called for page(s) being freed.
>>>>> + * TODO: Perform per-subpage free_pages_prepare() checks for order > 0 pages
>>>>> + * (HWPoison, PageNetpp, bad free page).
>>>>> + */
>>>>
>>>> I'm confused, and reading the v1 thread didn't help either. Where would the
>>>> subpages to check come from? AFAICS we start from order-0 pages always.
>>>> __free_contig_range calls free_pages_prepare on every page with order 0
>>>> unconditionally, so we check every page as an order-0 page. If we then free
>>>> the bunch of individually checked pages as a high-order page, there's no
>>>> reason to check those subpages again, no? Am I missing something?
>>>
>>> There are two kinds of order > 0 pages, compound and not compound.
>>> free_pages_prepare() checks all tail pages of a compound order > 0 pages too.
>>> For non compound ones, free_pages_prepare() only has free_page_is_bad()
>>> check on tail ones.
>>>
>>> So my guess is that the TODO is to check all subpages on a non compound
>>> order > 0 one in the same manner. This is based on the assumption that
>>
>> OK but:
>>
>> 1) Why put that TODO specifically on FPI_PREPARED definition, which is for
>> the case we skip the prepare/check?
>> 2) Why add it in this series which AFAICS doesn't handle non-compound
>> order>0 anywhere.
>> 3) We'd better work on eliminating the non-compound order>0 usages
>> altogether, rather than work on support them better.
>
> I agreed with you when I first saw this. After I think about it again,
> the issue might not be directly related to the allocation but is the free path.
> Like the patch title said, it is an optimization of free contiguous pages.
> These physically contiguous pages happen to come from alloc non-compound order>0
> and this leads to this optimization.
Sure and this use-case doesn't need the TODO to be solved, or am I mistaken?
That TODO seems to be about a hypothetical other use case with order>0
non-compound pages. Because AFAICS the use-cases in this series are not
about order>0 non-compound pages. Maybe they exist for a brief moment
between allocation and split_page() (in vmalloc() case?), but when we are
freeing them, we start with a contiguous series of order-0 pages (refcounted
or not).
So by my definition we are not freeing an order>0 non-compound page. By
"freeing order>0 non-compound page" I mean specifically what ___free_pages()
is handling in the "else if (!head) {" path, which I'd love to get rid of.
That TODO to me seems like about supporting that case.
> The problem they want to solve is to speed up page free path by freeing
> a group of pages together. They are optimizing for a special situation
> where a group of pages that are physically contiguous, so that these pages
> can be freed via free_pages(page, order /* > 0 */). If we take away
I don't think we want that as that leads to the case I described above. It
assumes head is refcounted and tail are not. I'd rather not overload it with
a case where we have contiguous order-0 pages where each is refcounted (or
none are). Yeah we can optimize the freeing like this series does, but I'd
not do it via something like "free_pages(page, order /* > 0 */)"
> the allocation of non-compound order>0, like you suggested in 3, we basically
I suggested we'd take it away in the sense of not producing order>0 where
head is refcounted, tails are not, and it's not a compound page. I'd rather
have an API that applies split_page() before and returns it as order-0
refcounted pages, but not the intermediate order>0 non-compound anymore.
> remove the optimization opportunity from them. I am not sure that what
> people want.
>
> To think about the problem broadly, how can we optimize free_page_bulk(),
> if that exists? Sorting input pages based on PFNs, so that we can them in
> high orders instead of individual order-0s. This patch basically says,
> hey, the group of pages we are freeing are all contiguous, since that is
> how we allocate them, freeing them as a whole is much quicker than freeing
> them individually.
Yes we can have generalized, perhaps stacked support for the cases used by
the converted callers in this series, but not using a generic API that would
try e.g. sorting pfns even when we know they are already sorted. That means:
- given as contiguous range, frozen (patch 3)
- given as contiguous range, not frozen (patch 1)
- probably contiguous, needs checking, given as array of pages (patch 2)
>>
>>> all non compound order > 0 page users use split_page() after the allocation,
>>> treat each page individually, and free them back altogether. But I am not
>>> sure if this is true for all users allocating non compound order > 0 pages.
>>
>> Maybe as part of the elimination (point 3 above) we should combine the
>> allocation+split so it's never the first without the second anymore.
I elaborated on this above.
>>> And free_pages_prepare_bulk() might be a better name for such functions.
>>>
>>> The above confusion is also a reason I asked Ryan to try adding a unsplit_page()
>>> function to fuse back non compound order > 0 pages and free the fused one
>>> as we are currently doing. But that looks like a pain to implment. Maybe an
>>
>> Yeah not sure it's worth it either.
>>
>>> alternative to this FPI_PREPARED is to add FPI_FREE_BULK and loop through all
>>> subpages if FPI_FREE_BULK is set with
>>> __free_pages_prepare(page + i, 0, fpi_flags & ~FPI_FREE_BULK) in
>>> __free_pages_ok().
>>
>> Hmm, maybe...
>
> Let me know if my reasoning above moves your opinion on FPI_FREE_BULK towards
> a positive direction. :)
If you can make it work to support the three cases above, without doing
unnecessary work, and with no "free_pages(page, order /* > 0 */)" like API?
> Best Regards,
> Yan, Zi
next prev parent reply other threads:[~2026-03-17 18:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 11:31 [PATCH v2 0/3] mm: Free contiguous order-0 pages efficiently Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 1/3] mm/page_alloc: Optimize free_contig_range() Muhammad Usama Anjum
2026-03-16 15:21 ` Vlastimil Babka
2026-03-16 16:02 ` Zi Yan
2026-03-16 16:19 ` Vlastimil Babka (SUSE)
2026-03-17 15:17 ` Zi Yan
2026-03-17 18:48 ` Vlastimil Babka (SUSE) [this message]
2026-03-19 22:07 ` David Hildenbrand (Arm)
2026-03-20 8:20 ` Vlastimil Babka (SUSE)
2026-03-20 12:46 ` Zi Yan
2026-03-16 16:11 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 2/3] vmalloc: Optimize vfree Muhammad Usama Anjum
2026-03-16 15:49 ` Vlastimil Babka
2026-03-17 9:36 ` Muhammad Usama Anjum
2026-03-20 8:39 ` David Hildenbrand (Arm)
2026-03-20 14:33 ` Vlastimil Babka (SUSE)
2026-03-23 11:28 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 3/3] mm/page_alloc: Optimize __free_contig_frozen_range() Muhammad Usama Anjum
2026-03-16 16:22 ` Vlastimil Babka
2026-03-20 14:26 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3883abcd-426d-4211-ae76-b5601ee728de@kernel.org \
--to=vbabka@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=Ryan.Roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=david.hildenbrand@arm.com \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=terrelln@fb.com \
--cc=urezki@gmail.com \
--cc=usama.anjum@arm.com \
--cc=vishal.moola@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.