From: "Zi Yan" <ziy@nvidia.com>
To: "Barry Song" <baohua@kernel.org>,
"David Hildenbrand (Arm)" <david@kernel.org>
Cc: "Qi Zheng" <qi.zheng@linux.dev>, <akpm@linux-foundation.org>,
<ljs@kernel.org>, <baolin.wang@linux.alibaba.com>,
<liam@infradead.org>, <npache@redhat.com>, <ryan.roberts@arm.com>,
<dev.jain@arm.com>, <lance.yang@linux.dev>,
<muchun.song@linux.dev>, <osalvador@suse.de>, <chrisl@kernel.org>,
<kasong@tencent.com>, <shikemeng@huaweicloud.com>,
<nphamcs@gmail.com>, <baoquan.he@linux.dev>,
<youngjun.park@lge.com>, <peterx@redhat.com>,
<usama.arif@linux.dev>, <willy@infradead.org>,
<vbabka@kernel.org>, <surenb@google.com>, <mhocko@suse.com>,
<jackmanb@google.com>, <hannes@cmpxchg.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>,
"Qi Zheng" <zhengqi.arch@bytedance.com>
Subject: Re: [RFC PATCH 0/8] Introducte Reserved THP
Date: Tue, 30 Jun 2026 19:34:11 -0400 [thread overview]
Message-ID: <DJMRZRG70UYR.MOB3ASPBCBAB@nvidia.com> (raw)
In-Reply-To: <CAGsJ_4zuxaqXQgrg_SiGY4gV4t3FQHdeM91dH4Y3-RknE7mg-g@mail.gmail.com>
On Tue Jun 30, 2026 at 6:59 PM EDT, Barry Song wrote:
> On Mon, Jun 29, 2026 at 8:20 PM David Hildenbrand (Arm)
> <david@kernel.org> wrote:
> [...]
>> >
>> > 2. Implementation
>> > =================
>> >
>> > In 2024, Yu Zhao proposed a similar idea:
>> >
>> > Link: https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/
>> >
>> > The idea was to introduce two virt zones: ZONE_NOSPLIT and ZONE_NOMERGE to
>> > guarantee the allocation success rate of THP, achieving an effect similar to
>> > reservation. However, it seems there was no further progress, perhaps because of
>> > reluctance to introduce more virt zones like ZONE_MOVABLE.
>> >
>> > This RFC wants to discuss another implementation:
>> >
>> > 1. Introduce a new migratetype: MIGRATE_RESERVED_THP.
>> > 2. Introduce two new hugetlb-like kernel boot parameters: `thp_reserved_size`
>> > and `thp_reserved_nr`. When set, the required memory is marked as
>> > MIGRATE_RESERVED_THP and put back into the buddy allocator.
>>
>> I'm all for some mechanism to make runtime allocation of large chunks of memory
>> easier, by adding a pool from where multiple consumers (THP, guest_memfd,
>> hugetlb, whatever) can allocate memory.
>>
>> Call me very skeptical of getting the page allocator involved like this. (I hate it)
>
> One thing we've been thinking about for a while is whether we can
> introduce something at the pageblock level to let memory "remember"
> which allocation order is preferred within that pageblock.
>
> For example, if we ever allocate an order-0 page from pageblock 100,
> that pageblock would later prefer order-0 allocations. Similarly, if
> we allocate a large folio from pageblock 200, we would avoid using
> pageblock 200 for order-0 allocations as long as there is still
> memory available in pageblock 100 for order-0.
>
> Since order-0 allocations are often the main source of fragmentation,
> if we already have both pagecache and anonymous large folios, we may
> care more about containing or quarantining order-0 allocations in
> certain areas, rather than trying to maintain a large-folio pool or
> similar strategy.
Aren't unmovable pages causing fragmentation? For movable pages,
regardless of their orders, they can always be migrated if no additional
pin is present.
If we use per-order pageblocks, how to use pageblocks with rarely used
orders? Allowing lower order to fallback to higher order pageblocks?
>
> Chris’s de-fragmentation of swap slots[1] seems to be a big success
> based on my observations, where he provides a similar memory-order
> preference for swap clusters. There is no reservation mechanism, no
> sysfs knob, and no need to split swap into two areas—everything
> just works automatically.
>
> I wonder if you would be interested in something similar at the
> pageblock level. If so, I’d be happy to work on a prototype in
> August. I’m completely booked in July.
>
> [1] https://lore.kernel.org/all/20240730-swap-allocator-v5-0-cb9c148b9297@kernel.org/
>
I feel that swap and page allocation have a fundamental distintion,
where swap slots are not movable, but pages can. Memory compaction can
move pages around to make space for high order allocations, but does
swap support something similar? How will page mobility work in this swap
slot defragmentation world?
In addition, when swap space is full, or only order-0 swap slots are
available but higher order folios want to be swapped out, folio swap
might simply stop (except splitting folios to fill the order-0 slots).
But for page allocation, some pages can be reclaimed/swapped to make
space and this adds complexity.
--
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2026-06-30 23:34 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-27 7:21 [RFC PATCH 0/8] Introducte Reserved THP Qi Zheng
2026-06-27 7:21 ` [RFC PATCH 1/8] mm: page_alloc: add reserved THP pageblock type Qi Zheng
2026-06-27 7:21 ` [RFC PATCH 2/8] mm: add boot-time reserved THP pageblock capacity Qi Zheng
2026-06-27 7:21 ` [RFC PATCH 3/8] mm: page_alloc: add a reserved THP allocation primitive Qi Zheng
2026-06-27 7:21 ` [RFC PATCH 4/8] mm: add reserved THP quota helpers Qi Zheng
2026-06-27 7:21 ` [RFC PATCH 5/8] mm: add reserved THP vma flag Qi Zheng
2026-06-27 7:26 ` [RFC PATCH 6/8] mm: maintain reserved THP quota across VMA changes Qi Zheng
2026-06-27 7:26 ` [RFC PATCH 7/8] mm: support reserved THP VMAs in anonymous faults Qi Zheng
2026-06-27 7:26 ` [RFC PATCH 8/8] mm: add MADV_RESERVED_THP range policy Qi Zheng
2026-06-29 3:46 ` [RFC PATCH 0/8] Introducte Reserved THP Matthew Wilcox
2026-06-29 10:13 ` Qi Zheng
2026-06-29 12:20 ` David Hildenbrand (Arm)
2026-06-29 19:00 ` Gregory Price
2026-06-30 22:59 ` Barry Song
2026-06-30 23:34 ` Zi Yan [this message]
2026-07-01 0:24 ` Barry Song
2026-06-30 23:45 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DJMRZRG70UYR.MOB3ASPBCBAB@nvidia.com \
--to=ziy@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=baoquan.he@linux.dev \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kasong@tencent.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=npache@redhat.com \
--cc=nphamcs@gmail.com \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
--cc=qi.zheng@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=usama.arif@linux.dev \
--cc=vbabka@kernel.org \
--cc=willy@infradead.org \
--cc=youngjun.park@lge.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox