From: Brendan Jackman <jackmanb@google.com>
To: Gregory Price <gourry@gourry.net>, Brendan Jackman <jackmanb@google.com>
Cc: Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Vlastimil Babka <vbabka@kernel.org>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Lorenzo Stoakes <ljs@kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
<rppt@kernel.org>, Sumit Garg <sumit.garg@oss.qualcomm.com>,
<derkling@google.com>, <reijiw@google.com>,
Will Deacon <will@kernel.org>, <rientjes@google.com>,
"Kalyazin, Nikita" <kalyazin@amazon.co.uk>,
<patrick.roy@linux.dev>,
"Itazuri, Takahiro" <itazur@amazon.co.uk>,
Andy Lutomirski <luto@kernel.org>,
David Kaplan <david.kaplan@amd.com>,
Thomas Gleixner <tglx@kernel.org>, Yosry Ahmed <yosry@kernel.org>
Subject: Re: [PATCH v2 00/22] mm: Add __GFP_UNMAPPED
Date: Wed, 13 May 2026 17:14:20 +0000 [thread overview]
Message-ID: <DIHPURUMTESI.1LP37OYJN1N31@google.com> (raw)
In-Reply-To: <agSkMV26nhYukbnK@gourry-fedora-PF4VCD3F>
On Wed May 13, 2026 at 4:17 PM UTC, Gregory Price wrote:
> On Fri, Mar 20, 2026 at 06:23:24PM +0000, Brendan Jackman wrote:
>>
>> Because of these ambitious usecases, it's core to this proposal that the
>> feature
>> overloading the concept of a migratetype, this extension is done by
>> adding a new concept on top of migratetype: the _freetype_. A freetype
>> is basically just a migratetype plus some flags, and it replaces
>> migratetypes wherever the latter is currently used as to index free
>> pages.
>>
>
> I'm a bit confused why the need for additional level of indirection
> instead of just adding a new migratetype. You still end up increasing
> the migratetype matrix, just with a new dimension.
>
> (apologies if this was covered in prior work or discussions, just now
> plugging myself into the series).
>
> Why not simply have an unmapped migratetype, for example, and on steal
> you convert it to movable or whatever the preference is?
Because the fact that only one migratetype currently supports being
unmapped is a temporary happenstance of the guest_memfd usecase. In
general, this needs to support having unmapped variants of ~arbitrary
migratetypes.
>> .:::: Hacky bits: simplistic secretmem integration
>>
>> The secretmem integration leaves the mmain optimisations on the table;
>> the security-required flushes of the mermap areas are implemented via
>> distinct tlb_flush_mm() calls. It should be possible to amortize the
>> mermap TLB flushes completely into the normal VMA flushing. However, as
>> far as I know there is no performance-sensitive usecase for secretmem.
>> So, I've just implemented the minimal adoption. This will at least avoid
>> fragmentation of the direct map, even if it doesn't reduce TLB flushing.
>> If anyone knows of a workload that might benefit from dropping that
>> flushing, let me know!
>
> Crossing a couple streams here, I wonder if there's some mechanisms
> introduced by MST's latest multi-zeroing-avoidance [1] code that might
> help deal with the problem here.
>
> MST wired up an optional user_addr into the buddy that allows us to sink
> the zeroing step for folio_zero_user (or folio_user_zero or whatever)
> into the post_alloc_hook - which includes some cache flushing.
>
> That conveniently gives you what you need for a TLB flush AND an
> indicator that the allocation is intended for userland.
>
> Unless I'm fundamentally misunderstanding something, the pattern at least
> seems similar.
Yeah, I actually only noticed that yesterday due to your posts on that
thread! I need to investigate it further. My assumption has always been
that this isn't a general solution because we don't always _have_ a user
address (e.g. for guest_memfd it's important that we can populate the
memory via write(), so there's no user address), but it's pretty likely
I'm missing something there.
> In that sense, does this just become a post_alloc_hook that unmaps the
> memory after zeroing and allocation?
>
> I get the intent is to have the majority of memory unmapped by default,
> and then steal those blocks and map them as the kernel requires more
> memory, but I wonder if it's cleaner to do it the other way and simply
> have the buddy unmap on alloc after zeroing, and remap on free.
That would be cleaner indeed, but the key question here isn't about the
default state of memory here, it's about batching.
The reason we need to do it at the block granularity is that a TLB flush
every time we allocate one of these pages is a performance nonstarter -
that's actually the entire point of this series. If you can afford a TLB
flush per allocation then you don't need __GFP_UNMAPPED for the
guest_memfd usecase, the existing direct map removal series [0] is
already fine.
[0] https://lore.kernel.org/all/20260410151746.61150-1-kalyazin@amazon.com/
> Seems like the free path would be trivial, check if the page is in the
> direct map and if not, remap it and move on. Entirely hidden from
> existing users.
>
> So, maybe a stupid question: Was the opposite mechanism considered
> (unmap on alloc sunk into the buddy), and if so was it rejected for some
> other reason?
Hopefully my prev paragraph explained that it's not viable anyway, but:
if we _did_ do the [un]mapping on a per-allocation basis, the
disadvantage of unmap-on-alloc is just that we expect most pages in the
system to be unmapped. So the majority of map-unmap cycles are pointless
(map on free, but we're probably gonna unmap again on allloc).
Cheers,
Brendan
next prev parent reply other threads:[~2026-05-13 17:14 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com>
[not found] ` <20260320-page_alloc-unmapped-v2-8-28bf1bd54f41@google.com>
2026-05-11 13:46 ` [PATCH v2 08/22] mm: introduce for_each_free_list() Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-9-28bf1bd54f41@google.com>
2026-05-11 13:51 ` [PATCH v2 09/22] mm/page_alloc: don't overload migratetype in find_suitable_fallback() Vlastimil Babka (SUSE)
2026-05-11 16:44 ` Brendan Jackman
2026-05-11 16:53 ` Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-10-28bf1bd54f41@google.com>
2026-05-11 15:34 ` [PATCH v2 10/22] mm: introduce freetype_t Vlastimil Babka (SUSE)
2026-05-11 16:49 ` Brendan Jackman
2026-05-11 16:58 ` Vlastimil Babka (SUSE)
2026-05-11 18:17 ` Vlastimil Babka (SUSE)
2026-05-11 18:26 ` Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-11-28bf1bd54f41@google.com>
2026-05-11 15:35 ` [PATCH v2 11/22] mm: move migratetype definitions to freetype.h Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-12-28bf1bd54f41@google.com>
2026-05-11 18:01 ` [PATCH v2 12/22] mm: add definitions for allocating unmapped pages Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-13-28bf1bd54f41@google.com>
2026-05-11 18:07 ` [PATCH v2 13/22] mm: rejig pageblock mask definitions Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-14-28bf1bd54f41@google.com>
2026-05-11 18:29 ` [PATCH v2 14/22] mm: encode freetype flags in pageblock flags Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-15-28bf1bd54f41@google.com>
2026-05-11 18:30 ` [PATCH v2 15/22] mm/page_alloc: remove ifdefs from pindex helpers Vlastimil Babka (SUSE)
2026-05-12 9:49 ` Brendan Jackman
[not found] ` <20260320-page_alloc-unmapped-v2-16-28bf1bd54f41@google.com>
2026-05-13 8:46 ` [PATCH v2 16/22] mm/page_alloc: separate pcplists by freetype flags Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-18-28bf1bd54f41@google.com>
2026-05-13 9:43 ` [PATCH v2 18/22] mm/page_alloc: introduce ALLOC_NOBLOCK Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-19-28bf1bd54f41@google.com>
2026-05-13 15:43 ` [PATCH v2 19/22] mm/page_alloc: implement __GFP_UNMAPPED allocations Vlastimil Babka (SUSE)
2026-05-13 16:17 ` [PATCH v2 00/22] mm: Add __GFP_UNMAPPED Gregory Price
2026-05-13 17:14 ` Brendan Jackman [this message]
2026-05-13 17:28 ` Gregory Price
2026-05-13 17:38 ` Vlastimil Babka (SUSE)
2026-05-13 17:59 ` Gregory Price
[not found] ` <20260320-page_alloc-unmapped-v2-20-28bf1bd54f41@google.com>
2026-05-13 17:00 ` [PATCH v2 20/22] mm/page_alloc: implement __GFP_UNMAPPED|__GFP_ZERO allocations Vlastimil Babka (SUSE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DIHPURUMTESI.1LP37OYJN1N31@google.com \
--to=jackmanb@google.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david.kaplan@amd.com \
--cc=david@kernel.org \
--cc=derkling@google.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=itazur@amazon.co.uk \
--cc=kalyazin@amazon.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luto@kernel.org \
--cc=patrick.roy@linux.dev \
--cc=peterz@infradead.org \
--cc=reijiw@google.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=sumit.garg@oss.qualcomm.com \
--cc=tglx@kernel.org \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yosry@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox