From: Brendan Jackman <jackmanb@google.com>
To: Gregory Price <gourry@gourry.net>, Brendan Jackman <jackmanb@google.com>
Cc: Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Vlastimil Babka <vbabka@kernel.org>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Lorenzo Stoakes <ljs@kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
<rppt@kernel.org>, Sumit Garg <sumit.garg@oss.qualcomm.com>,
<derkling@google.com>, <reijiw@google.com>,
Will Deacon <will@kernel.org>, <rientjes@google.com>,
"Kalyazin, Nikita" <kalyazin@amazon.co.uk>,
<patrick.roy@linux.dev>,
"Itazuri, Takahiro" <itazur@amazon.co.uk>,
Andy Lutomirski <luto@kernel.org>,
David Kaplan <david.kaplan@amd.com>,
Thomas Gleixner <tglx@kernel.org>, Yosry Ahmed <yosry@kernel.org>
Subject: Re: [PATCH v2 00/22] mm: Add __GFP_UNMAPPED
Date: Wed, 13 May 2026 17:14:20 +0000 [thread overview]
Message-ID: <DIHPURUMTESI.1LP37OYJN1N31@google.com> (raw)
In-Reply-To: <agSkMV26nhYukbnK@gourry-fedora-PF4VCD3F>
On Wed May 13, 2026 at 4:17 PM UTC, Gregory Price wrote:
> On Fri, Mar 20, 2026 at 06:23:24PM +0000, Brendan Jackman wrote:
>>
>> Because of these ambitious usecases, it's core to this proposal that the
>> feature
>> overloading the concept of a migratetype, this extension is done by
>> adding a new concept on top of migratetype: the _freetype_. A freetype
>> is basically just a migratetype plus some flags, and it replaces
>> migratetypes wherever the latter is currently used as to index free
>> pages.
>>
>
> I'm a bit confused why the need for additional level of indirection
> instead of just adding a new migratetype. You still end up increasing
> the migratetype matrix, just with a new dimension.
>
> (apologies if this was covered in prior work or discussions, just now
> plugging myself into the series).
>
> Why not simply have an unmapped migratetype, for example, and on steal
> you convert it to movable or whatever the preference is?
Because the fact that only one migratetype currently supports being
unmapped is a temporary happenstance of the guest_memfd usecase. In
general, this needs to support having unmapped variants of ~arbitrary
migratetypes.
>> .:::: Hacky bits: simplistic secretmem integration
>>
>> The secretmem integration leaves the mmain optimisations on the table;
>> the security-required flushes of the mermap areas are implemented via
>> distinct tlb_flush_mm() calls. It should be possible to amortize the
>> mermap TLB flushes completely into the normal VMA flushing. However, as
>> far as I know there is no performance-sensitive usecase for secretmem.
>> So, I've just implemented the minimal adoption. This will at least avoid
>> fragmentation of the direct map, even if it doesn't reduce TLB flushing.
>> If anyone knows of a workload that might benefit from dropping that
>> flushing, let me know!
>
> Crossing a couple streams here, I wonder if there's some mechanisms
> introduced by MST's latest multi-zeroing-avoidance [1] code that might
> help deal with the problem here.
>
> MST wired up an optional user_addr into the buddy that allows us to sink
> the zeroing step for folio_zero_user (or folio_user_zero or whatever)
> into the post_alloc_hook - which includes some cache flushing.
>
> That conveniently gives you what you need for a TLB flush AND an
> indicator that the allocation is intended for userland.
>
> Unless I'm fundamentally misunderstanding something, the pattern at least
> seems similar.
Yeah, I actually only noticed that yesterday due to your posts on that
thread! I need to investigate it further. My assumption has always been
that this isn't a general solution because we don't always _have_ a user
address (e.g. for guest_memfd it's important that we can populate the
memory via write(), so there's no user address), but it's pretty likely
I'm missing something there.
> In that sense, does this just become a post_alloc_hook that unmaps the
> memory after zeroing and allocation?
>
> I get the intent is to have the majority of memory unmapped by default,
> and then steal those blocks and map them as the kernel requires more
> memory, but I wonder if it's cleaner to do it the other way and simply
> have the buddy unmap on alloc after zeroing, and remap on free.
That would be cleaner indeed, but the key question here isn't about the
default state of memory here, it's about batching.
The reason we need to do it at the block granularity is that a TLB flush
every time we allocate one of these pages is a performance nonstarter -
that's actually the entire point of this series. If you can afford a TLB
flush per allocation then you don't need __GFP_UNMAPPED for the
guest_memfd usecase, the existing direct map removal series [0] is
already fine.
[0] https://lore.kernel.org/all/20260410151746.61150-1-kalyazin@amazon.com/
> Seems like the free path would be trivial, check if the page is in the
> direct map and if not, remap it and move on. Entirely hidden from
> existing users.
>
> So, maybe a stupid question: Was the opposite mechanism considered
> (unmap on alloc sunk into the buddy), and if so was it rejected for some
> other reason?
Hopefully my prev paragraph explained that it's not viable anyway, but:
if we _did_ do the [un]mapping on a per-allocation basis, the
disadvantage of unmap-on-alloc is just that we expect most pages in the
system to be unmapped. So the majority of map-unmap cycles are pointless
(map on free, but we're probably gonna unmap again on allloc).
Cheers,
Brendan
next prev parent reply other threads:[~2026-05-13 17:14 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 18:23 [PATCH v2 00/22] mm: Add __GFP_UNMAPPED Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd() Brendan Jackman
2026-03-20 19:42 ` Dave Hansen
2026-03-23 11:01 ` Brendan Jackman
2026-03-24 15:27 ` Borislav Petkov
2026-03-25 13:28 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 02/22] x86/mm: Generalize LDT remap into "mm-local region" Brendan Jackman
2026-03-20 19:47 ` Dave Hansen
2026-03-23 12:01 ` Brendan Jackman
2026-03-23 12:57 ` Brendan Jackman
2026-03-25 14:23 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 03/22] x86/tlb: Expose some flush function declarations to modules Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 04/22] mm: Create flags arg for __apply_to_page_range() Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 05/22] mm: Add more flags " Brendan Jackman
2026-03-26 16:14 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 06/22] x86/mm: introduce the mermap Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 07/22] mm: KUnit tests for " Brendan Jackman
2026-03-24 8:00 ` kernel test robot
2026-03-20 18:23 ` [PATCH v2 08/22] mm: introduce for_each_free_list() Brendan Jackman
2026-05-11 13:46 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 09/22] mm/page_alloc: don't overload migratetype in find_suitable_fallback() Brendan Jackman
2026-05-11 13:51 ` Vlastimil Babka (SUSE)
2026-05-11 16:44 ` Brendan Jackman
2026-05-11 16:53 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 10/22] mm: introduce freetype_t Brendan Jackman
2026-05-11 15:34 ` Vlastimil Babka (SUSE)
2026-05-11 16:49 ` Brendan Jackman
2026-05-11 16:58 ` Vlastimil Babka (SUSE)
2026-05-11 18:17 ` Vlastimil Babka (SUSE)
2026-05-11 18:26 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 11/22] mm: move migratetype definitions to freetype.h Brendan Jackman
2026-05-11 15:35 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 12/22] mm: add definitions for allocating unmapped pages Brendan Jackman
2026-05-11 18:01 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 13/22] mm: rejig pageblock mask definitions Brendan Jackman
2026-05-11 18:07 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 14/22] mm: encode freetype flags in pageblock flags Brendan Jackman
2026-05-11 18:29 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 15/22] mm/page_alloc: remove ifdefs from pindex helpers Brendan Jackman
2026-05-11 18:30 ` Vlastimil Babka (SUSE)
2026-05-12 9:49 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 16/22] mm/page_alloc: separate pcplists by freetype flags Brendan Jackman
2026-05-13 8:46 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 17/22] mm/page_alloc: rename ALLOC_NON_BLOCK back to _HARDER Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 18/22] mm/page_alloc: introduce ALLOC_NOBLOCK Brendan Jackman
2026-05-13 9:43 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 19/22] mm/page_alloc: implement __GFP_UNMAPPED allocations Brendan Jackman
2026-05-13 15:43 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 20/22] mm/page_alloc: implement __GFP_UNMAPPED|__GFP_ZERO allocations Brendan Jackman
2026-05-13 17:00 ` Vlastimil Babka (SUSE)
2026-03-20 18:23 ` [PATCH v2 21/22] mm: Minimal KUnit tests for some new page_alloc logic Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 22/22] mm/secretmem: Use __GFP_UNMAPPED when available Brendan Jackman
2026-03-31 14:40 ` Brendan Jackman
2026-05-13 16:17 ` [PATCH v2 00/22] mm: Add __GFP_UNMAPPED Gregory Price
2026-05-13 17:14 ` Brendan Jackman [this message]
2026-05-13 17:28 ` Gregory Price
2026-05-13 17:38 ` Vlastimil Babka (SUSE)
2026-05-13 17:59 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DIHPURUMTESI.1LP37OYJN1N31@google.com \
--to=jackmanb@google.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david.kaplan@amd.com \
--cc=david@kernel.org \
--cc=derkling@google.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=itazur@amazon.co.uk \
--cc=kalyazin@amazon.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luto@kernel.org \
--cc=patrick.roy@linux.dev \
--cc=peterz@infradead.org \
--cc=reijiw@google.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=sumit.garg@oss.qualcomm.com \
--cc=tglx@kernel.org \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yosry@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox