From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Brendan Jackman <jackmanb@google.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Lorenzo Stoakes <ljs@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org,
rppt@kernel.org, Sumit Garg <sumit.garg@oss.qualcomm.com>,
derkling@google.com, reijiw@google.com,
Will Deacon <will@kernel.org>,
rientjes@google.com, patrick.roy@linux.dev, "Itazuri,
Takahiro" <itazur@amazon.co.uk>,
Andy Lutomirski <luto@kernel.org>,
David Kaplan <david.kaplan@amd.com>,
Thomas Gleixner <tglx@kernel.org>, Yosry Ahmed <yosry@kernel.org>
Subject: Re: [PATCH v2 20/22] mm/page_alloc: implement __GFP_UNMAPPED|__GFP_ZERO allocations
Date: Wed, 13 May 2026 19:00:49 +0200 [thread overview]
Message-ID: <3453825f-2bc0-4d63-8731-3eaf9fc716a4@kernel.org> (raw)
In-Reply-To: <20260320-page_alloc-unmapped-v2-20-28bf1bd54f41@google.com>
On 3/20/26 19:23, Brendan Jackman wrote:
> The pages being zeroed here are unmapped, so they can't be zeroed via
> the direct map. Temporarily mapping them in the direct map is not
> possible because:
>
> - In general this requires allocating pagetables,
>
> - Unmapping them would require a TLB shootdown, which can't be done in
> general from the allocator (x86 requires IRQs on).
>
> Therefore, use the new mermap mechanism to zero these pages.
>
> The main mermap API is expected to fail very often. In order to avoid
> needing to fail allocations when that happens, instead fallback to the
> special mermap_get_reserved() variant, which is less efficient.
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> arch/x86/include/asm/pgtable_types.h | 2 +
> mm/Kconfig | 11 +++++-
> mm/page_alloc.c | 76 +++++++++++++++++++++++++++++++-----
> 3 files changed, 78 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 2ec250ba467e2..c3d73bdfff1fa 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -223,6 +223,7 @@ enum page_cache_mode {
> #define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX| 0| 0|___G)
> #define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0| 0| 0|___G)
> #define __PAGE_KERNEL (__PP|__RW| 0|___A|__NX|___D| 0|___G)
> +#define __PAGE_KERNEL_NOGLOBAL (__PP|__RW| 0|___A|__NX|___D| 0| 0)
> #define __PAGE_KERNEL_EXEC (__PP|__RW| 0|___A| 0|___D| 0|___G)
> #define __PAGE_KERNEL_NOCACHE (__PP|__RW| 0|___A|__NX|___D| 0|___G| __NC)
> #define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX| 0| 0|___G)
> @@ -245,6 +246,7 @@ enum page_cache_mode {
> #define __pgprot_mask(x) __pgprot((x) & __default_kernel_pte_mask)
>
> #define PAGE_KERNEL __pgprot_mask(__PAGE_KERNEL | _ENC)
> +#define PAGE_KERNEL_NOGLOBAL __pgprot_mask(__PAGE_KERNEL_NOGLOBAL | _ENC)
> #define PAGE_KERNEL_NOENC __pgprot_mask(__PAGE_KERNEL | 0)
> #define PAGE_KERNEL_RO __pgprot_mask(__PAGE_KERNEL_RO | _ENC)
> #define PAGE_KERNEL_EXEC __pgprot_mask(__PAGE_KERNEL_EXEC | _ENC)
Should this be part of earlier mermap x86 patches?
> diff --git a/mm/Kconfig b/mm/Kconfig
> index e4cb52149acad..05b2bb841d0e0 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -1506,7 +1506,14 @@ config MERMAP_KUNIT_TEST
> If unsure, say N.
>
> config PAGE_ALLOC_UNMAPPED
> - bool "Support allocating pages that aren't in the direct map" if COMPILE_TEST
> - default COMPILE_TEST
> + bool "Support allocating pages that aren't in the direct map"
> + depends on MERMAP
> +
> +config PAGE_ALLOC_KUNIT_TESTS
> + tristate "KUnit tests for the page allocator" if !KUNIT_ALL_TESTS
> + depends on KUNIT
> + default KUNIT_ALL_TESTS
> + help
> + Builds KUnit tests for the page allocator.
This belongs to the next patch?
> endmenu
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 710ee9f46d467..7c91dcbe32576 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -14,6 +14,7 @@
> * (lots of bits borrowed from Ingo Molnar & Andrew Morton)
> */
>
> +#include <linux/mermap.h>
> #include <linux/stddef.h>
> #include <linux/mm.h>
> #include <linux/highmem.h>
> @@ -1327,15 +1328,72 @@ static inline bool should_skip_kasan_poison(struct page *page)
> return page_kasan_tag(page) == KASAN_TAG_KERNEL;
> }
>
> -static void kernel_init_pages(struct page *page, int numpages)
> +#ifdef CONFIG_PAGE_ALLOC_UNMAPPED
> +static inline bool pageblock_unmapped(struct page *page)
> {
> - int i;
> + return freetype_flags(get_pageblock_freetype(page)) & FREETYPE_UNMAPPED;
> +}
>
> - /* s390's use of memset() could override KASAN redzones. */
> - kasan_disable_current();
> - for (i = 0; i < numpages; i++)
> - clear_highpage_kasan_tagged(page + i);
> - kasan_enable_current();
> +static inline void clear_page_mermap(struct page *page, unsigned int numpages)
> +{
> + void *mermap;
> +
> + BUILD_BUG_ON(IS_ENABLED(CONFIG_HIGHMEM));
> +
> + /* Fast path: single mapping (may fail under preemption). */
> + mermap = mermap_get(page, numpages << PAGE_SHIFT, PAGE_KERNEL_NOGLOBAL);
> + if (mermap) {
> + void *buf = kasan_reset_tag(mermap_addr(mermap));
> +
> + for (int i = 0; i < numpages; i++)
> + clear_page(buf + (i << PAGE_SHIFT));
> + mermap_put(mermap);
> + return;
> + }
> +
> + /* Slow path, map each page individually (always succeeds). */
> + for (int i = 0; i < numpages; i++) {
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + mermap = mermap_get_reserved(page + i, PAGE_KERNEL_NOGLOBAL);
> + clear_page(kasan_reset_tag(mermap_addr(mermap)));
> + mermap_put(mermap);
> + local_irq_restore(flags);
> + }
> +}
> +#else
> +static inline bool pageblock_unmapped(struct page *page)
> +{
> + return false;
> +}
> +
> +static inline void clear_page_mermap(struct page *page, unsigned int numpages)
> +{
> + BUG();
> +}
> +#endif
> +
> +static void kernel_init_pages(struct page *page, unsigned int numpages)
> +{
> + int num_blocks = DIV_ROUND_UP(numpages, pageblock_nr_pages);
> +
> + for (int block = 0; block < num_blocks; block++) {
> + struct page *block_page = page + (block << pageblock_order);
> + bool unmapped = pageblock_unmapped(block_page);
> +
> + /* s390's use of memset() could override KASAN redzones. */
> + kasan_disable_current();
> + if (unmapped) {
> + clear_page_mermap(block_page, numpages);
> + } else {
> + for (int i = 0; i < min(numpages, pageblock_nr_pages); i++)
> + clear_highpage_kasan_tagged(block_page + i);
> + }
> + kasan_enable_current();
> +
> + numpages -= pageblock_nr_pages;
> + }
> }
>
> #ifdef CONFIG_MEM_ALLOC_PROFILING
> @@ -5250,8 +5308,8 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
> ac->nodemask = nodemask;
> ac->freetype = gfp_freetype(gfp_mask);
>
> - /* Not implemented yet. */
> - if (freetype_flags(ac->freetype) & FREETYPE_UNMAPPED && gfp_mask & __GFP_ZERO)
> + if (freetype_flags(ac->freetype) & FREETYPE_UNMAPPED &&
> + WARN_ON(!mermap_ready()))
> return false;
>
> if (cpusets_enabled()) {
>
prev parent reply other threads:[~2026-05-13 17:00 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com>
[not found] ` <20260320-page_alloc-unmapped-v2-8-28bf1bd54f41@google.com>
2026-05-11 13:46 ` [PATCH v2 08/22] mm: introduce for_each_free_list() Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-9-28bf1bd54f41@google.com>
2026-05-11 13:51 ` [PATCH v2 09/22] mm/page_alloc: don't overload migratetype in find_suitable_fallback() Vlastimil Babka (SUSE)
2026-05-11 16:44 ` Brendan Jackman
2026-05-11 16:53 ` Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-11-28bf1bd54f41@google.com>
2026-05-11 15:35 ` [PATCH v2 11/22] mm: move migratetype definitions to freetype.h Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-12-28bf1bd54f41@google.com>
2026-05-11 18:01 ` [PATCH v2 12/22] mm: add definitions for allocating unmapped pages Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-13-28bf1bd54f41@google.com>
2026-05-11 18:07 ` [PATCH v2 13/22] mm: rejig pageblock mask definitions Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-10-28bf1bd54f41@google.com>
2026-05-11 15:34 ` [PATCH v2 10/22] mm: introduce freetype_t Vlastimil Babka (SUSE)
2026-05-11 16:49 ` Brendan Jackman
2026-05-11 16:58 ` Vlastimil Babka (SUSE)
2026-05-11 18:17 ` Vlastimil Babka (SUSE)
2026-05-11 18:26 ` Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-14-28bf1bd54f41@google.com>
2026-05-11 18:29 ` [PATCH v2 14/22] mm: encode freetype flags in pageblock flags Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-15-28bf1bd54f41@google.com>
2026-05-11 18:30 ` [PATCH v2 15/22] mm/page_alloc: remove ifdefs from pindex helpers Vlastimil Babka (SUSE)
2026-05-12 9:49 ` Brendan Jackman
[not found] ` <20260320-page_alloc-unmapped-v2-16-28bf1bd54f41@google.com>
2026-05-13 8:46 ` [PATCH v2 16/22] mm/page_alloc: separate pcplists by freetype flags Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-18-28bf1bd54f41@google.com>
2026-05-13 9:43 ` [PATCH v2 18/22] mm/page_alloc: introduce ALLOC_NOBLOCK Vlastimil Babka (SUSE)
[not found] ` <20260320-page_alloc-unmapped-v2-19-28bf1bd54f41@google.com>
2026-05-13 15:43 ` [PATCH v2 19/22] mm/page_alloc: implement __GFP_UNMAPPED allocations Vlastimil Babka (SUSE)
2026-05-13 16:17 ` [PATCH v2 00/22] mm: Add __GFP_UNMAPPED Gregory Price
2026-05-13 17:14 ` Brendan Jackman
2026-05-13 17:28 ` Gregory Price
2026-05-13 17:38 ` Vlastimil Babka (SUSE)
2026-05-13 17:59 ` Gregory Price
[not found] ` <20260320-page_alloc-unmapped-v2-20-28bf1bd54f41@google.com>
2026-05-13 17:00 ` Vlastimil Babka (SUSE) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3453825f-2bc0-4d63-8731-3eaf9fc716a4@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david.kaplan@amd.com \
--cc=david@kernel.org \
--cc=derkling@google.com \
--cc=hannes@cmpxchg.org \
--cc=itazur@amazon.co.uk \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luto@kernel.org \
--cc=patrick.roy@linux.dev \
--cc=peterz@infradead.org \
--cc=reijiw@google.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=sumit.garg@oss.qualcomm.com \
--cc=tglx@kernel.org \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yosry@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox