Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/16] mm: Some cleanups for page allocator APIs
@ 2026-06-29 13:11 Brendan Jackman
  2026-06-29 13:11 ` [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK Brendan Jackman
                   ` (16 more replies)
  0 siblings, 17 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Sean Christopherson, Paolo Bonzini, kvm,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Robin Holt, Steve Wahl, Arnd Bergmann,
	Greg Kroah-Hartman, Dimitris Michailidis, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni

This is based on mm-new, it depends on moving alloc_tag to mm/ [0].

[0] https://lore.kernel.org/all/aj5QBtJcphPElczI@lucifer/

Some tweaks and cleanups for page allocator entrypoint and flags. This
is motivated by preparation for __GFP_UNMAPPED [1] (which will probably
become ALLOC_UNMAPPED in its next iteration), but all this is supposed
to be an improvement to the codebase in its own right: unifying code
paths, reducing API surface, and removing GFP flags.

[1] https://lore.kernel.org/all/20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com/

This started with unifying  __alloc_frozen_pages[_nolock]_noprof() and
expanded from there.

Unifying the nolock allocator entrypoint with the normal allocator
entrypoint means adding an alloc_flags argument to the later (only
exposed within mm/). This presents an opportunity to take advantage of
that arg to remove some GFP flags, if we add that alloc_flags arg a bit
more broadly to allocator entrypoints.

To distinguish between mm-internal and "public" allocator entrypoints,
it makes sense to use the __ prefix. There are already some public APIs
with that prefix. For *alloc_pages*, just removing those variants seems
like a nice cleanup anyway, so do that. For get_free_pages, the "__"
variant is the _only_ variant and it's very widely used, so it doesn't
seem worthwhile to modify that. Therefore, scope this "__" change
specifically to the *alloc_pages* API, which means we leave the
*folio_alloc* API untouched too, even though that could probably be
cleaned up if so desired.

Tested:

- KVM, mm, and BPF selftests in a QEMU VM

- kunit.py on x86_64

- For the ALLOC_NO_CODETAG bits I just booted a VM and read
  /proc/allocinfo. I confirmed that if I remove ALLOC_NO_CODETAG, the
  kernel crashes in early boot, so I was at least booting code that
  depends on this logic.

I used Google's internal version of Antigravity (AI coding harness) to
do the repetitive bits, those commits are marked with Assisted-by, the
rest is manual.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
Changes in v3:
- Created mm/page_alloc.h
- Fixed EXPORT_SYMBOL() issues
- Reworded commit messages per Sashiko's pointers
- Dropped rename of alloc_flags arg in prepare_alloc_pages() (Suren)
- Renamed gfp_to_alloc_flags_nonblocking() too after rebasing onto:
  https://lore.kernel.org/all/20260623004600.113347-1-jp.kobryn@linux.dev/
- Link to v2: https://patch.msgid.link/20260622-alloc-trylock-v2-0-31f31367d420@google.com

Changes in v2:
- Fixed up whitespace in nolock unification patch
- Introduced ALLOC_DEFAULT to replace literal 0 for alloc_flags
- All other patches are new
- Link to v1: https://patch.msgid.link/20260617-alloc-trylock-v1-1-83fd7858832e@google.com

---
Brendan Jackman (16):
      mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK
      mm/page_alloc: some renames to clarify alloc_flags scopes
      mm: name some args in a function declaration
      mm: Split out internal page_alloc.h
      mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
      mm/page_alloc: relax GFP WARN in nolock allocs
      mm: move some stuff to mm/page_alloc.h
      perf/x86/intel: Use higher-level allocator API
      KVM: VMX: Use higher-level allocator API
      x86/virt: Use higher-level allocator API
      sgi-xp: Use higher-level allocator API
      net/funeth: Switch to higher-level allocator API
      mm: Remove __alloc_pages_node()
      mm: Move __alloc_pages() to mm/page_alloc.h
      mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
      mm: remove the __GFP_NO_OBJ_EXT flag

 Documentation/admin-guide/cgroup-v1/cpusets.rst  |   2 +-
 Documentation/admin-guide/mm/transhuge.rst       |   2 +-
 MAINTAINERS                                      |   1 +
 arch/x86/events/intel/ds.c                       |   6 +-
 arch/x86/kvm/vmx/vmx.c                           |   2 +-
 arch/x86/virt/hw.c                               |   2 +-
 drivers/misc/sgi-xp/xpc_uv.c                     |   2 +-
 drivers/net/ethernet/fungible/funeth/funeth_rx.c |   2 +-
 include/linux/gfp.h                              |  54 +---
 mm/alloc_tag.c                                   |  22 +-
 mm/compaction.c                                  |   5 +-
 mm/hugetlb.c                                     |   4 +-
 mm/internal.h                                    | 253 ------------------
 mm/khugepaged.c                                  |   1 +
 mm/memory-failure.c                              |   1 +
 mm/memory_hotplug.c                              |   1 +
 mm/mempolicy.c                                   |  11 +-
 mm/mm_init.c                                     |   1 +
 mm/page_alloc.c                                  | 259 ++++++++++---------
 mm/page_alloc.h                                  | 316 +++++++++++++++++++++++
 mm/page_frag_cache.c                             |   6 +-
 mm/page_isolation.c                              |   1 +
 mm/page_owner.c                                  |   2 +-
 mm/show_mem.c                                    |   1 +
 mm/slub.c                                        |   7 +-
 mm/swap.c                                        |   1 +
 mm/vmscan.c                                      |   1 +
 mm/vmstat.c                                      |   1 +
 tools/include/linux/gfp_types.h                  |   7 -
 29 files changed, 506 insertions(+), 468 deletions(-)
---
base-commit: ec005628e9dbcef26b761fa860f264fe6e5fd690
change-id: 20260617-alloc-trylock-14ad37dab337

Best regards,
--  
Brendan Jackman <jackmanb@google.com>



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 12:27   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes Brendan Jackman
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

It's confusing that the function is called "nolock" but the flag is
called "trylock", align them.

The function's terminology is more visible and has more mindshare so use that.

Suggested-by: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Link: https://lore.kernel.org/linux-mm/2399b3ad-4eac-4a14-94c3-27e9f07972a1@kernel.org/
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/internal.h   |  2 +-
 mm/page_alloc.c | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 430aa72a45758..2237eee030cba 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1480,7 +1480,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 #define ALLOC_NOFRAGMENT	  0x0
 #endif
 #define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
-#define ALLOC_TRYLOCK		0x400 /* Only use spin_trylock in allocation path */
+#define ALLOC_NOLOCK		0x400 /* Only use spin_trylock in allocation path */
 #define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
 
 /* Flags that allow allocations below the min watermark. */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 62f71ece7ca17..421271849f291 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2530,7 +2530,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 	unsigned long flags;
 	int i;
 
-	if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
+	if (unlikely(alloc_flags & ALLOC_NOLOCK)) {
 		if (!spin_trylock_irqsave(&zone->lock, flags))
 			return 0;
 	} else {
@@ -3218,7 +3218,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
 
 	do {
 		page = NULL;
-		if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
+		if (unlikely(alloc_flags & ALLOC_NOLOCK)) {
 			if (!spin_trylock_irqsave(&zone->lock, flags))
 				return NULL;
 		} else {
@@ -5059,7 +5059,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
 	 * Don't invoke should_fail logic, since it may call
 	 * get_random_u32() and printk() which need to spin_lock.
 	 */
-	if (!(*alloc_flags & ALLOC_TRYLOCK) &&
+	if (!(*alloc_flags & ALLOC_NOLOCK) &&
 	    should_fail_alloc_page(gfp_mask, order))
 		return false;
 
@@ -7804,7 +7804,7 @@ static bool cond_accept_memory(struct zone *zone, unsigned int order,
 		return false;
 
 	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
-	if (alloc_flags & ALLOC_TRYLOCK)
+	if (alloc_flags & ALLOC_NOLOCK)
 		return false;
 
 	wmark = promo_wmark_pages(zone);
@@ -7896,7 +7896,7 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
 	 */
 	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC | __GFP_COMP
 			| gfp_flags;
-	unsigned int alloc_flags = ALLOC_TRYLOCK;
+	unsigned int alloc_flags = ALLOC_NOLOCK;
 	struct alloc_context ac = { };
 	struct page *page;
 

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
  2026-06-29 13:11 ` [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 12:38   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 03/16] mm: name some args in a function declaration Brendan Jackman
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

It's pretty confusing that:

- The slowpath and fastpath have a totally distinct set of alloc_flags.

- gfp_to_alloc_flags() sounds generic but it only influences the
  slowpath.

Rename some variables to highlight which alloc_flags are
fastpath-specific. Rename gfp_to_alloc_flags() to highlight that it's
slowpath-specific.

gfp_to_alloc_flags_cma() and gfp_to_alloc_flags_nonblocking() currently
have perfectly harmless names, but to keep the naming consistent also
rename those to the alloc_flags_*() pattern (which already exists for
alloc_flags_nofragment()).

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/page_alloc.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 421271849f291..6010693861ec2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3774,8 +3774,8 @@ alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
 }
 
 /* Must be called after current_gfp_context() which can change gfp_mask */
-static inline unsigned int gfp_to_alloc_flags_cma(gfp_t gfp_mask,
-						  unsigned int alloc_flags)
+static inline unsigned int alloc_flags_cma(gfp_t gfp_mask,
+					   unsigned int alloc_flags)
 {
 #ifdef CONFIG_CMA
 	if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
@@ -4474,7 +4474,7 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask,
 }
 
 static inline unsigned int
-gfp_to_alloc_flags_nonblocking(gfp_t gfp_mask, unsigned int order)
+alloc_flags_nonblocking(gfp_t gfp_mask, unsigned int order)
 {
 	unsigned int alloc_flags = 0;
 
@@ -4497,7 +4497,7 @@ gfp_to_alloc_flags_nonblocking(gfp_t gfp_mask, unsigned int order)
 }
 
 static inline unsigned int
-gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
+alloc_flags_slowpath(gfp_t gfp_mask, unsigned int order)
 {
 	unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 
@@ -4512,7 +4512,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
 	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
 		alloc_flags |= ALLOC_KSWAPD;
 
-	alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp_mask, order);
+	alloc_flags |= alloc_flags_nonblocking(gfp_mask, order);
 
 	if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) {
 		/*
@@ -4525,7 +4525,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
 	} else if (unlikely(rt_or_dl_task(current)) && in_task())
 		alloc_flags |= ALLOC_MIN_RESERVE;
 
-	alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, alloc_flags);
+	alloc_flags = alloc_flags_cma(gfp_mask, alloc_flags);
 
 	if (defrag_mode)
 		alloc_flags |= ALLOC_NOFRAGMENT;
@@ -4791,7 +4791,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * kswapd needs to be woken up, and to avoid the cost of setting up
 	 * alloc_flags precisely. So we do that now.
 	 */
-	alloc_flags = gfp_to_alloc_flags(gfp_mask, order);
+	alloc_flags = alloc_flags_slowpath(gfp_mask, order);
 
 	/*
 	 * We need to recalculate the starting point for the zonelist iterator
@@ -4832,7 +4832,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
 	if (reserve_flags)
-		alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, reserve_flags) |
+		alloc_flags = alloc_flags_cma(gfp_mask, reserve_flags) |
 					  (alloc_flags & ALLOC_KSWAPD);
 
 	/*
@@ -5063,7 +5063,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
 	    should_fail_alloc_page(gfp_mask, order))
 		return false;
 
-	*alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, *alloc_flags);
+	*alloc_flags = alloc_flags_cma(gfp_mask, *alloc_flags);
 
 	/* Dirty zone balancing only done in the fast path */
 	ac->spread_dirty_pages = (gfp_mask & __GFP_WRITE);
@@ -5277,7 +5277,7 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 		int preferred_nid, nodemask_t *nodemask)
 {
 	struct page *page;
-	unsigned int alloc_flags = ALLOC_WMARK_LOW;
+	unsigned int fastpath_alloc_flags = ALLOC_WMARK_LOW;
 	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
 	struct alloc_context ac = { };
 
@@ -5299,18 +5299,18 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 	gfp = current_gfp_context(gfp);
 	alloc_gfp = gfp;
 	if (!prepare_alloc_pages(gfp, order, preferred_nid, nodemask, &ac,
-			&alloc_gfp, &alloc_flags))
+			&alloc_gfp, &fastpath_alloc_flags))
 		return NULL;
 
 	/*
 	 * Forbid the first pass from falling back to types that fragment
 	 * memory until all local zones are considered.
 	 */
-	alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
-	alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
+	fastpath_alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
+	fastpath_alloc_flags |= alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
 
 	/* First allocation attempt */
-	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
+	page = get_page_from_freelist(alloc_gfp, order, fastpath_alloc_flags, &ac);
 	if (likely(page))
 		goto out;
 

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 03/16] mm: name some args in a function declaration
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
  2026-06-29 13:11 ` [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK Brendan Jackman
  2026-06-29 13:11 ` [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 12:43   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 04/16] mm: Split out internal page_alloc.h Brendan Jackman
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

Checkpatch complains about this, a later patch will move the code, fix
it so that checkpatch doesn't complain about that patch. Do it in a
separate patch so the "move the code" patch is trivial to review using
Git's diff colouring.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/internal.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 2237eee030cba..8ce59c5664497 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -919,8 +919,8 @@ extern bool free_pages_prepare(struct page *page, unsigned int order);
 
 extern int user_min_free_kbytes;
 
-struct page *__alloc_frozen_pages_noprof(gfp_t, unsigned int order, int nid,
-		nodemask_t *);
+struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
+		nodemask_t *nodemask);
 #define __alloc_frozen_pages(...) \
 	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
 void free_frozen_pages(struct page *page, unsigned int order);

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 04/16] mm: Split out internal page_alloc.h
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (2 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 03/16] mm: name some args in a function declaration Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 13:54   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Brendan Jackman
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

internal.h is a bit bloated, seems like time for a page_alloc.h.

Where it wasn't obvious, the heuristic for deciding what goes into this
new header was "does it support/correspond to a definition in
mm/page_alloc.c?"

Only need to include it from 15 .c files out of 164 so this does seem
like a genuine reduction in scopes, which is nice. And there's no
circular internal.h<->page_alloc.h dependency, so it seems worthwhile to
split this up before that inevitably emerges!

Suggested-by: "David Hildenbrand (Arm)" <david@kernel.org>
Link: https://lore.kernel.org/all/41e92bab-6882-401a-8de9-154adbdcfb36@kernel.org/
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 MAINTAINERS          |   1 +
 mm/compaction.c      |   1 +
 mm/hugetlb.c         |   1 +
 mm/internal.h        | 252 -----------------------------------------------
 mm/khugepaged.c      |   1 +
 mm/memory-failure.c  |   1 +
 mm/memory_hotplug.c  |   1 +
 mm/mempolicy.c       |   1 +
 mm/mm_init.c         |   1 +
 mm/page_alloc.c      |   1 +
 mm/page_alloc.h      | 269 +++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/page_frag_cache.c |   2 +-
 mm/page_isolation.c  |   1 +
 mm/page_owner.c      |   2 +-
 mm/show_mem.c        |   1 +
 mm/slub.c            |   1 +
 mm/swap.c            |   1 +
 mm/vmscan.c          |   1 +
 18 files changed, 285 insertions(+), 254 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f55cc75801f4c..978a04e1f7cc3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17171,6 +17171,7 @@ F:	mm/debug_page_alloc.c
 F:	mm/debug_page_ref.c
 F:	mm/fail_page_alloc.c
 F:	mm/page_alloc.c
+F:	mm/page_alloc.h
 F:	mm/page_ext.c
 F:	mm/page_frag_cache.c
 F:	mm/page_isolation.c
diff --git a/mm/compaction.c b/mm/compaction.c
index f08765ade014c..7d80735502d9a 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -24,6 +24,7 @@
 #include <linux/page_owner.h>
 #include <linux/psi.h>
 #include <linux/cpuset.h>
+#include "page_alloc.h"
 #include "internal.h"
 
 #ifdef CONFIG_COMPACTION
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fb7ad2a4a26b4..f7925624c4d2e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -47,6 +47,7 @@
 #include <linux/node.h>
 #include <linux/page_owner.h>
 #include "internal.h"
+#include "page_alloc.h"
 #include "hugetlb_vmemmap.h"
 #include "hugetlb_cma.h"
 #include "hugetlb_internal.h"
diff --git a/mm/internal.h b/mm/internal.h
index 8ce59c5664497..c22284f04fc9e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -658,165 +658,6 @@ extern int defrag_mode;
 void setup_per_zone_wmarks(void);
 void calculate_min_free_kbytes(void);
 int __meminit init_per_zone_wmark_min(void);
-void page_alloc_sysctl_init(void);
-
-/*
- * Structure for holding the mostly immutable allocation parameters passed
- * between functions involved in allocations, including the alloc_pages*
- * family of functions.
- *
- * nodemask, migratetype and highest_zoneidx are initialized only once in
- * __alloc_pages() and then never change.
- *
- * zonelist, preferred_zone and highest_zoneidx are set first in
- * __alloc_pages() for the fast path, and might be later changed
- * in __alloc_pages_slowpath(). All other functions pass the whole structure
- * by a const pointer.
- */
-struct alloc_context {
-	struct zonelist *zonelist;
-	const nodemask_t *nodemask;
-	struct zoneref *preferred_zoneref;
-	int migratetype;
-
-	/*
-	 * highest_zoneidx represents highest usable zone index of
-	 * the allocation request. Due to the nature of the zone,
-	 * memory on lower zone than the highest_zoneidx will be
-	 * protected by lowmem_reserve[highest_zoneidx].
-	 *
-	 * highest_zoneidx is also used by reclaim/compaction to limit
-	 * the target zone since higher zone than this index cannot be
-	 * usable for this allocation request.
-	 */
-	enum zone_type highest_zoneidx;
-	bool spread_dirty_pages;
-};
-
-/*
- * This function returns the order of a free page in the buddy system. In
- * general, page_zone(page)->lock must be held by the caller to prevent the
- * page from being allocated in parallel and returning garbage as the order.
- * If a caller does not hold page_zone(page)->lock, it must guarantee that the
- * page cannot be allocated or merged in parallel. Alternatively, it must
- * handle invalid values gracefully, and use buddy_order_unsafe() below.
- */
-static inline unsigned int buddy_order(struct page *page)
-{
-	/* PageBuddy() must be checked by the caller */
-	return page_private(page);
-}
-
-/*
- * Like buddy_order(), but for callers who cannot afford to hold the zone lock.
- * PageBuddy() should be checked first by the caller to minimize race window,
- * and invalid values must be handled gracefully.
- *
- * READ_ONCE is used so that if the caller assigns the result into a local
- * variable and e.g. tests it for valid range before using, the compiler cannot
- * decide to remove the variable and inline the page_private(page) multiple
- * times, potentially observing different values in the tests and the actual
- * use of the result.
- */
-#define buddy_order_unsafe(page)	READ_ONCE(page_private(page))
-
-/*
- * This function checks whether a page is free && is the buddy
- * we can coalesce a page and its buddy if
- * (a) the buddy is not in a hole (check before calling!) &&
- * (b) the buddy is in the buddy system &&
- * (c) a page and its buddy have the same order &&
- * (d) a page and its buddy are in the same zone.
- *
- * For recording whether a page is in the buddy system, we set PageBuddy.
- * Setting, clearing, and testing PageBuddy is serialized by zone->lock.
- *
- * For recording page's order, we use page_private(page).
- */
-static inline bool page_is_buddy(struct page *page, struct page *buddy,
-				 unsigned int order)
-{
-	if (!page_is_guard(buddy) && !PageBuddy(buddy))
-		return false;
-
-	if (buddy_order(buddy) != order)
-		return false;
-
-	/*
-	 * zone check is done late to avoid uselessly calculating
-	 * zone/node ids for pages that could never merge.
-	 */
-	if (page_zone_id(page) != page_zone_id(buddy))
-		return false;
-
-	VM_BUG_ON_PAGE(page_count(buddy) != 0, buddy);
-
-	return true;
-}
-
-/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- *     B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- *     P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_PAGE_ORDER
- */
-static inline unsigned long
-__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-{
-	return page_pfn ^ (1 << order);
-}
-
-/*
- * Find the buddy of @page and validate it.
- * @page: The input page
- * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
- *       function is used in the performance-critical __free_one_page().
- * @order: The order of the page
- * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
- *             page_to_pfn().
- *
- * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
- * not the same as @page. The validation is necessary before use it.
- *
- * Return: the found buddy page or NULL if not found.
- */
-static inline struct page *find_buddy_page_pfn(struct page *page,
-			unsigned long pfn, unsigned int order, unsigned long *buddy_pfn)
-{
-	unsigned long __buddy_pfn = __find_buddy_pfn(pfn, order);
-	struct page *buddy;
-
-	buddy = page + (__buddy_pfn - pfn);
-	if (buddy_pfn)
-		*buddy_pfn = __buddy_pfn;
-
-	if (page_is_buddy(page, buddy, order))
-		return buddy;
-	return NULL;
-}
-
-extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
-				unsigned long end_pfn, struct zone *zone);
-
-static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
-				unsigned long end_pfn, struct zone *zone)
-{
-	if (zone->contiguous)
-		return pfn_to_page(start_pfn);
-
-	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
-}
 
 void set_zone_contiguous(struct zone *zone);
 bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
@@ -831,8 +672,6 @@ extern int __isolate_free_page(struct page *page, unsigned int order);
 extern void __putback_isolated_page(struct page *page, unsigned int order,
 				    int mt);
 extern void memblock_free_pages(unsigned long pfn, unsigned int order);
-extern void __free_pages_core(struct page *page, unsigned int order,
-		enum meminit_context context);
 
 /*
  * This will have no effect, other than possibly generating a warning, if the
@@ -914,40 +753,6 @@ static inline void init_compound_tail(struct page *tail,
 	prep_compound_tail(tail, head, order);
 }
 
-void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags);
-extern bool free_pages_prepare(struct page *page, unsigned int order);
-
-extern int user_min_free_kbytes;
-
-struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
-		nodemask_t *nodemask);
-#define __alloc_frozen_pages(...) \
-	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
-void free_frozen_pages(struct page *page, unsigned int order);
-void free_unref_folios(struct folio_batch *fbatch);
-
-#ifdef CONFIG_NUMA
-struct page *alloc_frozen_pages_noprof(gfp_t, unsigned int order);
-#else
-static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order)
-{
-	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL);
-}
-#endif
-
-#define alloc_frozen_pages(...) \
-	alloc_hooks(alloc_frozen_pages_noprof(__VA_ARGS__))
-
-struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order);
-#define alloc_frozen_pages_nolock(...) \
-	alloc_hooks(alloc_frozen_pages_nolock_noprof(__VA_ARGS__))
-void free_frozen_pages_nolock(struct page *page, unsigned int order);
-
-extern void zone_pcp_reset(struct zone *zone);
-extern void zone_pcp_disable(struct zone *zone);
-extern void zone_pcp_enable(struct zone *zone);
-extern void zone_pcp_init(struct zone *zone);
-
 extern void *memmap_alloc(phys_addr_t size, phys_addr_t align,
 			  phys_addr_t min_addr,
 			  int nid, bool exact_nid);
@@ -1101,23 +906,6 @@ static inline void init_cma_pageblock(struct page *page)
 }
 #endif
 
-enum fallback_result {
-	/* Found suitable migratetype, *mt_out is valid. */
-	FALLBACK_FOUND,
-	/* No fallback found in requested order. */
-	FALLBACK_EMPTY,
-	/* Passed @claimable, but claiming whole block is a bad idea. */
-	FALLBACK_NOCLAIM,
-};
-enum fallback_result
-find_suitable_fallback(struct free_area *area, unsigned int order,
-		       int migratetype, bool claimable, int *mt_out);
-
-static inline bool free_area_empty(struct free_area *area, int migratetype)
-{
-	return list_empty(&area->free_list[migratetype]);
-}
-
 /* mm/util.c */
 struct anon_vma *folio_anon_vma(const struct folio *folio);
 
@@ -1445,46 +1233,6 @@ extern unsigned long  __must_check vm_mmap_pgoff(struct file *, unsigned long,
 unsigned long reclaim_pages(struct list_head *folio_list);
 unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 					    struct list_head *folio_list);
-/* The ALLOC_WMARK bits are used as an index to zone->watermark */
-#define ALLOC_WMARK_MIN		WMARK_MIN
-#define ALLOC_WMARK_LOW		WMARK_LOW
-#define ALLOC_WMARK_HIGH	WMARK_HIGH
-#define ALLOC_NO_WATERMARKS	0x04 /* don't check watermarks at all */
-
-/* Mask to get the watermark bits */
-#define ALLOC_WMARK_MASK	(ALLOC_NO_WATERMARKS-1)
-
-/*
- * Only MMU archs have async oom victim reclaim - aka oom_reaper so we
- * cannot assume a reduced access to memory reserves is sufficient for
- * !MMU
- */
-#ifdef CONFIG_MMU
-#define ALLOC_OOM		0x08
-#else
-#define ALLOC_OOM		ALLOC_NO_WATERMARKS
-#endif
-
-#define ALLOC_NON_BLOCK		 0x10 /* Caller cannot block. Allow access
-				       * to 25% of the min watermark or
-				       * 62.5% if __GFP_HIGH is set.
-				       */
-#define ALLOC_MIN_RESERVE	 0x20 /* __GFP_HIGH set. Allow access to 50%
-				       * of the min watermark.
-				       */
-#define ALLOC_CPUSET		 0x40 /* check for correct cpuset */
-#define ALLOC_CMA		 0x80 /* allow allocations from CMA areas */
-#ifdef CONFIG_ZONE_DMA32
-#define ALLOC_NOFRAGMENT	0x100 /* avoid mixing pageblock types */
-#else
-#define ALLOC_NOFRAGMENT	  0x0
-#endif
-#define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
-#define ALLOC_NOLOCK		0x400 /* Only use spin_trylock in allocation path */
-#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
-
-/* Flags that allow allocations below the min watermark. */
-#define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
 
 enum ttu_flags;
 struct tlbflush_unmap_batch;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 617bca76db49b..58e14d1543ecb 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -26,6 +26,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "page_alloc.h"
 #include "mm_slot.h"
 
 enum scan_result {
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a09d85142da46..49edc37ad4324 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -66,6 +66,7 @@
 #include <trace/events/memory-failure.h>
 
 #include "swap.h"
+#include "page_alloc.h"
 #include "internal.h"
 
 static int sysctl_memory_failure_early_kill __read_mostly;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7ac19fab22632..9539e40c478ed 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -40,6 +40,7 @@
 #include <asm/tlbflush.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 #include "shuffle.h"
 
 enum {
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 36699fabd3c22..9c740324f9160 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -119,6 +119,7 @@
 #include <linux/memory.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 
 /* Internal flags */
 #define MPOL_MF_DISCONTIG_OK (MPOL_MF_INTERNAL << 0)	/* Skip checks for continuous vmas */
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 4026b084bd4bf..32593cca124f8 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -33,6 +33,7 @@
 #include <linux/kexec_handover.h>
 #include <linux/hugetlb.h>
 #include "internal.h"
+#include "page_alloc.h"
 #include "slab.h"
 #include "shuffle.h"
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6010693861ec2..a3ba63c7f9199 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -56,6 +56,7 @@
 #include <linux/pgalloc_tag.h>
 #include <asm/div64.h>
 #include "internal.h"
+#include "page_alloc.h"
 #include "shuffle.h"
 #include "page_reporting.h"
 
diff --git a/mm/page_alloc.h b/mm/page_alloc.h
new file mode 100644
index 0000000000000..3250d44f96457
--- /dev/null
+++ b/mm/page_alloc.h
@@ -0,0 +1,269 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * mm-internal API for the page (buddy) allocator. Public API lives in
+ * include/linux/gfp.h.
+ */
+#ifndef __MM_PAGE_ALLOC_H
+#define __MM_PAGE_ALLOC_H
+
+#include <linux/mm.h>
+#include <linux/mmzone.h>
+#include <linux/nodemask.h>
+#include <linux/types.h>
+
+/* The ALLOC_WMARK bits are used as an index to zone->watermark */
+#define ALLOC_WMARK_MIN		WMARK_MIN
+#define ALLOC_WMARK_LOW		WMARK_LOW
+#define ALLOC_WMARK_HIGH	WMARK_HIGH
+#define ALLOC_NO_WATERMARKS	0x04 /* don't check watermarks at all */
+
+/* Mask to get the watermark bits */
+#define ALLOC_WMARK_MASK	(ALLOC_NO_WATERMARKS-1)
+
+/*
+ * Only MMU archs have async oom victim reclaim - aka oom_reaper so we
+ * cannot assume a reduced access to memory reserves is sufficient for
+ * !MMU
+ */
+#ifdef CONFIG_MMU
+#define ALLOC_OOM		0x08
+#else
+#define ALLOC_OOM		ALLOC_NO_WATERMARKS
+#endif
+
+#define ALLOC_NON_BLOCK		 0x10 /* Caller cannot block. Allow access
+				       * to 25% of the min watermark or
+				       * 62.5% if __GFP_HIGH is set.
+				       */
+#define ALLOC_MIN_RESERVE	 0x20 /* __GFP_HIGH set. Allow access to 50%
+				       * of the min watermark.
+				       */
+#define ALLOC_CPUSET		 0x40 /* check for correct cpuset */
+#define ALLOC_CMA		 0x80 /* allow allocations from CMA areas */
+#ifdef CONFIG_ZONE_DMA32
+#define ALLOC_NOFRAGMENT	0x100 /* avoid mixing pageblock types */
+#else
+#define ALLOC_NOFRAGMENT	  0x0
+#endif
+#define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
+#define ALLOC_NOLOCK		0x400 /* Only use spin_trylock in allocation path */
+#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
+
+/* Flags that allow allocations below the min watermark. */
+#define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
+
+/*
+ * Structure for holding the mostly immutable allocation parameters passed
+ * between functions involved in allocations, including the alloc_pages*
+ * family of functions.
+ *
+ * nodemask, migratetype and highest_zoneidx are initialized only once in
+ * __alloc_pages() and then never change.
+ *
+ * zonelist, preferred_zone and highest_zoneidx are set first in
+ * __alloc_pages() for the fast path, and might be later changed
+ * in __alloc_pages_slowpath(). All other functions pass the whole structure
+ * by a const pointer.
+ */
+struct alloc_context {
+	struct zonelist *zonelist;
+	const nodemask_t *nodemask;
+	struct zoneref *preferred_zoneref;
+	int migratetype;
+
+	/*
+	 * highest_zoneidx represents highest usable zone index of
+	 * the allocation request. Due to the nature of the zone,
+	 * memory on lower zone than the highest_zoneidx will be
+	 * protected by lowmem_reserve[highest_zoneidx].
+	 *
+	 * highest_zoneidx is also used by reclaim/compaction to limit
+	 * the target zone since higher zone than this index cannot be
+	 * usable for this allocation request.
+	 */
+	enum zone_type highest_zoneidx;
+	bool spread_dirty_pages;
+};
+
+/*
+ * This function returns the order of a free page in the buddy system. In
+ * general, page_zone(page)->lock must be held by the caller to prevent the
+ * page from being allocated in parallel and returning garbage as the order.
+ * If a caller does not hold page_zone(page)->lock, it must guarantee that the
+ * page cannot be allocated or merged in parallel. Alternatively, it must
+ * handle invalid values gracefully, and use buddy_order_unsafe() below.
+ */
+static inline unsigned int buddy_order(struct page *page)
+{
+	/* PageBuddy() must be checked by the caller */
+	return page_private(page);
+}
+
+/*
+ * Like buddy_order(), but for callers who cannot afford to hold the zone lock.
+ * PageBuddy() should be checked first by the caller to minimize race window,
+ * and invalid values must be handled gracefully.
+ *
+ * READ_ONCE is used so that if the caller assigns the result into a local
+ * variable and e.g. tests it for valid range before using, the compiler cannot
+ * decide to remove the variable and inline the page_private(page) multiple
+ * times, potentially observing different values in the tests and the actual
+ * use of the result.
+ */
+#define buddy_order_unsafe(page)	READ_ONCE(page_private(page))
+
+/*
+ * This function checks whether a page is free && is the buddy
+ * we can coalesce a page and its buddy if
+ * (a) the buddy is not in a hole (check before calling!) &&
+ * (b) the buddy is in the buddy system &&
+ * (c) a page and its buddy have the same order &&
+ * (d) a page and its buddy are in the same zone.
+ *
+ * For recording whether a page is in the buddy system, we set PageBuddy.
+ * Setting, clearing, and testing PageBuddy is serialized by zone->lock.
+ *
+ * For recording page's order, we use page_private(page).
+ */
+static inline bool page_is_buddy(struct page *page, struct page *buddy,
+				 unsigned int order)
+{
+	if (!page_is_guard(buddy) && !PageBuddy(buddy))
+		return false;
+
+	if (buddy_order(buddy) != order)
+		return false;
+
+	/*
+	 * zone check is done late to avoid uselessly calculating
+	 * zone/node ids for pages that could never merge.
+	 */
+	if (page_zone_id(page) != page_zone_id(buddy))
+		return false;
+
+	VM_BUG_ON_PAGE(page_count(buddy) != 0, buddy);
+
+	return true;
+}
+
+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ *     B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ *     P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_PAGE_ORDER
+ */
+static inline unsigned long
+__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
+{
+	return page_pfn ^ (1 << order);
+}
+
+/*
+ * Find the buddy of @page and validate it.
+ * @page: The input page
+ * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
+ *       function is used in the performance-critical __free_one_page().
+ * @order: The order of the page
+ * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
+ *             page_to_pfn().
+ *
+ * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
+ * not the same as @page. The validation is necessary before use it.
+ *
+ * Return: the found buddy page or NULL if not found.
+ */
+static inline struct page *find_buddy_page_pfn(struct page *page,
+			unsigned long pfn, unsigned int order, unsigned long *buddy_pfn)
+{
+	unsigned long __buddy_pfn = __find_buddy_pfn(pfn, order);
+	struct page *buddy;
+
+	buddy = page + (__buddy_pfn - pfn);
+	if (buddy_pfn)
+		*buddy_pfn = __buddy_pfn;
+
+	if (page_is_buddy(page, buddy, order))
+		return buddy;
+	return NULL;
+}
+
+extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
+				unsigned long end_pfn, struct zone *zone);
+
+static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+				unsigned long end_pfn, struct zone *zone)
+{
+	if (zone->contiguous)
+		return pfn_to_page(start_pfn);
+
+	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
+}
+
+extern void __free_pages_core(struct page *page, unsigned int order,
+		enum meminit_context context);
+
+void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags);
+extern bool free_pages_prepare(struct page *page, unsigned int order);
+
+extern int user_min_free_kbytes;
+
+struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
+		nodemask_t *nodemask);
+#define __alloc_frozen_pages(...) \
+	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
+void free_frozen_pages(struct page *page, unsigned int order);
+void free_unref_folios(struct folio_batch *fbatch);
+
+#ifdef CONFIG_NUMA
+struct page *alloc_frozen_pages_noprof(gfp_t, unsigned int order);
+#else
+static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order)
+{
+	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL);
+}
+#endif
+
+#define alloc_frozen_pages(...) \
+	alloc_hooks(alloc_frozen_pages_noprof(__VA_ARGS__))
+
+struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order);
+#define alloc_frozen_pages_nolock(...) \
+	alloc_hooks(alloc_frozen_pages_nolock_noprof(__VA_ARGS__))
+void free_frozen_pages_nolock(struct page *page, unsigned int order);
+
+extern void zone_pcp_reset(struct zone *zone);
+extern void zone_pcp_disable(struct zone *zone);
+extern void zone_pcp_enable(struct zone *zone);
+extern void zone_pcp_init(struct zone *zone);
+
+enum fallback_result {
+	/* Found suitable migratetype, *mt_out is valid. */
+	FALLBACK_FOUND,
+	/* No fallback found in requested order. */
+	FALLBACK_EMPTY,
+	/* Passed @claimable, but claiming whole block is a bad idea. */
+	FALLBACK_NOCLAIM,
+};
+enum fallback_result
+find_suitable_fallback(struct free_area *area, unsigned int order,
+		       int migratetype, bool claimable, int *mt_out);
+
+static inline bool free_area_empty(struct free_area *area, int migratetype)
+{
+	return list_empty(&area->free_list[migratetype]);
+}
+
+void page_alloc_sysctl_init(void);
+
+#endif /* __MM_PAGE_ALLOC_H */
diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
index d2423f30577e4..a1077cef3a791 100644
--- a/mm/page_frag_cache.c
+++ b/mm/page_frag_cache.c
@@ -18,7 +18,7 @@
 #include <linux/init.h>
 #include <linux/mm.h>
 #include <linux/page_frag_cache.h>
-#include "internal.h"
+#include "page_alloc.h"
 
 static unsigned long encoded_page_create(struct page *page, unsigned int order,
 					 bool pfmemalloc)
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 32ce8a7d9df35..e5dfc7bf49446 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -11,6 +11,7 @@
 #include <linux/page_owner.h>
 #include <linux/migrate.h>
 #include "internal.h"
+#include "page_alloc.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/page_isolation.h>
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 74a844a86441e..6f580a64bdba3 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -13,7 +13,7 @@
 #include <linux/memcontrol.h>
 #include <linux/sched/clock.h>
 
-#include "internal.h"
+#include "page_alloc.h"
 
 /*
  * TODO: teach PAGE_OWNER_STACK_DEPTH (__dump_page_owner and save_stack)
diff --git a/mm/show_mem.c b/mm/show_mem.c
index 1b721a8ade67d..d1288b4c2b640 100644
--- a/mm/show_mem.c
+++ b/mm/show_mem.c
@@ -16,6 +16,7 @@
 #include <linux/vmstat.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 #include "swap.h"
 
 atomic_long_t _totalram_pages __read_mostly;
diff --git a/mm/slub.c b/mm/slub.c
index 9ec774dc70096..877021e69cc41 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -53,6 +53,7 @@
 #include <trace/events/kmem.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 
 /*
  * Lock order:
diff --git a/mm/swap.c b/mm/swap.c
index 0132ed0fb76b6..5e389bcc073a9 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -39,6 +39,7 @@
 #include <linux/buffer_head.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/pagemap.h>
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 754c5f5d716aa..de1879db39160 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -66,6 +66,7 @@
 #include <linux/sched/sysctl.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 #include "swap.h"
 
 #define CREATE_TRACE_POINTS

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (3 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 04/16] mm: Split out internal page_alloc.h Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 13:36   ` Harry Yoo
  2026-06-30 16:16   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs Brendan Jackman
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
main entry point function is significantly different from the normal
__alloc_frozen_pages_nolock(), this is tiring when reading the code.

Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
exposed to mm/) and then turn the nolock variant into a thin wrapper
that just sets that flag (as well as handling NUMA_NO_NODE, similar to
how some of the wrappers in gfp.h do).

Rationale that this doesn't change anything:

1. Simple bits: A bunch of the nolock-specific handling is just moved to
   the new alloc_order_allowed(), alloc_trylock_allowed() and
   gfp_trylock.

2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
   previously in the nolock variant:

   a. Application of gfp_allowed_mask; this only affects early boot, and
      only flags that affect the slowpath get changed here.

   b. Application of current_gfp_context() - also only affects the
      slowpath

3. The slowpath itself: this is now just explicitly skipped under
   !ALLOC_TRYLOCK.

Ulterior motive: adding an alloc_flags arg to the allocator's
mm-internal entrypoint can later be used to do more allocation
customisation without needing to create new GFP flags.

While adding this flag to a bunch of places, create ALLOC_DEFAULT to
avoid a mysterious literal 0 in most places. alloc_frozen_pages_noprof()
is defined above the alloc flags so just leave that as a slightly messy
exception instead of trying to fully reorder mm/internal.h for that one
case.

No functional change intended.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/hugetlb.c    |   3 +-
 mm/mempolicy.c  |  10 ++--
 mm/page_alloc.c | 178 +++++++++++++++++++++++++++++---------------------------
 mm/page_alloc.h |   6 +-
 mm/slub.c       |   6 +-
 5 files changed, 108 insertions(+), 95 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f7925624c4d2e..dfcfcfa4715bf 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1806,7 +1806,8 @@ static struct folio *alloc_buddy_frozen_folio(int order, gfp_t gfp_mask,
 	if (alloc_try_hard)
 		gfp_mask |= __GFP_RETRY_MAYFAIL;
 
-	folio = (struct folio *)__alloc_frozen_pages(gfp_mask, order, nid, nmask);
+	folio = (struct folio *)__alloc_frozen_pages(gfp_mask, order, nid, nmask,
+						     ALLOC_DEFAULT);
 
 	/*
 	 * If we did not specify __GFP_RETRY_MAYFAIL, but still got a
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 9c740324f9160..41d630f0ea821 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2426,9 +2426,11 @@ static struct page *alloc_pages_preferred_many(gfp_t gfp, unsigned int order,
 	 */
 	preferred_gfp = gfp | __GFP_NOWARN;
 	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
-	page = __alloc_frozen_pages_noprof(preferred_gfp, order, nid, nodemask);
+	page = __alloc_frozen_pages_noprof(preferred_gfp, order, nid, nodemask,
+					   ALLOC_DEFAULT);
 	if (!page)
-		page = __alloc_frozen_pages_noprof(gfp, order, nid, NULL);
+		page = __alloc_frozen_pages_noprof(gfp, order, nid, NULL,
+						   ALLOC_DEFAULT);
 
 	return page;
 }
@@ -2476,7 +2478,7 @@ static struct page *alloc_pages_mpol(gfp_t gfp, unsigned int order,
 			 */
 			page = __alloc_frozen_pages_noprof(
 				gfp | __GFP_THISNODE | __GFP_NORETRY, order,
-				nid, NULL);
+				nid, NULL, ALLOC_DEFAULT);
 			if (page || !(gfp & __GFP_DIRECT_RECLAIM))
 				return page;
 			/*
@@ -2488,7 +2490,7 @@ static struct page *alloc_pages_mpol(gfp_t gfp, unsigned int order,
 		}
 	}
 
-	page = __alloc_frozen_pages_noprof(gfp, order, nid, nodemask);
+	page = __alloc_frozen_pages_noprof(gfp, order, nid, nodemask, ALLOC_DEFAULT);
 
 	if (unlikely(pol->mode == MPOL_INTERLEAVE ||
 		     pol->mode == MPOL_WEIGHTED_INTERLEAVE) && page) {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a3ba63c7f9199..8d409d075e3e9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5222,7 +5222,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
 		}
 		nr_account++;
 
-		prep_new_page(page, 0, gfp, 0);
+		prep_new_page(page, 0, gfp, ALLOC_DEFAULT);
 		set_page_refcounted(page);
 		page_array[nr_populated++] = page;
 	}
@@ -5271,24 +5271,98 @@ void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
 	}
 }
 
-/*
- * This is the 'heart' of the zoned buddy allocator.
- */
-struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
-		int preferred_nid, nodemask_t *nodemask)
+static inline bool alloc_order_allowed(gfp_t gfp, unsigned int order,
+				       unsigned int alloc_flags)
 {
-	struct page *page;
-	unsigned int fastpath_alloc_flags = ALLOC_WMARK_LOW;
-	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
-	struct alloc_context ac = { };
+	if (alloc_flags & ALLOC_NOLOCK)
+		return pcp_allowed_order(order);
 
 	/*
 	 * There are several places where we assume that the order value is sane
 	 * so bail out early if the request is out of bound.
 	 */
-	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
+	return !(WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp));
+}
+
+static inline bool alloc_trylock_allowed(void)
+{
+	/*
+	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
+	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
+	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
+	 * mark the task as the owner of another rt_spin_lock which will
+	 * confuse PI logic, so return immediately if called from hard IRQ or
+	 * NMI.
+	 *
+	 * Note, irqs_disabled() case is ok. This function can be called
+	 * from raw_spin_lock_irqsave region.
+	 */
+	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
+		return false;
+
+	/* On UP, spin_trylock() always succeeds even when it is locked */
+	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
+		return false;
+
+	/* Bailout, since _deferred_grow_zone() needs to take a lock */
+	if (deferred_pages_enabled())
+		return false;
+
+	return true;
+}
+
+/*
+ * GFP flags to set for ALLOC_NOLOCK i.e. alloc_pages_nolock().
+ *
+ * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
+ * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
+ * is not safe in arbitrary context.
+ *
+ * These two are the conditions for gfpflags_allow_spinning() being true.
+ *
+ * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
+ * to warn. Also warn would trigger printk() which is unsafe from
+ * various contexts. We cannot use printk_deferred_enter() to mitigate,
+ * since the running context is unknown.
+ *
+ * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
+ * is safe in any context. Also zeroing the page is mandatory for
+ * BPF use cases.
+ *
+ * Though __GFP_NOMEMALLOC is not checked in the code path below,
+ * specify it here to highlight that alloc_pages_nolock()
+ * doesn't want to deplete reserves.
+ */
+static const gfp_t gfp_nolock = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC |
+				__GFP_COMP;
+
+/*
+ * This is the 'heart' of the zoned buddy allocator.
+ */
+struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
+		int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
+{
+	struct page *page;
+	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
+	struct alloc_context ac = { };
+	unsigned int fastpath_alloc_flags = alloc_flags;
+
+	/* Other flags could be supported later if needed. */
+	if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
 		return NULL;
 
+	if (!alloc_order_allowed(gfp, order, alloc_flags))
+		return NULL;
+
+	if (alloc_flags & ALLOC_NOLOCK) {
+		VM_WARN_ON_ONCE(gfp & ~__GFP_ACCOUNT);
+		if (!alloc_trylock_allowed())
+			return NULL;
+		gfp |= gfp_nolock;
+	} else {
+		fastpath_alloc_flags |= ALLOC_WMARK_LOW;
+	}
+
 	gfp &= gfp_allowed_mask;
 	/*
 	 * Apply scoped allocation constraints. This is mainly about GFP_NOFS
@@ -5310,9 +5384,9 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 	fastpath_alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
 	fastpath_alloc_flags |= alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
 
-	/* First allocation attempt */
+	/* First allocation attempt (or, for nolock, only attempt) */
 	page = get_page_from_freelist(alloc_gfp, order, fastpath_alloc_flags, &ac);
-	if (likely(page))
+	if (likely(page) || (alloc_flags & ALLOC_NOLOCK))
 		goto out;
 
 	alloc_gfp = gfp;
@@ -5329,7 +5403,8 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 out:
 	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
 	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
-		free_frozen_pages(page, order);
+		__free_frozen_pages(page, order,
+				    alloc_flags & ALLOC_NOLOCK ? FPI_TRYLOCK : 0);
 		page = NULL;
 	}
 
@@ -5345,7 +5420,8 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
 {
 	struct page *page;
 
-	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask);
+	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
+					   ALLOC_DEFAULT);
 	if (page)
 		set_page_refcounted(page);
 	return page;
@@ -7875,80 +7951,10 @@ static bool __free_unaccepted(struct page *page)
 
 struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order)
 {
-	/*
-	 * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
-	 * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
-	 * is not safe in arbitrary context.
-	 *
-	 * These two are the conditions for gfpflags_allow_spinning() being true.
-	 *
-	 * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
-	 * to warn. Also warn would trigger printk() which is unsafe from
-	 * various contexts. We cannot use printk_deferred_enter() to mitigate,
-	 * since the running context is unknown.
-	 *
-	 * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
-	 * is safe in any context. Also zeroing the page is mandatory for
-	 * BPF use cases.
-	 *
-	 * Though __GFP_NOMEMALLOC is not checked in the code path below,
-	 * specify it here to highlight that alloc_pages_nolock()
-	 * doesn't want to deplete reserves.
-	 */
-	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC | __GFP_COMP
-			| gfp_flags;
-	unsigned int alloc_flags = ALLOC_NOLOCK;
-	struct alloc_context ac = { };
-	struct page *page;
-
-	VM_WARN_ON_ONCE(gfp_flags & ~__GFP_ACCOUNT);
-	/*
-	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
-	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
-	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
-	 * mark the task as the owner of another rt_spin_lock which will
-	 * confuse PI logic, so return immediately if called from hard IRQ or
-	 * NMI.
-	 *
-	 * Note, irqs_disabled() case is ok. This function can be called
-	 * from raw_spin_lock_irqsave region.
-	 */
-	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
-		return NULL;
-
-	/* On UP, spin_trylock() always succeeds even when it is locked */
-	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
-		return NULL;
-
-	if (!pcp_allowed_order(order))
-		return NULL;
-
-	/* Bailout, since _deferred_grow_zone() needs to take a lock */
-	if (deferred_pages_enabled())
-		return NULL;
-
 	if (nid == NUMA_NO_NODE)
 		nid = numa_node_id();
 
-	prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac,
-			    &alloc_gfp, &alloc_flags);
-
-	/*
-	 * Best effort allocation from percpu free list.
-	 * If it's empty attempt to spin_trylock zone->lock.
-	 */
-	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
-
-	/* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */
-
-	if (memcg_kmem_online() && page && (gfp_flags & __GFP_ACCOUNT) &&
-	    unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) {
-		__free_frozen_pages(page, order, FPI_TRYLOCK);
-		page = NULL;
-	}
-	trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype);
-	kmsan_alloc_page(page, order, alloc_gfp);
-	return page;
+	return __alloc_frozen_pages_noprof(gfp_flags, order, nid, NULL, ALLOC_NOLOCK);
 }
 /**
  * alloc_pages_nolock - opportunistic reentrant allocation from any context
diff --git a/mm/page_alloc.h b/mm/page_alloc.h
index 3250d44f96457..e16f905f859a7 100644
--- a/mm/page_alloc.h
+++ b/mm/page_alloc.h
@@ -11,6 +11,7 @@
 #include <linux/nodemask.h>
 #include <linux/types.h>
 
+#define ALLOC_DEFAULT		0
 /* The ALLOC_WMARK bits are used as an index to zone->watermark */
 #define ALLOC_WMARK_MIN		WMARK_MIN
 #define ALLOC_WMARK_LOW		WMARK_LOW
@@ -219,7 +220,7 @@ extern bool free_pages_prepare(struct page *page, unsigned int order);
 extern int user_min_free_kbytes;
 
 struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
-		nodemask_t *nodemask);
+		nodemask_t *nodemask, unsigned int alloc_flags);
 #define __alloc_frozen_pages(...) \
 	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
 void free_frozen_pages(struct page *page, unsigned int order);
@@ -230,7 +231,8 @@ struct page *alloc_frozen_pages_noprof(gfp_t, unsigned int order);
 #else
 static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order)
 {
-	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL);
+	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL,
+					   0 /* ALLOC_DEFAULT */);
 }
 #endif
 
diff --git a/mm/slub.c b/mm/slub.c
index 877021e69cc41..3989b4758ae0a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3292,7 +3292,8 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
 	else if (node == NUMA_NO_NODE)
 		page = alloc_frozen_pages(flags, order);
 	else
-		page = __alloc_frozen_pages(flags, order, node, NULL);
+		page = __alloc_frozen_pages(flags, order, node, NULL,
+					    ALLOC_DEFAULT);
 
 	if (!page)
 		return NULL;
@@ -5302,7 +5303,8 @@ static void *___kmalloc_large_node(size_t size, gfp_t flags, int node)
 	if (node == NUMA_NO_NODE)
 		page = alloc_frozen_pages_noprof(flags, order);
 	else
-		page = __alloc_frozen_pages_noprof(flags, order, node, NULL);
+		page = __alloc_frozen_pages_noprof(flags, order, node, NULL,
+						   ALLOC_DEFAULT);
 
 	if (page) {
 		ptr = page_address(page);

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (4 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 13:52   ` Harry Yoo
  2026-06-30 16:42   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h Brendan Jackman
                   ` (10 subsequent siblings)
  16 siblings, 2 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

This WARN forbids setting other flags than __GFP_ACCOUNT but we
unconditionally set the ones in gfp_nolock so they are certainly fine
for the caller to set.

There are other GFP flags that are almost certainly fine to set here;
Willy noted GFP_HIGHMEM, GFP_DMA, GFP_MOVABLE and GFP_HARDWALL. But,
nolock allocation is rather special, so be conservative to try and
ensure we have a chance to think carefully before nontrivial new
usecases arise.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/linux-mm/ajS96fWbG4dzP3u3@casper.infradead.org/
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8d409d075e3e9..9cb3f1665b41b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5355,7 +5355,8 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 		return NULL;
 
 	if (alloc_flags & ALLOC_NOLOCK) {
-		VM_WARN_ON_ONCE(gfp & ~__GFP_ACCOUNT);
+		/* Certain other flags could be supported later if needed. */
+		VM_WARN_ON_ONCE(gfp & ~(__GFP_ACCOUNT | gfp_nolock));
 		if (!alloc_trylock_allowed())
 			return NULL;
 		gfp |= gfp_nolock;

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (5 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-30 16:42   ` Vlastimil Babka (SUSE)
  2026-06-29 13:11 ` [PATCH v3 08/16] perf/x86/intel: Use higher-level allocator API Brendan Jackman
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

Some of this stuff in the public header is only used internally so
shrink the scope to avoid silently growing new users.

drain_local_pages() is still used from kernel/power/snapshot.c so that
needs to stay behind.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 include/linux/gfp.h | 26 --------------------------
 mm/page_alloc.h     | 28 ++++++++++++++++++++++++++++
 mm/vmstat.c         |  1 +
 3 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index cdf95a9f0b87c..01d6d2591f49e 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -17,28 +17,6 @@ struct mempolicy;
 #define __default_gfp(a,b,...) b
 #define default_gfp(...) __default_gfp(,##__VA_ARGS__,GFP_KERNEL)
 
-/* Convert GFP flags to their corresponding migrate type */
-#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
-#define GFP_MOVABLE_SHIFT 3
-
-static inline int gfp_migratetype(const gfp_t gfp_flags)
-{
-	VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
-	BUILD_BUG_ON((1UL << GFP_MOVABLE_SHIFT) != ___GFP_MOVABLE);
-	BUILD_BUG_ON((___GFP_MOVABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_MOVABLE);
-	BUILD_BUG_ON((___GFP_RECLAIMABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_RECLAIMABLE);
-	BUILD_BUG_ON(((___GFP_MOVABLE | ___GFP_RECLAIMABLE) >>
-		      GFP_MOVABLE_SHIFT) != MIGRATE_HIGHATOMIC);
-
-	if (unlikely(page_group_by_mobility_disabled))
-		return MIGRATE_UNMOVABLE;
-
-	/* Group based on mobility */
-	return (__force unsigned long)(gfp_flags & GFP_MOVABLE_MASK) >> GFP_MOVABLE_SHIFT;
-}
-#undef GFP_MOVABLE_MASK
-#undef GFP_MOVABLE_SHIFT
-
 static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
 {
 	return !!(gfp_flags & __GFP_DIRECT_RECLAIM);
@@ -395,10 +373,6 @@ extern void free_pages(unsigned long addr, unsigned int order);
 #define __free_page(page) __free_pages((page), 0)
 #define free_page(addr) free_pages((addr), 0)
 
-void page_alloc_init_cpuhp(void);
-bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp);
-void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
-void drain_all_pages(struct zone *zone);
 void drain_local_pages(struct zone *zone);
 
 void page_alloc_init_late(void);
diff --git a/mm/page_alloc.h b/mm/page_alloc.h
index e16f905f859a7..af83764788b96 100644
--- a/mm/page_alloc.h
+++ b/mm/page_alloc.h
@@ -266,6 +266,34 @@ static inline bool free_area_empty(struct free_area *area, int migratetype)
 	return list_empty(&area->free_list[migratetype]);
 }
 
+/* Convert GFP flags to their corresponding migrate type */
+#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
+#define GFP_MOVABLE_SHIFT 3
+
+static inline int gfp_migratetype(const gfp_t gfp_flags)
+{
+	VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
+	BUILD_BUG_ON((1UL << GFP_MOVABLE_SHIFT) != ___GFP_MOVABLE);
+	BUILD_BUG_ON((___GFP_MOVABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_MOVABLE);
+	BUILD_BUG_ON((___GFP_RECLAIMABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_RECLAIMABLE);
+	BUILD_BUG_ON(((___GFP_MOVABLE | ___GFP_RECLAIMABLE) >>
+		      GFP_MOVABLE_SHIFT) != MIGRATE_HIGHATOMIC);
+
+	if (unlikely(page_group_by_mobility_disabled))
+		return MIGRATE_UNMOVABLE;
+
+	/* Group based on mobility */
+	return (__force unsigned long)(gfp_flags & GFP_MOVABLE_MASK) >> GFP_MOVABLE_SHIFT;
+}
+#undef GFP_MOVABLE_MASK
+#undef GFP_MOVABLE_SHIFT
+
+bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp);
+void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
+void drain_all_pages(struct zone *zone);
+void drain_local_pages(struct zone *zone);
+
+void page_alloc_init_cpuhp(void);
 void page_alloc_sysctl_init(void);
 
 #endif /* __MM_PAGE_ALLOC_H */
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7b93fbf9af092..3b5cb1031f720 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -30,6 +30,7 @@
 #include <linux/sched/isolation.h>
 
 #include "internal.h"
+#include "page_alloc.h"
 
 #ifdef CONFIG_PROC_FS
 #ifdef CONFIG_NUMA

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 08/16] perf/x86/intel: Use higher-level allocator API
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (6 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-29 13:11 ` [PATCH v3 09/16] KVM: VMX: " Brendan Jackman
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark

The difference between __alloc_pages_node() and alloc_pages_node() is
that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Assisted-by: Gemini:unknown-version
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/events/intel/ds.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 91a093d8cf2e7..70be80211d823 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -832,7 +832,7 @@ static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
 	int node = cpu_to_node(cpu);
 	struct page *page;
 
-	page = __alloc_pages_node(node, flags | __GFP_ZERO, order);
+	page = alloc_pages_node(node, flags | __GFP_ZERO, order);
 	return page ? page_address(page) : NULL;
 }
 
@@ -1088,9 +1088,9 @@ void init_arch_pebs_on_cpu(int cpu)
 
 	/*
 	 * 4KB-aligned pointer of the output buffer
-	 * (__alloc_pages_node() return page aligned address)
+	 * (alloc_pages_node() returns page aligned address)
 	 * Buffer Size = 4KB * 2^SIZE
-	 * contiguous physical buffer (__alloc_pages_node() with order)
+	 * contiguous physical buffer (alloc_pages_node() with order)
 	 */
 	arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
 	wrmsrq_on_cpu(cpu, MSR_IA32_PEBS_BASE, arch_pebs_base);

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 09/16] KVM: VMX: Use higher-level allocator API
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (7 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 08/16] perf/x86/intel: Use higher-level allocator API Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-29 15:31   ` -EXT-[PATCH " Soderlund, David
  2026-06-29 13:11 ` [PATCH v3 10/16] x86/virt: " Brendan Jackman
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman,
	Sean Christopherson, Paolo Bonzini, kvm

The difference between __alloc_pages_node() and alloc_pages_node() is
that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Assisted-by: Gemini:unknown-version
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2325be57d3d75..ad6a7fc6a54da 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3028,7 +3028,7 @@ struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
 	struct page *pages;
 	struct vmcs *vmcs;
 
-	pages = __alloc_pages_node(node, flags, 0);
+	pages = alloc_pages_node(node, flags, 0);
 	if (!pages)
 		return NULL;
 	vmcs = page_address(pages);

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 10/16] x86/virt: Use higher-level allocator API
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (8 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 09/16] KVM: VMX: " Brendan Jackman
@ 2026-06-29 13:11 ` Brendan Jackman
  2026-06-29 13:12 ` [PATCH v3 11/16] sgi-xp: " Brendan Jackman
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:11 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin

The difference between __alloc_pages_node() and alloc_pages_node() is
that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Assisted-by: Gemini:unknown-version
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/virt/hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/virt/hw.c b/arch/x86/virt/hw.c
index 7e9091c640be0..a236447ac7a26 100644
--- a/arch/x86/virt/hw.c
+++ b/arch/x86/virt/hw.c
@@ -196,7 +196,7 @@ static __init int __x86_vmx_init(void)
 		struct page *page;
 		struct vmcs *vmcs;
 
-		page = __alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0);
+		page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0);
 		if (WARN_ON_ONCE(!page)) {
 			x86_vmx_exit();
 			return -ENOMEM;

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 11/16] sgi-xp: Use higher-level allocator API
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (9 preceding siblings ...)
  2026-06-29 13:11 ` [PATCH v3 10/16] x86/virt: " Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-29 18:47   ` Steve Wahl
  2026-06-29 13:12 ` [PATCH v3 12/16] net/funeth: Switch to " Brendan Jackman
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman, Robin Holt,
	Steve Wahl, Arnd Bergmann, Greg Kroah-Hartman

The difference between __alloc_pages_node() and alloc_pages_node() is
that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Robin Holt <robinmholt@gmail.com>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Assisted-by: Gemini:unknown-model
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 drivers/misc/sgi-xp/xpc_uv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/sgi-xp/xpc_uv.c b/drivers/misc/sgi-xp/xpc_uv.c
index 772c787268932..aacff70204241 100644
--- a/drivers/misc/sgi-xp/xpc_uv.c
+++ b/drivers/misc/sgi-xp/xpc_uv.c
@@ -170,7 +170,7 @@ xpc_create_gru_mq_uv(unsigned int mq_size, int cpu, char *irq_name,
 	mq->mmr_blade = uv_cpu_to_blade_id(cpu);
 
 	nid = cpu_to_node(cpu);
-	page = __alloc_pages_node(nid,
+	page = alloc_pages_node(nid,
 				      GFP_KERNEL | __GFP_ZERO | __GFP_THISNODE,
 				      pg_order);
 	if (page == NULL) {

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 12/16] net/funeth: Switch to higher-level allocator API
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (10 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 11/16] sgi-xp: " Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-29 13:12 ` [PATCH v3 13/16] mm: Remove __alloc_pages_node() Brendan Jackman
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman,
	Dimitris Michailidis, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni

The difference between __alloc_pages_node() and alloc_pages_node() is
that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Dimitris Michailidis <dmichail@fungible.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Assisted-by: Gemini:unknown-version
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 drivers/net/ethernet/fungible/funeth/funeth_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/fungible/funeth/funeth_rx.c b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
index 7e2584895de39..d7000017ac2bd 100644
--- a/drivers/net/ethernet/fungible/funeth/funeth_rx.c
+++ b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
@@ -103,7 +103,7 @@ static int funeth_alloc_page(struct funeth_rxq *q, struct funeth_rxbuf *rb,
 	if (cache_get(q, rb))
 		return 0;
 
-	p = __alloc_pages_node(node, gfp | __GFP_NOWARN, 0);
+	p = alloc_pages_node(node, gfp | __GFP_NOWARN, 0);
 	if (unlikely(!p))
 		return -ENOMEM;
 

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 13/16] mm: Remove __alloc_pages_node()
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (11 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 12/16] net/funeth: Switch to " Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-29 13:12 ` [PATCH v3 14/16] mm: Move __alloc_pages() to mm/page_alloc.h Brendan Jackman
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

There were only a few users, which have been removed. The only advantage
of this API over alloc_pages_node() is avoiding a single conditional
branch. The disadvantages are:

1. More API surface, more sources of confusion, more maintenance.

2. Worse impact of CPU hotplug bugs: most users of __alloc_pages_node()
   were using the result of cpu_to_node(); if the CPU gets hotplugged
   out this will return NUMA_NO_NODE. If one of these paths fails to
   protect against a concurrent hotplug then page_alloc.c will use
   NUMA_NO_NODE as an index into NODE_DATA() and cause some horrible
   memory corruption or other. With alloc_pages_node(), the code might
   just work fine.

Ulterior motive: this frees up the __* variants of the allocator APIs to
serve specifically for use as mm-internal API.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 include/linux/gfp.h | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 01d6d2591f49e..3bf55a5f9143e 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -256,21 +256,6 @@ static inline void warn_if_node_offline(int this_node, gfp_t gfp_mask)
 	dump_stack();
 }
 
-/*
- * Allocate pages, preferring the node given as nid. The node must be valid and
- * online. For more general interface, see alloc_pages_node().
- */
-static inline struct page *
-__alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order)
-{
-	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
-	warn_if_node_offline(nid, gfp_mask);
-
-	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
-}
-
-#define  __alloc_pages_node(...)		alloc_hooks(__alloc_pages_node_noprof(__VA_ARGS__))
-
 static inline
 struct folio *__folio_alloc_node_noprof(gfp_t gfp, unsigned int order, int nid)
 {
@@ -293,7 +278,10 @@ static inline struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask,
 	if (nid == NUMA_NO_NODE)
 		nid = numa_mem_id();
 
-	return __alloc_pages_node_noprof(nid, gfp_mask, order);
+	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
+	warn_if_node_offline(nid, gfp_mask);
+
+	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
 }
 
 #define  alloc_pages_node(...)			alloc_hooks(alloc_pages_node_noprof(__VA_ARGS__))

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 14/16] mm: Move __alloc_pages() to mm/page_alloc.h
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (12 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 13/16] mm: Remove __alloc_pages_node() Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-29 13:12 ` [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Brendan Jackman
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

It's no longer used outside of mm/.

Since this means __alloc_pages_noprof() is no longer visible from gfp.h,
this also means moving the definition of alloc_pages_node_noprof into
the .c file.

Also remove references to this API from the documentation tree -
referring to the specific function name was already questionable but
now the function is not even public it definitely seems wrong.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 Documentation/admin-guide/cgroup-v1/cpusets.rst |  2 +-
 Documentation/admin-guide/mm/transhuge.rst      |  2 +-
 include/linux/gfp.h                             | 16 +---------------
 mm/page_alloc.c                                 | 13 ++++++++++++-
 mm/page_alloc.h                                 |  4 ++++
 5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/cpusets.rst b/Documentation/admin-guide/cgroup-v1/cpusets.rst
index c7909e5ac1361..52a213aff04e5 100644
--- a/Documentation/admin-guide/cgroup-v1/cpusets.rst
+++ b/Documentation/admin-guide/cgroup-v1/cpusets.rst
@@ -284,7 +284,7 @@ take action.
 ==>
     Unless this feature is enabled by writing "1" to the special file
     /dev/cpuset/memory_pressure_enabled, the hook in the rebalance
-    code of __alloc_pages() for this metric reduces to simply noticing
+    code of the page allocator for this metric reduces to simply noticing
     that the cpuset_memory_pressure_enabled flag is zero.  So only
     systems that enable this feature will compute the metric.
 
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index 23f8d13c2629d..16f37135ed80d 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -761,7 +761,7 @@ compact_fail
 	but failed.
 
 It is possible to establish how long the stalls were using the function
-tracer to record how long was spent in __alloc_pages() and
+tracer to record how long was spent in the page allocator and
 using the mm_page_alloc tracepoint to identify which allocations were
 for huge pages.
 
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 3bf55a5f9143e..4d57e9c0bf204 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -204,10 +204,6 @@ static inline void arch_free_page(struct page *page, int order) { }
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
 
-struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
-		nodemask_t *nodemask);
-#define __alloc_pages(...)			alloc_hooks(__alloc_pages_noprof(__VA_ARGS__))
-
 struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
 		nodemask_t *nodemask);
 #define __folio_alloc(...)			alloc_hooks(__folio_alloc_noprof(__VA_ARGS__))
@@ -272,17 +268,7 @@ struct folio *__folio_alloc_node_noprof(gfp_t gfp, unsigned int order, int nid)
  * prefer the current CPU's closest node. Otherwise node must be valid and
  * online.
  */
-static inline struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask,
-						   unsigned int order)
-{
-	if (nid == NUMA_NO_NODE)
-		nid = numa_mem_id();
-
-	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
-	warn_if_node_offline(nid, gfp_mask);
-
-	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
-}
+struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order);
 
 #define  alloc_pages_node(...)			alloc_hooks(alloc_pages_node_noprof(__VA_ARGS__))
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9cb3f1665b41b..026f33f217036 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5427,7 +5427,18 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
 		set_page_refcounted(page);
 	return page;
 }
-EXPORT_SYMBOL(__alloc_pages_noprof);
+
+struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order)
+{
+	if (nid == NUMA_NO_NODE)
+		nid = numa_mem_id();
+
+	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
+	warn_if_node_offline(nid, gfp_mask);
+
+	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
+}
+EXPORT_SYMBOL(alloc_pages_node_noprof);
 
 struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
 		nodemask_t *nodemask)
diff --git a/mm/page_alloc.h b/mm/page_alloc.h
index af83764788b96..2058cbdca56e7 100644
--- a/mm/page_alloc.h
+++ b/mm/page_alloc.h
@@ -244,6 +244,10 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
 	alloc_hooks(alloc_frozen_pages_nolock_noprof(__VA_ARGS__))
 void free_frozen_pages_nolock(struct page *page, unsigned int order);
 
+struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
+		nodemask_t *nodemask);
+#define __alloc_pages(...)			alloc_hooks(__alloc_pages_noprof(__VA_ARGS__))
+
 extern void zone_pcp_reset(struct zone *zone);
 extern void zone_pcp_disable(struct zone *zone);
 extern void zone_pcp_enable(struct zone *zone);

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (13 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 14/16] mm: Move __alloc_pages() to mm/page_alloc.h Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-30  1:55   ` Hao Ge
  2026-06-29 13:12 ` [PATCH v3 16/16] mm: remove the __GFP_NO_OBJ_EXT flag Brendan Jackman
  2026-06-29 14:00 ` [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Mike Rapoport
  16 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

Now that alloc_pages has an entrypoint that allows passing alloc_flags,
we can take advantage of this to start removing GFP flags that are only
used for mm-internal stuff.

This requires also plumbing the alloc_flags into some more of the
allocator code, in particular __alloc_pages[_noprof]() gets an
alloc_flags arg to go along with its callees, and we now need to pass
those flags deeper into the allocator so they can reach the alloc_tag
code.

No functional change intended.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/alloc_tag.c       | 22 ++++++----------------
 mm/compaction.c      |  4 ++--
 mm/internal.h        |  1 -
 mm/page_alloc.c      | 42 ++++++++++++++++++++++++------------------
 mm/page_alloc.h      | 17 +++++++++++++++--
 mm/page_frag_cache.c |  4 ++--
 6 files changed, 49 insertions(+), 41 deletions(-)

diff --git a/mm/alloc_tag.c b/mm/alloc_tag.c
index d9be1cf5187d9..a32a94e759b94 100644
--- a/mm/alloc_tag.c
+++ b/mm/alloc_tag.c
@@ -15,6 +15,8 @@
 #include <linux/vmalloc.h>
 #include <linux/kmemleak.h>
 
+#include "internal.h"
+
 #define ALLOCINFO_FILE_NAME		"allocinfo"
 #define MODULE_ALLOC_TAG_VMAP_SIZE	(100000UL * sizeof(struct alloc_tag))
 #define SECTION_START(NAME)		(CODETAG_SECTION_START_PREFIX NAME)
@@ -783,19 +785,6 @@ struct pfn_pool {
 
 #define PFN_POOL_SIZE			((PAGE_SIZE - offsetof(struct pfn_pool, pfns)) / \
 					 sizeof(unsigned long))
-
-/*
- * Skip early PFN recording for a page allocation.  Reuses the
- * %__GFP_NO_OBJ_EXT bit.  Used by __alloc_tag_add_early_pfn() to avoid
- * recursion when allocating pages for the early PFN tracking list
- * itself.
- *
- * Codetags of the pages allocated with __GFP_NO_CODETAG should be
- * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
- * alloc_tag_sub_check() from triggering a warning.
- */
-#define __GFP_NO_CODETAG		__GFP_NO_OBJ_EXT
-
 static struct pfn_pool *current_pfn_pool __initdata;
 
 static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
@@ -806,7 +795,8 @@ static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
 	do {
 		pool = READ_ONCE(current_pfn_pool);
 		if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
-			struct page *new_page = alloc_page(__GFP_HIGH | __GFP_NO_CODETAG);
+			struct page *new_page = __alloc_pages(__GFP_HIGH, 0, numa_mem_id(),
+							      NULL, ALLOC_NO_CODETAG);
 			struct pfn_pool *new;
 
 			if (!new_page) {
@@ -837,7 +827,7 @@ typedef void alloc_tag_add_func(unsigned long pfn);
 static alloc_tag_add_func __rcu *alloc_tag_add_early_pfn_ptr __refdata =
 	RCU_INITIALIZER(__alloc_tag_add_early_pfn);
 
-void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
+void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)
 {
 	alloc_tag_add_func *alloc_tag_add;
 
@@ -845,7 +835,7 @@ void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
 		return;
 
 	/* Skip allocations for the tracking list itself to avoid recursion. */
-	if (gfp_flags & __GFP_NO_CODETAG)
+	if (alloc_flags & ALLOC_NO_CODETAG)
 		return;
 
 	rcu_read_lock();
diff --git a/mm/compaction.c b/mm/compaction.c
index 7d80735502d9a..4b2318fad4eb5 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -83,7 +83,7 @@ static inline bool is_via_compact_memory(int order) { return false; }
 
 static struct page *mark_allocated_noprof(struct page *page, unsigned int order, gfp_t gfp_flags)
 {
-	post_alloc_hook(page, order, __GFP_MOVABLE);
+	post_alloc_hook(page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
 	set_page_refcounted(page);
 	return page;
 }
@@ -1851,7 +1851,7 @@ static struct folio *compaction_alloc_noprof(struct folio *src, unsigned long da
 	}
 	dst = (struct folio *)freepage;
 
-	post_alloc_hook(&dst->page, order, __GFP_MOVABLE);
+	post_alloc_hook(&dst->page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
 	set_page_refcounted(&dst->page);
 	if (order)
 		prep_compound_page(&dst->page, order);
diff --git a/mm/internal.h b/mm/internal.h
index c22284f04fc9e..369c656c63fa8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1237,7 +1237,6 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 enum ttu_flags;
 struct tlbflush_unmap_batch;
 
-
 /*
  * only for MM internal work items which do not depend on
  * any allocations or locks which might depend on allocations
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 026f33f217036..803b32e5a5e47 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1249,7 +1249,7 @@ void __clear_page_tag_ref(struct page *page)
 /* Should be called only if mem_alloc_profiling_enabled() */
 static noinline
 void __pgalloc_tag_add(struct page *page, struct task_struct *task,
-		       unsigned int nr, gfp_t gfp_flags)
+		       unsigned int nr, unsigned int alloc_flags)
 {
 	union pgtag_ref_handle handle;
 	union codetag_ref ref;
@@ -1263,17 +1263,17 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,
 		 * page_ext is not available yet, record the pfn so we can
 		 * clear the tag ref later when page_ext is initialized.
 		 */
-		alloc_tag_add_early_pfn(page_to_pfn(page), gfp_flags);
+		alloc_tag_add_early_pfn(page_to_pfn(page), alloc_flags);
 		if (task->alloc_tag)
 			alloc_tag_set_inaccurate(task->alloc_tag);
 	}
 }
 
 static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
-				   unsigned int nr, gfp_t gfp_flags)
+				   unsigned int nr, unsigned int alloc_flags)
 {
 	if (mem_alloc_profiling_enabled())
-		__pgalloc_tag_add(page, task, nr, gfp_flags);
+		__pgalloc_tag_add(page, task, nr, alloc_flags);
 }
 
 /* Should be called only if mem_alloc_profiling_enabled() */
@@ -1810,7 +1810,7 @@ static inline bool should_skip_init(gfp_t flags)
 }
 
 inline void post_alloc_hook(struct page *page, unsigned int order,
-				gfp_t gfp_flags)
+				gfp_t gfp_flags, unsigned int alloc_flags)
 {
 	const bool zero_tags = gfp_flags & __GFP_ZEROTAGS;
 	bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
@@ -1861,13 +1861,13 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 
 	set_page_owner(page, order, gfp_flags);
 	page_table_check_alloc(page, order);
-	pgalloc_tag_add(page, current, 1 << order, gfp_flags);
+	pgalloc_tag_add(page, current, 1 << order, alloc_flags);
 }
 
 static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
 							unsigned int alloc_flags)
 {
-	post_alloc_hook(page, order, gfp_flags);
+	post_alloc_hook(page, order, gfp_flags, alloc_flags);
 
 	if (order && (gfp_flags & __GFP_COMP))
 		prep_compound_page(page, order);
@@ -4791,8 +4791,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * The fast path uses conservative alloc_flags to succeed only until
 	 * kswapd needs to be woken up, and to avoid the cost of setting up
 	 * alloc_flags precisely. So we do that now.
+	 *
+	 * Can't just or alloc_flags if it contains WMARK bits, but those flags
+	 * shouldn't be set in ac->alloc_flags.
 	 */
-	alloc_flags = alloc_flags_slowpath(gfp_mask, order);
+	VM_WARN_ON(ac->alloc_flags & ALLOC_WMARK_MASK);
+	alloc_flags = ac->alloc_flags | alloc_flags_slowpath(gfp_mask, order);
 
 	/*
 	 * We need to recalculate the starting point for the zonelist iterator
@@ -4834,7 +4838,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
 	if (reserve_flags)
 		alloc_flags = alloc_flags_cma(gfp_mask, reserve_flags) |
-					  (alloc_flags & ALLOC_KSWAPD);
+				ac->alloc_flags | (alloc_flags & ALLOC_KSWAPD);
 
 	/*
 	 * Reset the nodemask and zonelist iterators if memory policies can be
@@ -5236,7 +5240,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
 	return nr_populated;
 
 failed:
-	page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask);
+	page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask, ALLOC_DEFAULT);
 	if (page)
 		page_array[nr_populated++] = page;
 	goto out;
@@ -5344,11 +5348,13 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 {
 	struct page *page;
 	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
-	struct alloc_context ac = { };
+	struct alloc_context ac = {
+		.alloc_flags = alloc_flags,
+	};
 	unsigned int fastpath_alloc_flags = alloc_flags;
 
 	/* Other flags could be supported later if needed. */
-	if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
+	if (WARN_ON(alloc_flags & ~(ALLOC_NOLOCK | ALLOC_NO_CODETAG)))
 		return NULL;
 
 	if (!alloc_order_allowed(gfp, order, alloc_flags))
@@ -5417,12 +5423,12 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 EXPORT_SYMBOL(__alloc_frozen_pages_noprof);
 
 struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
-		int preferred_nid, nodemask_t *nodemask)
+		int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
 {
 	struct page *page;
 
 	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
-					   ALLOC_DEFAULT);
+					   alloc_flags);
 	if (page)
 		set_page_refcounted(page);
 	return page;
@@ -5436,7 +5442,7 @@ struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order
 	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
 	warn_if_node_offline(nid, gfp_mask);
 
-	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
+	return __alloc_pages_noprof(gfp_mask, order, nid, NULL, ALLOC_DEFAULT);
 }
 EXPORT_SYMBOL(alloc_pages_node_noprof);
 
@@ -5444,7 +5450,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_
 		nodemask_t *nodemask)
 {
 	struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
-					preferred_nid, nodemask);
+					preferred_nid, nodemask, ALLOC_DEFAULT);
 	return page_rmappable_folio(page);
 }
 EXPORT_SYMBOL(__folio_alloc_noprof);
@@ -7126,7 +7132,7 @@ static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
 		list_for_each_entry_safe(page, next, &list[order], lru) {
 			int i;
 
-			post_alloc_hook(page, order, gfp_mask);
+			post_alloc_hook(page, order, gfp_mask, ALLOC_DEFAULT);
 			if (!order)
 				continue;
 
@@ -7331,7 +7337,7 @@ int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
 		struct page *head = pfn_to_page(start);
 
 		check_new_pages(head, order);
-		prep_new_page(head, order, gfp_mask, 0);
+		prep_new_page(head, order, gfp_mask, ALLOC_DEFAULT);
 	} else {
 		ret = -EINVAL;
 		WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
diff --git a/mm/page_alloc.h b/mm/page_alloc.h
index 2058cbdca56e7..2614bff6795b0 100644
--- a/mm/page_alloc.h
+++ b/mm/page_alloc.h
@@ -49,6 +49,16 @@
 #define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
 #define ALLOC_NOLOCK		0x400 /* Only use spin_trylock in allocation path */
 #define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
+/*
+ * Skip early PFN recording for a page allocation.  Used by
+ * __alloc_tag_add_early_pfn() to avoid recursion when allocating pages for the
+ * early PFN tracking list itself.
+ *
+ * Codetags of the pages allocated with __GFP_NO_CODETAG should be
+ * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
+ * alloc_tag_sub_check() from triggering a warning.
+ */
+#define ALLOC_NO_CODETAG       0x1000
 
 /* Flags that allow allocations below the min watermark. */
 #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
@@ -84,6 +94,8 @@ struct alloc_context {
 	 */
 	enum zone_type highest_zoneidx;
 	bool spread_dirty_pages;
+	/* Only flags that are global to the whole allocation go here. */
+	unsigned int alloc_flags;
 };
 
 /*
@@ -214,7 +226,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
 extern void __free_pages_core(struct page *page, unsigned int order,
 		enum meminit_context context);
 
-void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags);
+void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags,
+		     unsigned int alloc_flags);
 extern bool free_pages_prepare(struct page *page, unsigned int order);
 
 extern int user_min_free_kbytes;
@@ -245,7 +258,7 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
 void free_frozen_pages_nolock(struct page *page, unsigned int order);
 
 struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
-		nodemask_t *nodemask);
+		nodemask_t *nodemask, unsigned int alloc_flags);
 #define __alloc_pages(...)			alloc_hooks(__alloc_pages_noprof(__VA_ARGS__))
 
 extern void zone_pcp_reset(struct zone *zone);
diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
index a1077cef3a791..e63efe78b7d4b 100644
--- a/mm/page_frag_cache.c
+++ b/mm/page_frag_cache.c
@@ -57,10 +57,10 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
 	gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) |  __GFP_COMP |
 		   __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
 	page = __alloc_pages(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER,
-			     numa_mem_id(), NULL);
+			     numa_mem_id(), NULL, ALLOC_DEFAULT);
 #endif
 	if (unlikely(!page)) {
-		page = __alloc_pages(gfp, 0, numa_mem_id(), NULL);
+		page = __alloc_pages(gfp, 0, numa_mem_id(), NULL, ALLOC_DEFAULT);
 		order = 0;
 	}
 

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 16/16] mm: remove the __GFP_NO_OBJ_EXT flag
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (14 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Brendan Jackman
@ 2026-06-29 13:12 ` Brendan Jackman
  2026-06-29 14:00 ` [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Mike Rapoport
  16 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 13:12 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Brendan Jackman

All users of the flag are converted to SLAB_ALLOC_NO_RECURSE or
ALLOC_NO_CODETAG (from __GFP_NO_CODETAG which reused the NO_OBJ_EXT bit).
Free up the flag bit.

Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
[Rebased onto __GFP_NO_CODETAG removal]
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 tools/include/linux/gfp_types.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/tools/include/linux/gfp_types.h b/tools/include/linux/gfp_types.h
index 6c75df30a281d..a93b8bd200b76 100644
--- a/tools/include/linux/gfp_types.h
+++ b/tools/include/linux/gfp_types.h
@@ -55,7 +55,6 @@ enum {
 #ifdef CONFIG_LOCKDEP
 	___GFP_NOLOCKDEP_BIT,
 #endif
-	___GFP_NO_OBJ_EXT_BIT,
 	___GFP_LAST_BIT
 };
 
@@ -96,7 +95,6 @@ enum {
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
-#define ___GFP_NO_OBJ_EXT       BIT(___GFP_NO_OBJ_EXT_BIT)
 
 /*
  * Physical address zone modifiers (see linux/mmzone.h - low four bits)
@@ -137,17 +135,12 @@ enum {
  * node with no fallbacks or placement policy enforcements.
  *
  * %__GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
- *
- * %__GFP_NO_OBJ_EXT causes slab allocation to have no object extension.
- * mark_obj_codetag_empty() should be called upon freeing for objects allocated
- * with this flag to indicate that their NULL tags are expected and normal.
  */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
 #define __GFP_WRITE	((__force gfp_t)___GFP_WRITE)
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL)
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)
 #define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)
-#define __GFP_NO_OBJ_EXT   ((__force gfp_t)___GFP_NO_OBJ_EXT)
 
 /**
  * DOC: Watermark modifiers

-- 
2.54.0



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 00/16] mm: Some cleanups for page allocator APIs
  2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
                   ` (15 preceding siblings ...)
  2026-06-29 13:12 ` [PATCH v3 16/16] mm: remove the __GFP_NO_OBJ_EXT flag Brendan Jackman
@ 2026-06-29 14:00 ` Mike Rapoport
  2026-06-29 14:30   ` Brendan Jackman
  16 siblings, 1 reply; 39+ messages in thread
From: Mike Rapoport @ 2026-06-29 14:00 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt, Harry Yoo (Oracle), Gregory Price,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Sean Christopherson, Paolo Bonzini, kvm,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Robin Holt, Steve Wahl, Arnd Bergmann,
	Greg Kroah-Hartman, Dimitris Michailidis, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni

Hi Brendan,

On Mon, Jun 29, 2026 at 01:11:49PM +0000, Brendan Jackman wrote:
> 
> Some tweaks and cleanups for page allocator entrypoint and flags. This
> is motivated by preparation for __GFP_UNMAPPED [1] (which will probably
> become ALLOC_UNMAPPED in its next iteration), but all this is supposed
> to be an improvement to the codebase in its own right: unifying code
> paths, reducing API surface, and removing GFP flags.
> 
> Tested:
> 
> - KVM, mm, and BPF selftests in a QEMU VM
> 
> - kunit.py on x86_64
> 
> - For the ALLOC_NO_CODETAG bits I just booted a VM and read
>   /proc/allocinfo. I confirmed that if I remove ALLOC_NO_CODETAG, the
>   kernel crashes in early boot, so I was at least booting code that
>   depends on this logic.

Heads up before the full kbuild report:

CI has tested the following submission:
Status:     FAILURE
Name:       [v3,00/16] mm: Some cleanups for page allocator APIs
Patchwork:  https://patchwork.kernel.org/project/linux-mm/list/?series=1118244&state=*
Matrix:     https://github.com/linux-mm/linux-mm/actions/runs/28375636866

> I used Google's internal version of Antigravity (AI coding harness) to
> do the repetitive bits, those commits are marked with Assisted-by, the
> rest is manual.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 00/16] mm: Some cleanups for page allocator APIs
  2026-06-29 14:00 ` [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Mike Rapoport
@ 2026-06-29 14:30   ` Brendan Jackman
  2026-06-29 15:05     ` Brendan Jackman
  0 siblings, 1 reply; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 14:30 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt, Harry Yoo (Oracle), Gregory Price,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Sean Christopherson, Paolo Bonzini, kvm,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Robin Holt, Steve Wahl, Arnd Bergmann,
	Greg Kroah-Hartman, Dimitris Michailidis, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, 29 Jun 2026 at 16:00, Mike Rapoport <rppt@kernel.org> wrote:
>
> Hi Brendan,
>
> On Mon, Jun 29, 2026 at 01:11:49PM +0000, Brendan Jackman wrote:
> >
> > Some tweaks and cleanups for page allocator entrypoint and flags. This
> > is motivated by preparation for __GFP_UNMAPPED [1] (which will probably
> > become ALLOC_UNMAPPED in its next iteration), but all this is supposed
> > to be an improvement to the codebase in its own right: unifying code
> > paths, reducing API surface, and removing GFP flags.
> >
> > Tested:
> >
> > - KVM, mm, and BPF selftests in a QEMU VM
> >
> > - kunit.py on x86_64
> >
> > - For the ALLOC_NO_CODETAG bits I just booted a VM and read
> >   /proc/allocinfo. I confirmed that if I remove ALLOC_NO_CODETAG, the
> >   kernel crashes in early boot, so I was at least booting code that
> >   depends on this logic.
>
> Heads up before the full kbuild report:
>
> CI has tested the following submission:
> Status:     FAILURE
> Name:       [v3,00/16] mm: Some cleanups for page allocator APIs
> Patchwork:  https://patchwork.kernel.org/project/linux-mm/list/?series=1118244&state=*
> Matrix:     https://github.com/linux-mm/linux-mm/actions/runs/28375636866

Agh, thanks, I broke the build for CMA.

I thought I had this covered in my local test scripts. I will fix that
first then I'll send a fixup for the patch.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 00/16] mm: Some cleanups for page allocator APIs
  2026-06-29 14:30   ` Brendan Jackman
@ 2026-06-29 15:05     ` Brendan Jackman
  0 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-29 15:05 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt, Harry Yoo (Oracle), Gregory Price,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm,
	linux-kernel, linux-rt-devel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Sean Christopherson, Paolo Bonzini, kvm,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Robin Holt, Steve Wahl, Arnd Bergmann,
	Greg Kroah-Hartman, Dimitris Michailidis, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, 29 Jun 2026 at 16:30, Brendan Jackman <jackmanb@google.com> wrote:
>
> On Mon, 29 Jun 2026 at 16:00, Mike Rapoport <rppt@kernel.org> wrote:
> >
> > Hi Brendan,
> >
> > On Mon, Jun 29, 2026 at 01:11:49PM +0000, Brendan Jackman wrote:
> > >
> > > Some tweaks and cleanups for page allocator entrypoint and flags. This
> > > is motivated by preparation for __GFP_UNMAPPED [1] (which will probably
> > > become ALLOC_UNMAPPED in its next iteration), but all this is supposed
> > > to be an improvement to the codebase in its own right: unifying code
> > > paths, reducing API surface, and removing GFP flags.
> > >
> > > Tested:
> > >
> > > - KVM, mm, and BPF selftests in a QEMU VM
> > >
> > > - kunit.py on x86_64
> > >
> > > - For the ALLOC_NO_CODETAG bits I just booted a VM and read
> > >   /proc/allocinfo. I confirmed that if I remove ALLOC_NO_CODETAG, the
> > >   kernel crashes in early boot, so I was at least booting code that
> > >   depends on this logic.
> >
> > Heads up before the full kbuild report:
> >
> > CI has tested the following submission:
> > Status:     FAILURE
> > Name:       [v3,00/16] mm: Some cleanups for page allocator APIs
> > Patchwork:  https://patchwork.kernel.org/project/linux-mm/list/?series=1118244&state=*
> > Matrix:     https://github.com/linux-mm/linux-mm/actions/runs/28375636866
>
> Agh, thanks, I broke the build for CMA.
>
> I thought I had this covered in my local test scripts. I will fix that
> first then I'll send a fixup for the patch.

OK my scripts are indeed checking CMA, the issue is that I didn't
build with NUMA_BALANCING.

I guess Suren was right[0] and I really should build allmodconfig (at
least vmlinux) before sending patches. I was a bit skeptical that this
was an especially useful config to build, but now I realise it just
maximises coverage, even if it does so under a pretty arbitrary
configuration.

[0] https://lore.kernel.org/all/CAJuCfpHAMaK2sZYSgS750CvgksCKEfOmLzZasXdBup+CrS-3Pg@mail.gmail.com/

Here are the fixes:

diff --git i/mm/migrate.c w/mm/migrate.c
index d9b23909d716c..8e0a6fb3f6618 100644
--- i/mm/migrate.c
+++ w/mm/migrate.c
@@ -49,6 +49,7 @@
 #include <trace/events/migrate.h>

 #include "internal.h"
+#include "page_alloc.h"
 #include "swap.h"

 static const struct movable_operations *offline_movable_ops;
diff --git i/mm/page_reporting.c w/mm/page_reporting.c
index 7418f2e500bb4..c7325704c3202 100644
--- i/mm/page_reporting.c
+++ w/mm/page_reporting.c
@@ -8,6 +8,7 @@
 #include <linux/delay.h>
 #include <linux/scatterlist.h>

+#include "page_alloc.h"
 #include "page_reporting.h"
 #include "internal.h"

diff --git i/mm/shuffle.c w/mm/shuffle.c
index fb1393b8b3a9d..82a2c7725a08a 100644
--- i/mm/shuffle.c
+++ w/mm/shuffle.c
@@ -7,6 +7,7 @@
 #include <linux/random.h>
 #include <linux/moduleparam.h>
 #include "internal.h"
+#include "page_alloc.h"
 #include "shuffle.h"

 DEFINE_STATIC_KEY_FALSE(page_alloc_shuffle_key);


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* RE: -EXT-[PATCH v3 09/16] KVM: VMX: Use higher-level allocator API
  2026-06-29 13:11 ` [PATCH v3 09/16] KVM: VMX: " Brendan Jackman
@ 2026-06-29 15:31   ` Soderlund, David
  0 siblings, 0 replies; 39+ messages in thread
From: Soderlund, David @ 2026-06-29 15:31 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Johannes Weiner,
	Alexei Starovoitov, Matthew Wilcox, Hao Ge, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev,
	Sean Christopherson, Paolo Bonzini, kvm@vger.kernel.org

Can someone please remove me from this list.  I've requested a couple of times.

  Thanks,
     David

-----Original Message-----
From: Brendan Jackman <jackmanb@google.com> 
Sent: Monday, June 29, 2026 7:12 AM
To: Andrew Morton <akpm@linux-foundation.org>; Vlastimil Babka <vbabka@kernel.org>; Suren Baghdasaryan <surenb@google.com>; Michal Hocko <mhocko@suse.com>; Johannes Weiner <hannes@cmpxchg.org>; Zi Yan <ziy@nvidia.com>; Muchun Song <muchun.song@linux.dev>; Oscar Salvador <osalvador@suse.de>; David Hildenbrand <david@kernel.org>; Lorenzo Stoakes <ljs@kernel.org>; Liam R. Howlett <liam@infradead.org>; Mike Rapoport <rppt@kernel.org>; Matthew Brost <matthew.brost@intel.com>; Joshua Hahn <joshua.hahnjy@gmail.com>; Rakie Kim <rakie.kim@sk.com>; Byungchul Park <byungchul@sk.com>; Ying Huang <ying.huang@linux.alibaba.com>; Alistair Popple <apopple@nvidia.com>; Hao Li <hao.li@linux.dev>; Christoph Lameter <cl@gentwo.org>; David Rientjes <rientjes@google.com>; Roman Gushchin <roman.gushchin@linux.dev>; Sebastian Andrzej Siewior <bigeasy@linutronix.de>; Clark Williams <clrkwllms@kernel.org>; Steven Rostedt <rostedt@goodmis.org>
Cc: Harry Yoo (Oracle) <harry@kernel.org>; Gregory Price <gourry@gourry.net>; Johannes Weiner <hannes@cmpxchg.org>; Alexei Starovoitov <ast@kernel.org>; Matthew Wilcox <willy@infradead.org>; Hao Ge <hao.ge@linux.dev>; linux-mm@kvack.org; linux-kernel@vger.kernel.org; linux-rt-devel@lists.linux.dev; Brendan Jackman <jackmanb@google.com>; Sean Christopherson <seanjc@google.com>; Paolo Bonzini <pbonzini@redhat.com>; kvm@vger.kernel.org
Subject: -EXT-[PATCH v3 09/16] KVM: VMX: Use higher-level allocator API

WARNING:  This message is from an external source.  Evaluate the message carefully BEFORE clicking on links or opening attachments.

The difference between __alloc_pages_node() and alloc_pages_node() is that the latter allows you to pass NUMA_NO_NODE.

The former is going away and the latter works fine here so switch over.

No functional change intended.

Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Assisted-by: Gemini:unknown-version
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 2325be57d3d75..ad6a7fc6a54da 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3028,7 +3028,7 @@ struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
        struct page *pages;
        struct vmcs *vmcs;

-       pages = __alloc_pages_node(node, flags, 0);
+       pages = alloc_pages_node(node, flags, 0);
        if (!pages)
                return NULL;
        vmcs = page_address(pages);

--
2.54.0



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 11/16] sgi-xp: Use higher-level allocator API
  2026-06-29 13:12 ` [PATCH v3 11/16] sgi-xp: " Brendan Jackman
@ 2026-06-29 18:47   ` Steve Wahl
  0 siblings, 0 replies; 39+ messages in thread
From: Steve Wahl @ 2026-06-29 18:47 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
	Robin Holt, Steve Wahl, Arnd Bergmann, Greg Kroah-Hartman

Acked-by: Steve Wahl <steve.wahl@hpe.com>

On Mon, Jun 29, 2026 at 01:12:00PM +0000, Brendan Jackman wrote:
> The difference between __alloc_pages_node() and alloc_pages_node() is
> that the latter allows you to pass NUMA_NO_NODE.
> 
> The former is going away and the latter works fine here so switch over.
> 
> No functional change intended.
> 
> Cc: Robin Holt <robinmholt@gmail.com>
> Cc: Steve Wahl <steve.wahl@hpe.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Assisted-by: Gemini:unknown-model
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>  drivers/misc/sgi-xp/xpc_uv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/sgi-xp/xpc_uv.c b/drivers/misc/sgi-xp/xpc_uv.c
> index 772c787268932..aacff70204241 100644
> --- a/drivers/misc/sgi-xp/xpc_uv.c
> +++ b/drivers/misc/sgi-xp/xpc_uv.c
> @@ -170,7 +170,7 @@ xpc_create_gru_mq_uv(unsigned int mq_size, int cpu, char *irq_name,
>  	mq->mmr_blade = uv_cpu_to_blade_id(cpu);
>  
>  	nid = cpu_to_node(cpu);
> -	page = __alloc_pages_node(nid,
> +	page = alloc_pages_node(nid,
>  				      GFP_KERNEL | __GFP_ZERO | __GFP_THISNODE,
>  				      pg_order);
>  	if (page == NULL) {
> 
> -- 
> 2.54.0
> 

-- 
Steve Wahl, Hewlett Packard Enterprise


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
  2026-06-29 13:12 ` [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Brendan Jackman
@ 2026-06-30  1:55   ` Hao Ge
  2026-06-30 10:10     ` Brendan Jackman
  2026-06-30 12:01     ` Brendan Jackman
  0 siblings, 2 replies; 39+ messages in thread
From: Hao Ge @ 2026-06-30  1:55 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, linux-mm, linux-kernel, linux-rt-devel,
	Vlastimil Babka, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, David Hildenbrand,
	Oscar Salvador, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt

Hi Brendan


On 2026/6/29 21:12, Brendan Jackman wrote:
> Now that alloc_pages has an entrypoint that allows passing alloc_flags,
> we can take advantage of this to start removing GFP flags that are only
> used for mm-internal stuff.
>
> This requires also plumbing the alloc_flags into some more of the
> allocator code, in particular __alloc_pages[_noprof]() gets an
> alloc_flags arg to go along with its callees, and we now need to pass
> those flags deeper into the allocator so they can reach the alloc_tag
> code.
>
> No functional change intended.
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>   mm/alloc_tag.c       | 22 ++++++----------------
>   mm/compaction.c      |  4 ++--
>   mm/internal.h        |  1 -
>   mm/page_alloc.c      | 42 ++++++++++++++++++++++++------------------
>   mm/page_alloc.h      | 17 +++++++++++++++--
>   mm/page_frag_cache.c |  4 ++--
>   6 files changed, 49 insertions(+), 41 deletions(-)
>
> diff --git a/mm/alloc_tag.c b/mm/alloc_tag.c
> index d9be1cf5187d9..a32a94e759b94 100644
> --- a/mm/alloc_tag.c
> +++ b/mm/alloc_tag.c
> @@ -15,6 +15,8 @@
>   #include <linux/vmalloc.h>
>   #include <linux/kmemleak.h>
>   
> +#include "internal.h"


Should we include page_alloc.h here, as we call __alloc_pages later in 
this file?


> +
>   #define ALLOCINFO_FILE_NAME		"allocinfo"
>   #define MODULE_ALLOC_TAG_VMAP_SIZE	(100000UL * sizeof(struct alloc_tag))
>   #define SECTION_START(NAME)		(CODETAG_SECTION_START_PREFIX NAME)
> @@ -783,19 +785,6 @@ struct pfn_pool {
>   
>   #define PFN_POOL_SIZE			((PAGE_SIZE - offsetof(struct pfn_pool, pfns)) / \
>   					 sizeof(unsigned long))
> -
> -/*
> - * Skip early PFN recording for a page allocation.  Reuses the
> - * %__GFP_NO_OBJ_EXT bit.  Used by __alloc_tag_add_early_pfn() to avoid
> - * recursion when allocating pages for the early PFN tracking list
> - * itself.
> - *
> - * Codetags of the pages allocated with __GFP_NO_CODETAG should be
> - * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
> - * alloc_tag_sub_check() from triggering a warning.
> - */
> -#define __GFP_NO_CODETAG		__GFP_NO_OBJ_EXT
> -
>   static struct pfn_pool *current_pfn_pool __initdata;
>   
>   static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
> @@ -806,7 +795,8 @@ static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
>   	do {
>   		pool = READ_ONCE(current_pfn_pool);
>   		if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
> -			struct page *new_page = alloc_page(__GFP_HIGH | __GFP_NO_CODETAG);
> +			struct page *new_page = __alloc_pages(__GFP_HIGH, 0, numa_mem_id(),
> +							      NULL, ALLOC_NO_CODETAG);
>   			struct pfn_pool *new;
>   
>   			if (!new_page) {
> @@ -837,7 +827,7 @@ typedef void alloc_tag_add_func(unsigned long pfn);
>   static alloc_tag_add_func __rcu *alloc_tag_add_early_pfn_ptr __refdata =
>   	RCU_INITIALIZER(__alloc_tag_add_early_pfn);
>   
> -void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
> +void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)


alloc_tag_add_early_pfn() has three occurrences across the codebase:

1. Definition in mm/alloc_tag.c:830:

void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)

2. Declaration in include/linux/alloc_tag.h:166:

void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)

3. Static inline stub in include/linux/alloc_tag.h:170:

static inline void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t 
gfp_flags) {}

This patch updates the definition in alloc_tag.c to take unsigned int 
alloc_flags,

but the two declarations in alloc_tag.h are left with the old gfp_t 
gfp_flags signature

These should be updated to match.


>   {
>   	alloc_tag_add_func *alloc_tag_add;
>   
> @@ -845,7 +835,7 @@ void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
>   		return;
>   
>   	/* Skip allocations for the tracking list itself to avoid recursion. */
> -	if (gfp_flags & __GFP_NO_CODETAG)
> +	if (alloc_flags & ALLOC_NO_CODETAG)
>   		return;
>   
>   	rcu_read_lock();
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 7d80735502d9a..4b2318fad4eb5 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -83,7 +83,7 @@ static inline bool is_via_compact_memory(int order) { return false; }
>   
>   static struct page *mark_allocated_noprof(struct page *page, unsigned int order, gfp_t gfp_flags)
>   {
> -	post_alloc_hook(page, order, __GFP_MOVABLE);
> +	post_alloc_hook(page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
>   	set_page_refcounted(page);
>   	return page;
>   }
> @@ -1851,7 +1851,7 @@ static struct folio *compaction_alloc_noprof(struct folio *src, unsigned long da
>   	}
>   	dst = (struct folio *)freepage;
>   
> -	post_alloc_hook(&dst->page, order, __GFP_MOVABLE);
> +	post_alloc_hook(&dst->page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
>   	set_page_refcounted(&dst->page);
>   	if (order)
>   		prep_compound_page(&dst->page, order);
> diff --git a/mm/internal.h b/mm/internal.h
> index c22284f04fc9e..369c656c63fa8 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1237,7 +1237,6 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
>   enum ttu_flags;
>   struct tlbflush_unmap_batch;
>   
> -
>   /*
>    * only for MM internal work items which do not depend on
>    * any allocations or locks which might depend on allocations
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 026f33f217036..803b32e5a5e47 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1249,7 +1249,7 @@ void __clear_page_tag_ref(struct page *page)
>   /* Should be called only if mem_alloc_profiling_enabled() */
>   static noinline
>   void __pgalloc_tag_add(struct page *page, struct task_struct *task,
> -		       unsigned int nr, gfp_t gfp_flags)
> +		       unsigned int nr, unsigned int alloc_flags)
>   {
>   	union pgtag_ref_handle handle;
>   	union codetag_ref ref;
> @@ -1263,17 +1263,17 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,
>   		 * page_ext is not available yet, record the pfn so we can
>   		 * clear the tag ref later when page_ext is initialized.
>   		 */
> -		alloc_tag_add_early_pfn(page_to_pfn(page), gfp_flags);
> +		alloc_tag_add_early_pfn(page_to_pfn(page), alloc_flags);
>   		if (task->alloc_tag)
>   			alloc_tag_set_inaccurate(task->alloc_tag);
>   	}
>   }
>   
>   static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
> -				   unsigned int nr, gfp_t gfp_flags)
> +				   unsigned int nr, unsigned int alloc_flags)


Same situation as alloc_tag_add_early_pfn() — the #else stub

at mm/page_alloc.c:1309 still uses gfp_t gfp_flags instead of unsigned int

alloc_flags.


>   {
>   	if (mem_alloc_profiling_enabled())
> -		__pgalloc_tag_add(page, task, nr, gfp_flags);
> +		__pgalloc_tag_add(page, task, nr, alloc_flags);
>   }
>   
>   /* Should be called only if mem_alloc_profiling_enabled() */
> @@ -1810,7 +1810,7 @@ static inline bool should_skip_init(gfp_t flags)
>   }
>   
>   inline void post_alloc_hook(struct page *page, unsigned int order,
> -				gfp_t gfp_flags)
> +				gfp_t gfp_flags, unsigned int alloc_flags)
>   {
>   	const bool zero_tags = gfp_flags & __GFP_ZEROTAGS;
>   	bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
> @@ -1861,13 +1861,13 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>   
>   	set_page_owner(page, order, gfp_flags);
>   	page_table_check_alloc(page, order);
> -	pgalloc_tag_add(page, current, 1 << order, gfp_flags);
> +	pgalloc_tag_add(page, current, 1 << order, alloc_flags);
>   }
>   
>   static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
>   							unsigned int alloc_flags)
>   {
> -	post_alloc_hook(page, order, gfp_flags);
> +	post_alloc_hook(page, order, gfp_flags, alloc_flags);
>   
>   	if (order && (gfp_flags & __GFP_COMP))
>   		prep_compound_page(page, order);
> @@ -4791,8 +4791,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>   	 * The fast path uses conservative alloc_flags to succeed only until
>   	 * kswapd needs to be woken up, and to avoid the cost of setting up
>   	 * alloc_flags precisely. So we do that now.
> +	 *
> +	 * Can't just or alloc_flags if it contains WMARK bits, but those flags
> +	 * shouldn't be set in ac->alloc_flags.
>   	 */
> -	alloc_flags = alloc_flags_slowpath(gfp_mask, order);
> +	VM_WARN_ON(ac->alloc_flags & ALLOC_WMARK_MASK);
> +	alloc_flags = ac->alloc_flags | alloc_flags_slowpath(gfp_mask, order);
>   
>   	/*
>   	 * We need to recalculate the starting point for the zonelist iterator
> @@ -4834,7 +4838,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>   	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
>   	if (reserve_flags)
>   		alloc_flags = alloc_flags_cma(gfp_mask, reserve_flags) |
> -					  (alloc_flags & ALLOC_KSWAPD);
> +				ac->alloc_flags | (alloc_flags & ALLOC_KSWAPD);
>   
>   	/*
>   	 * Reset the nodemask and zonelist iterators if memory policies can be
> @@ -5236,7 +5240,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
>   	return nr_populated;
>   
>   failed:
> -	page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask);
> +	page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask, ALLOC_DEFAULT);
>   	if (page)
>   		page_array[nr_populated++] = page;
>   	goto out;
> @@ -5344,11 +5348,13 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>   {
>   	struct page *page;
>   	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
> -	struct alloc_context ac = { };
> +	struct alloc_context ac = {
> +		.alloc_flags = alloc_flags,
> +	};
>   	unsigned int fastpath_alloc_flags = alloc_flags;
>   
>   	/* Other flags could be supported later if needed. */
> -	if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
> +	if (WARN_ON(alloc_flags & ~(ALLOC_NOLOCK | ALLOC_NO_CODETAG)))
>   		return NULL;
>   
>   	if (!alloc_order_allowed(gfp, order, alloc_flags))
> @@ -5417,12 +5423,12 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>   EXPORT_SYMBOL(__alloc_frozen_pages_noprof);
>   
>   struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
> -		int preferred_nid, nodemask_t *nodemask)
> +		int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
>   {
>   	struct page *page;
>   
>   	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
> -					   ALLOC_DEFAULT);
> +					   alloc_flags);
>   	if (page)
>   		set_page_refcounted(page);
>   	return page;
> @@ -5436,7 +5442,7 @@ struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order
>   	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
>   	warn_if_node_offline(nid, gfp_mask);
>   
> -	return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
> +	return __alloc_pages_noprof(gfp_mask, order, nid, NULL, ALLOC_DEFAULT);
>   }
>   EXPORT_SYMBOL(alloc_pages_node_noprof);
>   
> @@ -5444,7 +5450,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_
>   		nodemask_t *nodemask)
>   {
>   	struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
> -					preferred_nid, nodemask);
> +					preferred_nid, nodemask, ALLOC_DEFAULT);
>   	return page_rmappable_folio(page);
>   }
>   EXPORT_SYMBOL(__folio_alloc_noprof);
> @@ -7126,7 +7132,7 @@ static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
>   		list_for_each_entry_safe(page, next, &list[order], lru) {
>   			int i;
>   
> -			post_alloc_hook(page, order, gfp_mask);
> +			post_alloc_hook(page, order, gfp_mask, ALLOC_DEFAULT);
>   			if (!order)
>   				continue;
>   
> @@ -7331,7 +7337,7 @@ int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
>   		struct page *head = pfn_to_page(start);
>   
>   		check_new_pages(head, order);
> -		prep_new_page(head, order, gfp_mask, 0);
> +		prep_new_page(head, order, gfp_mask, ALLOC_DEFAULT);
>   	} else {
>   		ret = -EINVAL;
>   		WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
> diff --git a/mm/page_alloc.h b/mm/page_alloc.h
> index 2058cbdca56e7..2614bff6795b0 100644
> --- a/mm/page_alloc.h
> +++ b/mm/page_alloc.h
> @@ -49,6 +49,16 @@
>   #define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
>   #define ALLOC_NOLOCK		0x400 /* Only use spin_trylock in allocation path */
>   #define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
> +/*
> + * Skip early PFN recording for a page allocation.  Used by
> + * __alloc_tag_add_early_pfn() to avoid recursion when allocating pages for the
> + * early PFN tracking list itself.
> + *
> + * Codetags of the pages allocated with __GFP_NO_CODETAG should be
> + * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
> + * alloc_tag_sub_check() from triggering a warning.
> + */


I originally wrote this lengthy comment because the logic lives inside 
alloc_tag.c.

I wanted to document all the context to avoid confusion when revisiting 
this code later on.

We've since replaced __GFP_NO_CODETAG with ALLOC_NO_CODETAG, a generic 
alloc_flags bit defined in page_alloc.h.

The original long comment is no longer accurate:

Given that, I suggest updating it to the following:

/*

  * Avoid alloc_tag recursion for internal allocations.

  * Callers must clear_page_tag_ref() before
  * freeing to avoid warnings from alloc_tag_sub_check().

  */


Thanks

Best Regards

Hao


> +#define ALLOC_NO_CODETAG       0x1000
>   
>   /* Flags that allow allocations below the min watermark. */
>   #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
> @@ -84,6 +94,8 @@ struct alloc_context {
>   	 */
>   	enum zone_type highest_zoneidx;
>   	bool spread_dirty_pages;
> +	/* Only flags that are global to the whole allocation go here. */
> +	unsigned int alloc_flags;
>   };
>   
>   /*
> @@ -214,7 +226,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>   extern void __free_pages_core(struct page *page, unsigned int order,
>   		enum meminit_context context);
>   
> -void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags);
> +void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags,
> +		     unsigned int alloc_flags);
>   extern bool free_pages_prepare(struct page *page, unsigned int order);
>   
>   extern int user_min_free_kbytes;
> @@ -245,7 +258,7 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
>   void free_frozen_pages_nolock(struct page *page, unsigned int order);
>   
>   struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
> -		nodemask_t *nodemask);
> +		nodemask_t *nodemask, unsigned int alloc_flags);
>   #define __alloc_pages(...)			alloc_hooks(__alloc_pages_noprof(__VA_ARGS__))
>   
>   extern void zone_pcp_reset(struct zone *zone);
> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
> index a1077cef3a791..e63efe78b7d4b 100644
> --- a/mm/page_frag_cache.c
> +++ b/mm/page_frag_cache.c
> @@ -57,10 +57,10 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
>   	gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) |  __GFP_COMP |
>   		   __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
>   	page = __alloc_pages(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER,
> -			     numa_mem_id(), NULL);
> +			     numa_mem_id(), NULL, ALLOC_DEFAULT);
>   #endif
>   	if (unlikely(!page)) {
> -		page = __alloc_pages(gfp, 0, numa_mem_id(), NULL);
> +		page = __alloc_pages(gfp, 0, numa_mem_id(), NULL, ALLOC_DEFAULT);
>   		order = 0;
>   	}
>   
>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
  2026-06-30  1:55   ` Hao Ge
@ 2026-06-30 10:10     ` Brendan Jackman
  2026-06-30 12:01     ` Brendan Jackman
  1 sibling, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 10:10 UTC (permalink / raw)
  To: Hao Ge, Brendan Jackman
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, linux-mm, linux-kernel, linux-rt-devel,
	Vlastimil Babka, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, David Hildenbrand,
	Oscar Salvador, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt

On Tue Jun 30, 2026 at 1:55 AM UTC, Hao Ge wrote:
> Hi Brendan
>
>
> On 2026/6/29 21:12, Brendan Jackman wrote:
>> Now that alloc_pages has an entrypoint that allows passing alloc_flags,
>> we can take advantage of this to start removing GFP flags that are only
>> used for mm-internal stuff.
>>
>> This requires also plumbing the alloc_flags into some more of the
>> allocator code, in particular __alloc_pages[_noprof]() gets an
>> alloc_flags arg to go along with its callees, and we now need to pass
>> those flags deeper into the allocator so they can reach the alloc_tag
>> code.
>>
>> No functional change intended.
>>
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>>   mm/alloc_tag.c       | 22 ++++++----------------
>>   mm/compaction.c      |  4 ++--
>>   mm/internal.h        |  1 -
>>   mm/page_alloc.c      | 42 ++++++++++++++++++++++++------------------
>>   mm/page_alloc.h      | 17 +++++++++++++++--
>>   mm/page_frag_cache.c |  4 ++--
>>   6 files changed, 49 insertions(+), 41 deletions(-)
>>
>> diff --git a/mm/alloc_tag.c b/mm/alloc_tag.c
>> index d9be1cf5187d9..a32a94e759b94 100644
>> --- a/mm/alloc_tag.c
>> +++ b/mm/alloc_tag.c
>> @@ -15,6 +15,8 @@
>>   #include <linux/vmalloc.h>
>>   #include <linux/kmemleak.h>
>>   
>> +#include "internal.h"
>
>
> Should we include page_alloc.h here, as we call __alloc_pages later in 
> this file?

Yeah, there are a few build failures due to me not doing a broad enough
build. From now on I will just wait for allmodconfig instead of trying
to be clever with my build tests, sorry about this.

Also, this suggests that I have not actually re-tested the alloc_tag
code since v3 so I must repeat the test described in my cover letter (I
just manually enable the feature and check the kernel boots) for v4.

>> +
>>   #define ALLOCINFO_FILE_NAME		"allocinfo"
>>   #define MODULE_ALLOC_TAG_VMAP_SIZE	(100000UL * sizeof(struct alloc_tag))
>>   #define SECTION_START(NAME)		(CODETAG_SECTION_START_PREFIX NAME)
>> @@ -783,19 +785,6 @@ struct pfn_pool {
>>   
>>   #define PFN_POOL_SIZE			((PAGE_SIZE - offsetof(struct pfn_pool, pfns)) / \
>>   					 sizeof(unsigned long))
>> -
>> -/*
>> - * Skip early PFN recording for a page allocation.  Reuses the
>> - * %__GFP_NO_OBJ_EXT bit.  Used by __alloc_tag_add_early_pfn() to avoid
>> - * recursion when allocating pages for the early PFN tracking list
>> - * itself.
>> - *
>> - * Codetags of the pages allocated with __GFP_NO_CODETAG should be
>> - * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
>> - * alloc_tag_sub_check() from triggering a warning.
>> - */
>> -#define __GFP_NO_CODETAG		__GFP_NO_OBJ_EXT
>> -
>>   static struct pfn_pool *current_pfn_pool __initdata;
>>   
>>   static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
>> @@ -806,7 +795,8 @@ static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
>>   	do {
>>   		pool = READ_ONCE(current_pfn_pool);
>>   		if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
>> -			struct page *new_page = alloc_page(__GFP_HIGH | __GFP_NO_CODETAG);
>> +			struct page *new_page = __alloc_pages(__GFP_HIGH, 0, numa_mem_id(),
>> +							      NULL, ALLOC_NO_CODETAG);
>>   			struct pfn_pool *new;
>>   
>>   			if (!new_page) {
>> @@ -837,7 +827,7 @@ typedef void alloc_tag_add_func(unsigned long pfn);
>>   static alloc_tag_add_func __rcu *alloc_tag_add_early_pfn_ptr __refdata =
>>   	RCU_INITIALIZER(__alloc_tag_add_early_pfn);
>>   
>> -void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
>> +void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)
>
>
> alloc_tag_add_early_pfn() has three occurrences across the codebase:
>
> 1. Definition in mm/alloc_tag.c:830:
>
> void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)
>
> 2. Declaration in include/linux/alloc_tag.h:166:
>
> void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
>
> 3. Static inline stub in include/linux/alloc_tag.h:170:
>
> static inline void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t 
> gfp_flags) {}
>
> This patch updates the definition in alloc_tag.c to take unsigned int 
> alloc_flags,
>
> but the two declarations in alloc_tag.h are left with the old gfp_t 
> gfp_flags signature
>
> These should be updated to match.

Yeah ditto, sorry about this and thanks for the review.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
  2026-06-30  1:55   ` Hao Ge
  2026-06-30 10:10     ` Brendan Jackman
@ 2026-06-30 12:01     ` Brendan Jackman
  1 sibling, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 12:01 UTC (permalink / raw)
  To: Hao Ge, Brendan Jackman
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, linux-mm, linux-kernel, linux-rt-devel,
	Vlastimil Babka, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, David Hildenbrand,
	Oscar Salvador, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport,
	Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park, Ying Huang,
	Alistair Popple, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin, Sebastian Andrzej Siewior, Clark Williams,
	Steven Rostedt

On Tue Jun 30, 2026 at 1:55 AM UTC, Hao Ge wrote:
>> +/*
>> + * Skip early PFN recording for a page allocation.  Used by
>> + * __alloc_tag_add_early_pfn() to avoid recursion when allocating pages for the
>> + * early PFN tracking list itself.
>> + *
>> + * Codetags of the pages allocated with __GFP_NO_CODETAG should be
>> + * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
>> + * alloc_tag_sub_check() from triggering a warning.
>> + */
>
>
> I originally wrote this lengthy comment because the logic lives inside 
> alloc_tag.c.
>
> I wanted to document all the context to avoid confusion when revisiting 
> this code later on.
>
> We've since replaced __GFP_NO_CODETAG with ALLOC_NO_CODETAG, a generic 
> alloc_flags bit defined in page_alloc.h.
>
> The original long comment is no longer accurate:
>
> Given that, I suggest updating it to the following:
>
> /*
>
>   * Avoid alloc_tag recursion for internal allocations.
>
>   * Callers must clear_page_tag_ref() before
>   * freeing to avoid warnings from alloc_tag_sub_check().
>
>   */

Thanks for the context, pasting this in for v4!


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK
  2026-06-29 13:11 ` [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK Brendan Jackman
@ 2026-06-30 12:27   ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 12:27 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> It's confusing that the function is called "nolock" but the flag is
> called "trylock", align them.
> 
> The function's terminology is more visible and has more mindshare so use that.
> 
> Suggested-by: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
> Link: https://lore.kernel.org/linux-mm/2399b3ad-4eac-4a14-94c3-27e9f07972a1@kernel.org/
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes
  2026-06-29 13:11 ` [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes Brendan Jackman
@ 2026-06-30 12:38   ` Vlastimil Babka (SUSE)
  2026-06-30 17:25     ` Brendan Jackman
  0 siblings, 1 reply; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 12:38 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> It's pretty confusing that:
> 
> - The slowpath and fastpath have a totally distinct set of alloc_flags.
> 
> - gfp_to_alloc_flags() sounds generic but it only influences the
>   slowpath.
> 
> Rename some variables to highlight which alloc_flags are
> fastpath-specific. Rename gfp_to_alloc_flags() to highlight that it's
> slowpath-specific.
> 
> gfp_to_alloc_flags_cma() and gfp_to_alloc_flags_nonblocking() currently
> have perfectly harmless names, but to keep the naming consistent also
> rename those to the alloc_flags_*() pattern (which already exists for
> alloc_flags_nofragment()).

How annoying that alloc_flags_nofragment() doesn't have gfp as the first
parameter, unlike others.
Oh well, must resist too much OCD :)

Uh, more annoyingly, alloc_flags_cma() takes alloc_flags and returns
augmented alloc flags, so there's stuff like

*alloc_flags = alloc_flags_cma(gfp_mask, *alloc_flags);

Since we're unifying, it could be make to work additively like others? Then:

*alloc_flags |= alloc_flags_cma(gfp_mask);

> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Otherwise, LGTM.
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/16] mm: name some args in a function declaration
  2026-06-29 13:11 ` [PATCH v3 03/16] mm: name some args in a function declaration Brendan Jackman
@ 2026-06-30 12:43   ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 12:43 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> Checkpatch complains about this, a later patch will move the code, fix
> it so that checkpatch doesn't complain about that patch. Do it in a
> separate patch so the "move the code" patch is trivial to review using
> Git's diff colouring.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  mm/internal.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 2237eee030cba..8ce59c5664497 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -919,8 +919,8 @@ extern bool free_pages_prepare(struct page *page, unsigned int order);
>  
>  extern int user_min_free_kbytes;
>  
> -struct page *__alloc_frozen_pages_noprof(gfp_t, unsigned int order, int nid,
> -		nodemask_t *);
> +struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
> +		nodemask_t *nodemask);
>  #define __alloc_frozen_pages(...) \
>  	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
>  void free_frozen_pages(struct page *page, unsigned int order);
> 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-29 13:11 ` [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Brendan Jackman
@ 2026-06-30 13:36   ` Harry Yoo
  2026-06-30 15:34     ` Vlastimil Babka (SUSE)
  2026-06-30 17:04     ` Brendan Jackman
  2026-06-30 16:16   ` Vlastimil Babka (SUSE)
  1 sibling, 2 replies; 39+ messages in thread
From: Harry Yoo @ 2026-06-30 13:36 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
	linux-mm, linux-kernel, linux-rt-devel


[-- Attachment #1.1: Type: text/plain, Size: 4198 bytes --]



On 6/29/26 10:11 PM, Brendan Jackman wrote:
> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
> main entry point function is significantly different from the normal
> __alloc_frozen_pages_nolock(), this is tiring when reading the code.
> 
> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
> exposed to mm/) and then turn the nolock variant into a thin wrapper
> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
> how some of the wrappers in gfp.h do).
> 
> Rationale that this doesn't change anything:
>
> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
>    the new alloc_order_allowed(), alloc_trylock_allowed() and
>    gfp_trylock.

Right.

> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
>    previously in the nolock variant:
> 
>    a. Application of gfp_allowed_mask; this only affects early boot, and
>       only flags that affect the slowpath get changed here.

gfp_allowed_mask clears __GFP_RECLAIM, and that means now allocations
with GFP_KERNEL during early boot would see
gfpflags_allow_spinning() = false.

The helper is not used in in the page allocator, but used in
memcg/stackdepot/page_owner.

>    b. Application of current_gfp_context() - also only affects the
>       slowpath

PF_MEMALLOC_PIN affects the fast path, but ALLOC_NOLOCK users
won't be affected.

What about alloc_flags_nofragment/nonblocking()?

> 3. The slowpath itself: this is now just explicitly skipped under
>    !ALLOC_TRYLOCK.

Right.

> Ulterior motive: adding an alloc_flags arg to the allocator's
> mm-internal entrypoint can later be used to do more allocation
> customisation without needing to create new GFP flags.
> 
> While adding this flag to a bunch of places, create ALLOC_DEFAULT to
> avoid a mysterious literal 0 in most places.
>
> alloc_frozen_pages_noprof() is defined above the alloc flags

The function is defined below the alloc flags, no?

> so just leave that as a slightly messy
> exception instead of trying to fully reorder mm/internal.h for that one
> case.
> 
> No functional change intended.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>  mm/hugetlb.c    |   3 +-
>  mm/mempolicy.c  |  10 ++--
>  mm/page_alloc.c | 178 +++++++++++++++++++++++++++++---------------------------
>  mm/page_alloc.h |   6 +-
>  mm/slub.c       |   6 +-
>  5 files changed, 108 insertions(+), 95 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a3ba63c7f9199..8d409d075e3e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5271,24 +5271,98 @@ void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>  	}
>  }
>  
> +static inline bool alloc_trylock_allowed(void)
> +{
> +	/*
> +	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
> +	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
> +	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
> +	 * mark the task as the owner of another rt_spin_lock which will
> +	 * confuse PI logic, so return immediately if called from hard IRQ or
> +	 * NMI.
> +	 *
> +	 * Note, irqs_disabled() case is ok. This function can be called
> +	 * from raw_spin_lock_irqsave region.
> +	 */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> +		return false;
> +
> +	/* On UP, spin_trylock() always succeeds even when it is locked */
> +	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> +		return false;

Except for deferred_pages_enabled(), it's not specific to the page
allocator. SLUB has

	/*
	 * See the comment for the same check in
	 * alloc_frozen_pages_nolock_noprof()
	 */

... and repeats the same thing as above.

Perhaps let's factor it out into a helper
rather than trying not to forget to update the other place?

> +	/* Bailout, since _deferred_grow_zone() needs to take a lock */
> +	if (deferred_pages_enabled())
> +		return false;
> +
> +	return true;
> +}


-- 
Cheers,
Harry / Hyeonggon

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs
  2026-06-29 13:11 ` [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs Brendan Jackman
@ 2026-06-30 13:52   ` Harry Yoo
  2026-06-30 16:42   ` Vlastimil Babka (SUSE)
  1 sibling, 0 replies; 39+ messages in thread
From: Harry Yoo @ 2026-06-30 13:52 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
	linux-mm, linux-kernel, linux-rt-devel


[-- Attachment #1.1: Type: text/plain, Size: 864 bytes --]



On 6/29/26 10:11 PM, Brendan Jackman wrote:
> This WARN forbids setting other flags than __GFP_ACCOUNT but we
> unconditionally set the ones in gfp_nolock so they are certainly fine
> for the caller to set.
> 
> There are other GFP flags that are almost certainly fine to set here;
> Willy noted GFP_HIGHMEM, GFP_DMA, GFP_MOVABLE and GFP_HARDWALL. But,
> nolock allocation is rather special, so be conservative to try and
> ensure we have a chance to think carefully before nontrivial new
> usecases arise.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Link: https://lore.kernel.org/linux-mm/ajS96fWbG4dzP3u3@casper.infradead.org/
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---

Acked-by: Harry Yoo (Oracle) <harry@kernel.org>

-- 
Cheers,
Harry / Hyeonggon

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 04/16] mm: Split out internal page_alloc.h
  2026-06-29 13:11 ` [PATCH v3 04/16] mm: Split out internal page_alloc.h Brendan Jackman
@ 2026-06-30 13:54   ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 13:54 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> internal.h is a bit bloated, seems like time for a page_alloc.h.
> 
> Where it wasn't obvious, the heuristic for deciding what goes into this
> new header was "does it support/correspond to a definition in
> mm/page_alloc.c?"
> 
> Only need to include it from 15 .c files out of 164 so this does seem
> like a genuine reduction in scopes, which is nice. And there's no
> circular internal.h<->page_alloc.h dependency, so it seems worthwhile to
> split this up before that inevitably emerges!
> 
> Suggested-by: "David Hildenbrand (Arm)" <david@kernel.org>
> Link: https://lore.kernel.org/all/41e92bab-6882-401a-8de9-154adbdcfb36@kernel.org/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Cool.

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-30 13:36   ` Harry Yoo
@ 2026-06-30 15:34     ` Vlastimil Babka (SUSE)
  2026-06-30 16:56       ` Brendan Jackman
  2026-06-30 17:04     ` Brendan Jackman
  1 sibling, 1 reply; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 15:34 UTC (permalink / raw)
  To: Harry Yoo, Brendan Jackman, Andrew Morton, Suren Baghdasaryan,
	Michal Hocko, Johannes Weiner, Zi Yan, Muchun Song,
	Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
	linux-mm, linux-kernel, linux-rt-devel

On 6/30/26 15:36, Harry Yoo wrote:
> 
> 
> On 6/29/26 10:11 PM, Brendan Jackman wrote:
>> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
>> main entry point function is significantly different from the normal
>> __alloc_frozen_pages_nolock(), this is tiring when reading the code.
>> 
>> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
>> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
>> exposed to mm/) and then turn the nolock variant into a thin wrapper
>> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
>> how some of the wrappers in gfp.h do).
>> 
>> Rationale that this doesn't change anything:
>>
>> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
>>    the new alloc_order_allowed(), alloc_trylock_allowed() and
>>    gfp_trylock.
> 
> Right.
> 
>> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
>>    previously in the nolock variant:
>> 
>>    a. Application of gfp_allowed_mask; this only affects early boot, and
>>       only flags that affect the slowpath get changed here.
> 
> gfp_allowed_mask clears __GFP_RECLAIM, and that means now allocations
> with GFP_KERNEL during early boot would see
> gfpflags_allow_spinning() = false.

Is it a problem though? non-nolock allocations were affected before (the
masking existed for those already) and will be affected now the same, and
_nolock() allocations don't pass __GFP_RECLAIM in the first place, so the
masking can't affect them?

> The helper is not used in in the page allocator, but used in
> memcg/stackdepot/page_owner.
> 
>>    b. Application of current_gfp_context() - also only affects the
>>       slowpath
> 
> PF_MEMALLOC_PIN affects the fast path, but ALLOC_NOLOCK users
> won't be affected.

And it wouldn't be wrong if they were? It only clears __GFP_MOVABLE?

> What about alloc_flags_nofragment/nonblocking()?

ALLOC_NOFRAGMENT due to e.g. defrag_mode could be a problem indeed, if
there's no slowpath. Make ALLOC_NOLOCK override it?

nonblocking() is probably fine?

> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-29 13:11 ` [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Brendan Jackman
  2026-06-30 13:36   ` Harry Yoo
@ 2026-06-30 16:16   ` Vlastimil Babka (SUSE)
  2026-06-30 18:47     ` Brendan Jackman
  1 sibling, 1 reply; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 16:16 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
> main entry point function is significantly different from the normal

Let's mention it explicitly, alloc_frozen_pages_nolock_noprof().

> __alloc_frozen_pages_nolock(), this is tiring when reading the code.

You mean __alloc_frozen_pages_noprof()?

> 
> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only

Again __alloc_frozen_pages_noprof()

> exposed to mm/) and then turn the nolock variant into a thin wrapper
> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
> how some of the wrappers in gfp.h do).
> 
> Rationale that this doesn't change anything:
> 
> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
>    the new alloc_order_allowed(), alloc_trylock_allowed() and
>    gfp_trylock.

Should be alloc_nolock_allowed() and gfp_nolock

> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
>    previously in the nolock variant:
> 
>    a. Application of gfp_allowed_mask; this only affects early boot, and
>       only flags that affect the slowpath get changed here.

As discussed in reply to Harry, I'd mention the flags excluded by
GFP_BOOT_MASK are not usable by _nolock() anyway.

>    b. Application of current_gfp_context() - also only affects the
>       slowpath
> 
> 3. The slowpath itself: this is now just explicitly skipped under
>    !ALLOC_TRYLOCK.

ALLOC_NOLOCK.

> 
> Ulterior motive: adding an alloc_flags arg to the allocator's
> mm-internal entrypoint can later be used to do more allocation
> customisation without needing to create new GFP flags.
> 
> While adding this flag to a bunch of places, create ALLOC_DEFAULT to
> avoid a mysterious literal 0 in most places.


> alloc_frozen_pages_noprof()
> is defined above the alloc flags so just leave that as a slightly messy
> exception instead of trying to fully reorder mm/internal.h for that one
> case.

This no longer applies in v3?

> No functional change intended.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>  mm/hugetlb.c    |   3 +-
>  mm/mempolicy.c  |  10 ++--
>  mm/page_alloc.c | 178 +++++++++++++++++++++++++++++---------------------------
>  mm/page_alloc.h |   6 +-
>  mm/slub.c       |   6 +-
>  5 files changed, 108 insertions(+), 95 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f7925624c4d2e..dfcfcfa4715bf 100644

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a3ba63c7f9199..8d409d075e3e9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5222,7 +5222,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
>  		}
>  		nr_account++;
>  
> -		prep_new_page(page, 0, gfp, 0);
> +		prep_new_page(page, 0, gfp, ALLOC_DEFAULT);
>  		set_page_refcounted(page);
>  		page_array[nr_populated++] = page;
>  	}
> @@ -5271,24 +5271,98 @@ void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>  	}
>  }
>  
> -/*
> - * This is the 'heart' of the zoned buddy allocator.
> - */
> -struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
> -		int preferred_nid, nodemask_t *nodemask)
> +static inline bool alloc_order_allowed(gfp_t gfp, unsigned int order,
> +				       unsigned int alloc_flags)
>  {
> -	struct page *page;
> -	unsigned int fastpath_alloc_flags = ALLOC_WMARK_LOW;
> -	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
> -	struct alloc_context ac = { };
> +	if (alloc_flags & ALLOC_NOLOCK)
> +		return pcp_allowed_order(order);
>  
>  	/*
>  	 * There are several places where we assume that the order value is sane
>  	 * so bail out early if the request is out of bound.
>  	 */
> -	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> +	return !(WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp));
> +}
> +
> +static inline bool alloc_trylock_allowed(void)

alloc_nolock_allowed()

> +{
> +	/*
> +	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
> +	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
> +	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
> +	 * mark the task as the owner of another rt_spin_lock which will
> +	 * confuse PI logic, so return immediately if called from hard IRQ or
> +	 * NMI.
> +	 *
> +	 * Note, irqs_disabled() case is ok. This function can be called
> +	 * from raw_spin_lock_irqsave region.
> +	 */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> +		return false;
> +
> +	/* On UP, spin_trylock() always succeeds even when it is locked */
> +	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> +		return false;
> +
> +	/* Bailout, since _deferred_grow_zone() needs to take a lock */
> +	if (deferred_pages_enabled())
> +		return false;
> +
> +	return true;
> +}
> +
> +/*
> + * GFP flags to set for ALLOC_NOLOCK i.e. alloc_pages_nolock().
> + *
> + * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
> + * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
> + * is not safe in arbitrary context.
> + *
> + * These two are the conditions for gfpflags_allow_spinning() being true.
> + *
> + * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
> + * to warn. Also warn would trigger printk() which is unsafe from
> + * various contexts. We cannot use printk_deferred_enter() to mitigate,
> + * since the running context is unknown.
> + *
> + * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
> + * is safe in any context. Also zeroing the page is mandatory for
> + * BPF use cases.
> + *
> + * Though __GFP_NOMEMALLOC is not checked in the code path below,
> + * specify it here to highlight that alloc_pages_nolock()
> + * doesn't want to deplete reserves.
> + */
> +static const gfp_t gfp_nolock = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC |
> +				__GFP_COMP;
> +
> +/*
> + * This is the 'heart' of the zoned buddy allocator.
> + */
> +struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
> +		int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
> +{
> +	struct page *page;
> +	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
> +	struct alloc_context ac = { };
> +	unsigned int fastpath_alloc_flags = alloc_flags;
> +
> +	/* Other flags could be supported later if needed. */
> +	if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
>  		return NULL;
>  
> +	if (!alloc_order_allowed(gfp, order, alloc_flags))
> +		return NULL;
> +
> +	if (alloc_flags & ALLOC_NOLOCK) {
> +		VM_WARN_ON_ONCE(gfp & ~__GFP_ACCOUNT);
> +		if (!alloc_trylock_allowed())
> +			return NULL;
> +		gfp |= gfp_nolock;

I think we could do a
		fastpath_alloc_flags |= ALLOC_WMARK_MIN;

to make it explicit, even though it's a no-op (the value is 0) and
alloc_frozen_pages_nolock_noprof() didn't do it.

> +	} else {
> +		fastpath_alloc_flags |= ALLOC_WMARK_LOW;
> +	}
> +
>  	gfp &= gfp_allowed_mask;
>  	/*
>  	 * Apply scoped allocation constraints. This is mainly about GFP_NOFS
> @@ -5310,9 +5384,9 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>  	fastpath_alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
>  	fastpath_alloc_flags |= alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
>  
> -	/* First allocation attempt */
> +	/* First allocation attempt (or, for nolock, only attempt) */
>  	page = get_page_from_freelist(alloc_gfp, order, fastpath_alloc_flags, &ac);
> -	if (likely(page))
> +	if (likely(page) || (alloc_flags & ALLOC_NOLOCK))
>  		goto out;
>  
>  	alloc_gfp = gfp;
> @@ -5329,7 +5403,8 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>  out:
>  	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
>  	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
> -		free_frozen_pages(page, order);
> +		__free_frozen_pages(page, order,
> +				    alloc_flags & ALLOC_NOLOCK ? FPI_TRYLOCK : 0);
>  		page = NULL;
>  	}
>  
> @@ -5345,7 +5420,8 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
>  {
>  	struct page *page;
>  
> -	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask);
> +	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
> +					   ALLOC_DEFAULT);
>  	if (page)
>  		set_page_refcounted(page);
>  	return page;
> @@ -7875,80 +7951,10 @@ static bool __free_unaccepted(struct page *page)
>  
>  struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order)
>  {
> -	/*
> -	 * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
> -	 * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
> -	 * is not safe in arbitrary context.
> -	 *
> -	 * These two are the conditions for gfpflags_allow_spinning() being true.
> -	 *
> -	 * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
> -	 * to warn. Also warn would trigger printk() which is unsafe from
> -	 * various contexts. We cannot use printk_deferred_enter() to mitigate,
> -	 * since the running context is unknown.
> -	 *
> -	 * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
> -	 * is safe in any context. Also zeroing the page is mandatory for
> -	 * BPF use cases.
> -	 *
> -	 * Though __GFP_NOMEMALLOC is not checked in the code path below,
> -	 * specify it here to highlight that alloc_pages_nolock()
> -	 * doesn't want to deplete reserves.
> -	 */
> -	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC | __GFP_COMP
> -			| gfp_flags;
> -	unsigned int alloc_flags = ALLOC_NOLOCK;
> -	struct alloc_context ac = { };
> -	struct page *page;
> -
> -	VM_WARN_ON_ONCE(gfp_flags & ~__GFP_ACCOUNT);
> -	/*
> -	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
> -	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
> -	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
> -	 * mark the task as the owner of another rt_spin_lock which will
> -	 * confuse PI logic, so return immediately if called from hard IRQ or
> -	 * NMI.
> -	 *
> -	 * Note, irqs_disabled() case is ok. This function can be called
> -	 * from raw_spin_lock_irqsave region.
> -	 */
> -	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> -		return NULL;
> -
> -	/* On UP, spin_trylock() always succeeds even when it is locked */
> -	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> -		return NULL;
> -
> -	if (!pcp_allowed_order(order))
> -		return NULL;
> -
> -	/* Bailout, since _deferred_grow_zone() needs to take a lock */
> -	if (deferred_pages_enabled())
> -		return NULL;
> -
>  	if (nid == NUMA_NO_NODE)
>  		nid = numa_node_id();
>  
> -	prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac,
> -			    &alloc_gfp, &alloc_flags);
> -
> -	/*
> -	 * Best effort allocation from percpu free list.
> -	 * If it's empty attempt to spin_trylock zone->lock.
> -	 */
> -	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
> -
> -	/* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */
> -
> -	if (memcg_kmem_online() && page && (gfp_flags & __GFP_ACCOUNT) &&
> -	    unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) {
> -		__free_frozen_pages(page, order, FPI_TRYLOCK);
> -		page = NULL;
> -	}
> -	trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype);
> -	kmsan_alloc_page(page, order, alloc_gfp);
> -	return page;
> +	return __alloc_frozen_pages_noprof(gfp_flags, order, nid, NULL, ALLOC_NOLOCK);
>  }
>  /**
>   * alloc_pages_nolock - opportunistic reentrant allocation from any context
> diff --git a/mm/page_alloc.h b/mm/page_alloc.h
> index 3250d44f96457..e16f905f859a7 100644
> --- a/mm/page_alloc.h
> +++ b/mm/page_alloc.h
> @@ -11,6 +11,7 @@
>  #include <linux/nodemask.h>
>  #include <linux/types.h>
>  
> +#define ALLOC_DEFAULT		0
>  /* The ALLOC_WMARK bits are used as an index to zone->watermark */
>  #define ALLOC_WMARK_MIN		WMARK_MIN
>  #define ALLOC_WMARK_LOW		WMARK_LOW
> @@ -219,7 +220,7 @@ extern bool free_pages_prepare(struct page *page, unsigned int order);
>  extern int user_min_free_kbytes;
>  
>  struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
> -		nodemask_t *nodemask);
> +		nodemask_t *nodemask, unsigned int alloc_flags);
>  #define __alloc_frozen_pages(...) \
>  	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
>  void free_frozen_pages(struct page *page, unsigned int order);
> @@ -230,7 +231,8 @@ struct page *alloc_frozen_pages_noprof(gfp_t, unsigned int order);
>  #else
>  static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order)
>  {
> -	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL);
> +	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL,
> +					   0 /* ALLOC_DEFAULT */);

Can use ALLOC_DEFAULT now.

>  }
>  #endif
>  


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs
  2026-06-29 13:11 ` [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs Brendan Jackman
  2026-06-30 13:52   ` Harry Yoo
@ 2026-06-30 16:42   ` Vlastimil Babka (SUSE)
  1 sibling, 0 replies; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 16:42 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> This WARN forbids setting other flags than __GFP_ACCOUNT but we
> unconditionally set the ones in gfp_nolock so they are certainly fine
> for the caller to set.
> 
> There are other GFP flags that are almost certainly fine to set here;
> Willy noted GFP_HIGHMEM, GFP_DMA, GFP_MOVABLE and GFP_HARDWALL. But,
> nolock allocation is rather special, so be conservative to try and
> ensure we have a chance to think carefully before nontrivial new
> usecases arise.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Link: https://lore.kernel.org/linux-mm/ajS96fWbG4dzP3u3@casper.infradead.org/
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8d409d075e3e9..9cb3f1665b41b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5355,7 +5355,8 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>  		return NULL;
>  
>  	if (alloc_flags & ALLOC_NOLOCK) {
> -		VM_WARN_ON_ONCE(gfp & ~__GFP_ACCOUNT);
> +		/* Certain other flags could be supported later if needed. */
> +		VM_WARN_ON_ONCE(gfp & ~(__GFP_ACCOUNT | gfp_nolock));
>  		if (!alloc_trylock_allowed())
>  			return NULL;
>  		gfp |= gfp_nolock;
> 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h
  2026-06-29 13:11 ` [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h Brendan Jackman
@ 2026-06-30 16:42   ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 39+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-30 16:42 UTC (permalink / raw)
  To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
	Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
	Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On 6/29/26 15:11, Brendan Jackman wrote:
> Some of this stuff in the public header is only used internally so
> shrink the scope to avoid silently growing new users.
> 
> drain_local_pages() is still used from kernel/power/snapshot.c so that
> needs to stay behind.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>

Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  include/linux/gfp.h | 26 --------------------------
>  mm/page_alloc.h     | 28 ++++++++++++++++++++++++++++
>  mm/vmstat.c         |  1 +
>  3 files changed, 29 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index cdf95a9f0b87c..01d6d2591f49e 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -17,28 +17,6 @@ struct mempolicy;
>  #define __default_gfp(a,b,...) b
>  #define default_gfp(...) __default_gfp(,##__VA_ARGS__,GFP_KERNEL)
>  
> -/* Convert GFP flags to their corresponding migrate type */
> -#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
> -#define GFP_MOVABLE_SHIFT 3
> -
> -static inline int gfp_migratetype(const gfp_t gfp_flags)
> -{
> -	VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
> -	BUILD_BUG_ON((1UL << GFP_MOVABLE_SHIFT) != ___GFP_MOVABLE);
> -	BUILD_BUG_ON((___GFP_MOVABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_MOVABLE);
> -	BUILD_BUG_ON((___GFP_RECLAIMABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_RECLAIMABLE);
> -	BUILD_BUG_ON(((___GFP_MOVABLE | ___GFP_RECLAIMABLE) >>
> -		      GFP_MOVABLE_SHIFT) != MIGRATE_HIGHATOMIC);
> -
> -	if (unlikely(page_group_by_mobility_disabled))
> -		return MIGRATE_UNMOVABLE;
> -
> -	/* Group based on mobility */
> -	return (__force unsigned long)(gfp_flags & GFP_MOVABLE_MASK) >> GFP_MOVABLE_SHIFT;
> -}
> -#undef GFP_MOVABLE_MASK
> -#undef GFP_MOVABLE_SHIFT
> -
>  static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
>  {
>  	return !!(gfp_flags & __GFP_DIRECT_RECLAIM);
> @@ -395,10 +373,6 @@ extern void free_pages(unsigned long addr, unsigned int order);
>  #define __free_page(page) __free_pages((page), 0)
>  #define free_page(addr) free_pages((addr), 0)
>  
> -void page_alloc_init_cpuhp(void);
> -bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp);
> -void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> -void drain_all_pages(struct zone *zone);
>  void drain_local_pages(struct zone *zone);
>  
>  void page_alloc_init_late(void);
> diff --git a/mm/page_alloc.h b/mm/page_alloc.h
> index e16f905f859a7..af83764788b96 100644
> --- a/mm/page_alloc.h
> +++ b/mm/page_alloc.h
> @@ -266,6 +266,34 @@ static inline bool free_area_empty(struct free_area *area, int migratetype)
>  	return list_empty(&area->free_list[migratetype]);
>  }
>  
> +/* Convert GFP flags to their corresponding migrate type */
> +#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
> +#define GFP_MOVABLE_SHIFT 3
> +
> +static inline int gfp_migratetype(const gfp_t gfp_flags)
> +{
> +	VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
> +	BUILD_BUG_ON((1UL << GFP_MOVABLE_SHIFT) != ___GFP_MOVABLE);
> +	BUILD_BUG_ON((___GFP_MOVABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_MOVABLE);
> +	BUILD_BUG_ON((___GFP_RECLAIMABLE >> GFP_MOVABLE_SHIFT) != MIGRATE_RECLAIMABLE);
> +	BUILD_BUG_ON(((___GFP_MOVABLE | ___GFP_RECLAIMABLE) >>
> +		      GFP_MOVABLE_SHIFT) != MIGRATE_HIGHATOMIC);
> +
> +	if (unlikely(page_group_by_mobility_disabled))
> +		return MIGRATE_UNMOVABLE;
> +
> +	/* Group based on mobility */
> +	return (__force unsigned long)(gfp_flags & GFP_MOVABLE_MASK) >> GFP_MOVABLE_SHIFT;
> +}
> +#undef GFP_MOVABLE_MASK
> +#undef GFP_MOVABLE_SHIFT
> +
> +bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp);
> +void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> +void drain_all_pages(struct zone *zone);
> +void drain_local_pages(struct zone *zone);
> +
> +void page_alloc_init_cpuhp(void);
>  void page_alloc_sysctl_init(void);
>  
>  #endif /* __MM_PAGE_ALLOC_H */
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 7b93fbf9af092..3b5cb1031f720 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -30,6 +30,7 @@
>  #include <linux/sched/isolation.h>
>  
>  #include "internal.h"
> +#include "page_alloc.h"
>  
>  #ifdef CONFIG_PROC_FS
>  #ifdef CONFIG_NUMA
> 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-30 15:34     ` Vlastimil Babka (SUSE)
@ 2026-06-30 16:56       ` Brendan Jackman
  0 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 16:56 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE), Harry Yoo, Brendan Jackman, Andrew Morton,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
	linux-mm, linux-kernel, linux-rt-devel

On Tue Jun 30, 2026 at 3:34 PM UTC, Vlastimil Babka (SUSE) wrote:
> On 6/30/26 15:36, Harry Yoo wrote:
>> 
>> 
>> On 6/29/26 10:11 PM, Brendan Jackman wrote:
>>> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
>>> main entry point function is significantly different from the normal
>>> __alloc_frozen_pages_nolock(), this is tiring when reading the code.
>>> 
>>> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
>>> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
>>> exposed to mm/) and then turn the nolock variant into a thin wrapper
>>> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
>>> how some of the wrappers in gfp.h do).
>>> 
>>> Rationale that this doesn't change anything:
>>>
>>> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
>>>    the new alloc_order_allowed(), alloc_trylock_allowed() and
>>>    gfp_trylock.
>> 
>> Right.
>> 
>>> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
>>>    previously in the nolock variant:
>>> 
>>>    a. Application of gfp_allowed_mask; this only affects early boot, and
>>>       only flags that affect the slowpath get changed here.
>> 
>> gfp_allowed_mask clears __GFP_RECLAIM, and that means now allocations
>> with GFP_KERNEL during early boot would see
>> gfpflags_allow_spinning() = false.
>
> Is it a problem though? non-nolock allocations were affected before (the
> masking existed for those already) and will be affected now the same, and
> _nolock() allocations don't pass __GFP_RECLAIM in the first place, so the
> masking can't affect them?

This was my thinking too.

>> The helper is not used in in the page allocator, but used in
>> memcg/stackdepot/page_owner.
>> 
>>>    b. Application of current_gfp_context() - also only affects the
>>>       slowpath
>> 
>> PF_MEMALLOC_PIN affects the fast path, but ALLOC_NOLOCK users
>> won't be affected.
>
> And it wouldn't be wrong if they were? It only clears __GFP_MOVABLE?
>
>> What about alloc_flags_nofragment/nonblocking()?
>
> ALLOC_NOFRAGMENT due to e.g. defrag_mode could be a problem indeed, if
> there's no slowpath. Make ALLOC_NOLOCK override it?

Yeah calling alloc_flags_nofragment() here is a bug in the patch,
and Sashiko also complained: 

https://lore.kernel.org/all/20260629142921.9A05A1F000E9@smtp.kernel.org/

Like I said in the reply to that thread I think maybe we _do_ want to
set ALLOC_NOFRAGMENT for nolock allocations? But, that is a functional
change, it doesn't belong in this series.

> nonblocking() is probably fine?

Yeah, I believe this is fine.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-30 13:36   ` Harry Yoo
  2026-06-30 15:34     ` Vlastimil Babka (SUSE)
@ 2026-06-30 17:04     ` Brendan Jackman
  1 sibling, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 17:04 UTC (permalink / raw)
  To: Harry Yoo, Brendan Jackman, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
	linux-mm, linux-kernel, linux-rt-devel

On Tue Jun 30, 2026 at 1:36 PM UTC, Harry Yoo wrote:
>> Ulterior motive: adding an alloc_flags arg to the allocator's
>> mm-internal entrypoint can later be used to do more allocation
>> customisation without needing to create new GFP flags.
>> 
>> While adding this flag to a bunch of places, create ALLOC_DEFAULT to
>> avoid a mysterious literal 0 in most places.
>>
>> alloc_frozen_pages_noprof() is defined above the alloc flags
>
> The function is defined below the alloc flags, no?

Yep this paragraph is stale since I created mm/page_alloc.h, will remove
it.

>> so just leave that as a slightly messy
>> exception instead of trying to fully reorder mm/internal.h for that one
>> case.
>> 
>> No functional change intended.
>> 
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>>  mm/hugetlb.c    |   3 +-
>>  mm/mempolicy.c  |  10 ++--
>>  mm/page_alloc.c | 178 +++++++++++++++++++++++++++++---------------------------
>>  mm/page_alloc.h |   6 +-
>>  mm/slub.c       |   6 +-
>>  5 files changed, 108 insertions(+), 95 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index a3ba63c7f9199..8d409d075e3e9 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5271,24 +5271,98 @@ void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>>  	}
>>  }
>>  
>> +static inline bool alloc_trylock_allowed(void)
>> +{
>> +	/*
>> +	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
>> +	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
>> +	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
>> +	 * mark the task as the owner of another rt_spin_lock which will
>> +	 * confuse PI logic, so return immediately if called from hard IRQ or
>> +	 * NMI.
>> +	 *
>> +	 * Note, irqs_disabled() case is ok. This function can be called
>> +	 * from raw_spin_lock_irqsave region.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
>> +		return false;
>> +
>> +	/* On UP, spin_trylock() always succeeds even when it is locked */
>> +	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
>> +		return false;
>
> Except for deferred_pages_enabled(), it's not specific to the page
> allocator. SLUB has
>
> 	/*
> 	 * See the comment for the same check in
> 	 * alloc_frozen_pages_nolock_noprof()
> 	 */
>
> ... and repeats the same thing as above.
>
> Perhaps let's factor it out into a helper
> rather than trying not to forget to update the other place?

Hm, not sure about this. I think I would say it's a "coincidence" that
these two bits of code look the same? Like, page_alloc.c uses
spin_trylock() so you can't do alloc_pages_nolock() from IRQ on
PREEMPT_RT. slub.c ALSO uses spin_trylock(), so you ALSO can't use
kmalloc_nolock() in those scenarios. But those are two different facts
that just happen to be isomorphic? Putting them into a shared helper
would kinda imply that these are part of a single system with inherently
coupled constraints.

I dunno I'm being a bit of a ponderous philosopher there, I don't have
particularly strong feelings. But I'd lean towards leaving this out of
the patchset since the potential deduplication isn't really related to
the other cleanups anyway.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes
  2026-06-30 12:38   ` Vlastimil Babka (SUSE)
@ 2026-06-30 17:25     ` Brendan Jackman
  0 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 17:25 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE), Brendan Jackman, Andrew Morton,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On Tue Jun 30, 2026 at 12:38 PM UTC, Vlastimil Babka (SUSE) wrote:
> On 6/29/26 15:11, Brendan Jackman wrote:
>> It's pretty confusing that:
>> 
>> - The slowpath and fastpath have a totally distinct set of alloc_flags.
>> 
>> - gfp_to_alloc_flags() sounds generic but it only influences the
>>   slowpath.
>> 
>> Rename some variables to highlight which alloc_flags are
>> fastpath-specific. Rename gfp_to_alloc_flags() to highlight that it's
>> slowpath-specific.
>> 
>> gfp_to_alloc_flags_cma() and gfp_to_alloc_flags_nonblocking() currently
>> have perfectly harmless names, but to keep the naming consistent also
>> rename those to the alloc_flags_*() pattern (which already exists for
>> alloc_flags_nofragment()).
>
> How annoying that alloc_flags_nofragment() doesn't have gfp as the first
> parameter, unlike others.
> Oh well, must resist too much OCD :)
>
> Uh, more annoyingly, alloc_flags_cma() takes alloc_flags and returns
> augmented alloc flags, so there's stuff like
>
> *alloc_flags = alloc_flags_cma(gfp_mask, *alloc_flags);
>
> Since we're unifying, it could be make to work additively like others? Then:
>
> *alloc_flags |= alloc_flags_cma(gfp_mask);

Sure, I can chuck this on as an extra patch.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
  2026-06-30 16:16   ` Vlastimil Babka (SUSE)
@ 2026-06-30 18:47     ` Brendan Jackman
  0 siblings, 0 replies; 39+ messages in thread
From: Brendan Jackman @ 2026-06-30 18:47 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE), Brendan Jackman, Andrew Morton,
	Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
	Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
	Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
  Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
	Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel

On Tue Jun 30, 2026 at 4:16 PM UTC, Vlastimil Babka (SUSE) wrote:
> On 6/29/26 15:11, Brendan Jackman wrote:
>> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
>> main entry point function is significantly different from the normal
>
> Let's mention it explicitly, alloc_frozen_pages_nolock_noprof().
>
>> __alloc_frozen_pages_nolock(), this is tiring when reading the code.
>
> You mean __alloc_frozen_pages_noprof()?
>
>> 
>> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
>> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
>
> Again __alloc_frozen_pages_noprof()
>
>> exposed to mm/) and then turn the nolock variant into a thin wrapper
>> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
>> how some of the wrappers in gfp.h do).
>> 
>> Rationale that this doesn't change anything:
>> 
>> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
>>    the new alloc_order_allowed(), alloc_trylock_allowed() and
>>    gfp_trylock.
>
> Should be alloc_nolock_allowed() and gfp_nolock
>
>> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
>>    previously in the nolock variant:
>> 
>>    a. Application of gfp_allowed_mask; this only affects early boot, and
>>       only flags that affect the slowpath get changed here.
>
> As discussed in reply to Harry, I'd mention the flags excluded by
> GFP_BOOT_MASK are not usable by _nolock() anyway.
>
>>    b. Application of current_gfp_context() - also only affects the
>>       slowpath
>> 
>> 3. The slowpath itself: this is now just explicitly skipped under
>>    !ALLOC_TRYLOCK.
>
> ALLOC_NOLOCK.
>
>> 
>> Ulterior motive: adding an alloc_flags arg to the allocator's
>> mm-internal entrypoint can later be used to do more allocation
>> customisation without needing to create new GFP flags.
>> 
>> While adding this flag to a bunch of places, create ALLOC_DEFAULT to
>> avoid a mysterious literal 0 in most places.
>
>
>> alloc_frozen_pages_noprof()
>> is defined above the alloc flags so just leave that as a slightly messy
>> exception instead of trying to fully reorder mm/internal.h for that one
>> case.
>
> This no longer applies in v3?
>
>> No functional change intended.
>> 
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>>  mm/hugetlb.c    |   3 +-
>>  mm/mempolicy.c  |  10 ++--
>>  mm/page_alloc.c | 178 +++++++++++++++++++++++++++++---------------------------
>>  mm/page_alloc.h |   6 +-
>>  mm/slub.c       |   6 +-
>>  5 files changed, 108 insertions(+), 95 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index f7925624c4d2e..dfcfcfa4715bf 100644
>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index a3ba63c7f9199..8d409d075e3e9 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5222,7 +5222,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
>>  		}
>>  		nr_account++;
>>  
>> -		prep_new_page(page, 0, gfp, 0);
>> +		prep_new_page(page, 0, gfp, ALLOC_DEFAULT);
>>  		set_page_refcounted(page);
>>  		page_array[nr_populated++] = page;
>>  	}
>> @@ -5271,24 +5271,98 @@ void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>>  	}
>>  }
>>  
>> -/*
>> - * This is the 'heart' of the zoned buddy allocator.
>> - */
>> -struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>> -		int preferred_nid, nodemask_t *nodemask)
>> +static inline bool alloc_order_allowed(gfp_t gfp, unsigned int order,
>> +				       unsigned int alloc_flags)
>>  {
>> -	struct page *page;
>> -	unsigned int fastpath_alloc_flags = ALLOC_WMARK_LOW;
>> -	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
>> -	struct alloc_context ac = { };
>> +	if (alloc_flags & ALLOC_NOLOCK)
>> +		return pcp_allowed_order(order);
>>  
>>  	/*
>>  	 * There are several places where we assume that the order value is sane
>>  	 * so bail out early if the request is out of bound.
>>  	 */
>> -	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
>> +	return !(WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp));
>> +}
>> +
>> +static inline bool alloc_trylock_allowed(void)
>
> alloc_nolock_allowed()
>
>> +{
>> +	/*
>> +	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
>> +	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
>> +	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
>> +	 * mark the task as the owner of another rt_spin_lock which will
>> +	 * confuse PI logic, so return immediately if called from hard IRQ or
>> +	 * NMI.
>> +	 *
>> +	 * Note, irqs_disabled() case is ok. This function can be called
>> +	 * from raw_spin_lock_irqsave region.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
>> +		return false;
>> +
>> +	/* On UP, spin_trylock() always succeeds even when it is locked */
>> +	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
>> +		return false;
>> +
>> +	/* Bailout, since _deferred_grow_zone() needs to take a lock */
>> +	if (deferred_pages_enabled())
>> +		return false;
>> +
>> +	return true;
>> +}
>> +
>> +/*
>> + * GFP flags to set for ALLOC_NOLOCK i.e. alloc_pages_nolock().
>> + *
>> + * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
>> + * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
>> + * is not safe in arbitrary context.
>> + *
>> + * These two are the conditions for gfpflags_allow_spinning() being true.
>> + *
>> + * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
>> + * to warn. Also warn would trigger printk() which is unsafe from
>> + * various contexts. We cannot use printk_deferred_enter() to mitigate,
>> + * since the running context is unknown.
>> + *
>> + * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
>> + * is safe in any context. Also zeroing the page is mandatory for
>> + * BPF use cases.
>> + *
>> + * Though __GFP_NOMEMALLOC is not checked in the code path below,
>> + * specify it here to highlight that alloc_pages_nolock()
>> + * doesn't want to deplete reserves.
>> + */
>> +static const gfp_t gfp_nolock = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC |
>> +				__GFP_COMP;
>> +
>> +/*
>> + * This is the 'heart' of the zoned buddy allocator.
>> + */
>> +struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>> +		int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
>> +{
>> +	struct page *page;
>> +	gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
>> +	struct alloc_context ac = { };
>> +	unsigned int fastpath_alloc_flags = alloc_flags;
>> +
>> +	/* Other flags could be supported later if needed. */
>> +	if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
>>  		return NULL;
>>  
>> +	if (!alloc_order_allowed(gfp, order, alloc_flags))
>> +		return NULL;
>> +
>> +	if (alloc_flags & ALLOC_NOLOCK) {
>> +		VM_WARN_ON_ONCE(gfp & ~__GFP_ACCOUNT);
>> +		if (!alloc_trylock_allowed())
>> +			return NULL;
>> +		gfp |= gfp_nolock;
>
> I think we could do a
> 		fastpath_alloc_flags |= ALLOC_WMARK_MIN;
>
> to make it explicit, even though it's a no-op (the value is 0) and
> alloc_frozen_pages_nolock_noprof() didn't do it.
>
>> +	} else {
>> +		fastpath_alloc_flags |= ALLOC_WMARK_LOW;
>> +	}
>> +
>>  	gfp &= gfp_allowed_mask;
>>  	/*
>>  	 * Apply scoped allocation constraints. This is mainly about GFP_NOFS
>> @@ -5310,9 +5384,9 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>>  	fastpath_alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
>>  	fastpath_alloc_flags |= alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
>>  
>> -	/* First allocation attempt */
>> +	/* First allocation attempt (or, for nolock, only attempt) */
>>  	page = get_page_from_freelist(alloc_gfp, order, fastpath_alloc_flags, &ac);
>> -	if (likely(page))
>> +	if (likely(page) || (alloc_flags & ALLOC_NOLOCK))
>>  		goto out;
>>  
>>  	alloc_gfp = gfp;
>> @@ -5329,7 +5403,8 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
>>  out:
>>  	if (memcg_kmem_online() && (gfp & __GFP_ACCOUNT) && page &&
>>  	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
>> -		free_frozen_pages(page, order);
>> +		__free_frozen_pages(page, order,
>> +				    alloc_flags & ALLOC_NOLOCK ? FPI_TRYLOCK : 0);
>>  		page = NULL;
>>  	}
>>  
>> @@ -5345,7 +5420,8 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
>>  {
>>  	struct page *page;
>>  
>> -	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask);
>> +	page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
>> +					   ALLOC_DEFAULT);
>>  	if (page)
>>  		set_page_refcounted(page);
>>  	return page;
>> @@ -7875,80 +7951,10 @@ static bool __free_unaccepted(struct page *page)
>>  
>>  struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned int order)
>>  {
>> -	/*
>> -	 * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed.
>> -	 * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd
>> -	 * is not safe in arbitrary context.
>> -	 *
>> -	 * These two are the conditions for gfpflags_allow_spinning() being true.
>> -	 *
>> -	 * Specify __GFP_NOWARN since failing alloc_pages_nolock() is not a reason
>> -	 * to warn. Also warn would trigger printk() which is unsafe from
>> -	 * various contexts. We cannot use printk_deferred_enter() to mitigate,
>> -	 * since the running context is unknown.
>> -	 *
>> -	 * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below
>> -	 * is safe in any context. Also zeroing the page is mandatory for
>> -	 * BPF use cases.
>> -	 *
>> -	 * Though __GFP_NOMEMALLOC is not checked in the code path below,
>> -	 * specify it here to highlight that alloc_pages_nolock()
>> -	 * doesn't want to deplete reserves.
>> -	 */
>> -	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC | __GFP_COMP
>> -			| gfp_flags;
>> -	unsigned int alloc_flags = ALLOC_NOLOCK;
>> -	struct alloc_context ac = { };
>> -	struct page *page;
>> -
>> -	VM_WARN_ON_ONCE(gfp_flags & ~__GFP_ACCOUNT);
>> -	/*
>> -	 * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
>> -	 * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
>> -	 * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
>> -	 * mark the task as the owner of another rt_spin_lock which will
>> -	 * confuse PI logic, so return immediately if called from hard IRQ or
>> -	 * NMI.
>> -	 *
>> -	 * Note, irqs_disabled() case is ok. This function can be called
>> -	 * from raw_spin_lock_irqsave region.
>> -	 */
>> -	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
>> -		return NULL;
>> -
>> -	/* On UP, spin_trylock() always succeeds even when it is locked */
>> -	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
>> -		return NULL;
>> -
>> -	if (!pcp_allowed_order(order))
>> -		return NULL;
>> -
>> -	/* Bailout, since _deferred_grow_zone() needs to take a lock */
>> -	if (deferred_pages_enabled())
>> -		return NULL;
>> -
>>  	if (nid == NUMA_NO_NODE)
>>  		nid = numa_node_id();
>>  
>> -	prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac,
>> -			    &alloc_gfp, &alloc_flags);
>> -
>> -	/*
>> -	 * Best effort allocation from percpu free list.
>> -	 * If it's empty attempt to spin_trylock zone->lock.
>> -	 */
>> -	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
>> -
>> -	/* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */
>> -
>> -	if (memcg_kmem_online() && page && (gfp_flags & __GFP_ACCOUNT) &&
>> -	    unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) {
>> -		__free_frozen_pages(page, order, FPI_TRYLOCK);
>> -		page = NULL;
>> -	}
>> -	trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype);
>> -	kmsan_alloc_page(page, order, alloc_gfp);
>> -	return page;
>> +	return __alloc_frozen_pages_noprof(gfp_flags, order, nid, NULL, ALLOC_NOLOCK);
>>  }
>>  /**
>>   * alloc_pages_nolock - opportunistic reentrant allocation from any context
>> diff --git a/mm/page_alloc.h b/mm/page_alloc.h
>> index 3250d44f96457..e16f905f859a7 100644
>> --- a/mm/page_alloc.h
>> +++ b/mm/page_alloc.h
>> @@ -11,6 +11,7 @@
>>  #include <linux/nodemask.h>
>>  #include <linux/types.h>
>>  
>> +#define ALLOC_DEFAULT		0
>>  /* The ALLOC_WMARK bits are used as an index to zone->watermark */
>>  #define ALLOC_WMARK_MIN		WMARK_MIN
>>  #define ALLOC_WMARK_LOW		WMARK_LOW
>> @@ -219,7 +220,7 @@ extern bool free_pages_prepare(struct page *page, unsigned int order);
>>  extern int user_min_free_kbytes;
>>  
>>  struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, int nid,
>> -		nodemask_t *nodemask);
>> +		nodemask_t *nodemask, unsigned int alloc_flags);
>>  #define __alloc_frozen_pages(...) \
>>  	alloc_hooks(__alloc_frozen_pages_noprof(__VA_ARGS__))
>>  void free_frozen_pages(struct page *page, unsigned int order);
>> @@ -230,7 +231,8 @@ struct page *alloc_frozen_pages_noprof(gfp_t, unsigned int order);
>>  #else
>>  static inline struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order)
>>  {
>> -	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL);
>> +	return __alloc_frozen_pages_noprof(gfp, order, numa_node_id(), NULL,
>> +					   0 /* ALLOC_DEFAULT */);
>
> Can use ALLOC_DEFAULT now.

Thanks and ack to all of these.

Will mention the ALLOC_WMARK_MIN thing in the commit message too.


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2026-06-30 18:48 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 13:11 [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Brendan Jackman
2026-06-29 13:11 ` [PATCH v3 01/16] mm/page_alloc: rename ALLOC_TRYLOCK -> ALLOC_NOLOCK Brendan Jackman
2026-06-30 12:27   ` Vlastimil Babka (SUSE)
2026-06-29 13:11 ` [PATCH v3 02/16] mm/page_alloc: some renames to clarify alloc_flags scopes Brendan Jackman
2026-06-30 12:38   ` Vlastimil Babka (SUSE)
2026-06-30 17:25     ` Brendan Jackman
2026-06-29 13:11 ` [PATCH v3 03/16] mm: name some args in a function declaration Brendan Jackman
2026-06-30 12:43   ` Vlastimil Babka (SUSE)
2026-06-29 13:11 ` [PATCH v3 04/16] mm: Split out internal page_alloc.h Brendan Jackman
2026-06-30 13:54   ` Vlastimil Babka (SUSE)
2026-06-29 13:11 ` [PATCH v3 05/16] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Brendan Jackman
2026-06-30 13:36   ` Harry Yoo
2026-06-30 15:34     ` Vlastimil Babka (SUSE)
2026-06-30 16:56       ` Brendan Jackman
2026-06-30 17:04     ` Brendan Jackman
2026-06-30 16:16   ` Vlastimil Babka (SUSE)
2026-06-30 18:47     ` Brendan Jackman
2026-06-29 13:11 ` [PATCH v3 06/16] mm/page_alloc: relax GFP WARN in nolock allocs Brendan Jackman
2026-06-30 13:52   ` Harry Yoo
2026-06-30 16:42   ` Vlastimil Babka (SUSE)
2026-06-29 13:11 ` [PATCH v3 07/16] mm: move some stuff to mm/page_alloc.h Brendan Jackman
2026-06-30 16:42   ` Vlastimil Babka (SUSE)
2026-06-29 13:11 ` [PATCH v3 08/16] perf/x86/intel: Use higher-level allocator API Brendan Jackman
2026-06-29 13:11 ` [PATCH v3 09/16] KVM: VMX: " Brendan Jackman
2026-06-29 15:31   ` -EXT-[PATCH " Soderlund, David
2026-06-29 13:11 ` [PATCH v3 10/16] x86/virt: " Brendan Jackman
2026-06-29 13:12 ` [PATCH v3 11/16] sgi-xp: " Brendan Jackman
2026-06-29 18:47   ` Steve Wahl
2026-06-29 13:12 ` [PATCH v3 12/16] net/funeth: Switch to " Brendan Jackman
2026-06-29 13:12 ` [PATCH v3 13/16] mm: Remove __alloc_pages_node() Brendan Jackman
2026-06-29 13:12 ` [PATCH v3 14/16] mm: Move __alloc_pages() to mm/page_alloc.h Brendan Jackman
2026-06-29 13:12 ` [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Brendan Jackman
2026-06-30  1:55   ` Hao Ge
2026-06-30 10:10     ` Brendan Jackman
2026-06-30 12:01     ` Brendan Jackman
2026-06-29 13:12 ` [PATCH v3 16/16] mm: remove the __GFP_NO_OBJ_EXT flag Brendan Jackman
2026-06-29 14:00 ` [PATCH v3 00/16] mm: Some cleanups for page allocator APIs Mike Rapoport
2026-06-29 14:30   ` Brendan Jackman
2026-06-29 15:05     ` Brendan Jackman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox