* Re: [PATCH v4 15/18] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
[not found] ` <20260702-alloc-trylock-v4-15-0af8ff387e80@google.com>
@ 2026-07-03 2:29 ` Hao Ge
2026-07-03 9:24 ` Vlastimil Babka (SUSE)
1 sibling, 0 replies; 7+ messages in thread
From: Hao Ge @ 2026-07-03 2:29 UTC (permalink / raw)
To: Brendan Jackman
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, linux-mm, linux-kernel, linux-rt-devel, derkling,
reijiw, Yosry Ahmed, Andrew Morton, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Muchun Song,
Zi Yan, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Hi Brendan
On 2026/7/2 17:49, Brendan Jackman wrote:
> Now that alloc_pages has an entrypoint that allows passing alloc_flags,
> we can take advantage of this to start removing GFP flags that are only
> used for mm-internal stuff.
>
> This requires also plumbing the alloc_flags into some more of the
> allocator code, in particular __alloc_pages[_noprof]() gets an
> alloc_flags arg to go along with its callees, and we now need to pass
> those flags deeper into the allocator so they can reach the alloc_tag
> code.
>
> While moving the flag definition into page_alloc.h, also update the
> comment per Hao's suggestion.
>
> No functional change intended.
>
> Link: https://lore.kernel.org/all/b4916118-3537-4e19-8bc8-1d103dd0d225@linux.dev/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
Tested in the same environment as commit 6b1842775a46,
the boot-time "alloc_tag was not set" warning does not regress.
Aside from the minor nit below, this patch looks correct to me.
Tested-by: Hao Ge <hao.ge@linux.dev>
Acked-by: Hao Ge <hao.ge@linux.dev>
> ---
> include/linux/alloc_tag.h | 4 ++--
> mm/alloc_tag.c | 23 +++++++--------------
> mm/compaction.c | 4 ++--
> mm/internal.h | 1 -
> mm/page_alloc.c | 52 +++++++++++++++++++++++++++--------------------
> mm/page_alloc.h | 14 +++++++++++--
> mm/page_frag_cache.c | 4 ++--
> 7 files changed, 55 insertions(+), 47 deletions(-)
>
> diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
> index 068ba2e77c5d6..fcf90e6b24204 100644
> --- a/include/linux/alloc_tag.h
> +++ b/include/linux/alloc_tag.h
> @@ -163,11 +163,11 @@ static inline void alloc_tag_sub_check(union codetag_ref *ref)
> {
> WARN_ONCE(ref && !ref->ct, "alloc_tag was not set\n");
> }
> -void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags);
> +void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags);
> #else
> static inline void alloc_tag_add_check(union codetag_ref *ref, struct alloc_tag *tag) {}
> static inline void alloc_tag_sub_check(union codetag_ref *ref) {}
> -static inline void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags) {}
> +static inline void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags) {}
> #endif
>
> /* Caller should verify both ref and tag to be valid */
> diff --git a/mm/alloc_tag.c b/mm/alloc_tag.c
> index d9be1cf5187d9..cf65e9992fda3 100644
> --- a/mm/alloc_tag.c
> +++ b/mm/alloc_tag.c
> @@ -15,6 +15,9 @@
> #include <linux/vmalloc.h>
> #include <linux/kmemleak.h>
>
> +#include "internal.h"
> +#include "page_alloc.h"
> +
> #define ALLOCINFO_FILE_NAME "allocinfo"
> #define MODULE_ALLOC_TAG_VMAP_SIZE (100000UL * sizeof(struct alloc_tag))
> #define SECTION_START(NAME) (CODETAG_SECTION_START_PREFIX NAME)
> @@ -783,19 +786,6 @@ struct pfn_pool {
>
> #define PFN_POOL_SIZE ((PAGE_SIZE - offsetof(struct pfn_pool, pfns)) / \
> sizeof(unsigned long))
> -
> -/*
> - * Skip early PFN recording for a page allocation. Reuses the
> - * %__GFP_NO_OBJ_EXT bit. Used by __alloc_tag_add_early_pfn() to avoid
> - * recursion when allocating pages for the early PFN tracking list
> - * itself.
> - *
> - * Codetags of the pages allocated with __GFP_NO_CODETAG should be
> - * cleared (via clear_page_tag_ref()) before freeing the pages to prevent
> - * alloc_tag_sub_check() from triggering a warning.
> - */
> -#define __GFP_NO_CODETAG __GFP_NO_OBJ_EXT
> -
> static struct pfn_pool *current_pfn_pool __initdata;
>
> static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
> @@ -806,7 +796,8 @@ static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
> do {
> pool = READ_ONCE(current_pfn_pool);
> if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
> - struct page *new_page = alloc_page(__GFP_HIGH | __GFP_NO_CODETAG);
> + struct page *new_page = __alloc_pages(__GFP_HIGH, 0, numa_mem_id(),
> + NULL, ALLOC_NO_CODETAG);
> struct pfn_pool *new;
>
> if (!new_page) {
> @@ -837,7 +828,7 @@ typedef void alloc_tag_add_func(unsigned long pfn);
> static alloc_tag_add_func __rcu *alloc_tag_add_early_pfn_ptr __refdata =
> RCU_INITIALIZER(__alloc_tag_add_early_pfn);
>
> -void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
> +void alloc_tag_add_early_pfn(unsigned long pfn, unsigned int alloc_flags)
> {
> alloc_tag_add_func *alloc_tag_add;
>
> @@ -845,7 +836,7 @@ void alloc_tag_add_early_pfn(unsigned long pfn, gfp_t gfp_flags)
> return;
>
> /* Skip allocations for the tracking list itself to avoid recursion. */
> - if (gfp_flags & __GFP_NO_CODETAG)
> + if (alloc_flags & ALLOC_NO_CODETAG)
> return;
>
> rcu_read_lock();
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 7d80735502d9a..4b2318fad4eb5 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -83,7 +83,7 @@ static inline bool is_via_compact_memory(int order) { return false; }
>
> static struct page *mark_allocated_noprof(struct page *page, unsigned int order, gfp_t gfp_flags)
> {
> - post_alloc_hook(page, order, __GFP_MOVABLE);
> + post_alloc_hook(page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
> set_page_refcounted(page);
> return page;
> }
> @@ -1851,7 +1851,7 @@ static struct folio *compaction_alloc_noprof(struct folio *src, unsigned long da
> }
> dst = (struct folio *)freepage;
>
> - post_alloc_hook(&dst->page, order, __GFP_MOVABLE);
> + post_alloc_hook(&dst->page, order, __GFP_MOVABLE, ALLOC_DEFAULT);
> set_page_refcounted(&dst->page);
> if (order)
> prep_compound_page(&dst->page, order);
> diff --git a/mm/internal.h b/mm/internal.h
> index 7e3b2386e274b..3c00eaf5f45a4 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1237,7 +1237,6 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
> enum ttu_flags;
> struct tlbflush_unmap_batch;
>
> -
Nit: The blank line removal in mm/internal.h looks like an unrelated
cleanup.
Thanks
Best Regards
Hao
> /*
> * only for MM internal work items which do not depend on
> * any allocations or locks which might depend on allocations
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f68b2b138a2e8..cfaf16244f56d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1249,7 +1249,7 @@ void __clear_page_tag_ref(struct page *page)
> /* Should be called only if mem_alloc_profiling_enabled() */
> static noinline
> void __pgalloc_tag_add(struct page *page, struct task_struct *task,
> - unsigned int nr, gfp_t gfp_flags)
> + unsigned int nr, unsigned int alloc_flags)
> {
> union pgtag_ref_handle handle;
> union codetag_ref ref;
> @@ -1263,17 +1263,17 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,
> * page_ext is not available yet, record the pfn so we can
> * clear the tag ref later when page_ext is initialized.
> */
> - alloc_tag_add_early_pfn(page_to_pfn(page), gfp_flags);
> + alloc_tag_add_early_pfn(page_to_pfn(page), alloc_flags);
> if (task->alloc_tag)
> alloc_tag_set_inaccurate(task->alloc_tag);
> }
> }
>
> static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
> - unsigned int nr, gfp_t gfp_flags)
> + unsigned int nr, unsigned int alloc_flags)
> {
> if (mem_alloc_profiling_enabled())
> - __pgalloc_tag_add(page, task, nr, gfp_flags);
> + __pgalloc_tag_add(page, task, nr, alloc_flags);
> }
>
> /* Should be called only if mem_alloc_profiling_enabled() */
> @@ -1306,7 +1306,7 @@ static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr)
> #else /* CONFIG_MEM_ALLOC_PROFILING */
>
> static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
> - unsigned int nr, gfp_t gfp_flags) {}
> + unsigned int nr, unsigned int alloc_flags) {}
> static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {}
> static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) {}
>
> @@ -1810,7 +1810,7 @@ static inline bool should_skip_init(gfp_t flags)
> }
>
> inline void post_alloc_hook(struct page *page, unsigned int order,
> - gfp_t gfp_flags)
> + gfp_t gfp_flags, unsigned int alloc_flags)
> {
> const bool zero_tags = gfp_flags & __GFP_ZEROTAGS;
> bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
> @@ -1861,13 +1861,13 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>
> set_page_owner(page, order, gfp_flags);
> page_table_check_alloc(page, order);
> - pgalloc_tag_add(page, current, 1 << order, gfp_flags);
> + pgalloc_tag_add(page, current, 1 << order, alloc_flags);
> }
>
> static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
> unsigned int alloc_flags)
> {
> - post_alloc_hook(page, order, gfp_flags);
> + post_alloc_hook(page, order, gfp_flags, alloc_flags);
>
> if (order && (gfp_flags & __GFP_COMP))
> prep_compound_page(page, order);
> @@ -4078,7 +4078,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> */
> page = get_page_from_freelist((gfp_mask | __GFP_HARDWALL) &
> ~__GFP_DIRECT_RECLAIM, order,
> - ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
> + ac->alloc_flags|ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
> if (page)
> goto out;
>
> @@ -4124,7 +4124,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> */
> if (gfp_mask & __GFP_NOFAIL)
> page = __alloc_pages_cpuset_fallback(gfp_mask, order,
> - ALLOC_NO_WATERMARKS, ac);
> + ac->alloc_flags|ALLOC_NO_WATERMARKS, ac);
> }
> out:
> mutex_unlock(&oom_lock);
> @@ -4791,8 +4791,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * The fast path uses conservative alloc_flags to succeed only until
> * kswapd needs to be woken up, and to avoid the cost of setting up
> * alloc_flags precisely. So we do that now.
> + *
> + * Can't just or alloc_flags if it contains WMARK bits, but those flags
> + * shouldn't be set in ac->alloc_flags.
> */
> - alloc_flags = alloc_flags_slowpath(gfp_mask, order);
> + VM_WARN_ON(ac->alloc_flags & ALLOC_WMARK_MASK);
> + alloc_flags = ac->alloc_flags | alloc_flags_slowpath(gfp_mask, order);
>
> /*
> * We need to recalculate the starting point for the zonelist iterator
> @@ -4834,7 +4838,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
> if (reserve_flags)
> alloc_flags = alloc_flags_cma(gfp_mask, reserve_flags) |
> - (alloc_flags & ALLOC_KSWAPD);
> + ac->alloc_flags | (alloc_flags & ALLOC_KSWAPD);
>
> /*
> * Reset the nodemask and zonelist iterators if memory policies can be
> @@ -5003,6 +5007,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * we always retry
> */
> if (unlikely(nofail)) {
> + unsigned int alloc_flags = ac->alloc_flags | ALLOC_MIN_RESERVE;
> +
> /*
> * Lacking direct_reclaim we can't do anything to reclaim memory,
> * we disregard these unreasonable nofail requests and still
> @@ -5018,7 +5024,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * could deplete whole memory reserves which would just make
> * the situation worse.
> */
> - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE, ac);
> + page = __alloc_pages_cpuset_fallback(gfp_mask, order, alloc_flags, ac);
> if (page)
> goto got_pg;
>
> @@ -5236,7 +5242,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
> return nr_populated;
>
> failed:
> - page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask);
> + page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask, ALLOC_DEFAULT);
> if (page)
> page_array[nr_populated++] = page;
> goto out;
> @@ -5344,11 +5350,13 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
> {
> struct page *page;
> gfp_t alloc_gfp; /* The gfp_t that was actually used for allocation */
> - struct alloc_context ac = { };
> + struct alloc_context ac = {
> + .alloc_flags = alloc_flags,
> + };
> unsigned int fastpath_alloc_flags = alloc_flags;
>
> /* Other flags could be supported later if needed. */
> - if (WARN_ON(alloc_flags & ~ALLOC_NOLOCK))
> + if (WARN_ON(alloc_flags & ~(ALLOC_NOLOCK | ALLOC_NO_CODETAG)))
> return NULL;
>
> if (!alloc_order_allowed(gfp, order, alloc_flags))
> @@ -5421,12 +5429,12 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
> EXPORT_SYMBOL(__alloc_frozen_pages_noprof);
>
> struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
> - int preferred_nid, nodemask_t *nodemask)
> + int preferred_nid, nodemask_t *nodemask, unsigned int alloc_flags)
> {
> struct page *page;
>
> page = __alloc_frozen_pages_noprof(gfp, order, preferred_nid, nodemask,
> - ALLOC_DEFAULT);
> + alloc_flags);
> if (page)
> set_page_refcounted(page);
> return page;
> @@ -5440,7 +5448,7 @@ struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order
> VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> warn_if_node_offline(nid, gfp_mask);
>
> - return __alloc_pages_noprof(gfp_mask, order, nid, NULL);
> + return __alloc_pages_noprof(gfp_mask, order, nid, NULL, ALLOC_DEFAULT);
> }
> EXPORT_SYMBOL(alloc_pages_node_noprof);
>
> @@ -5448,7 +5456,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_
> nodemask_t *nodemask)
> {
> struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order,
> - preferred_nid, nodemask);
> + preferred_nid, nodemask, ALLOC_DEFAULT);
> return page_rmappable_folio(page);
> }
> EXPORT_SYMBOL(__folio_alloc_noprof);
> @@ -7130,7 +7138,7 @@ static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
> list_for_each_entry_safe(page, next, &list[order], lru) {
> int i;
>
> - post_alloc_hook(page, order, gfp_mask);
> + post_alloc_hook(page, order, gfp_mask, ALLOC_DEFAULT);
> if (!order)
> continue;
>
> @@ -7335,7 +7343,7 @@ int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
> struct page *head = pfn_to_page(start);
>
> check_new_pages(head, order);
> - prep_new_page(head, order, gfp_mask, 0);
> + prep_new_page(head, order, gfp_mask, ALLOC_DEFAULT);
> } else {
> ret = -EINVAL;
> WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
> diff --git a/mm/page_alloc.h b/mm/page_alloc.h
> index 3b8a4709b1497..06f8b6f150cdf 100644
> --- a/mm/page_alloc.h
> +++ b/mm/page_alloc.h
> @@ -49,6 +49,13 @@
> #define ALLOC_HIGHATOMIC 0x200 /* Allows access to MIGRATE_HIGHATOMIC */
> #define ALLOC_NOLOCK 0x400 /* Only use spin_trylock in allocation path */
> #define ALLOC_KSWAPD 0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
> +/*
> + * Avoid alloc_tag recursion for internal allocations.
> + *
> + * Callers must clear_page_tag_ref() before freeing to avoid warnings from
> + * alloc_tag_sub_check().
> + */
> +#define ALLOC_NO_CODETAG 0x1000
>
> /* Flags that allow allocations below the min watermark. */
> #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
> @@ -84,6 +91,8 @@ struct alloc_context {
> */
> enum zone_type highest_zoneidx;
> bool spread_dirty_pages;
> + /* Only flags that are global to the whole allocation go here. */
> + unsigned int alloc_flags;
> };
>
> /*
> @@ -214,7 +223,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> extern void __free_pages_core(struct page *page, unsigned int order,
> enum meminit_context context);
>
> -void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags);
> +void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags,
> + unsigned int alloc_flags);
> extern bool free_pages_prepare(struct page *page, unsigned int order);
>
> extern int user_min_free_kbytes;
> @@ -245,7 +255,7 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
> void free_frozen_pages_nolock(struct page *page, unsigned int order);
>
> struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
> - nodemask_t *nodemask);
> + nodemask_t *nodemask, unsigned int alloc_flags);
> #define __alloc_pages(...) alloc_hooks(__alloc_pages_noprof(__VA_ARGS__))
>
> extern void zone_pcp_reset(struct zone *zone);
> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
> index a1077cef3a791..e63efe78b7d4b 100644
> --- a/mm/page_frag_cache.c
> +++ b/mm/page_frag_cache.c
> @@ -57,10 +57,10 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
> gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP |
> __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
> page = __alloc_pages(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER,
> - numa_mem_id(), NULL);
> + numa_mem_id(), NULL, ALLOC_DEFAULT);
> #endif
> if (unlikely(!page)) {
> - page = __alloc_pages(gfp, 0, numa_mem_id(), NULL);
> + page = __alloc_pages(gfp, 0, numa_mem_id(), NULL, ALLOC_DEFAULT);
> order = 0;
> }
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 05/18] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof()
[not found] ` <20260702-alloc-trylock-v4-5-0af8ff387e80@google.com>
@ 2026-07-03 9:20 ` Vlastimil Babka (SUSE)
0 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-07-03 9:20 UTC (permalink / raw)
To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
derkling, reijiw, Yosry Ahmed
On 7/2/26 11:49, Brendan Jackman wrote:
> Currently the core allocator code is controlled by ALLOC_NOLOCK, but the
> main entry point function is significantly different from the normal
> __alloc_frozen_pages_nolock(), this is tiring when reading the code.
>
> Plumb the ALLOC_NOLOCK control one layer up in the call stack: create
> an alloc_flags argument to __alloc_frozen_pages_nolock() (which is only
> exposed to mm/) and then turn the nolock variant into a thin wrapper
> that just sets that flag (as well as handling NUMA_NO_NODE, similar to
> how some of the wrappers in gfp.h do).
>
> For consistency, set ALLOC_WMARK_MIN explicitly in fastpath_alloc_flags
> for the new ALLOC_NOLOCK path. This was already "done" silently in
> __alloc_frozen_pages_nolock_noprof(): ALLOC_WMARK_MIN is 0.
>
> Rationale that this doesn't change anything:
>
> 1. Simple bits: A bunch of the nolock-specific handling is just moved to
> the new alloc_order_allowed(), alloc_nolock_allowed() and
> gfp_nolock.
>
> 2. __alloc_frozen_pages_noprof() has some extra logic that wasn't
> previously in the nolock variant:
>
> a. Application of gfp_allowed_mask; this only affects early boot,
> only flags that affect the slowpath get changed here, and the
> nolock allocation path isn't allowed to the GFP_BOOT_MASK flags.
>
> b. Application of current_gfp_context() - also only affects the
> slowpath
>
> 3. The slowpath itself: this is now just explicitly skipped under
> !ALLOC_TRYLOCK.
>
> Ulterior motive: adding an alloc_flags arg to the allocator's
> mm-internal entrypoint can later be used to do more allocation
> customisation without needing to create new GFP flags.
>
> No functional change intended.
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
LGTM
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 15/18] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
[not found] ` <20260702-alloc-trylock-v4-15-0af8ff387e80@google.com>
2026-07-03 2:29 ` [PATCH v4 15/18] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Hao Ge
@ 2026-07-03 9:24 ` Vlastimil Babka (SUSE)
1 sibling, 0 replies; 7+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-07-03 9:24 UTC (permalink / raw)
To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
derkling, reijiw, Yosry Ahmed
On 7/2/26 11:49, Brendan Jackman wrote:
> Now that alloc_pages has an entrypoint that allows passing alloc_flags,
> we can take advantage of this to start removing GFP flags that are only
> used for mm-internal stuff.
>
> This requires also plumbing the alloc_flags into some more of the
> allocator code, in particular __alloc_pages[_noprof]() gets an
> alloc_flags arg to go along with its callees, and we now need to pass
> those flags deeper into the allocator so they can reach the alloc_tag
> code.
>
> While moving the flag definition into page_alloc.h, also update the
> comment per Hao's suggestion.
>
> No functional change intended.
>
> Link: https://lore.kernel.org/all/b4916118-3537-4e19-8bc8-1d103dd0d225@linux.dev/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 17/18] mm/page_alloc: drop alloc_flags arg from alloc_flags_cma()
[not found] ` <20260702-alloc-trylock-v4-17-0af8ff387e80@google.com>
@ 2026-07-03 9:28 ` Vlastimil Babka (SUSE)
0 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-07-03 9:28 UTC (permalink / raw)
To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
derkling, reijiw, Yosry Ahmed
On 7/2/26 11:49, Brendan Jackman wrote:
> To align the style with other alloc_flags_*() functions, drop this
> additive argument and just have the callers do that themselves.
>
> Note you can't always freely or alloc_flags like these callers do
> (because of the WMARK bits that encode an enum) but this is fine for
> ALLOC_CMA, just like it's fine for e.g. ALLOC_NON_BLOCK returned by
> alloc_flags_nonblocking() and or'd by its caller.
>
> Suggested-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> Link: https://lore.kernel.org/all/5dcdd1ef-21ad-4ed0-9e8a-0e5cf96b4392@kernel.org/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 18/18] mm: factor out can_spin_trylock()
[not found] ` <20260702-alloc-trylock-v4-18-0af8ff387e80@google.com>
@ 2026-07-03 9:32 ` Vlastimil Babka (SUSE)
2026-07-03 11:43 ` Brendan Jackman
2026-07-03 9:38 ` Harry Yoo
1 sibling, 1 reply; 7+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-07-03 9:32 UTC (permalink / raw)
To: Brendan Jackman, Andrew Morton, Suren Baghdasaryan, Michal Hocko,
Johannes Weiner, Zi Yan, Muchun Song, Oscar Salvador,
David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Mike Rapoport, Matthew Brost, Joshua Hahn, Rakie Kim,
Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
derkling, reijiw, Yosry Ahmed
On 7/2/26 11:49, Brendan Jackman wrote:
> Deduplicate checks for whether the current context is safe for
> spin_trylock().
>
> Does this function really belong in mm/internal.h or is it generic? Not
> sure. If someone ends up duplicating this logic elsewhere in the kernel,
> that would be a shame. But goes in some generic header, someone treats
> it as documentation about where it's guaranteed safe to spin_trylock(),
> and then it emerges that there are other subtle preconditions that
> didn't affect the mm usecase, that would be worse. So, just be
> conservative and keep it local.
Agreed. We could even use page_alloc.h ?
> Suggested-by: Harry Yoo <harry@kernel.org>
> Link: https://lore.kernel.org/all/397859cb-b127-4cc6-9c71-044afc99bf0c@kernel.org/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> ---
> mm/internal.h | 23 +++++++++++++++++++++++
> mm/page_alloc.c | 17 +----------------
> mm/slub.c | 10 +---------
> 3 files changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 3c00eaf5f45a4..e6f300693ffd7 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1715,4 +1715,27 @@ static inline void mm_prepare_for_swap_entries(struct mm_struct *mm)
> }
> }
>
> +static inline bool can_spin_trylock(void)
> +{
> + /*
> + * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
> + * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
> + * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
> + * mark the task as the owner of another rt_spin_lock which will
> + * confuse PI logic, so return immediately if called from hard IRQ or
> + * NMI.
> + *
> + * Note, irqs_disabled() case is ok. spin_trylock() can be called
> + * from raw_spin_lock_irqsave region.
> + */
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> + return false;
> +
> + /* On UP, spin_trylock() always succeeds even when it is locked */
> + if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> + return false;
> +
> + return true;
> +}
> +
> #endif /* __MM_INTERNAL_H */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c3b246e67ed14..a63733dac659e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5291,22 +5291,7 @@ static inline bool alloc_order_allowed(gfp_t gfp, unsigned int order,
>
> static inline bool alloc_nolock_allowed(void)
> {
> - /*
> - * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is
> - * unsafe in NMI. If spin_trylock() is called from hard IRQ the current
> - * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will
> - * mark the task as the owner of another rt_spin_lock which will
> - * confuse PI logic, so return immediately if called from hard IRQ or
> - * NMI.
> - *
> - * Note, irqs_disabled() case is ok. This function can be called
> - * from raw_spin_lock_irqsave region.
> - */
> - if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> - return false;
> -
> - /* On UP, spin_trylock() always succeeds even when it is locked */
> - if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> + if (!can_spin_trylock())
> return false;
>
> /* Bailout, since _deferred_grow_zone() needs to take a lock */
> diff --git a/mm/slub.c b/mm/slub.c
> index 3989b4758ae0a..b19dc46de73c5 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5408,15 +5408,7 @@ static void *__kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_f
> if (unlikely(!size))
> return ZERO_SIZE_PTR;
>
> - /*
> - * See the comment for the same check in
> - * alloc_frozen_pages_nolock_noprof()
> - */
> - if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> - return NULL;
> -
> - /* On UP, spin_trylock() always succeeds even when it is locked */
> - if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
> + if (!can_spin_trylock())
> return NULL;
>
> retry:
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 18/18] mm: factor out can_spin_trylock()
[not found] ` <20260702-alloc-trylock-v4-18-0af8ff387e80@google.com>
2026-07-03 9:32 ` [PATCH v4 18/18] mm: factor out can_spin_trylock() Vlastimil Babka (SUSE)
@ 2026-07-03 9:38 ` Harry Yoo
1 sibling, 0 replies; 7+ messages in thread
From: Harry Yoo @ 2026-07-03 9:38 UTC (permalink / raw)
To: Brendan Jackman, Andrew Morton, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Gregory Price, Alexei Starovoitov, Matthew Wilcox, Hao Ge,
linux-mm, linux-kernel, linux-rt-devel, derkling, reijiw,
Yosry Ahmed
[-- Attachment #1.1: Type: text/plain, Size: 979 bytes --]
On 7/2/26 6:49 PM, Brendan Jackman wrote:
> Deduplicate checks for whether the current context is safe for
> spin_trylock().
>
> Does this function really belong in mm/internal.h or is it generic? Not
> sure. If someone ends up duplicating this logic elsewhere in the kernel,
> that would be a shame.
Wondering what BPF has been doing about it...
> But goes in some generic header, someone treats
> it as documentation about where it's guaranteed safe to spin_trylock(),
> and then it emerges that there are other subtle preconditions that
> didn't affect the mm usecase, that would be worse. So, just be
> conservative and keep it local.
But yeah, agreed.
> Suggested-by: Harry Yoo <harry@kernel.org>
> Link: https://lore.kernel.org/all/397859cb-b127-4cc6-9c71-044afc99bf0c@kernel.org/
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Thanks!
--
Cheers,
Harry / Hyeonggon
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 18/18] mm: factor out can_spin_trylock()
2026-07-03 9:32 ` [PATCH v4 18/18] mm: factor out can_spin_trylock() Vlastimil Babka (SUSE)
@ 2026-07-03 11:43 ` Brendan Jackman
0 siblings, 0 replies; 7+ messages in thread
From: Brendan Jackman @ 2026-07-03 11:43 UTC (permalink / raw)
To: Vlastimil Babka (SUSE), Brendan Jackman, Andrew Morton,
Suren Baghdasaryan, Michal Hocko, Johannes Weiner, Zi Yan,
Muchun Song, Oscar Salvador, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Mike Rapoport, Matthew Brost, Joshua Hahn,
Rakie Kim, Byungchul Park, Ying Huang, Alistair Popple, Hao Li,
Christoph Lameter, David Rientjes, Roman Gushchin,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt
Cc: Harry Yoo (Oracle), Gregory Price, Alexei Starovoitov,
Matthew Wilcox, Hao Ge, linux-mm, linux-kernel, linux-rt-devel,
derkling, reijiw, Yosry Ahmed
On Fri Jul 3, 2026 at 9:32 AM UTC, Vlastimil Babka (SUSE) wrote:
> On 7/2/26 11:49, Brendan Jackman wrote:
>> Deduplicate checks for whether the current context is safe for
>> spin_trylock().
>>
>> Does this function really belong in mm/internal.h or is it generic? Not
>> sure. If someone ends up duplicating this logic elsewhere in the kernel,
>> that would be a shame. But goes in some generic header, someone treats
>> it as documentation about where it's guaranteed safe to spin_trylock(),
>> and then it emerges that there are other subtle preconditions that
>> didn't affect the mm usecase, that would be worse. So, just be
>> conservative and keep it local.
>
> Agreed. We could even use page_alloc.h ?
Would be nice to shrink the scope but I think it would be "wrong" for
slub.c to get it from there. It's not to do with the page allocator.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-07-03 11:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260702-alloc-trylock-v4-0-0af8ff387e80@google.com>
[not found] ` <20260702-alloc-trylock-v4-5-0af8ff387e80@google.com>
2026-07-03 9:20 ` [PATCH v4 05/18] mm/page_alloc: unify __alloc_frozen_pages[_nolock]_noprof() Vlastimil Babka (SUSE)
[not found] ` <20260702-alloc-trylock-v4-15-0af8ff387e80@google.com>
2026-07-03 2:29 ` [PATCH v4 15/18] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG Hao Ge
2026-07-03 9:24 ` Vlastimil Babka (SUSE)
[not found] ` <20260702-alloc-trylock-v4-17-0af8ff387e80@google.com>
2026-07-03 9:28 ` [PATCH v4 17/18] mm/page_alloc: drop alloc_flags arg from alloc_flags_cma() Vlastimil Babka (SUSE)
[not found] ` <20260702-alloc-trylock-v4-18-0af8ff387e80@google.com>
2026-07-03 9:32 ` [PATCH v4 18/18] mm: factor out can_spin_trylock() Vlastimil Babka (SUSE)
2026-07-03 11:43 ` Brendan Jackman
2026-07-03 9:38 ` Harry Yoo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox