From: Hao Ge <hao.ge@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] mm/alloc_tag: replace fixed-size early PFN array with dynamic linked list
Date: Wed, 6 May 2026 10:23:09 +0800 [thread overview]
Message-ID: <0b9969e2-b208-46c2-a9a5-bf620239275a@linux.dev> (raw)
In-Reply-To: <20260430075239.efcceac3d83ede2a2d22158c@linux-foundation.org>
Hi Andrew and Suren
Sorry for the late reply, I was on holiday.
On 2026/4/30 22:52, Andrew Morton wrote:
> On Thu, 30 Apr 2026 10:02:26 +0800 Hao Ge <hao.ge@linux.dev> wrote:
>
>> Pages allocated before page_ext is available have their codetag left
>> uninitialized. Track these early PFNs and clear their codetag in
>> clear_early_alloc_pfn_tag_refs() to avoid "alloc_tag was not set"
>> warnings when they are freed later.
>>
>> Currently a fixed-size array of 8192 entries is used, with a warning if
>> the limit is exceeded. However, the number of early allocations depends
>> on the number of CPUs and can be larger than 8192.
>>
>> Replace the fixed-size array with a dynamically allocated linked list
>> of pfn_pool structs. Each node is allocated via alloc_page() and mapped
>> to a pfn_pool containing a next pointer, an atomic slot counter, and a
>> PFN array that fills the remainder of the page.
>>
>> The tracking pages themselves are allocated via alloc_page(), which
>> would trigger __pgalloc_tag_add() -> alloc_tag_add_early_pfn() and
>> recurse indefinitely. Introduce __GFP_NO_CODETAG (reuses the
>> %__GFP_NO_OBJ_EXT bit) and pass gfp_flags through pgalloc_tag_add()
>> so that the early path can skip recording allocations that carry this
>> flag.
> Thanks. AI review asked a couple of questions:
>
> https://sashiko.dev/#/patchset/20260430020226.34116-1-hao.ge@linux.dev
Yes, Sashiko is an excellent tool. Let me carefully go through the
questions raised by its review.
About question 1 (About question 1 (__GFP_NO_CODETAG aliasing to
__GFP_NO_OBJ_EXT may cause
SLUB page allocations to be erroneously skipped):
No. When SLUB allocates a new slab page via new_slab(),
it filters the flags with (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK),
https://elixir.bootlin.com/linux/v7.0.1/source/mm/slub.c#L3539
which does not include __GFP_NO_OBJ_EXT. So __GFP_NO_OBJ_EXT is never
passed
through to the page allocator for slab page allocations.
About question 2 (retaining caller's zone/placement flags may cause NULL
pointer dereference or violate mobility constraints):
Sashiko's analysis is correct. As Suren also suggested, it will be
fixed in
v5 by masking out GFP_ZONEMASK when allocating tracking pages,
so the internal allocation is no longer constrained by the original
caller's zone
requirements.
> Please check to see if there's anything legitimate there?
>
> It also asked some different questions of the v3 patch:
> https://sashiko.dev/#/patchset/20260423083756.157902-1-hao.ge@linux.dev
For the v3 patch:
About question 3 (race in the cmpxchg failure path: page_ext becomes
available between alloc_page() and __free_page(), triggering "alloc_tag
was not set"):
Good catch. This race was identified during the v3 review and the fix
(calling clear_page_tag_ref() before __free_page()) was added in v4.
Apologies for not documenting it in the v4 changelogs.
CPU A (__alloc_tag_add_early_pfn) CPU B
(clear_early_alloc_pfn_tag_refs)
alloc_page() -> codetag uninitialized
page_ext becomes available
cmpxchg() fails
__free_page -> WARN
Setting the codetag to CODETAG_EMPTY via clear_page_tag_ref() prevents
this warning.
About question 4 (accounting leak: tracking page gets fully accounted,
then clear_page_tag_ref() overwrites to CODETAG_EMPTY, __free_page()
skips subtraction):
Yes, this can happen. However, the tracking pages are allocated with
__GFP_NO_CODETAG and are only used during early boot. The leaked
accounting is at most one page per tracking node, which is negligible
for this debug-only code path.
Thanks
Best Regards
Hao
next prev parent reply other threads:[~2026-05-06 2:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 2:02 [PATCH v4] mm/alloc_tag: replace fixed-size early PFN array with dynamic linked list Hao Ge
2026-04-30 14:52 ` Andrew Morton
2026-05-06 2:23 ` Hao Ge [this message]
2026-05-01 20:32 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b9969e2-b208-46c2-a9a5-bf620239275a@linux.dev \
--to=hao.ge@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.