From: Hao Ge <hao.ge@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] mm/alloc_tag: replace fixed-size early PFN array with dynamic linked list
Date: Wed, 6 May 2026 10:23:09 +0800 [thread overview]
Message-ID: <0b9969e2-b208-46c2-a9a5-bf620239275a@linux.dev> (raw)
In-Reply-To: <20260430075239.efcceac3d83ede2a2d22158c@linux-foundation.org>
Hi Andrew and Suren
Sorry for the late reply, I was on holiday.
On 2026/4/30 22:52, Andrew Morton wrote:
> On Thu, 30 Apr 2026 10:02:26 +0800 Hao Ge <hao.ge@linux.dev> wrote:
>
>> Pages allocated before page_ext is available have their codetag left
>> uninitialized. Track these early PFNs and clear their codetag in
>> clear_early_alloc_pfn_tag_refs() to avoid "alloc_tag was not set"
>> warnings when they are freed later.
>>
>> Currently a fixed-size array of 8192 entries is used, with a warning if
>> the limit is exceeded. However, the number of early allocations depends
>> on the number of CPUs and can be larger than 8192.
>>
>> Replace the fixed-size array with a dynamically allocated linked list
>> of pfn_pool structs. Each node is allocated via alloc_page() and mapped
>> to a pfn_pool containing a next pointer, an atomic slot counter, and a
>> PFN array that fills the remainder of the page.
>>
>> The tracking pages themselves are allocated via alloc_page(), which
>> would trigger __pgalloc_tag_add() -> alloc_tag_add_early_pfn() and
>> recurse indefinitely. Introduce __GFP_NO_CODETAG (reuses the
>> %__GFP_NO_OBJ_EXT bit) and pass gfp_flags through pgalloc_tag_add()
>> so that the early path can skip recording allocations that carry this
>> flag.
> Thanks. AI review asked a couple of questions:
>
> https://sashiko.dev/#/patchset/20260430020226.34116-1-hao.ge@linux.dev
Yes, Sashiko is an excellent tool. Let me carefully go through the
questions raised by its review.
About question 1 (About question 1 (__GFP_NO_CODETAG aliasing to
__GFP_NO_OBJ_EXT may cause
SLUB page allocations to be erroneously skipped):
No. When SLUB allocates a new slab page via new_slab(),
it filters the flags with (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK),
https://elixir.bootlin.com/linux/v7.0.1/source/mm/slub.c#L3539
which does not include __GFP_NO_OBJ_EXT. So __GFP_NO_OBJ_EXT is never
passed
through to the page allocator for slab page allocations.
About question 2 (retaining caller's zone/placement flags may cause NULL
pointer dereference or violate mobility constraints):
Sashiko's analysis is correct. As Suren also suggested, it will be
fixed in
v5 by masking out GFP_ZONEMASK when allocating tracking pages,
so the internal allocation is no longer constrained by the original
caller's zone
requirements.
> Please check to see if there's anything legitimate there?
>
> It also asked some different questions of the v3 patch:
> https://sashiko.dev/#/patchset/20260423083756.157902-1-hao.ge@linux.dev
For the v3 patch:
About question 3 (race in the cmpxchg failure path: page_ext becomes
available between alloc_page() and __free_page(), triggering "alloc_tag
was not set"):
Good catch. This race was identified during the v3 review and the fix
(calling clear_page_tag_ref() before __free_page()) was added in v4.
Apologies for not documenting it in the v4 changelogs.
CPU A (__alloc_tag_add_early_pfn) CPU B
(clear_early_alloc_pfn_tag_refs)
alloc_page() -> codetag uninitialized
page_ext becomes available
cmpxchg() fails
__free_page -> WARN
Setting the codetag to CODETAG_EMPTY via clear_page_tag_ref() prevents
this warning.
About question 4 (accounting leak: tracking page gets fully accounted,
then clear_page_tag_ref() overwrites to CODETAG_EMPTY, __free_page()
skips subtraction):
Yes, this can happen. However, the tracking pages are allocated with
__GFP_NO_CODETAG and are only used during early boot. The leaked
accounting is at most one page per tracking node, which is negligible
for this debug-only code path.
Thanks
Best Regards
Hao
next prev parent reply other threads:[~2026-05-06 2:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 2:02 [PATCH v4] mm/alloc_tag: replace fixed-size early PFN array with dynamic linked list Hao Ge
2026-04-30 14:52 ` Andrew Morton
2026-05-06 2:23 ` Hao Ge [this message]
2026-05-01 20:32 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b9969e2-b208-46c2-a9a5-bf620239275a@linux.dev \
--to=hao.ge@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox