From: Suren Baghdasaryan <surenb@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hao Ge <hao.ge@linux.dev>,
Kent Overstreet <kent.overstreet@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/alloc_tag: clear codetag for pages allocated before page_ext initialization
Date: Thu, 19 Mar 2026 16:48:52 -0700 [thread overview]
Message-ID: <CAJuCfpHD4SB30oP7Bdm=ey+2W7CAsFPruedEbqgR5OumrQErxQ@mail.gmail.com> (raw)
In-Reply-To: <CAJuCfpG5OG66mgfn3ez5ZXTjj7VC8yuK3L2u83uOZhyBbRH2Lw@mail.gmail.com>
On Thu, Mar 19, 2026 at 4:44 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Thu, Mar 19, 2026 at 3:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Thu, 19 Mar 2026 16:31:53 +0800 Hao Ge <hao.ge@linux.dev> wrote:
> >
> > > Due to initialization ordering, page_ext is allocated and initialized
> > > relatively late during boot. Some pages have already been allocated
> > > and freed before page_ext becomes available, leaving their codetag
> > > uninitialized.
>
> Hi Hao,
> Thanks for the report.
> Hmm. So, we are allocating pages before page_ext is initialized...
>
> > >
> > > A clear example is in init_section_page_ext(): alloc_page_ext() calls
> > > kmemleak_alloc().
Forgot to ask. The example you are using here is for page_ext
allocation itself. Do you have any other examples where page
allocation happens before page_ext initialization? If that's the only
place, then we might be able to fix this in a simpler way by doing
something special for alloc_page_ext().
> > > If the slab cache has no free objects, it falls back
> > > to the buddy allocator to allocate memory. However, at this point page_ext
> > > is not yet fully initialized, so these newly allocated pages have no
> > > codetag set. These pages may later be reclaimed by KASAN,which causes
> > > the warning to trigger when they are freed because their codetag ref is
> > > still empty.
> > >
> > > Use a global array to track pages allocated before page_ext is fully
> > > initialized, similar to how kmemleak tracks early allocations.
> > > When page_ext initialization completes, set their codetag
> > > to empty to avoid warnings when they are freed later.
> > >
> > > ...
> > >
> > > --- a/include/linux/alloc_tag.h
> > > +++ b/include/linux/alloc_tag.h
> > > @@ -74,6 +74,9 @@ static inline void set_codetag_empty(union codetag_ref *ref)
> > >
> > > #ifdef CONFIG_MEM_ALLOC_PROFILING
> > >
> > > +bool mem_profiling_is_available(void);
> > > +void alloc_tag_add_early_pfn(unsigned long pfn);
> > > +
> > > #define ALLOC_TAG_SECTION_NAME "alloc_tags"
> > >
> > > struct codetag_bytes {
> > > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > > index 58991ab09d84..a5bf4e72c154 100644
> > > --- a/lib/alloc_tag.c
> > > +++ b/lib/alloc_tag.c
> > > @@ -6,6 +6,7 @@
> > > #include <linux/kallsyms.h>
> > > #include <linux/module.h>
> > > #include <linux/page_ext.h>
> > > +#include <linux/pgalloc_tag.h>
> > > #include <linux/proc_fs.h>
> > > #include <linux/seq_buf.h>
> > > #include <linux/seq_file.h>
> > > @@ -26,6 +27,82 @@ static bool mem_profiling_support;
> > >
> > > static struct codetag_type *alloc_tag_cttype;
> > >
> > > +/*
> > > + * State of the alloc_tag
> > > + *
> > > + * This is used to describe the states of the alloc_tag during bootup.
> > > + *
> > > + * When we need to allocate page_ext to store codetag, we face an
> > > + * initialization timing problem:
> > > + *
> > > + * Due to initialization order, pages may be allocated via buddy system
> > > + * before page_ext is fully allocated and initialized. Although these
> > > + * pages call the allocation hooks, the codetag will not be set because
> > > + * page_ext is not yet available.
> > > + *
> > > + * When these pages are later free to the buddy system, it triggers
> > > + * warnings because their codetag is actually empty if
> > > + * CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled.
> > > + *
> > > + * Additionally, in this situation, we cannot record detailed allocation
> > > + * information for these pages.
> > > + */
> > > +enum mem_profiling_state {
> > > + DOWN, /* No mem_profiling functionality yet */
> > > + UP /* Everything is working */
> > > +};
> > > +
> > > +static enum mem_profiling_state mem_profiling_state = DOWN;
> > > +
> > > +bool mem_profiling_is_available(void)
> > > +{
> > > + return mem_profiling_state == UP;
> > > +}
> > > +
> > > +#ifdef CONFIG_MEM_ALLOC_PROFILING_DEBUG
> > > +
> > > +#define EARLY_ALLOC_PFN_MAX 256
> > > +
> > > +static unsigned long early_pfns[EARLY_ALLOC_PFN_MAX];
> >
> > It's unfortunate that this isn't __initdata.
> >
> > > +static unsigned int early_pfn_count;
> > > +static DEFINE_SPINLOCK(early_pfn_lock);
> > > +
> > >
> > > ...
> > >
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -1293,6 +1293,13 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,
> > > alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr);
> > > update_page_tag_ref(handle, &ref);
> > > put_page_tag_ref(handle);
> > > + } else {
>
> This branch can be marked as "unlikely".
>
> > > + /*
> > > + * page_ext is not available yet, record the pfn so we can
> > > + * clear the tag ref later when page_ext is initialized.
> > > + */
> > > + if (!mem_profiling_is_available())
> > > + alloc_tag_add_early_pfn(page_to_pfn(page));
> > > }
> > > }
> >
> > All because of this, I believe. Is this fixable?
> >
> > If we take that `else', we know we're running in __init code, yes? I
> > don't see how `__init pgalloc_tag_add_early()' could be made to work.
> > hrm. Something clever, please.
>
> We can have a pointer to a function that is initialized to point to
> alloc_tag_add_early_pfn, which is defined as __init and uses
> early_pfns which now can be defined as __initdata. After
> clear_early_alloc_pfn_tag_refs() is done we reset that pointer to
> NULL. __pgalloc_tag_add() instead of calling alloc_tag_add_early_pfn()
> directly checks that pointer and if it's not NULL then calls the
> function that it points to. This way __pgalloc_tag_add() which is not
> an __init function will be invoking alloc_tag_add_early_pfn() __init
> function only until we are done with initialization. I haven't tried
> this but I think that should work. This also eliminates the need for
> mem_profiling_state variable since we can use this function pointer
> instead.
>
>
> >
next prev parent reply other threads:[~2026-03-19 23:49 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 8:31 [PATCH] mm/alloc_tag: clear codetag for pages allocated before page_ext initialization Hao Ge
2026-03-19 22:28 ` Andrew Morton
2026-03-19 23:44 ` Suren Baghdasaryan
2026-03-19 23:48 ` Suren Baghdasaryan [this message]
2026-03-20 1:57 ` Hao Ge
2026-03-20 2:14 ` Suren Baghdasaryan
2026-03-23 9:15 ` Hao Ge
2026-03-23 22:47 ` Suren Baghdasaryan
2026-03-24 9:43 ` Hao Ge
2026-03-25 0:21 ` Suren Baghdasaryan
2026-03-25 2:07 ` Hao Ge
2026-03-25 6:25 ` Suren Baghdasaryan
2026-03-25 7:35 ` Suren Baghdasaryan
2026-03-25 11:20 ` Hao Ge
2026-03-25 15:17 ` Suren Baghdasaryan
2026-03-26 1:44 ` Hao Ge
2026-03-26 5:04 ` Suren Baghdasaryan
2026-03-26 5:33 ` Hao Ge
2026-03-26 8:23 ` Suren Baghdasaryan
2026-03-20 3:14 ` Andrew Morton
2026-03-20 4:18 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJuCfpHD4SB30oP7Bdm=ey+2W7CAsFPruedEbqgR5OumrQErxQ@mail.gmail.com' \
--to=surenb@google.com \
--cc=akpm@linux-foundation.org \
--cc=hao.ge@linux.dev \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox