public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Hao Ge <hao.ge@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	 linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/alloc_tag: clear codetag for pages allocated before page_ext initialization
Date: Wed, 25 Mar 2026 22:04:02 -0700	[thread overview]
Message-ID: <CAJuCfpHGmgZX_F7Kae2pZqvLRS4pR9Y3+CRMVZWOxb0XpO9EnQ@mail.gmail.com> (raw)
In-Reply-To: <098f53cc-97b5-4647-89dd-0e5820b1e9a0@linux.dev>

On Wed, Mar 25, 2026 at 6:45 PM Hao Ge <hao.ge@linux.dev> wrote:
>
>
> On 2026/3/25 23:17, Suren Baghdasaryan wrote:
> > On Wed, Mar 25, 2026 at 4:21 AM Hao Ge <hao.ge@linux.dev> wrote:
> >>
> >> On 2026/3/25 15:35, Suren Baghdasaryan wrote:
> >>> On Tue, Mar 24, 2026 at 11:25 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>> On Tue, Mar 24, 2026 at 7:08 PM Hao Ge <hao.ge@linux.dev> wrote:
> >>>>> On 2026/3/25 08:21, Suren Baghdasaryan wrote:
> >>>>>> On Tue, Mar 24, 2026 at 2:43 AM Hao Ge <hao.ge@linux.dev> wrote:
> >>>>>>> On 2026/3/24 06:47, Suren Baghdasaryan wrote:
> >>>>>>>> On Mon, Mar 23, 2026 at 2:16 AM Hao Ge <hao.ge@linux.dev> wrote:
> >>>>>>>>> On 2026/3/20 10:14, Suren Baghdasaryan wrote:
> >>>>>>>>>> On Thu, Mar 19, 2026 at 6:58 PM Hao Ge <hao.ge@linux.dev> wrote:
> >>>>>>>>>>> On 2026/3/20 07:48, Suren Baghdasaryan wrote:
> >>>>>>>>>>>> On Thu, Mar 19, 2026 at 4:44 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>>>>>>> On Thu, Mar 19, 2026 at 3:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >>>>>>>>>>>>>> On Thu, 19 Mar 2026 16:31:53 +0800 Hao Ge <hao.ge@linux.dev> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Due to initialization ordering, page_ext is allocated and initialized
> >>>>>>>>>>>>>>> relatively late during boot. Some pages have already been allocated
> >>>>>>>>>>>>>>> and freed before page_ext becomes available, leaving their codetag
> >>>>>>>>>>>>>>> uninitialized.
> >>>>>>>>>>>>> Hi Hao,
> >>>>>>>>>>>>> Thanks for the report.
> >>>>>>>>>>>>> Hmm. So, we are allocating pages before page_ext is initialized...
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>> A clear example is in init_section_page_ext(): alloc_page_ext() calls
> >>>>>>>>>>>>>>> kmemleak_alloc().
> >>>>>>>>>>>> Forgot to ask. The example you are using here is for page_ext
> >>>>>>>>>>>> allocation itself. Do you have any other examples where page
> >>>>>>>>>>>> allocation happens before page_ext initialization? If that's the only
> >>>>>>>>>>>> place, then we might be able to fix this in a simpler way by doing
> >>>>>>>>>>>> something special for alloc_page_ext().
> >>>>>>>>>>> Hi Suren
> >>>>>>>>>>>
> >>>>>>>>>>> To help illustrate the point, here's the debug log I added:
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >>>>>>>>>>> index 2d4b6f1a554e..ebfe636f5b07 100644
> >>>>>>>>>>> --- a/mm/page_alloc.c
> >>>>>>>>>>> +++ b/mm/page_alloc.c
> >>>>>>>>>>> @@ -1293,6 +1293,9 @@ void __pgalloc_tag_add(struct page *page, struct
> >>>>>>>>>>> task_struct *task,
> >>>>>>>>>>>                       alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr);
> >>>>>>>>>>>                       update_page_tag_ref(handle, &ref);
> >>>>>>>>>>>                       put_page_tag_ref(handle);
> >>>>>>>>>>> +       } else {
> >>>>>>>>>>> +               pr_warn("__pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>>>>>> page=%p pfn=%lu nr=%u\n", page, page_to_pfn(page), nr);
> >>>>>>>>>>> +               dump_stack();
> >>>>>>>>>>>               }
> >>>>>>>>>>>        }
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> And I caught the following logs:
> >>>>>>>>>>>
> >>>>>>>>>>> [    0.296399] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>>>>>> page=ffffea000400c700 pfn=1049372 nr=1
> >>>>>>>>>>> [    0.296400] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
> >>>>>>>>>>> 7.0.0-rc4-dirty #12 PREEMPT(lazy)
> >>>>>>>>>>> [    0.296402] Hardware name: Red Hat KVM, BIOS
> >>>>>>>>>>> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >>>>>>>>>>> [    0.296402] Call Trace:
> >>>>>>>>>>> [    0.296403]  <TASK>
> >>>>>>>>>>> [    0.296403]  dump_stack_lvl+0x53/0x70
> >>>>>>>>>>> [    0.296405]  __pgalloc_tag_add+0x3a3/0x6e0
> >>>>>>>>>>> [    0.296406]  ? __pfx___pgalloc_tag_add+0x10/0x10
> >>>>>>>>>>> [    0.296407]  ? kasan_unpoison+0x27/0x60
> >>>>>>>>>>> [    0.296409]  ? __kasan_unpoison_pages+0x2c/0x40
> >>>>>>>>>>> [    0.296411]  get_page_from_freelist+0xa54/0x1310
> >>>>>>>>>>> [    0.296413]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >>>>>>>>>>> [    0.296415]  ? __pfx___alloc_frozen_pages_noprof+0x10/0x10
> >>>>>>>>>>> [    0.296417]  ? stack_depot_save_flags+0x3f/0x680
> >>>>>>>>>>> [    0.296418]  ? ___slab_alloc+0x518/0x530
> >>>>>>>>>>> [    0.296420]  alloc_pages_mpol+0x13a/0x3f0
> >>>>>>>>>>> [    0.296421]  ? __pfx_alloc_pages_mpol+0x10/0x10
> >>>>>>>>>>> [    0.296423]  ? _raw_spin_lock_irqsave+0x8a/0xf0
> >>>>>>>>>>> [    0.296424]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> >>>>>>>>>>> [    0.296426]  alloc_slab_page+0xc2/0x130
> >>>>>>>>>>> [    0.296427]  allocate_slab+0x77/0x2c0
> >>>>>>>>>>> [    0.296429]  ? syscall_enter_define_fields+0x3bb/0x5f0
> >>>>>>>>>>> [    0.296430]  ___slab_alloc+0x125/0x530
> >>>>>>>>>>> [    0.296432]  ? __trace_define_field+0x252/0x3d0
> >>>>>>>>>>> [    0.296433]  __kmalloc_noprof+0x329/0x630
> >>>>>>>>>>> [    0.296435]  ? syscall_enter_define_fields+0x3bb/0x5f0
> >>>>>>>>>>> [    0.296436]  syscall_enter_define_fields+0x3bb/0x5f0
> >>>>>>>>>>> [    0.296438]  ? __pfx_syscall_enter_define_fields+0x10/0x10
> >>>>>>>>>>> [    0.296440]  event_define_fields+0x326/0x540
> >>>>>>>>>>> [    0.296441]  __trace_early_add_events+0xac/0x3c0
> >>>>>>>>>>> [    0.296443]  trace_event_init+0x24c/0x460
> >>>>>>>>>>> [    0.296445]  trace_init+0x9/0x20
> >>>>>>>>>>> [    0.296446]  start_kernel+0x199/0x3c0
> >>>>>>>>>>> [    0.296448]  x86_64_start_reservations+0x18/0x30
> >>>>>>>>>>> [    0.296449]  x86_64_start_kernel+0xe2/0xf0
> >>>>>>>>>>> [    0.296451]  common_startup_64+0x13e/0x141
> >>>>>>>>>>> [    0.296453]  </TASK>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> [    0.312234] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>>>>>> page=ffffea000400f900 pfn=1049572 nr=1
> >>>>>>>>>>> [    0.312234] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
> >>>>>>>>>>> 7.0.0-rc4-dirty #12 PREEMPT(lazy)
> >>>>>>>>>>> [    0.312236] Hardware name: Red Hat KVM, BIOS
> >>>>>>>>>>> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >>>>>>>>>>> [    0.312236] Call Trace:
> >>>>>>>>>>> [    0.312237]  <TASK>
> >>>>>>>>>>> [    0.312237]  dump_stack_lvl+0x53/0x70
> >>>>>>>>>>> [    0.312239]  __pgalloc_tag_add+0x3a3/0x6e0
> >>>>>>>>>>> [    0.312240]  ? __pfx___pgalloc_tag_add+0x10/0x10
> >>>>>>>>>>> [    0.312241]  ? rmqueue.constprop.0+0x4fc/0x1ce0
> >>>>>>>>>>> [    0.312243]  ? kasan_unpoison+0x27/0x60
> >>>>>>>>>>> [    0.312244]  ? __kasan_unpoison_pages+0x2c/0x40
> >>>>>>>>>>> [    0.312246]  get_page_from_freelist+0xa54/0x1310
> >>>>>>>>>>> [    0.312248]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >>>>>>>>>>> [    0.312250]  ? __pfx___alloc_frozen_pages_noprof+0x10/0x10
> >>>>>>>>>>> [    0.312253]  alloc_slab_page+0x39/0x130
> >>>>>>>>>>> [    0.312254]  allocate_slab+0x77/0x2c0
> >>>>>>>>>>> [    0.312255]  ? alloc_cpumask_var_node+0xc7/0x230
> >>>>>>>>>>> [    0.312257]  ___slab_alloc+0x46d/0x530
> >>>>>>>>>>> [    0.312259]  __kmalloc_node_noprof+0x2fa/0x680
> >>>>>>>>>>> [    0.312261]  ? alloc_cpumask_var_node+0xc7/0x230
> >>>>>>>>>>> [    0.312263]  alloc_cpumask_var_node+0xc7/0x230
> >>>>>>>>>>> [    0.312264]  init_desc+0x141/0x6b0
> >>>>>>>>>>> [    0.312266]  alloc_desc+0x108/0x1b0
> >>>>>>>>>>> [    0.312267]  early_irq_init+0xee/0x1c0
> >>>>>>>>>>> [    0.312268]  ? __pfx_early_irq_init+0x10/0x10
> >>>>>>>>>>> [    0.312271]  start_kernel+0x1ab/0x3c0
> >>>>>>>>>>> [    0.312272]  x86_64_start_reservations+0x18/0x30
> >>>>>>>>>>> [    0.312274]  x86_64_start_kernel+0xe2/0xf0
> >>>>>>>>>>> [    0.312275]  common_startup_64+0x13e/0x141
> >>>>>>>>>>> [    0.312277]  </TASK>
> >>>>>>>>>>>
> >>>>>>>>>>> [    0.312834] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>>>>>> page=ffffea000400fc00 pfn=1049584 nr=1
> >>>>>>>>>>> [    0.312835] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
> >>>>>>>>>>> 7.0.0-rc4-dirty #12 PREEMPT(lazy)
> >>>>>>>>>>> [    0.312836] Hardware name: Red Hat KVM, BIOS
> >>>>>>>>>>> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >>>>>>>>>>> [    0.312837] Call Trace:
> >>>>>>>>>>> [    0.312837]  <TASK>
> >>>>>>>>>>> [    0.312838]  dump_stack_lvl+0x53/0x70
> >>>>>>>>>>> [    0.312840]  __pgalloc_tag_add+0x3a3/0x6e0
> >>>>>>>>>>> [    0.312841]  ? __pfx___pgalloc_tag_add+0x10/0x10
> >>>>>>>>>>> [    0.312842]  ? rmqueue.constprop.0+0x4fc/0x1ce0
> >>>>>>>>>>> [    0.312844]  ? kasan_unpoison+0x27/0x60
> >>>>>>>>>>> [    0.312845]  ? __kasan_unpoison_pages+0x2c/0x40
> >>>>>>>>>>> [    0.312847]  get_page_from_freelist+0xa54/0x1310
> >>>>>>>>>>> [    0.312849]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >>>>>>>>>>> [    0.312851]  ? __pfx___alloc_frozen_pages_noprof+0x10/0x10
> >>>>>>>>>>> [    0.312853]  alloc_pages_mpol+0x13a/0x3f0
> >>>>>>>>>>> [    0.312855]  ? __pfx_alloc_pages_mpol+0x10/0x10
> >>>>>>>>>>> [    0.312856]  ? xas_find+0x2d8/0x450
> >>>>>>>>>>> [    0.312858]  ? _raw_spin_lock+0x84/0xe0
> >>>>>>>>>>> [    0.312859]  ? __pfx__raw_spin_lock+0x10/0x10
> >>>>>>>>>>> [    0.312861]  alloc_pages_noprof+0xf6/0x2b0
> >>>>>>>>>>> [    0.312862]  __change_page_attr+0x293/0x850
> >>>>>>>>>>> [    0.312864]  ? __pfx___change_page_attr+0x10/0x10
> >>>>>>>>>>> [    0.312865]  ? _vm_unmap_aliases+0x2d0/0x650
> >>>>>>>>>>> [    0.312868]  ? __pfx__vm_unmap_aliases+0x10/0x10
> >>>>>>>>>>> [    0.312869]  __change_page_attr_set_clr+0x16c/0x360
> >>>>>>>>>>> [    0.312871]  ? spp_getpage+0xbb/0x1e0
> >>>>>>>>>>> [    0.312872]  change_page_attr_set_clr+0x220/0x3c0
> >>>>>>>>>>> [    0.312873]  ? flush_tlb_one_kernel+0xf/0x30
> >>>>>>>>>>> [    0.312875]  ? set_pte_vaddr_p4d+0x110/0x180
> >>>>>>>>>>> [    0.312877]  ? __pfx_change_page_attr_set_clr+0x10/0x10
> >>>>>>>>>>> [    0.312878]  ? __pfx_set_pte_vaddr_p4d+0x10/0x10
> >>>>>>>>>>> [    0.312881]  ? __pfx_mtree_load+0x10/0x10
> >>>>>>>>>>> [    0.312883]  ? __pfx_mtree_load+0x10/0x10
> >>>>>>>>>>> [    0.312884]  ? __asan_memcpy+0x3c/0x60
> >>>>>>>>>>> [    0.312886]  ? set_intr_gate+0x10c/0x150
> >>>>>>>>>>> [    0.312888]  set_memory_ro+0x76/0xa0
> >>>>>>>>>>> [    0.312889]  ? __pfx_set_memory_ro+0x10/0x10
> >>>>>>>>>>> [    0.312891]  idt_setup_apic_and_irq_gates+0x2c1/0x390
> >>>>>>>>>>>
> >>>>>>>>>>> and more.
> >>>>>>>>>> Ok, it's not the only place. Got your point.
> >>>>>>>>>>
> >>>>>>>>>>> off topic -  if we were to handle only alloc_page_ext() specifically,
> >>>>>>>>>>> what would be the most straightforward
> >>>>>>>>>>>
> >>>>>>>>>>> solution in your mind? I'd really appreciate your insight.
> >>>>>>>>>> I was thinking if it's the only special case maybe we can handle it
> >>>>>>>>>> somehow differently, like we do when we allocate obj_ext vectors for
> >>>>>>>>>> slabs using __GFP_NO_OBJ_EXT. I haven't found a good solution yet but
> >>>>>>>>>> since it's not a special case we would not be able to use it even if I
> >>>>>>>>>> came up with something...
> >>>>>>>>>> I think your way is the most straight-forward but please try my
> >>>>>>>>>> suggestion to see if we can avoid extra overhead.
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Suren.
> >>>>> Hi Suren
> >>>>>>> Hi Suren
> >>>>>>>
> >>>>>>>
> >>>>>>>> Hi Hao,
> >>>>>>>>
> >>>>>>>>> Hi Suren
> >>>>>>>>>
> >>>>>>>>> Thank you for your feedback. After re-examining this issue,
> >>>>>>>>>
> >>>>>>>>> I realize my previous focus was misplaced.
> >>>>>>>>>
> >>>>>>>>> Upon deeper consideration, I understand that this is not merely a bug,
> >>>>>>>>>
> >>>>>>>>> but rather a warning that indicates a gap in our memory profiling mechanism.
> >>>>>>>>>
> >>>>>>>>> Specifically, the current implementation appears to be missing memory
> >>>>>>>>> allocation
> >>>>>>>>>
> >>>>>>>>> tracking during the period between the buddy system allocation and page_ext
> >>>>>>>>>
> >>>>>>>>> initialization.
> >>>>>>>>>
> >>>>>>>>> This profiling gap means we may not be capturing all relevant memory
> >>>>>>>>> allocation
> >>>>>>>>>
> >>>>>>>>> events during this critical transition phase.
> >>>>>>>> Correct, this limitation exists because memory profiling relies on
> >>>>>>>> some kernel facilities (page_ext, objj_ext) which might not be
> >>>>>>>> initialized yet at the time of allocation.
> >>>>>>>>
> >>>>>>>>> My approach is to dynamically allocate codetag_ref when get_page_tag_ref
> >>>>>>>>> fails,
> >>>>>>>>>
> >>>>>>>>> and maintain a linked list to track all buddy system allocations that
> >>>>>>>>> occur prior to page_ext initialization.
> >>>>>>>>>
> >>>>>>>>>       However, this introduces performance concerns:
> >>>>>>>>>
> >>>>>>>>> 1. Free Path Overhead: When freeing these pages, we would need to
> >>>>>>>>> traverse the entire linked list to locate
> >>>>>>>>>
> >>>>>>>>>           the corresponding codetag_ref, resulting in O(n) lookup complexity
> >>>>>>>>> per free operation.
> >>>>>>>>>
> >>>>>>>>> 2. Initialization Overhead: During init_page_alloc_tagging, iterating
> >>>>>>>>> through the linked list to assign codetag_ref to
> >>>>>>>>>
> >>>>>>>>>           page_ext would introduce additional traversal cost.
> >>>>>>>>>
> >>>>>>>>> If the number of pages is substantial, this could incur significant
> >>>>>>>>> overhead. What are your thoughts on this? I look forward to your
> >>>>>>>>> suggestions.
> >>>>>>>> My thinking is that these early allocations comprise a small portion
> >>>>>>>> of overall memory consumed by the system. So, instead of trying to
> >>>>>>>> record and handle them in some alternative way, we just accept that
> >>>>>>>> some counters might not be exactly accurate and ignore those early
> >>>>>>>> allocations. See how the early slab allocations are marked with the
> >>>>>>>> CODETAG_FLAG_INACCURATE flag and later reported as inaccurate. I think
> >>>>>>>> that's an acceptable alternative to introducing extra complexity and
> >>>>>>>> performance overhead. IOW, the benefits of accounting for these early
> >>>>>>>> allocations are low compared to the effort required to account for
> >>>>>>>> them. Unless you found a simple and performant way to do that...
> >>>>>>> I have been exploring possible solutions to this issue over the past few
> >>>>>>> days,
> >>>>>>>
> >>>>>>> but so far I have not come up with a good approach.
> >>>>>>>
> >>>>>>> I have counted the number of memory allocations that occur earlier than the
> >>>>>>>
> >>>>>>> allocation and initialization of our page_ext, and found that there are
> >>>>>>> actually
> >>>>>>>
> >>>>>>> quite a lot of them.
> >>>>>> Interesting... I wonder it's because deferred_struct_pages defers
> >>>>>> page_ext initialization. Can you check if setting early_page_ext
> >>>>>> reduces or eliminates these allocations before page_ext init cases?
> >>>>> Yes, you are correct. In my 8-core 16GB virtual machine, I used a global
> >>>>> counter
> >>>>>
> >>>>> to record these allocations. With early_page_ext enabled, there were 130
> >>>>> allocations
> >>>>>
> >>>>> before page_ext initialization. Without early_page_ext, there were 802
> >>>>> allocations
> >>>>>
> >>>>> before page_ext initialization.
> >>>>>
> >>>>>
> >>>>>>> Similarly, I have made the following changes and collected the
> >>>>>>> corresponding logs.
> >>>>>>>
> >>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >>>>>>> index 2d4b6f1a554e..6db65b3d52d3 100644
> >>>>>>> --- a/mm/page_alloc.c
> >>>>>>> +++ b/mm/page_alloc.c
> >>>>>>> @@ -1293,6 +1293,8 @@ void __pgalloc_tag_add(struct page *page, struct
> >>>>>>> task_struct *task,
> >>>>>>>                     alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr);
> >>>>>>>                     update_page_tag_ref(handle, &ref);
> >>>>>>>                     put_page_tag_ref(handle);
> >>>>>>> +       } else{
> >>>>>>> +               pr_warn("__pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=%p pfn=%lu nr=%u\n", page, page_to_pfn(page), nr);
> >>>>>>>             }
> >>>>>>>      }
> >>>>>>>
> >>>>>>> @@ -1314,6 +1316,8 @@ void __pgalloc_tag_sub(struct page *page, unsigned
> >>>>>>> int nr)
> >>>>>>>                     alloc_tag_sub(&ref, PAGE_SIZE * nr);
> >>>>>>>                     update_page_tag_ref(handle, &ref);
> >>>>>>>                     put_page_tag_ref(handle);
> >>>>>>> +       } else{
> >>>>>>> +                pr_warn("__pgalloc_tag_sub: get_page_tag_ref failed!
> >>>>>>> page=%p pfn=%lu nr=%u\n", page, page_to_pfn(page), nr);
> >>>>>>>             }
> >>>>>>>      }
> >>>>>>>
> >>>>>>> [    0.261699] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001000 pfn=1048640 nr=2
> >>>>>>> [    0.261711] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001100 pfn=1048644 nr=4
> >>>>>>> [    0.261717] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001200 pfn=1048648 nr=4
> >>>>>>> [    0.261721] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001300 pfn=1048652 nr=4
> >>>>>>> [    0.261893] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001080 pfn=1048642 nr=2
> >>>>>>> [    0.261917] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001400 pfn=1048656 nr=4
> >>>>>>> [    0.262018] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001500 pfn=1048660 nr=2
> >>>>>>> [    0.262024] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001600 pfn=1048664 nr=8
> >>>>>>> [    0.262040] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001580 pfn=1048662 nr=1
> >>>>>>> [    0.262048] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040015c0 pfn=1048663 nr=1
> >>>>>>> [    0.262056] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001800 pfn=1048672 nr=2
> >>>>>>> [    0.262064] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001880 pfn=1048674 nr=2
> >>>>>>> [    0.262078] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001900 pfn=1048676 nr=2
> >>>>>>> [    0.262196] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
> >>>>>>> [    0.262213] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001980 pfn=1048678 nr=2
> >>>>>>> [    0.262220] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001a00 pfn=1048680 nr=4
> >>>>>>> [    0.262246] ODEBUG: selftest passed
> >>>>>>> [    0.262268] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001b00 pfn=1048684 nr=1
> >>>>>>> [    0.262318] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001b40 pfn=1048685 nr=1
> >>>>>>> [    0.262368] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001b80 pfn=1048686 nr=1
> >>>>>>> [    0.262418] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001bc0 pfn=1048687 nr=1
> >>>>>>> [    0.262469] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001c00 pfn=1048688 nr=1
> >>>>>>> [    0.262519] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001c40 pfn=1048689 nr=1
> >>>>>>> [    0.262569] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001c80 pfn=1048690 nr=1
> >>>>>>> [    0.262620] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001cc0 pfn=1048691 nr=1
> >>>>>>> [    0.262670] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001d00 pfn=1048692 nr=1
> >>>>>>> [    0.262721] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001d40 pfn=1048693 nr=1
> >>>>>>> [    0.262771] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001d80 pfn=1048694 nr=1
> >>>>>>> [    0.262821] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001dc0 pfn=1048695 nr=1
> >>>>>>> [    0.262871] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001e00 pfn=1048696 nr=1
> >>>>>>> [    0.262923] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001e40 pfn=1048697 nr=1
> >>>>>>> [    0.262974] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001e80 pfn=1048698 nr=1
> >>>>>>> [    0.263024] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001ec0 pfn=1048699 nr=1
> >>>>>>> [    0.263074] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001f00 pfn=1048700 nr=1
> >>>>>>> [    0.263124] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001f40 pfn=1048701 nr=1
> >>>>>>> [    0.263174] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001f80 pfn=1048702 nr=1
> >>>>>>> [    0.263224] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004001fc0 pfn=1048703 nr=1
> >>>>>>> [    0.263275] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002000 pfn=1048704 nr=1
> >>>>>>> [    0.263325] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002040 pfn=1048705 nr=1
> >>>>>>> [    0.263375] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002080 pfn=1048706 nr=1
> >>>>>>> [    0.263427] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002400 pfn=1048720 nr=16
> >>>>>>> [    0.263437] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040020c0 pfn=1048707 nr=1
> >>>>>>> [    0.263463] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002100 pfn=1048708 nr=1
> >>>>>>> [    0.263465] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002140 pfn=1048709 nr=1
> >>>>>>> [    0.263467] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002180 pfn=1048710 nr=1
> >>>>>>> [    0.263509] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002200 pfn=1048712 nr=4
> >>>>>>> [    0.263512] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002800 pfn=1048736 nr=8
> >>>>>>> [    0.263524] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040021c0 pfn=1048711 nr=1
> >>>>>>> [    0.263536] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002300 pfn=1048716 nr=1
> >>>>>>> [    0.263537] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002340 pfn=1048717 nr=1
> >>>>>>> [    0.263539] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002380 pfn=1048718 nr=1
> >>>>>>> [    0.263604] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004004000 pfn=1048832 nr=128
> >>>>>>> [    0.263638] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004003000 pfn=1048768 nr=64
> >>>>>>> [    0.263650] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002c00 pfn=1048752 nr=16
> >>>>>>> [    0.263655] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040023c0 pfn=1048719 nr=1
> >>>>>>> [    0.270582] __pgalloc_tag_sub: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040023c0 pfn=1048719 nr=1
> >>>>>>> [    0.270591] ftrace: allocating 52717 entries in 208 pages
> >>>>>>> [    0.270592] ftrace: allocated 208 pages with 3 groups
> >>>>>>> [    0.270620] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004002a00 pfn=1048744 nr=8
> >>>>>>> [    0.270636] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040023c0 pfn=1048719 nr=1
> >>>>>>> [    0.270643] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006000 pfn=1048960 nr=1
> >>>>>>> [    0.270649] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006040 pfn=1048961 nr=1
> >>>>>>> [    0.270658] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004007000 pfn=1049024 nr=64
> >>>>>>> [    0.270659] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006080 pfn=1048962 nr=2
> >>>>>>> [    0.270722] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006100 pfn=1048964 nr=1
> >>>>>>> [    0.270730] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006140 pfn=1048965 nr=1
> >>>>>>> [    0.270738] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006180 pfn=1048966 nr=1
> >>>>>>> [    0.270777] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040061c0 pfn=1048967 nr=1
> >>>>>>> [    0.270786] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006200 pfn=1048968 nr=1
> >>>>>>> [    0.270792] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006240 pfn=1048969 nr=1
> >>>>>>> [    0.270833] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006300 pfn=1048972 nr=4
> >>>>>>> [    0.270891] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006280 pfn=1048970 nr=1
> >>>>>>> [    0.270980] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040062c0 pfn=1048971 nr=1
> >>>>>>> [    0.271071] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006400 pfn=1048976 nr=1
> >>>>>>> [    0.271156] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006440 pfn=1048977 nr=1
> >>>>>>> [    0.271185] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006480 pfn=1048978 nr=2
> >>>>>>> [    0.271301] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006500 pfn=1048980 nr=1
> >>>>>>> [    0.271655] Dynamic Preempt: lazy
> >>>>>>> [    0.271662] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006580 pfn=1048982 nr=2
> >>>>>>> [    0.271752] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006600 pfn=1048984 nr=4
> >>>>>>> [    0.271762] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004010000 pfn=1049600 nr=4
> >>>>>>> [    0.271824] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006540 pfn=1048981 nr=1
> >>>>>>> [    0.271916] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006700 pfn=1048988 nr=2
> >>>>>>> [    0.271964] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006780 pfn=1048990 nr=1
> >>>>>>> [    0.272099] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea00040067c0 pfn=1048991 nr=1
> >>>>>>> [    0.272138] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006800 pfn=1048992 nr=2
> >>>>>>> [    0.272144] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006a00 pfn=1049000 nr=8
> >>>>>>> [    0.272249] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006c00 pfn=1049008 nr=8
> >>>>>>> [    0.272319] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006880 pfn=1048994 nr=2
> >>>>>>> [    0.272351] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006900 pfn=1048996 nr=4
> >>>>>>> [    0.272424] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004006e00 pfn=1049016 nr=8
> >>>>>>> [    0.272485] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008000 pfn=1049088 nr=8
> >>>>>>> [    0.272535] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008200 pfn=1049096 nr=2
> >>>>>>> [    0.272600] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008400 pfn=1049104 nr=8
> >>>>>>> [    0.272663] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008300 pfn=1049100 nr=4
> >>>>>>> [    0.272694] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008280 pfn=1049098 nr=2
> >>>>>>> [    0.272708] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008600 pfn=1049112 nr=8
> >>>>>>>
> >>>>>>> [    0.272924] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008880 pfn=1049122 nr=2
> >>>>>>> [    0.272934] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008900 pfn=1049124 nr=2
> >>>>>>> [    0.272952] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008c00 pfn=1049136 nr=4
> >>>>>>> [    0.273035] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008980 pfn=1049126 nr=2
> >>>>>>> [    0.273062] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008e00 pfn=1049144 nr=8
> >>>>>>> [    0.273674] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008d00 pfn=1049140 nr=1
> >>>>>>> [    0.273884] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008d80 pfn=1049142 nr=2
> >>>>>>> [    0.273943] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009000 pfn=1049152 nr=2
> >>>>>>> [    0.274379] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009080 pfn=1049154 nr=2
> >>>>>>> [    0.274575] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009200 pfn=1049160 nr=8
> >>>>>>> [    0.274617] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009100 pfn=1049156 nr=4
> >>>>>>> [    0.274794] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009400 pfn=1049168 nr=2
> >>>>>>> [    0.274840] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009480 pfn=1049170 nr=2
> >>>>>>> [    0.275057] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009500 pfn=1049172 nr=2
> >>>>>>> [    0.275092] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009580 pfn=1049174 nr=2
> >>>>>>> [    0.275134] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009600 pfn=1049176 nr=8
> >>>>>>> [    0.275211] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009800 pfn=1049184 nr=4
> >>>>>>> [    0.275510] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009900 pfn=1049188 nr=2
> >>>>>>> [    0.275548] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009980 pfn=1049190 nr=2
> >>>>>>> [    0.275976] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009a00 pfn=1049192 nr=8
> >>>>>>> [    0.275987] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009c00 pfn=1049200 nr=2
> >>>>>>> [    0.276139] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009c80 pfn=1049202 nr=2
> >>>>>>> [    0.276152] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004008d40 pfn=1049141 nr=1
> >>>>>>> [    0.276242] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009d00 pfn=1049204 nr=1
> >>>>>>> [    0.276358] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009d40 pfn=1049205 nr=1
> >>>>>>> [    0.276444] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009d80 pfn=1049206 nr=1
> >>>>>>> [    0.276526] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009dc0 pfn=1049207 nr=1
> >>>>>>> [    0.276615] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009e00 pfn=1049208 nr=1
> >>>>>>> [    0.276696] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009e40 pfn=1049209 nr=1
> >>>>>>> [    0.276792] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009e80 pfn=1049210 nr=1
> >>>>>>> [    0.276827] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009f00 pfn=1049212 nr=2
> >>>>>>> [    0.276891] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009ec0 pfn=1049211 nr=1
> >>>>>>> [    0.276999] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009f80 pfn=1049214 nr=1
> >>>>>>> [    0.277082] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea0004009fc0 pfn=1049215 nr=1
> >>>>>>> [    0.277172] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea000400a000 pfn=1049216 nr=1
> >>>>>>> [    0.277257] __pgalloc_tag_add: get_page_tag_ref failed!
> >>>>>>> page=ffffea000400a040 pfn=1049217 nr=1
> >>>>>>>
> >>>>>>> and so on.
> >>>>>>>
> >>>>>>>
> >>>>>>>> I think your earlier patch can effectively detect these early
> >>>>>>>> allocations and suppress the warnings. We should also mark these
> >>>>>>>> allocations with CODETAG_FLAG_INACCURATE.
> >>>>>>> Thanks to an excellent AI review, I realized there are issues with
> >>>>>>>
> >>>>>>> my original patch. One problem is the 256-element array; another
> >>>>>> Yes, if there are lots of such allocations, it's not appropriate.
> >>>>>>
> >>>>>>> is that it involves allocation and free operations — meaning we need
> >>>>>>>
> >>>>>>> to record entries at __pgalloc_tag_add and remove them at __pgalloc_tag_sub,
> >>>>>>>
> >>>>>>> which introduces a noticeable overhead. I'm wondering if we can instead
> >>>>>>> set a flag
> >>>>>>>
> >>>>>>> bit in page flags during the early boot stage, which I'll refer to as
> >>>>>>> EARLY_ALLOC_FLAGS.
> >>>>>>>
> >>>>>>> Then, in __pgalloc_tag_sub, we first check for EARLY_ALLOC_FLAGS. If
> >>>>>>> set, we clear the
> >>>>>>>
> >>>>>>> flag and return immediately; otherwise, we perform the actual
> >>>>>>> subtraction of the tag count.
> >>>>>>>
> >>>>>>> This approach seems somewhat similar to the idea behind
> >>>>>>> mem_profiling_compressed.
> >>>>>> That seems doable but let's first check if we can make page_ext
> >>>>>> initialization happen before these allocations. That would be the
> >>>>>> ideal path. If it's not possible then we can focus on alternatives
> >>>>>> like the one you propose.
> >>>>> Yes, the ideal scenario would be to have page_ext initialization
> >>>>> complete before
> >>>>>
> >>>>> these allocations occur. I just did a code walkthrough and found that
> >>>>> this resembles
> >>>>>
> >>>>> the FLATMEM implementation approach - FLATMEM allocates page_ext before
> >>>>> the buddy
> >>>>>
> >>>>> system initialization, so it doesn't seem to encounter the issue we're
> >>>>> facing now.
> >>>>>
> >>>>> https://elixir.bootlin.com/linux/v7.0-rc5/source/mm/mm_init.c#L2707
> >>>> Yes, page_ext_init_flatmem() looks like an interesting option and it
> >>>> would not work with sparsemem. TBH I would prefer to find a simple
> >>>> solution that can identify early init allocations, mark them inaccuate
> >>>> and suppress the warning rather than introduce some complex mechanism
> >>>> to account for them which would work only is some cases (flatmem).
> >>>> With your original approach I think the only real issue is the size of
> >>>> the array that might be too small. The other issue you mentioned about
> >>>> allocated page being freed and then re-allocated after page_ext is
> >>>> inialized but before clear_page_tag_ref() is called is not really a
> >>>> problem. Yes, we will lose that counter's value but it's similar to
> >>>> other early allocations which we just treat as inaccurate. We can also
> >>>> minimize the possibility of this happening by moving
> >>>> clear_page_tag_ref() into init_page_alloc_tagging().
> >>>>
> >>>> I don't like the pageflag option you mentioned because it adds an
> >>>> extra condition check into __pgalloc_tag_sub() which will be executed
> >>>> even after the init stage is over.
> >>>> I'll look into this some more tomorrow as it's quite late now.
> >>
> >> Hi Suren
> >>
> >>
> >>> Just though of something. Are all these pages allocated by slab? If
> >>> so, I think slab does not use page->lru (need to double-check) and we
> >>> could add all these pages allocated during early init into a list and
> >>> then set their page_ext reference to CODETAG_EMPTY in
> >>> init_page_alloc_tagging().
> >> Got your point.
> >>
> >>
> >> There will indeed be some non-SLAB memory allocations here, such as the
> >> following:
> >>
> >>
> >> CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
> >> 7.0.0-rc4-00001-g6392c3a6119e-dirty #31 PREEMPT(lazy)
> >> [    0.326607] Hardware name: Red Hat KVM, BIOS
> >> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [    0.326608] Call Trace:
> >> [    0.326608]  <TASK>
> >> [    0.326609]  dump_stack_lvl+0x53/0x70
> >> [    0.326611]  __pgalloc_tag_add+0x407/0x700
> >> [    0.326616]  get_page_from_freelist+0xa54/0x1310
> >> [    0.326618]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >> [    0.326623]  alloc_pages_mpol+0x13a/0x3f0
> >> [    0.326627]  alloc_pages_noprof+0xf6/0x2b0
> >> [    0.326628]  __pmd_alloc+0x743/0x9c0
> >> [    0.326630]  vmap_range_noflush+0xac0/0x10a0
> >> [    0.326637]  ioremap_page_range+0x17c/0x250
> >> [    0.326639]  __ioremap_caller+0x437/0x5c0
> >> [    0.326645]  acpi_os_map_iomem+0x4c0/0x660
> >> [    0.326647]  acpi_tb_verify_temp_table+0x1c0/0x580
> >> [    0.326649]  acpi_reallocate_root_table+0x2ad/0x460
> >> [    0.326655]  acpi_early_init+0x111/0x460
> >> [    0.326657]  start_kernel+0x271/0x3c0
> >> [    0.326659]  x86_64_start_reservations+0x18/0x30
> >> [    0.326660]  x86_64_start_kernel+0xe2/0xf0
> >> [    0.326662]  common_startup_64+0x13e/0x141
> >> [    0.326663]  </TASK>
> >>
> >> CPU: 0 UID: 0 PID: 2 Comm: kthreadd Not tainted
> >> 7.0.0-rc4-00001-g6392c3a6119e-dirty #31 PREEMPT(lazy)
> >> [    0.329167] Hardware name: Red Hat KVM, BIOS
> >> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [    0.329167] Call Trace:
> >> [    0.329167]  <TASK>
> >> [    0.329167]  dump_stack_lvl+0x53/0x70
> >> [    0.329167]  __pgalloc_tag_add+0x407/0x700
> >> [    0.329167]  get_page_from_freelist+0xa54/0x1310
> >> [    0.329167]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >> [    0.329167]  __alloc_pages_noprof+0x10/0x1b0
> >> [    0.329167]  dup_task_struct+0x163/0x8c0
> >> [    0.329167]  copy_process+0x390/0x4a70
> >> [    0.329167]  kernel_clone+0xe1/0x830
> >> [    0.329167]  kernel_thread+0xcb/0x110
> >> [    0.329167]  kthreadd+0x8a2/0xc60
> >> [    0.329167]  ret_from_fork+0x551/0x720
> >> [    0.329167]  ret_from_fork_asm+0x1a/0x30
> >> [    0.329167]  </TASK>
> >>
> >> CPU: 0 UID: 0 PID: 2 Comm: kthreadd Not tainted
> >> 7.0.0-rc4-00001-g6392c3a6119e-dirty #31 PREEMPT(lazy)
> >> [    0.329167] Hardware name: Red Hat KVM, BIOS
> >> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [    0.329167] Call Trace:
> >> [    0.329167]  <TASK>
> >> [    0.329167]  dump_stack_lvl+0x53/0x70
> >> [    0.329167]  __pgalloc_tag_add+0x407/0x700
> >> [    0.329167]  get_page_from_freelist+0xa54/0x1310
> >> [    0.329167]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >> [    0.329167]  __alloc_pages_noprof+0x10/0x1b0
> >> [    0.329167]  dup_task_struct+0x163/0x8c0
> >> [    0.329167]  copy_process+0x390/0x4a70
> >> [    0.329167]  kernel_clone+0xe1/0x830
> >> [    0.329167]  kernel_thread+0xcb/0x110
> >> [    0.329167]  kthreadd+0x8a2/0xc60
> >> [    0.329167]  ret_from_fork+0x551/0x720
> >> [    0.329167]  ret_from_fork_asm+0x1a/0x30
> >> [    0.329167]  </TASK>
> >>
> >> CPU: 4 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> >> 7.0.0-rc4-00001-g6392c3a6119e-dirty #31 PREEMPT(lazy)
> >> [    0.434265] Hardware name: Red Hat KVM, BIOS
> >> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [    0.434266] Call Trace:
> >> [    0.434266]  <TASK>
> >> [    0.434266]  dump_stack_lvl+0x53/0x70
> >> [    0.434268]  __pgalloc_tag_add+0x407/0x700
> >> [    0.434272]  get_page_from_freelist+0xa54/0x1310
> >> [    0.434274]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >> [    0.434279]  alloc_pages_exact_nid_noprof+0x10f/0x380
> >> [    0.434283]  init_section_page_ext+0x167/0x370
> >> [    0.434284]  page_ext_init+0x451/0x620
> >> [    0.434287]  page_alloc_init_late+0x553/0x630
> >> [    0.434290]  kernel_init_freeable+0x7be/0xd30
> >> [    0.434294]  kernel_init+0x1f/0x1f0
> >> [    0.434295]  ret_from_fork+0x551/0x720
> >> [    0.434301]  ret_from_fork_asm+0x1a/0x30
> >> [    0.434303]  </TASK>
> >>
> >> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> >> 7.0.0-rc4-00001-g6392c3a6119e-dirty #31 PREEMPT(lazy)
> >> [    0.346712] Hardware name: Red Hat KVM, BIOS
> >> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> >> [    0.346713] Call Trace:
> >> [    0.346713]  <TASK>
> >> [    0.346714]  dump_stack_lvl+0x53/0x70
> >> [    0.346715]  __pgalloc_tag_add+0x407/0x700
> >> [    0.346720]  get_page_from_freelist+0xa54/0x1310
> >> [    0.346723]  __alloc_frozen_pages_noprof+0x206/0x4c0
> >> [    0.346729]  __alloc_pages_noprof+0x10/0x1b0
> >> [    0.346731]  alloc_cpu_data+0x96/0x210
> >> [    0.346732]  rb_allocate_cpu_buffer+0xb93/0x1500
> >> [    0.346739]  trace_rb_cpu_prepare+0x21a/0x4f0
> >> [    0.346753]  cpuhp_invoke_callback+0x6db/0x14b0
> >> [    0.346755]  __cpuhp_invoke_callback_range+0xde/0x1d0
> >> [    0.346759]  _cpu_up+0x395/0x880
> >> [    0.346761]  cpu_up+0x1bb/0x210
> >> [    0.346762]  cpuhp_bringup_mask+0xd2/0x150
> >> [    0.346763]  bringup_nonboot_cpus+0x12b/0x170
> >> [    0.346764]  smp_init+0x2f/0x100
> >> [    0.346766]  kernel_init_freeable+0x7a5/0xd30
> >> [    0.346769]  kernel_init+0x1f/0x1f0
> >> [    0.346771]  ret_from_fork+0x551/0x720
> >> [    0.346776]  ret_from_fork_asm+0x1a/0x30
> >> [    0.346778]  </TASK>
> >>
> >> and so on...
> >>
> >>
> >> In fact, I previously conducted extensive and prolonged stress testing
> >>
> >> on memory profiling. After our efforts to address several WARN cases,
> >>
> >> one remaining scenario we are addressing is the warning triggered during
> >>
> >> early slab cache reclaim — which is precisely the situation we are currently
> >>
> >> encountering (although I cannot guarantee that all edge cases have been
> >>
> >> covered by our stress testing). During the stress testing process, this
> >> warning
> >>
> >> did indeed manifest. However, the current environment triggers KASAN slab
> >>
> >> cache reclaim earlier than anticipated.
> >>
> >>
> >> Although the memory allocated prior to page_ext initialization has a
> >> relatively low probability of
> >>
> >> being released in subsequent operations (at least we have not
> >> encountered such cases up to now),
> >>
> >>    I remain uncertain whether there are any overlooked edge cases when
> >> considering only slab-backed pages.
>
> Hi Suren
>
>
> > Ok, I guess specialized solution for slab would not work then. I want
> > to check on my side and understand how the number of these early
> > allocation scales. Is it higher for bigger machines or stays constant.
> > If the latter I think your original simple solution with some fixups
> > can still work. I'll need to instrument my code to capture these early
> > allocations and see where they originate. If you have a patch already
> > doing that it would help speed it up for me.
> > Thanks,
> > Suren.
>
> OK, my V2 patch is as follows:

Thanks! I'll go over it but first I need to check if the number of
early allocations is constant or dependent on some factors like
machine size (as I mentioned before). I hope to carve out some time to
investigate that this Friday.
We should also probably start a separate thread for this v2 as this
email thread is getting painfully long.

>
>
> diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
> index d40ac39bfbe8..bf226c2be2ad 100644
> --- a/include/linux/alloc_tag.h
> +++ b/include/linux/alloc_tag.h
> @@ -74,6 +74,8 @@ static inline void set_codetag_empty(union codetag_ref
> *ref)
>
>   #ifdef CONFIG_MEM_ALLOC_PROFILING
>
> +void alloc_tag_add_early_pfn(unsigned long pfn);
> +
>   #define ALLOC_TAG_SECTION_NAME    "alloc_tags"
>
>   struct codetag_bytes {
> diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h
> index 38a82d65e58e..951d33362268 100644
> --- a/include/linux/pgalloc_tag.h
> +++ b/include/linux/pgalloc_tag.h
> @@ -181,7 +181,7 @@ static inline struct alloc_tag
> *__pgalloc_tag_get(struct page *page)
>
>       if (get_page_tag_ref(page, &ref, &handle)) {
>           alloc_tag_sub_check(&ref);
> -        if (ref.ct)
> +        if (ref.ct && !is_codetag_empty(&ref))
>               tag = ct_to_alloc_tag(ref.ct);
>           put_page_tag_ref(handle);
>       }
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index 58991ab09d84..55c134a71cd0 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -6,6 +6,7 @@
>   #include <linux/kallsyms.h>
>   #include <linux/module.h>
>   #include <linux/page_ext.h>
> +#include <linux/pgalloc_tag.h>
>   #include <linux/proc_fs.h>
>   #include <linux/seq_buf.h>
>   #include <linux/seq_file.h>
> @@ -26,6 +27,85 @@ static bool mem_profiling_support;
>
>   static struct codetag_type *alloc_tag_cttype;
>
> +#ifdef CONFIG_MEM_ALLOC_PROFILING_DEBUG
> +
> +/*
> + * page_ext is allocated and initialized relatively late during boot.
> + * Some pages are allocated before page_ext becomes available.
> + * Track these early PFNs and clear their codetag refs later to avoid
> + * warnings when they are freed.
> + */
> +
> +#define EARLY_ALLOC_PFN_MAX        256
> +
> +static unsigned long early_pfns[EARLY_ALLOC_PFN_MAX] __initdata;
> +static atomic_t early_pfn_count __initdata = ATOMIC_INIT(0);
> +
> +static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
> +{
> +    int old_idx, new_idx;
> +
> +    do {
> +        old_idx = atomic_read(&early_pfn_count);
> +        if (old_idx >= EARLY_ALLOC_PFN_MAX)
> +            return;
> +        new_idx = old_idx + 1;
> +    } while (!atomic_try_cmpxchg(&early_pfn_count, &old_idx, new_idx));
> +
> +    early_pfns[old_idx] = pfn;
> +}
> +
> +static void (*alloc_tag_add_early_pfn_ptr)(unsigned long pfn) __refdata =
> +        __alloc_tag_add_early_pfn;
> +
> +void alloc_tag_add_early_pfn(unsigned long pfn)
> +{
> +    if (static_key_enabled(&mem_profiling_compressed))
> +        return;
> +
> +    if (alloc_tag_add_early_pfn_ptr)
> +        alloc_tag_add_early_pfn_ptr(pfn);
> +}
> +
> +static void __init clear_early_alloc_pfn_tag_refs(void)
> +{
> +    unsigned int i;
> +
> +    for (i = 0; i < atomic_read(&early_pfn_count); i++) {
> +        unsigned long pfn = early_pfns[i];
> +
> +        if (pfn_valid(pfn)) {
> +            struct page *page = pfn_to_page(pfn);
> +            union pgtag_ref_handle handle;
> +            union codetag_ref ref;
> +
> +            if (get_page_tag_ref(page, &ref, &handle)) {
> +                /*
> +                 * An early-allocated page could be freed and reallocated
> +                 * after its page_ext is initialized but before we
> clear it.
> +                 * In that case, it already has a valid tag set.
> +                 * We should not overwrite that valid tag with
> CODETAG_EMPTY.
> +                 */
> +                if (ref.ct) {
> +                    put_page_tag_ref(handle);
> +                    continue;
> +                }
> +
> +                set_codetag_empty(&ref);
> +                update_page_tag_ref(handle, &ref);
> +                put_page_tag_ref(handle);
> +            }
> +    }
> +
> +    atomic_set(&early_pfn_count, 0);
> +
> +    alloc_tag_add_early_pfn_ptr = NULL;
> +}
> +#else /* !CONFIG_MEM_ALLOC_PROFILING_DEBUG */
> +inline void alloc_tag_add_early_pfn(unsigned long pfn) {}
> +static inline void __init clear_early_alloc_pfn_tag_refs(void) {}
> +#endif
> +
>   #ifdef CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU
>   DEFINE_PER_CPU(struct alloc_tag_counters, _shared_alloc_tag);
>   EXPORT_SYMBOL(_shared_alloc_tag);
> @@ -760,6 +840,7 @@ static __init bool need_page_alloc_tagging(void)
>
>   static __init void init_page_alloc_tagging(void)
>   {
> +    clear_early_alloc_pfn_tag_refs();
>   }
>
>   struct page_ext_operations page_alloc_tagging_ops = {
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2d4b6f1a554e..5ce5c4ba401f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1293,6 +1293,12 @@ void __pgalloc_tag_add(struct page *page, struct
> task_struct *task,
>           alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr);
>           update_page_tag_ref(handle, &ref);
>           put_page_tag_ref(handle);
> +    } else {
> +        /*
> +         * page_ext is not available yet, record the pfn so we can
> +         * clear the tag ref later when page_ext is initialized.
> +         */
> +        alloc_tag_add_early_pfn(page_to_pfn(page));
>       }
>   }
>
> Although this 256-entry array remains unmodified for now, I will locally
> record the occurrence counts
>
> of these various early memory allocations. Hopefully this will be
> helpful to you.
>
>
> Thanks
>
> Hao
>
> >
> >>
> >> Thanks
> >> Hao
> >>
> >>>> Thanks,
> >>>> Suren.
> >>>>
> >>>>> However, I'm not entirely certain whether SPARSEMEM can guarantee the
> >>>>> same behavior.
> >>>>>
> >>>>>
> >>>>>>> I would appreciate your valuable feedback and any better suggestions you
> >>>>>>> might have.
> >>>>>> Thanks for pursuing this! I'll help in any way I can.
> >>>>>> Suren.
> >>>>> Thank you so much for your patient guidance and assistance.
> >>>>>
> >>>>> I truly appreciate your willingness to share your knowledge and insights.
> >>>>>
> >>>>> Thanks,
> >>>>> Hao
> >>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> Hao
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Suren.
> >>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>>
> >>>>>>>>> Hao
> >>>>>>>>>
> >>>>>>>>>>> Thanks.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>>> If the slab cache has no free objects, it falls back
> >>>>>>>>>>>>>>> to the buddy allocator to allocate memory. However, at this point page_ext
> >>>>>>>>>>>>>>> is not yet fully initialized, so these newly allocated pages have no
> >>>>>>>>>>>>>>> codetag set. These pages may later be reclaimed by KASAN,which causes
> >>>>>>>>>>>>>>> the warning to trigger when they are freed because their codetag ref is
> >>>>>>>>>>>>>>> still empty.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Use a global array to track pages allocated before page_ext is fully
> >>>>>>>>>>>>>>> initialized, similar to how kmemleak tracks early allocations.
> >>>>>>>>>>>>>>> When page_ext initialization completes, set their codetag
> >>>>>>>>>>>>>>> to empty to avoid warnings when they are freed later.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --- a/include/linux/alloc_tag.h
> >>>>>>>>>>>>>>> +++ b/include/linux/alloc_tag.h
> >>>>>>>>>>>>>>> @@ -74,6 +74,9 @@ static inline void set_codetag_empty(union codetag_ref *ref)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>        #ifdef CONFIG_MEM_ALLOC_PROFILING
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +bool mem_profiling_is_available(void);
> >>>>>>>>>>>>>>> +void alloc_tag_add_early_pfn(unsigned long pfn);
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>>        #define ALLOC_TAG_SECTION_NAME       "alloc_tags"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>        struct codetag_bytes {
> >>>>>>>>>>>>>>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> >>>>>>>>>>>>>>> index 58991ab09d84..a5bf4e72c154 100644
> >>>>>>>>>>>>>>> --- a/lib/alloc_tag.c
> >>>>>>>>>>>>>>> +++ b/lib/alloc_tag.c
> >>>>>>>>>>>>>>> @@ -6,6 +6,7 @@
> >>>>>>>>>>>>>>>        #include <linux/kallsyms.h>
> >>>>>>>>>>>>>>>        #include <linux/module.h>
> >>>>>>>>>>>>>>>        #include <linux/page_ext.h>
> >>>>>>>>>>>>>>> +#include <linux/pgalloc_tag.h>
> >>>>>>>>>>>>>>>        #include <linux/proc_fs.h>
> >>>>>>>>>>>>>>>        #include <linux/seq_buf.h>
> >>>>>>>>>>>>>>>        #include <linux/seq_file.h>
> >>>>>>>>>>>>>>> @@ -26,6 +27,82 @@ static bool mem_profiling_support;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>        static struct codetag_type *alloc_tag_cttype;
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +/*
> >>>>>>>>>>>>>>> + * State of the alloc_tag
> >>>>>>>>>>>>>>> + *
> >>>>>>>>>>>>>>> + * This is used to describe the states of the alloc_tag during bootup.
> >>>>>>>>>>>>>>> + *
> >>>>>>>>>>>>>>> + * When we need to allocate page_ext to store codetag, we face an
> >>>>>>>>>>>>>>> + * initialization timing problem:
> >>>>>>>>>>>>>>> + *
> >>>>>>>>>>>>>>> + * Due to initialization order, pages may be allocated via buddy system
> >>>>>>>>>>>>>>> + * before page_ext is fully allocated and initialized. Although these
> >>>>>>>>>>>>>>> + * pages call the allocation hooks, the codetag will not be set because
> >>>>>>>>>>>>>>> + * page_ext is not yet available.
> >>>>>>>>>>>>>>> + *
> >>>>>>>>>>>>>>> + * When these pages are later free to the buddy system, it triggers
> >>>>>>>>>>>>>>> + * warnings because their codetag is actually empty if
> >>>>>>>>>>>>>>> + * CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled.
> >>>>>>>>>>>>>>> + *
> >>>>>>>>>>>>>>> + * Additionally, in this situation, we cannot record detailed allocation
> >>>>>>>>>>>>>>> + * information for these pages.
> >>>>>>>>>>>>>>> + */
> >>>>>>>>>>>>>>> +enum mem_profiling_state {
> >>>>>>>>>>>>>>> +     DOWN,                   /* No mem_profiling functionality yet */
> >>>>>>>>>>>>>>> +     UP                      /* Everything is working */
> >>>>>>>>>>>>>>> +};
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>> +static enum mem_profiling_state mem_profiling_state = DOWN;
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>> +bool mem_profiling_is_available(void)
> >>>>>>>>>>>>>>> +{
> >>>>>>>>>>>>>>> +     return mem_profiling_state == UP;
> >>>>>>>>>>>>>>> +}
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>> +#ifdef CONFIG_MEM_ALLOC_PROFILING_DEBUG
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>> +#define EARLY_ALLOC_PFN_MAX          256
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>> +static unsigned long early_pfns[EARLY_ALLOC_PFN_MAX];
> >>>>>>>>>>>>>> It's unfortunate that this isn't __initdata.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +static unsigned int early_pfn_count;
> >>>>>>>>>>>>>>> +static DEFINE_SPINLOCK(early_pfn_lock);
> >>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --- a/mm/page_alloc.c
> >>>>>>>>>>>>>>> +++ b/mm/page_alloc.c
> >>>>>>>>>>>>>>> @@ -1293,6 +1293,13 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,
> >>>>>>>>>>>>>>>                     alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr);
> >>>>>>>>>>>>>>>                     update_page_tag_ref(handle, &ref);
> >>>>>>>>>>>>>>>                     put_page_tag_ref(handle);
> >>>>>>>>>>>>>>> +     } else {
> >>>>>>>>>>>>> This branch can be marked as "unlikely".
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +             /*
> >>>>>>>>>>>>>>> +              * page_ext is not available yet, record the pfn so we can
> >>>>>>>>>>>>>>> +              * clear the tag ref later when page_ext is initialized.
> >>>>>>>>>>>>>>> +              */
> >>>>>>>>>>>>>>> +             if (!mem_profiling_is_available())
> >>>>>>>>>>>>>>> +                     alloc_tag_add_early_pfn(page_to_pfn(page));
> >>>>>>>>>>>>>>>             }
> >>>>>>>>>>>>>>>        }
> >>>>>>>>>>>>>> All because of this, I believe.  Is this fixable?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If we take that `else', we know we're running in __init code, yes?  I
> >>>>>>>>>>>>>> don't see how `__init pgalloc_tag_add_early()' could be made to work.
> >>>>>>>>>>>>>> hrm.  Something clever, please.
> >>>>>>>>>>>>> We can have a pointer to a function that is initialized to point to
> >>>>>>>>>>>>> alloc_tag_add_early_pfn, which is defined as __init and uses
> >>>>>>>>>>>>> early_pfns which now can be defined as __initdata. After
> >>>>>>>>>>>>> clear_early_alloc_pfn_tag_refs() is done we reset that pointer to
> >>>>>>>>>>>>> NULL. __pgalloc_tag_add() instead of calling alloc_tag_add_early_pfn()
> >>>>>>>>>>>>> directly checks that pointer and if it's not NULL then calls the
> >>>>>>>>>>>>> function that it points to. This way __pgalloc_tag_add() which is not
> >>>>>>>>>>>>> an __init function will be invoking alloc_tag_add_early_pfn() __init
> >>>>>>>>>>>>> function only until we are done with initialization. I haven't tried
> >>>>>>>>>>>>> this but I think that should work. This also eliminates the need for
> >>>>>>>>>>>>> mem_profiling_state variable since we can use this function pointer
> >>>>>>>>>>>>> instead.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>


  reply	other threads:[~2026-03-26  5:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19  8:31 [PATCH] mm/alloc_tag: clear codetag for pages allocated before page_ext initialization Hao Ge
2026-03-19 22:28 ` Andrew Morton
2026-03-19 23:44   ` Suren Baghdasaryan
2026-03-19 23:48     ` Suren Baghdasaryan
2026-03-20  1:57       ` Hao Ge
2026-03-20  2:14         ` Suren Baghdasaryan
2026-03-23  9:15           ` Hao Ge
2026-03-23 22:47             ` Suren Baghdasaryan
2026-03-24  9:43               ` Hao Ge
2026-03-25  0:21                 ` Suren Baghdasaryan
2026-03-25  2:07                   ` Hao Ge
2026-03-25  6:25                     ` Suren Baghdasaryan
2026-03-25  7:35                       ` Suren Baghdasaryan
2026-03-25 11:20                         ` Hao Ge
2026-03-25 15:17                           ` Suren Baghdasaryan
2026-03-26  1:44                             ` Hao Ge
2026-03-26  5:04                               ` Suren Baghdasaryan [this message]
2026-03-26  5:33                                 ` Hao Ge
2026-03-26  8:23                                   ` Suren Baghdasaryan
2026-03-20  3:14 ` Andrew Morton
2026-03-20  4:18   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpHGmgZX_F7Kae2pZqvLRS4pR9Y3+CRMVZWOxb0XpO9EnQ@mail.gmail.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hao.ge@linux.dev \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox