From: "David Wang" <00107082@163.com>
To: "kernel test robot" <oliver.sang@intel.com>,
"Suren Baghdasaryan" <surenb@google.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
linux-kernel@vger.kernel.org,
"Andrew Morton" <akpm@linux-foundation.org>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Kent Overstreet" <kent.overstreet@linux.dev>,
"Minchan Kim" <minchan@google.com>,
"Pasha Tatashin" <pasha.tatashin@soleen.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Sourav Panda" <souravpanda@google.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Vlastimil Babka" <vbabka@suse.cz>, "Yu Zhao" <yuzhao@google.com>,
"Zhenhua Huang" <quic_zhenhuah@quicinc.com>,
linux-mm@kvack.org
Subject: Re:[linus:master] [alloc_tag] 93d5440ece: WARNING:at_include/linux/alloc_tag.h:#__pgalloc_tag_sub
Date: Sat, 3 May 2025 19:51:05 +0800 (CST) [thread overview]
Message-ID: <bcb2407.16c1.19695fc582d.Coremail.00107082@163.com> (raw)
In-Reply-To: <202504151659.9b09c785-lkp@intel.com>
Hi,
I have been running my system with CONFIG_MEM_ALLOC_PROFILING_DEBUG=y for a long while, trying to
reproduce this. Though I have not yet hit the exact call traces, but I got the same warning via a "simpler"
trace:
[Fri May 2 15:07:00 2025] alloc_tag was not set
[Fri May 2 15:07:00 2025] WARNING: CPU: 0 PID: 677 at
./include/linux/alloc_tag.h:156
./include/linux/alloc_tag.h:154
./include/linux/pgalloc_tag.h:182
mm/page_alloc.c:1163
___free_pages mm/page_alloc.c:5072 <---- code[1] below
drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1417 <---code[2] below
iwl_pcie_rx_handle drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1568 iwlwifi
<other traces are irrelevant...I think>
code[1]:
5063 static void ___free_pages(struct page *page, unsigned int order,
5064 fpi_t fpi_flags)
5065 {
5066 /* get PageHead before we drop reference */
5067 int head = PageHead(page);
5068
5069 if (put_page_testzero(page))
5070 __free_frozen_pages(page, order, fpi_flags);
5071 else if (!head) {
5072 pgalloc_tag_sub_pages(page, (1 << order) - 1); <-----at this point, page[0] may already be returned.
5073 while (order-- > 0)
5074 __free_frozen_pages(page + (1 << order), order,
5075 fpi_flags);
5076 }
5077 }
code[2]:
1415 /* page was stolen from us -- free our reference */
1416 if (page_stolen) {
1417 __free_pages(rxb->page, trans_pcie->rx_page_order);
1418 rxb->page = NULL;
1419 }
From those codes above, my guess is:
Thread1 Thread2
iwlwifi alloc page of order "0"
some callback use get_page to "steal" the memory.
iwlwifi notice the page is "stolen",
and call __free_pages to release its
reference to the page, and
if (put_page_testzero(page)) failed
call put_page to release its reference, and page's
reference count drop to 0 now, and the page is released
pgalloc_tag_sub_pages is called to
sub "0" page, but the page is already
gone, hence the warning.
It is potentially dangerous to pgalloc_tag_sub_pages a released page.
Kind of feel the pgalloc_tag_sub_pages here should be removed:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5669baf2a6fe..63c160537045 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5069,7 +5069,6 @@ static void ___free_pages(struct page *page, unsigned int order,
if (put_page_testzero(page))
__free_frozen_pages(page, order, fpi_flags);
else if (!head) {
- pgalloc_tag_sub_pages(page, (1 << order) - 1);
while (order-- > 0)
__free_frozen_pages(page + (1 << order), order,
fpi_flags);
And about the warning in the origianl report by "kernel test robot", it is not the same.
I think there are places where high-order pages are released via several low-order pages, and my understanding
is that only the first page has the tag, but I am not quite sure, correct me if I am wrong...
(The whole MM is quite complicated to me.).
And I believe when the lower-order page is released without a tag, a debug warning should follow
The warning is kind of "benign" under those conditions, though
Thanks
David
At 2025-04-15 16:35:47, "kernel test robot" <oliver.sang@intel.com> wrote:
>
>
>Hello,
>
>
>
>
>
>If you fix the issue in a separate patch/commit (i.e. not just a new version of
>the same patch/commit), kindly add following tags
>| Reported-by: kernel test robot <oliver.sang@intel.com>
>| Closes: https://lore.kernel.org/oe-lkp/202504151659.9b09c785-lkp@intel.com
>
>
>[ 147.337988][ T2016] ------------[ cut here ]------------
>[ 147.338915][ T2016] alloc_tag was not set
>[ 147.339502][ T2016] WARNING: CPU: 0 PID: 2016 at include/linux/alloc_tag.h:152 __pgalloc_tag_sub (include/linux/alloc_tag.h:152 include/linux/alloc_tag.h:195 mm/page_alloc.c:1089)
>[ 147.341127][ T2016] Modules linked in:
>[ 147.341672][ T2016] CPU: 0 UID: 0 PID: 2016 Comm: grep Tainted: G T 6.14.0-rc6-00062-g93d5440ece3c #1 c08622b3723459177a60d595773689e527750d0d
>[ 147.343295][ T2016] Tainted: [T]=RANDSTRUCT
>[ 147.343867][ T2016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>[ 147.345236][ T2016] RIP: 0010:__pgalloc_tag_sub (include/linux/alloc_tag.h:152 include/linux/alloc_tag.h:195 mm/page_alloc.c:1089)
>[ 147.345982][ T2016] Code: 41 5e 41 5f 5d c3 49 c7 47 e0 00 00 00 00 80 3d 25 ef 66 0a 00 75 a9 48 c7 c7 e0 91 39 8e c6 05 15 ef 66 0a 01 e8 3f bb ad ff <0f> 0b eb 92 48 c7 c0 20 2b e6 90 48 ba 00 00 00 00 00 fc ff df 48
>All code
>========
> 0: 41 5e pop %r14
> 2: 41 5f pop %r15
> 4: 5d pop %rbp
> 5: c3 ret
> 6: 49 c7 47 e0 00 00 00 movq $0x0,-0x20(%r15)
> d: 00
> e: 80 3d 25 ef 66 0a 00 cmpb $0x0,0xa66ef25(%rip) # 0xa66ef3a
> 15: 75 a9 jne 0xffffffffffffffc0
> 17: 48 c7 c7 e0 91 39 8e mov $0xffffffff8e3991e0,%rdi
> 1e: c6 05 15 ef 66 0a 01 movb $0x1,0xa66ef15(%rip) # 0xa66ef3a
> 25: e8 3f bb ad ff call 0xffffffffffadbb69
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: eb 92 jmp 0xffffffffffffffc0
> 2e: 48 c7 c0 20 2b e6 90 mov $0xffffffff90e62b20,%rax
> 35: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx
> 3c: fc ff df
> 3f: 48 rex.W
>
>Code starting with the faulting instruction
>===========================================
> 0: 0f 0b ud2
> 2: eb 92 jmp 0xffffffffffffff96
> 4: 48 c7 c0 20 2b e6 90 mov $0xffffffff90e62b20,%rax
> b: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx
> 12: fc ff df
> 15: 48 rex.W
>[ 147.348298][ T2016] RSP: 0018:ffffc90001def730 EFLAGS: 00010282
>[ 147.349063][ T2016] RAX: dffffc0000000000 RBX: 1ffff920003bdee7 RCX: 0000000000000001
>[ 147.350021][ T2016] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffffff9059d5a8
>[ 147.354294][ T2016] RBP: ffffc90001def7a0 R08: ffffffff87fd99d0 R09: fffffbfff20b3ab5
>[ 147.355329][ T2016] R10: ffffffff9059d5ab R11: 0000000000000001 R12: ffff888106402c58
>[ 147.356345][ T2016] R13: 0000000000000000 R14: 0000000000000001 R15: ffffc90001def778
>[ 147.357317][ T2016] FS: 0000000000000000(0000) GS:ffffffff904be000(0000) knlGS:0000000000000000
>[ 147.358402][ T2016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[ 147.359199][ T2016] CR2: 00007fedda922200 CR3: 0000000155129000 CR4: 00000000000406f0
>[ 147.360209][ T2016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>[ 147.361208][ T2016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>[ 147.362220][ T2016] Call Trace:
>[ 147.362713][ T2016] <TASK>
>[ 147.363178][ T2016] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>[ 147.363763][ T2016] ? __warn (kernel/panic.c:748)
>[ 147.364366][ T2016] ? __wake_up_klogd (arch/x86/include/asm/preempt.h:94 kernel/printk/printk.c:4525)
>[ 147.365015][ T2016] ? __pgalloc_tag_sub (include/linux/alloc_tag.h:152 include/linux/alloc_tag.h:195 mm/page_alloc.c:1089)
>[ 147.365689][ T2016] ? report_bug (lib/bug.c:201 lib/bug.c:219)
>[ 147.366349][ T2016] ? page_ext_get (include/linux/rcupdate.h:337 include/linux/rcupdate.h:849 mm/page_ext.c:525)
>[ 147.366988][ T2016] ? handle_bug (arch/x86/kernel/traps.c:285)
>[ 147.367583][ T2016] ? exc_invalid_op (arch/x86/kernel/traps.c:309 (discriminator 1))
>[ 147.368249][ T2016] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:574)
>[ 147.368921][ T2016] ? irq_work_queue (arch/x86/include/asm/atomic.h:23 arch/x86/include/asm/atomic.h:145 include/linux/atomic/atomic-arch-fallback.h:1690 include/linux/atomic/atomic-instrumented.h:954 kernel/irq_work.c:61 kernel/irq_work.c:119)
>[ 147.369562][ T2016] ? __pgalloc_tag_sub (include/linux/alloc_tag.h:152 include/linux/alloc_tag.h:195 mm/page_alloc.c:1089)
>[ 147.370244][ T2016] ? __alloc_contig_migrate_range (mm/page_alloc.c:1084)
>[ 147.371016][ T2016] free_frozen_pages (mm/page_alloc.c:1211 mm/page_alloc.c:2738)
>[ 147.371700][ T2016] __free_slab (mm/slub.c:2669)
>[ 147.372331][ T2016] free_slab (mm/slub.c:2692)
>[ 147.372900][ T2016] free_to_partial_list (mm/slub.c:4414)
>[ 147.373569][ T2016] ? qlist_free_all (mm/kasan/quarantine.c:163 mm/kasan/quarantine.c:179)
>[ 147.374213][ T2016] __slab_free (mm/slub.c:4534)
>[ 147.374819][ T2016] ? __kasan_check_read (mm/kasan/shadow.c:32)
>[ 147.375470][ T2016] ? mark_lock (arch/x86/include/asm/bitops.h:227 (discriminator 3) arch/x86/include/asm/bitops.h:239 (discriminator 3) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 3) kernel/locking/lockdep.c:230 (discriminator 3) kernel/locking/lockdep.c:4729 (discriminator 3))
>[ 147.376062][ T2016] ? mark_held_locks (kernel/locking/lockdep.c:4323)
>[ 147.376701][ T2016] ___cache_free (mm/slub.c:4681)
>[ 147.377248][ T2016] qlist_free_all (mm/kasan/quarantine.c:174)
>[ 147.377795][ T2016] kasan_quarantine_reduce (include/linux/srcu.h:357 mm/kasan/quarantine.c:287)
>[ 147.378470][ T2016] __kasan_slab_alloc (mm/kasan/common.c:329)
>[ 147.379088][ T2016] kmem_cache_alloc_noprof (include/linux/kasan.h:250 mm/slub.c:4128 mm/slub.c:4177 mm/slub.c:4184)
>[ 147.379784][ T2016] getname_flags (include/linux/sched.h:2248 fs/namei.c:139)
>[ 147.380436][ T2016] getname (fs/namei.c:224)
>[ 147.380969][ T2016] do_sys_openat2 (fs/open.c:1422)
>[ 147.381587][ T2016] ? build_open_flags (fs/open.c:1414)
>[ 147.382258][ T2016] __x64_sys_openat (fs/open.c:1454)
>[ 147.382899][ T2016] ? __ia32_compat_sys_open (fs/open.c:1454)
>[ 147.383611][ T2016] x64_sys_call (arch/x86/entry/syscall_64.c:36)
>[ 147.384281][ T2016] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>[ 147.384977][ T2016] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
>[ 147.385724][ T2016] RIP: 0033:0x7fedda9e895d
>
>The kernel config and materials to reproduce are available at:
>https://download.01.org/0day-ci/archive/20250415/202504151659.9b09c785-lkp@intel.com
>
>
>
>--
>0-DAY CI Kernel Test Service
>https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2025-05-03 11:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-15 8:35 [linus:master] [alloc_tag] 93d5440ece: WARNING:at_include/linux/alloc_tag.h:#__pgalloc_tag_sub kernel test robot
2025-05-03 11:51 ` David Wang [this message]
2025-05-05 16:57 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bcb2407.16c1.19695fc582d.Coremail.00107082@163.com \
--to=00107082@163.com \
--cc=akpm@linux-foundation.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=minchan@google.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=peterz@infradead.org \
--cc=quic_zhenhuah@quicinc.com \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.