linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: David Wang <00107082@163.com>
Cc: akpm@linux-foundation.org, surenb@google.com,
	kent.overstreet@linux.dev, oliver.sang@intel.com,
	cachen@purestorage.com, linux-mm@kvack.org,
	oe-lkp@lists.linux.dev
Subject: Re: [PATCH] lib/alloc_tag: do not acquire nonexistent lock when mem profiling is disabled
Date: Fri, 20 Jun 2025 20:33:31 +0900	[thread overview]
Message-ID: <aFVHCxM6M0-zUdgo@hyeyoo> (raw)
In-Reply-To: <aFU6lZxrB_3Wmywn@hyeyoo>

On Fri, Jun 20, 2025 at 07:40:21PM +0900, Harry Yoo wrote:
> On Fri, Jun 20, 2025 at 11:09:16AM +0800, David Wang wrote:
> > 
> > 
> > At 2025-06-20 08:40:32, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> > >alloc_tag_top_users() attempts to acquire alloc_tag_ctype->mod_lock
> > >even when memory allocation profiling feature is disabled at runtime.
> > >If the feature is compiled in but not enabled at boot, alloc_tag_init()
> > >does not properly allocate and initialize the alloc_tag_cttype variable.
> > >
> > >This leads to a crash on memory allocation failure by attempting to
> > >acquire a semaphore that does not exist:
> > >
> > >  Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
> > >  KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
> > >  CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G      D             6.16.0-rc2 #1 VOLUNTARY
> > >  Tainted: [D]=DIE
> > >  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > >  RIP: 0010:down_read_trylock+0xaa/0x3b0
> > >  Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
> > >  RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
> > >  RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
> > >  RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> > >  RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
> > >  R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
> > >  R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
> > >  FS:  00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
> > >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >  CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
> > >  Call Trace:
> > >   <TASK>
> > >   codetag_trylock_module_list+0xd/0x20
> > >   alloc_tag_top_users+0x369/0x4b0
> > >   __show_mem+0x1cd/0x6e0
> > >   warn_alloc+0x2b1/0x390
> > >   __alloc_frozen_pages_noprof+0x12b9/0x21a0
> > >   alloc_pages_mpol+0x135/0x3e0
> > >   alloc_slab_page+0x82/0xe0
> > >   new_slab+0x212/0x240
> > >   ___slab_alloc+0x82a/0xe00
> > >   </TASK>
> > >
> > >As David Wang points out, this issue was introduced by commit
> > >780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
> > >Before the commit, alloc tagging subsystem unconditionally allocates
> > >the semaphore.
> > >
> > >After the commit, alloc_tag_top_users() must check whether it was
> > >actually initialized. Fix it by adding the appropriate check in
> > >alloc_tag_top_users().
> > >
> > >Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com
> > 
> > I am not quite sure this can be closed, according to the config file
> > https://download.01.org/0day-ci/archive/20250618/202506181351.bba867dd-lkp@intel.com/config-6.15.0-rc6-00142-g2d76e79315e4
> > 
> > CONFIG_MEM_ALLOC_PROFILING=y
> > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y  <---
> > CONFIG_MEM_ALLOC_PROFILING_DEBUG=y
> > 
> > mem_profiling_support is true on boot, and alloc_tag_ctype is properly initialized.
> > 
> > Maybe there is other issue lurking somewhere....
> 
> Oops, I thought they are all the same issues.
> I should have been more thorough and checked the config.
> Thank you for pointing it out!
> 
> I think you're right. mem_profiling_support == true doesn't necessarily
> mean it's allocated and initialized, as you demonstrated it in the other
> email.
> 
> I think it'd be more robust to set mem_profiling_support to false,
> disable mem_alloc_profiling_key at boot and enable it later when
> it is properly allocated.

Actually, we need something a bit more sophiscated than that.
IIUC memory allocation is accounted even before alloc_tag_init(),
and the logic depends on mem_alloc_profiling_key being enabled.
If we change that, some allocations during early boot stage won't be
accounted.

I think we need to introduce a separate variable to indicate whether
alloc_tag_init() has completed its initialization and check that in
alloc_tag_top_users().

> > >Closes: https://lore.kernel.org/oe-lkp/202505071555.e757f1e0-lkp@intel.com
> > 
> > This one should not be closed, because "# CONFIG_MEM_ALLOC_PROFILING is not set".
> > https://download.01.org/0day-ci/archive/20250507/202505071555.e757f1e0-lkp@intel.com/config-6.15.0-rc2-00491-g7fc85b92db96
> 
> I assumed it was mem profiling that caused the crash, since it happened
> while printing memory info. Pretty weird coincidence...
> 
> I'll try to reproduce it and figure out why it crashed.
> 
> > >Fixes: 780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
> > >Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > >---
> > >
> > >I manually confirmed that the crash in the vmalloc test module no longer
> > >occurs with this patch when the memory profiling feature is compiled
> > >but not enabled at boot.
> > >
> > >No Cc: stable because the offending commit was introduced in v6.16-rc1.
> > >
> > > lib/alloc_tag.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > >index 66a4628185f7..20c627191d3e 100644
> > >--- a/lib/alloc_tag.c
> > >+++ b/lib/alloc_tag.c
> > >@@ -124,7 +124,9 @@ size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
> > > 	struct codetag_bytes n;
> > > 	unsigned int i, nr = 0;
> > > 
> > >-	if (can_sleep)
> > >+	if (!mem_profiling_support)
> > >+		return 0;
> > >+	else if (can_sleep)
> > > 		codetag_lock_module_list(alloc_tag_cttype, true);
> > > 	else if (!codetag_trylock_module_list(alloc_tag_cttype))
> > > 		return 0;
> > >-- 
> > >2.43.0
> 
> -- 
> Cheers,
> Harry / Hyeonggon
> 

-- 
Cheers,
Harry / Hyeonggon


  reply	other threads:[~2025-06-20 11:33 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18  6:25 [linus:master] [lib/test_vmalloc.c] 2d76e79315: Kernel_panic-not_syncing:Fatal_exception kernel test robot
2025-06-19 14:10 ` Kernel crash due to alloc_tag_top_users() being called when !mem_profiling_support? Harry Yoo
2025-06-19 15:04   ` Harry Yoo
2025-06-20  8:47     ` Uladzislau Rezki
2025-06-22 22:54       ` Suren Baghdasaryan
2025-06-23 11:29         ` Uladzislau Rezki
2025-06-19 15:08   ` David Wang
2025-06-20  1:14     ` Harry Yoo
2025-06-20  0:40 ` [PATCH] lib/alloc_tag: do not acquire nonexistent lock when mem profiling is disabled Harry Yoo
2025-06-20  3:09   ` David Wang
2025-06-20 10:40     ` [PATCH] " Harry Yoo
2025-06-20 11:33       ` Harry Yoo [this message]
2025-06-20 13:59         ` David Wang
2025-06-20 12:47       ` Harry Yoo
2025-06-20 10:02 ` CONFIG_TEST_VMALLOC=y conflict/race with alloc_tag_init David Wang
2025-06-22 22:50   ` Suren Baghdasaryan
2025-06-23  2:04     ` Harry Yoo
2025-06-23  2:45     ` David Wang
2025-06-23  3:16       ` David Wang
2025-06-23  4:39         ` David Wang
2025-06-23 11:36       ` Uladzislau Rezki
2025-06-23 13:20         ` David Wang
2025-06-20 14:24 ` [PATCH] lib/test_vmalloc.c: demote vmalloc_test_init to late_initcall David Wang
2025-06-20 19:59   ` Harry Yoo
2025-06-20 19:53 ` [PATCH v2] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() Harry Yoo
2025-06-21  3:43   ` David Wang
2025-06-22 22:24     ` [PATCH " Suren Baghdasaryan
2025-06-23  2:01       ` Harry Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFVHCxM6M0-zUdgo@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=cachen@purestorage.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).