All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: David Wang <00107082@163.com>,
	akpm@linux-foundation.org, kent.overstreet@linux.dev,
	oliver.sang@intel.com, cachen@purestorage.com,
	linux-mm@kvack.org, oe-lkp@lists.linux.dev,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()
Date: Mon, 23 Jun 2025 11:01:41 +0900	[thread overview]
Message-ID: <aFi1hcpzFs5EhPAb@harry> (raw)
In-Reply-To: <CAJuCfpG3=0MCac2jTVM9LiJWDwWdLE3vrcJp52x4ZX5XdSEv1A@mail.gmail.com>

On Sun, Jun 22, 2025 at 03:24:08PM -0700, Suren Baghdasaryan wrote:
> On Fri, Jun 20, 2025 at 8:43 PM David Wang <00107082@163.com> wrote:
> >
> >
> > At 2025-06-21 03:53:05, "Harry Yoo" <harry.yoo@oracle.com> wrote:
> > >alloc_tag_top_users() attempts to lock alloc_tag_cttype->mod_lock
> > >even when the alloc_tag_cttype is not allocated because:
> > >
> > >  1) alloc tagging is disabled because mem profiling is disabled
> > >     (!alloc_tag_cttype)
> > >  2) alloc tagging is enabled, but not yet initialized (!alloc_tag_cttype)
> > >  3) alloc tagging is enabled, but failed initialization
> > >     (!alloc_tag_cttype or IS_ERR(alloc_tag_cttype))
> > >
> > >In all cases, alloc_tag_cttype is not allocated, and therefore
> > >alloc_tag_top_users() should not attempt to acquire the semaphore.
> > >
> > >This leads to a crash on memory allocation failure by attempting to
> > >acquire a non-existent semaphore:
> > >
> > >  Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
> > >  KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
> > >  CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G      D             6.16.0-rc2 #1 VOLUNTARY
> > >  Tainted: [D]=DIE
> > >  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > >  RIP: 0010:down_read_trylock+0xaa/0x3b0
> > >  Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
> > >  RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
> > >  RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
> > >  RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> > >  RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
> > >  R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
> > >  R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
> > >  FS:  00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
> > >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >  CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
> > >  Call Trace:
> > >   <TASK>
> > >   codetag_trylock_module_list+0xd/0x20
> > >   alloc_tag_top_users+0x369/0x4b0
> > >   __show_mem+0x1cd/0x6e0
> > >   warn_alloc+0x2b1/0x390
> > >   __alloc_frozen_pages_noprof+0x12b9/0x21a0
> > >   alloc_pages_mpol+0x135/0x3e0
> > >   alloc_slab_page+0x82/0xe0
> > >   new_slab+0x212/0x240
> > >   ___slab_alloc+0x82a/0xe00
> > >   </TASK>
> > >
> > >As David Wang points out, this issue became easier to trigger after commit
> > >780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
> > >
> > >Before the commit, the issue occurred only when it failed to allocate
> > >and initialize alloc_tag_cttype or if a memory allocation fails before
> > >alloc_tag_init() is called. After the commit, it can be easily triggered
> > >when memory profiling is compiled but disabled at boot.
> 
> Thanks for the fix and sorry about the delay with reviewing it.

No problem ;)

> > >
> > >To properly determine whether alloc_tag_init() has been called and
> > >its data structures initialized, verify that alloc_tag_cttype is a valid
> > >pointer before acquiring the semaphore. If the variable is NULL or an error
> > >value, it has not been properly initialized. In such a case, just skip
> > >and do not attempt acquire the semaphore.
> 
> nit: s/attempt acquire/attempt to acquire

Will fix the typo.

> > >
> > >Reported-by: kernel test robot <oliver.sang@intel.com>
> > >Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com__;!!ACWV5N9M2RV99hQ!NZv9w8rtFb5ni1zqQs7y8loVNvbrbW3d1pBi4bA_f_Tfh-pegcni0iK5642QuK6FqCBCaOUfy-7KeUc$ 
> > >Fixes: 780138b12381 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
> > >Fixes: 1438d349d16b ("lib: add memory allocations report in show_mem()")
> > >Cc: stable@vger.kernel.org
> > >Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> >
> > Just notice another thread can be closed as well:
> > https://urldefense.com/v3/__https://lore.kernel.org/all/202506131711.5b41931c-lkp@intel.com/__;!!ACWV5N9M2RV99hQ!NZv9w8rtFb5ni1zqQs7y8loVNvbrbW3d1pBi4bA_f_Tfh-pegcni0iK5642QuK6FqCBCaOUfSGgkKj0$ 
> > This coincide with scenario #1, where OOM happened with
> > CONFIG_MEM_ALLOC_PROFILING=y
> > # CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT is not set
> > # CONFIG_MEM_ALLOC_PROFILING_DEBUG is not set
> >
> > >---
> > >
> > >v1 -> v2:
> > >
> > >- v1 fixed the bug only when MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n.
> > >
> > >  v2 now fixes the bug even when MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y.
> > >  I didn't expect alloc_tag_cttype to be NULL when
> > >  mem_profiling_support is true, but as David points out (Thanks David!)
> > >  if a memory allocation fails before alloc_tag_init(), it can be NULL.
> > >
> > >  So instead of indirectly checking mem_profiling_support, just directly
> > >  check if alloc_tag_cttype is allocated.
> > >
> > >- Closes: https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/202505071555.e757f1e0-lkp@intel.com__;!!ACWV5N9M2RV99hQ!NZv9w8rtFb5ni1zqQs7y8loVNvbrbW3d1pBi4bA_f_Tfh-pegcni0iK5642QuK6FqCBCaOUfwfwsQlE$ 
> > >  tag was removed because it was not a crash and not relevant to this
> > >  patch.
> > >
> > >- Added Cc: stable because, if an allocation fails before
> > >  alloc_tag_init(), it can be triggered even prior-780138b12381.
> > >  I verified that the bug can be triggered in v6.12 and fixed by this
> > >  patch.
> > >
> > >  It should be quite difficult to trigger in practice, though.
> > >  Maybe I'm a bit paranoid?
> > >
> > > lib/alloc_tag.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > >index 66a4628185f7..d8ec4c03b7d2 100644
> > >--- a/lib/alloc_tag.c
> > >+++ b/lib/alloc_tag.c
> > >@@ -124,7 +124,9 @@ size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
> > >       struct codetag_bytes n;
> > >       unsigned int i, nr = 0;
> > >
> > >-      if (can_sleep)
> > >+      if (IS_ERR_OR_NULL(alloc_tag_cttype))
> > >+              return 0;
> 
> So, AFAIKT alloc_tag_cttype will be NULL when memory profiling is
> disabled and it will be ENOMEM if codetag_register_type() fails.

Yes.

Or when memory profiling is enabled, but a memory allocation fails
before alloc_tag_init().

> I think it would be good to add a pr_warn() in the alloc_tag_init() when
> codetag_register_type() fails so that the user can determine the
> reason why show_mem() report is missing allocation tag information.

Will do.

> > >+      else if (can_sleep)
> 
> nit: the above extra "else" is not really needed. The following should
> work just fine, is more readable and produces less churn:
> 
> +      if (IS_ERR_OR_NULL(alloc_tag_cttype))
> +              return 0;
> +
>       if (can_sleep)
>                codetag_lock_module_list(alloc_tag_cttype, true);
>        else if (!codetag_trylock_module_list(alloc_tag_cttype))
>                return 0;

Will do, thanks!

> 
> > >               codetag_lock_module_list(alloc_tag_cttype, true);
> > >       else if (!codetag_trylock_module_list(alloc_tag_cttype))
> > >               return 0;
> > >--
> > >2.43.0

-- 
Cheers,
Harry / Hyeonggon

      reply	other threads:[~2025-06-23  2:02 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18  6:25 [linus:master] [lib/test_vmalloc.c] 2d76e79315: Kernel_panic-not_syncing:Fatal_exception kernel test robot
2025-06-19 14:10 ` Kernel crash due to alloc_tag_top_users() being called when !mem_profiling_support? Harry Yoo
2025-06-19 15:04   ` Harry Yoo
2025-06-20  8:47     ` Uladzislau Rezki
2025-06-22 22:54       ` Suren Baghdasaryan
2025-06-23 11:29         ` Uladzislau Rezki
2025-06-19 15:08   ` David Wang
2025-06-20  1:14     ` Harry Yoo
2025-06-20  0:40 ` [PATCH] lib/alloc_tag: do not acquire nonexistent lock when mem profiling is disabled Harry Yoo
2025-06-20  3:09   ` David Wang
2025-06-20 10:40     ` [PATCH] " Harry Yoo
2025-06-20 11:33       ` Harry Yoo
2025-06-20 13:59         ` David Wang
2025-06-20 12:47       ` Harry Yoo
2025-06-20 10:02 ` CONFIG_TEST_VMALLOC=y conflict/race with alloc_tag_init David Wang
2025-06-22 22:50   ` Suren Baghdasaryan
2025-06-23  2:04     ` Harry Yoo
2025-06-23  2:45     ` David Wang
2025-06-23  3:16       ` David Wang
2025-06-23  4:39         ` David Wang
2025-06-23 11:36       ` Uladzislau Rezki
2025-06-23 13:20         ` David Wang
2025-06-20 14:24 ` [PATCH] lib/test_vmalloc.c: demote vmalloc_test_init to late_initcall David Wang
2025-06-20 19:59   ` Harry Yoo
2025-06-20 19:53 ` [PATCH v2] lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() Harry Yoo
2025-06-21  3:43   ` David Wang
2025-06-22 22:24     ` [PATCH " Suren Baghdasaryan
2025-06-23  2:01       ` Harry Yoo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFi1hcpzFs5EhPAb@harry \
    --to=harry.yoo@oracle.com \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=cachen@purestorage.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.