public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP
@ 2026-04-27  7:09 Harry Yoo (Oracle)
  2026-04-27  7:09 ` [PATCH mm-hotfixes v2 1/2] mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() " Harry Yoo (Oracle)
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Harry Yoo (Oracle) @ 2026-04-27  7:09 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Zi Yan, Harry Yoo, Shakeel Butt,
	Alexei Starovoitov, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin
  Cc: linux-mm, linux-kernel, stable

Due to my mistake, V1 was sent twice w/o proper cover letter and
Cc: stable. Please ignore V1. Apologies for the noise. 

Changes since V1:
- used b4 to send patch series (w/ a proper cover letter) instead of
  my broken git send-email script (Thanks Vlastimil)
- added Cc: stable to patches 1 and 2

On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held.
As a result, alloc_frozen_pages_nolock() and kmalloc_nolock() called
from an NMI context can successfully re-acquire the lock that the
page/slab allocators are already holding (no deadlock because it's
trylock,  but leads to e.g., allocating the same page/object twice and
causing use-after-free).

It was discovered while testing the new kmalloc/kfree_nolock() test case
in the slub_kunit test module with CONFIG_DEBUG_SPINLOCK=y on a UP
kernel.

Patch 1 fixes alloc_frozen_pages_nolock() and
patch 2 fixes kmalloc_nolock().

Note: As pointed out by Vlastimil Babka [1], in theory a kprobe in a
locked section could trigger the same issue. However, fixing that
involves a non-trivial rework (e.g., inventing a new spinlock type) or
introduces unnecessary overhead for all spinlocks on UP (e.g., let all
spinlocks check locked status on UP).

Given that BPF tracing on UP is rare, and it's even more unlikely to
trace a function called from the memory allocator within the locked
section, this patch series addresses the issue only on NMI contexts
(which is rare as well but now covered by the new test case).

[1] https://lore.kernel.org/linux-mm/af3a7fa9-b368-4ffd-964d-9e4fcba863a8@kernel.org

Cc: stable.vger.kernel.org
---
Harry Yoo (Oracle) (2):
      mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP
      mm/slab: return NULL early from kmalloc_nolock() in NMI on UP

 mm/page_alloc.c | 5 +++++
 mm/slub.c       | 4 ++++
 2 files changed, 9 insertions(+)
---
base-commit: ba24da38a519dfcff8cce3f3f2726d7b159a4d75
change-id: 20260427-nolock-api-fix-bd056911e68e

Best regards,
--  
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH mm-hotfixes v2 1/2] mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP
  2026-04-27  7:09 [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP Harry Yoo (Oracle)
@ 2026-04-27  7:09 ` Harry Yoo (Oracle)
  2026-04-27  7:09 ` [PATCH mm-hotfixes v2 2/2] mm/slab: return NULL early from kmalloc_nolock() " Harry Yoo (Oracle)
  2026-04-27  8:00 ` [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs " Vlastimil Babka (SUSE)
  2 siblings, 0 replies; 5+ messages in thread
From: Harry Yoo (Oracle) @ 2026-04-27  7:09 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Zi Yan, Harry Yoo, Shakeel Butt,
	Alexei Starovoitov, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin
  Cc: linux-mm, linux-kernel, stable

On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held. As a
result, alloc_frozen_pages_nolock() called from NMI context can
re-enter rmqueue() and acquire the zone lock that the interrupted
context is already holding, corrupting the freelists.

With CONFIG_DEBUG_SPINLOCK on UP, the following BUG is triggered with
the slub_kunit test module:

  BUG: spinlock trylock failure on UP on CPU#0, kunit_try_catch/243
  [...]
  Call Trace:
   <NMI>
   dump_stack_lvl+0x3f/0x60
   do_raw_spin_trylock+0x41/0x50
   _raw_spin_trylock+0x24/0x50
   rmqueue.isra.0+0x2a9/0xa70
   get_page_from_freelist+0xeb/0x450
   alloc_frozen_pages_nolock_noprof+0x111/0x1e0
   allocate_slab+0x42a/0x500
   ___slab_alloc+0xa7/0x4c0
   kmalloc_nolock_noprof+0x164/0x310
   [...]
   </NMI>

Fix this by returning NULL early when invoked from NMI on a UP kernel.

Link: https://lore.kernel.org/linux-mm/ad_cqe51pvr1WaDg@hyeyoo
Cc: <stable@vger.kernel.org>
Fixes: d7242af86434 ("mm: Introduce alloc_frozen_pages_nolock()")
Signed-off-by: Harry Yoo (Oracle) <harry@kernel.org>
---
 mm/page_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 71859993dd54..23c7298d3be2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7737,6 +7737,11 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
 	 */
 	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
 		return NULL;
+
+	/* On UP, spin_trylock() always succeeds even when it is locked */
+	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
+		return NULL;
+
 	if (!pcp_allowed_order(order))
 		return NULL;
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH mm-hotfixes v2 2/2] mm/slab: return NULL early from kmalloc_nolock() in NMI on UP
  2026-04-27  7:09 [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP Harry Yoo (Oracle)
  2026-04-27  7:09 ` [PATCH mm-hotfixes v2 1/2] mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() " Harry Yoo (Oracle)
@ 2026-04-27  7:09 ` Harry Yoo (Oracle)
  2026-04-27  8:00 ` [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs " Vlastimil Babka (SUSE)
  2 siblings, 0 replies; 5+ messages in thread
From: Harry Yoo (Oracle) @ 2026-04-27  7:09 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Zi Yan, Harry Yoo, Shakeel Butt,
	Alexei Starovoitov, Hao Li, Christoph Lameter, David Rientjes,
	Roman Gushchin
  Cc: linux-mm, linux-kernel, stable

On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held. As a
result, kmalloc_nolock() called from NMI context can re-enter the slab
allocator and acquire n->list_lock that the interrupted context is
already holding, corrupting slab state.

With CONFIG_DEBUG_SPINLOCK on UP, the following BUG is triggered with
the slub_kunit test module:

  BUG: spinlock trylock failure on UP on CPU#0, kunit_try_catch/243
  [...]
  Call Trace:
   <NMI>
   dump_stack_lvl+0x3f/0x60
   do_raw_spin_trylock+0x41/0x50
   _raw_spin_trylock+0x24/0x50
   get_from_partial_node+0x120/0x4d0
   ___slab_alloc+0x8a/0x4c0
   kmalloc_nolock_noprof+0x164/0x310
   [...]
   </NMI>

Fix this by returning NULL early when invoked from NMI on a UP kernel.

Link: https://lore.kernel.org/linux-mm/ad_cqe51pvr1WaDg@hyeyoo
Cc: <stable@vger.kernel.org>
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Harry Yoo (Oracle) <harry@kernel.org>
---
 mm/slub.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 92362eeb13e5..b4ec15df92f6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5339,6 +5339,10 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
 	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
 		return NULL;
 
+	/* On UP, spin_trylock() always succeeds even when it is locked */
+	if (!IS_ENABLED(CONFIG_SMP) && in_nmi())
+		return NULL;
+
 retry:
 	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
 		return NULL;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP
  2026-04-27  7:09 [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP Harry Yoo (Oracle)
  2026-04-27  7:09 ` [PATCH mm-hotfixes v2 1/2] mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() " Harry Yoo (Oracle)
  2026-04-27  7:09 ` [PATCH mm-hotfixes v2 2/2] mm/slab: return NULL early from kmalloc_nolock() " Harry Yoo (Oracle)
@ 2026-04-27  8:00 ` Vlastimil Babka (SUSE)
  2026-04-27  8:13   ` Harry Yoo (Oracle)
  2 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-04-27  8:00 UTC (permalink / raw)
  To: Harry Yoo (Oracle), Andrew Morton, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan,
	Shakeel Butt, Alexei Starovoitov, Hao Li, Christoph Lameter,
	David Rientjes, Roman Gushchin
  Cc: linux-mm, linux-kernel, stable

On 4/27/26 09:09, Harry Yoo (Oracle) wrote:
> Due to my mistake, V1 was sent twice w/o proper cover letter and
> Cc: stable. Please ignore V1. Apologies for the noise. 
> 
> Changes since V1:
> - used b4 to send patch series (w/ a proper cover letter) instead of
>   my broken git send-email script (Thanks Vlastimil)
> - added Cc: stable to patches 1 and 2
> 
> On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
> unconditionally succeeds even when the lock is already held.
> As a result, alloc_frozen_pages_nolock() and kmalloc_nolock() called
> from an NMI context can successfully re-acquire the lock that the
> page/slab allocators are already holding (no deadlock because it's
> trylock,  but leads to e.g., allocating the same page/object twice and
> causing use-after-free).
> 
> It was discovered while testing the new kmalloc/kfree_nolock() test case
> in the slub_kunit test module with CONFIG_DEBUG_SPINLOCK=y on a UP
> kernel.
> 
> Patch 1 fixes alloc_frozen_pages_nolock() and
> patch 2 fixes kmalloc_nolock().

Thanks. Given the problem exposed is in a slab kunit test I think it's
better to handle this in the slab tree. The page_alloc change is small and
should not cause conflicts. So I've merged both in slab/for-next.

> Note: As pointed out by Vlastimil Babka [1], in theory a kprobe in a
> locked section could trigger the same issue. However, fixing that
> involves a non-trivial rework (e.g., inventing a new spinlock type) or
> introduces unnecessary overhead for all spinlocks on UP (e.g., let all
> spinlocks check locked status on UP).
> 
> Given that BPF tracing on UP is rare, and it's even more unlikely to
> trace a function called from the memory allocator within the locked
> section, this patch series addresses the issue only on NMI contexts
> (which is rare as well but now covered by the new test case).
> 
> [1] https://lore.kernel.org/linux-mm/af3a7fa9-b368-4ffd-964d-9e4fcba863a8@kernel.org
> 
> Cc: stable.vger.kernel.org
> ---
> Harry Yoo (Oracle) (2):
>       mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP
>       mm/slab: return NULL early from kmalloc_nolock() in NMI on UP
> 
>  mm/page_alloc.c | 5 +++++
>  mm/slub.c       | 4 ++++
>  2 files changed, 9 insertions(+)
> ---
> base-commit: ba24da38a519dfcff8cce3f3f2726d7b159a4d75
> change-id: 20260427-nolock-api-fix-bd056911e68e
> 
> Best regards,
> --  
> Cheers,
> Harry / Hyeonggon
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP
  2026-04-27  8:00 ` [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs " Vlastimil Babka (SUSE)
@ 2026-04-27  8:13   ` Harry Yoo (Oracle)
  0 siblings, 0 replies; 5+ messages in thread
From: Harry Yoo (Oracle) @ 2026-04-27  8:13 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Andrew Morton, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Johannes Weiner, Zi Yan, Shakeel Butt, Alexei Starovoitov, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin, linux-mm,
	linux-kernel, stable

On Mon, Apr 27, 2026 at 10:00:16AM +0200, Vlastimil Babka (SUSE) wrote:
> On 4/27/26 09:09, Harry Yoo (Oracle) wrote:
> > Due to my mistake, V1 was sent twice w/o proper cover letter and
> > Cc: stable. Please ignore V1. Apologies for the noise. 
> > 
> > Changes since V1:
> > - used b4 to send patch series (w/ a proper cover letter) instead of
> >   my broken git send-email script (Thanks Vlastimil)
> > - added Cc: stable to patches 1 and 2
> > 
> > On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
> > unconditionally succeeds even when the lock is already held.
> > As a result, alloc_frozen_pages_nolock() and kmalloc_nolock() called
> > from an NMI context can successfully re-acquire the lock that the
> > page/slab allocators are already holding (no deadlock because it's
> > trylock,  but leads to e.g., allocating the same page/object twice and
> > causing use-after-free).
> > 
> > It was discovered while testing the new kmalloc/kfree_nolock() test case
> > in the slub_kunit test module with CONFIG_DEBUG_SPINLOCK=y on a UP
> > kernel.
> > 
> > Patch 1 fixes alloc_frozen_pages_nolock() and
> > patch 2 fixes kmalloc_nolock().
> 
> Thanks. Given the problem exposed is in a slab kunit test I think it's
> better to handle this in the slab tree.

Ack, I'm fine either way.

> The page_alloc change is small and
> should not cause conflicts.

Right.

> So I've merged both in slab/for-next.

Thanks! (I see it's been added to slab/for-next-fixes)

-- 
Cheers,
Harry / Hyeonggon

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-27  8:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27  7:09 [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs in NMI on UP Harry Yoo (Oracle)
2026-04-27  7:09 ` [PATCH mm-hotfixes v2 1/2] mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() " Harry Yoo (Oracle)
2026-04-27  7:09 ` [PATCH mm-hotfixes v2 2/2] mm/slab: return NULL early from kmalloc_nolock() " Harry Yoo (Oracle)
2026-04-27  8:00 ` [PATCH mm-hotfixes v2 0/2] mm/page_alloc,slab: return NULL early from *_nolock() memory allocation APIs " Vlastimil Babka (SUSE)
2026-04-27  8:13   ` Harry Yoo (Oracle)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox