The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
@ 2026-06-30 21:40 Gregory Price
  2026-07-01  6:38 ` Mike Rapoport
  2026-07-01  8:35 ` David Hildenbrand (Arm)
  0 siblings, 2 replies; 7+ messages in thread
From: Gregory Price @ 2026-06-30 21:40 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-cxl, kernel-team, david, osalvador, akpm,
	rppt, mgorman, hannes, vbabka

We miss a failed allocation check for pgdat->per_cpu_nodestats, which
results in a NULL deref when we offset into the per-cpu area.

Propagate -ENOMEM up the stack and leave per_cpu_nodestats pointing
at boot_nodestats so a later online can retry the allocation.

hotadd_init_pgdat() returns NULL on failure, which __try_online_node()
already maps to -ENOMEM.

Assisted-by: Sashiko:unknown-model
Fixes: 75ef71840539 ("mm, vmstat: add infrastructure for per-node vmstats")
Signed-off-by: Gregory Price <gourry@gourry.net>
---
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            |  3 ++-
 mm/mm_init.c                   | 14 +++++++++++---
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 7c9d66729c60..f04b915678db 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -289,7 +289,7 @@ static inline void __remove_memory(u64 start, u64 size) {}
 /* Default online_type (MMOP_*) when new memory blocks are added. */
 extern enum mmop mhp_get_default_online_type(void);
 extern void mhp_set_default_online_type(enum mmop online_type);
-extern void __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
+extern int __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
 extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
 extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
 extern int add_memory_resource(int nid, struct resource *resource,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7ac19fab2263..8b137328dcf0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1263,7 +1263,8 @@ static pg_data_t *hotadd_init_pgdat(int nid)
 	pgdat = NODE_DATA(nid);
 
 	/* init node's zones as empty zones, we don't have any present pages.*/
-	free_area_init_core_hotplug(pgdat);
+	if (free_area_init_core_hotplug(pgdat))
+		return NULL;
 
 	/*
 	 * The node we allocated has no zone fallback lists. For avoiding
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 306ea5c13f54..37fd64ce144d 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1536,7 +1536,7 @@ void __init set_pageblock_order(void)
  * NOTE: this function is only called during memory hotplug
  */
 #ifdef CONFIG_MEMORY_HOTPLUG
-void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
+int __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
 {
 	int nid = pgdat->node_id;
 	enum zone_type z;
@@ -1544,8 +1544,14 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
 
 	pgdat_init_internals(pgdat);
 
-	if (pgdat->per_cpu_nodestats == &boot_nodestats)
-		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
+	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
+		struct per_cpu_nodestat __percpu *p;
+
+		p = alloc_percpu(struct per_cpu_nodestat);
+		if (!p)
+			return -ENOMEM;
+		pgdat->per_cpu_nodestats = p;
+	}
 
 	/*
 	 * Reset the nr_zones, order and highest_zoneidx before reuse.
@@ -1583,6 +1589,8 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
 		zone->present_pages = 0;
 		zone_init_internals(zone, z, nid, 0);
 	}
+
+	return 0;
 }
 #endif
 
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-06-30 21:40 [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug Gregory Price
@ 2026-07-01  6:38 ` Mike Rapoport
  2026-07-01 15:20   ` Gregory Price
  2026-07-01  8:35 ` David Hildenbrand (Arm)
  1 sibling, 1 reply; 7+ messages in thread
From: Mike Rapoport @ 2026-07-01  6:38 UTC (permalink / raw)
  To: Gregory Price
  Cc: linux-mm, linux-kernel, linux-cxl, kernel-team, david, osalvador,
	akpm, mgorman, hannes, vbabka

On Tue, Jun 30, 2026 at 05:40:39PM -0400, Gregory Price wrote:
> We miss a failed allocation check for pgdat->per_cpu_nodestats, which
> results in a NULL deref when we offset into the per-cpu area.
> 
> Propagate -ENOMEM up the stack and leave per_cpu_nodestats pointing
> at boot_nodestats so a later online can retry the allocation.
> 
> hotadd_init_pgdat() returns NULL on failure, which __try_online_node()
> already maps to -ENOMEM.
> 
> Assisted-by: Sashiko:unknown-model

I suppose it's rather 

Reported-by: Sashiko <sashiko-bot@kernel.org>

> Fixes: 75ef71840539 ("mm, vmstat: add infrastructure for per-node vmstats")
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>  include/linux/memory_hotplug.h |  2 +-
>  mm/memory_hotplug.c            |  3 ++-
>  mm/mm_init.c                   | 14 +++++++++++---
>  3 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 7c9d66729c60..f04b915678db 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -289,7 +289,7 @@ static inline void __remove_memory(u64 start, u64 size) {}
>  /* Default online_type (MMOP_*) when new memory blocks are added. */
>  extern enum mmop mhp_get_default_online_type(void);
>  extern void mhp_set_default_online_type(enum mmop online_type);
> -extern void __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
> +extern int __ref free_area_init_core_hotplug(struct pglist_data *pgdat);

Nit: we are trying to get rid of 'extern's in the headers, even though it
makes the declarations inconsistent.
Can you please drop the extern since you anyway changing this?

>  extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
>  extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
>  extern int add_memory_resource(int nid, struct resource *resource,
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7ac19fab2263..8b137328dcf0 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1263,7 +1263,8 @@ static pg_data_t *hotadd_init_pgdat(int nid)
>  	pgdat = NODE_DATA(nid);
>  
>  	/* init node's zones as empty zones, we don't have any present pages.*/
> -	free_area_init_core_hotplug(pgdat);
> +	if (free_area_init_core_hotplug(pgdat))
> +		return NULL;
>  
>  	/*
>  	 * The node we allocated has no zone fallback lists. For avoiding
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 306ea5c13f54..37fd64ce144d 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1536,7 +1536,7 @@ void __init set_pageblock_order(void)
>   * NOTE: this function is only called during memory hotplug
>   */
>  #ifdef CONFIG_MEMORY_HOTPLUG
> -void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> +int __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>  {
>  	int nid = pgdat->node_id;
>  	enum zone_type z;
> @@ -1544,8 +1544,14 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>  
>  	pgdat_init_internals(pgdat);
>  
> -	if (pgdat->per_cpu_nodestats == &boot_nodestats)
> -		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
> +	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
> +		struct per_cpu_nodestat __percpu *p;
> +
> +		p = alloc_percpu(struct per_cpu_nodestat);
> +		if (!p)
> +			return -ENOMEM;
> +		pgdat->per_cpu_nodestats = p;
> +	}
>  
>  	/*
>  	 * Reset the nr_zones, order and highest_zoneidx before reuse.
> @@ -1583,6 +1589,8 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>  		zone->present_pages = 0;
>  		zone_init_internals(zone, z, nid, 0);
>  	}
> +
> +	return 0;
>  }
>  #endif
>  
> -- 
> 2.53.0-Meta
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-06-30 21:40 [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug Gregory Price
  2026-07-01  6:38 ` Mike Rapoport
@ 2026-07-01  8:35 ` David Hildenbrand (Arm)
  2026-07-01 15:05   ` Gregory Price
  1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01  8:35 UTC (permalink / raw)
  To: Gregory Price, linux-mm
  Cc: linux-kernel, linux-cxl, kernel-team, osalvador, akpm, rppt,
	mgorman, hannes, vbabka


>  	 * The node we allocated has no zone fallback lists. For avoiding
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 306ea5c13f54..37fd64ce144d 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1536,7 +1536,7 @@ void __init set_pageblock_order(void)
>   * NOTE: this function is only called during memory hotplug
>   */
>  #ifdef CONFIG_MEMORY_HOTPLUG
> -void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> +int __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>  {
>  	int nid = pgdat->node_id;
>  	enum zone_type z;
> @@ -1544,8 +1544,14 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>  
>  	pgdat_init_internals(pgdat);
>  
> -	if (pgdat->per_cpu_nodestats == &boot_nodestats)
> -		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
> +	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
> +		struct per_cpu_nodestat __percpu *p;
> +
> +		p = alloc_percpu(struct per_cpu_nodestat);
> +		if (!p)
> +			return -ENOMEM;
> +		pgdat->per_cpu_nodestats = p;

Is there a need for the temporary variable?

Also how to handle cleanup on error? Or why can we skip cleanup? (what happens
if we get another call to __try_online_node()  later?)

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-07-01  8:35 ` David Hildenbrand (Arm)
@ 2026-07-01 15:05   ` Gregory Price
  2026-07-01 16:54     ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 7+ messages in thread
From: Gregory Price @ 2026-07-01 15:05 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: linux-mm, linux-kernel, linux-cxl, kernel-team, osalvador, akpm,
	rppt, mgorman, hannes, vbabka

On Wed, Jul 01, 2026 at 10:35:41AM +0200, David Hildenbrand (Arm) wrote:
> 
> >  	 * The node we allocated has no zone fallback lists. For avoiding
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index 306ea5c13f54..37fd64ce144d 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1536,7 +1536,7 @@ void __init set_pageblock_order(void)
> >   * NOTE: this function is only called during memory hotplug
> >   */
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > -void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> > +int __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> >  {
> >  	int nid = pgdat->node_id;
> >  	enum zone_type z;
> > @@ -1544,8 +1544,14 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> >  
> >  	pgdat_init_internals(pgdat);
> >  
> > -	if (pgdat->per_cpu_nodestats == &boot_nodestats)
> > -		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
> > +	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
> > +		struct per_cpu_nodestat __percpu *p;
> > +
> > +		p = alloc_percpu(struct per_cpu_nodestat);
> > +		if (!p)
> > +			return -ENOMEM;
> > +		pgdat->per_cpu_nodestats = p;
> 
> Is there a need for the temporary variable?
> 

at start:
    pgdat->per_cpu_nodestats = &boot_nodestats 

So need a tmp to do the swap/revert/etc

> Also how to handle cleanup on error? Or why can we skip cleanup? (what happens
> if we get another call to __try_online_node()  later?)
> 

-ENOMEM ->
    hotadd_init_pgdat -> NULL
        __try_online_node -> -ENOMEM
	    pr_err("Cannot online node %d due to NULL pgdat\n", nid);
	    try_online_node -> -ENOMEM
	        __add_memory_resource  ->  error_memblock_remove:

Basically we're left exactly where we were before we made the attempt.

Another call should work just fine after a bunch of dmesg spew.

If this happens during boot via add_memory then something else is
horribly horribly wrong and we'll at least get some debug info instead
of null deref.

~Gregory

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-07-01  6:38 ` Mike Rapoport
@ 2026-07-01 15:20   ` Gregory Price
  2026-07-01 20:14     ` Mike Rapoport
  0 siblings, 1 reply; 7+ messages in thread
From: Gregory Price @ 2026-07-01 15:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, linux-kernel, linux-cxl, kernel-team, david, osalvador,
	akpm, mgorman, hannes, vbabka

On Wed, Jul 01, 2026 at 09:38:57AM +0300, Mike Rapoport wrote:
> On Tue, Jun 30, 2026 at 05:40:39PM -0400, Gregory Price wrote:
> > We miss a failed allocation check for pgdat->per_cpu_nodestats, which
> > results in a NULL deref when we offset into the per-cpu area.
> > 
> > Propagate -ENOMEM up the stack and leave per_cpu_nodestats pointing
> > at boot_nodestats so a later online can retry the allocation.
> > 
> > hotadd_init_pgdat() returns NULL on failure, which __try_online_node()
> > already maps to -ENOMEM.
> > 
> > Assisted-by: Sashiko:unknown-model
> 
> I suppose it's rather 
> 
> Reported-by: Sashiko <sashiko-bot@kernel.org>

I honestly wasn't sure here, Assisted-by is the tag for LLMs lol
i can change it

> 
> > Fixes: 75ef71840539 ("mm, vmstat: add infrastructure for per-node vmstats")
> > Signed-off-by: Gregory Price <gourry@gourry.net>
> > ---
> >  include/linux/memory_hotplug.h |  2 +-
> >  mm/memory_hotplug.c            |  3 ++-
> >  mm/mm_init.c                   | 14 +++++++++++---
> >  3 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index 7c9d66729c60..f04b915678db 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -289,7 +289,7 @@ static inline void __remove_memory(u64 start, u64 size) {}
> >  /* Default online_type (MMOP_*) when new memory blocks are added. */
> >  extern enum mmop mhp_get_default_online_type(void);
> >  extern void mhp_set_default_online_type(enum mmop online_type);
> > -extern void __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
> > +extern int __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
> 
> Nit: we are trying to get rid of 'extern's in the headers, even though it
> makes the declarations inconsistent.
> Can you please drop the extern since you anyway changing this?

yup no problem


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-07-01 15:05   ` Gregory Price
@ 2026-07-01 16:54     ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:54 UTC (permalink / raw)
  To: Gregory Price
  Cc: linux-mm, linux-kernel, linux-cxl, kernel-team, osalvador, akpm,
	rppt, mgorman, hannes, vbabka

On 7/1/26 17:05, Gregory Price wrote:
> On Wed, Jul 01, 2026 at 10:35:41AM +0200, David Hildenbrand (Arm) wrote:
>>
>>>  	 * The node we allocated has no zone fallback lists. For avoiding
>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>> index 306ea5c13f54..37fd64ce144d 100644
>>> --- a/mm/mm_init.c
>>> +++ b/mm/mm_init.c
>>> @@ -1536,7 +1536,7 @@ void __init set_pageblock_order(void)
>>>   * NOTE: this function is only called during memory hotplug
>>>   */
>>>  #ifdef CONFIG_MEMORY_HOTPLUG
>>> -void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>>> +int __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>>>  {
>>>  	int nid = pgdat->node_id;
>>>  	enum zone_type z;
>>> @@ -1544,8 +1544,14 @@ void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
>>>  
>>>  	pgdat_init_internals(pgdat);
>>>  
>>> -	if (pgdat->per_cpu_nodestats == &boot_nodestats)
>>> -		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
>>> +	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
>>> +		struct per_cpu_nodestat __percpu *p;
>>> +
>>> +		p = alloc_percpu(struct per_cpu_nodestat);
>>> +		if (!p)
>>> +			return -ENOMEM;
>>> +		pgdat->per_cpu_nodestats = p;
>>
>> Is there a need for the temporary variable?
>>
> 
> at start:
>     pgdat->per_cpu_nodestats = &boot_nodestats 
> 
> So need a tmp to do the swap/revert/etc

Reviewing too many patches today ...  makes sense.

> 
>> Also how to handle cleanup on error? Or why can we skip cleanup? (what happens
>> if we get another call to __try_online_node()  later?)
>>
> 
> -ENOMEM ->
>     hotadd_init_pgdat -> NULL
>         __try_online_node -> -ENOMEM
> 	    pr_err("Cannot online node %d due to NULL pgdat\n", nid);
> 	    try_online_node -> -ENOMEM
> 	        __add_memory_resource  ->  error_memblock_remove:
> 
> Basically we're left exactly where we were before we made the attempt.
> 
> Another call should work just fine after a bunch of dmesg spew.

Okay, so the additional pgdat_init_internals) should be ok.

Makes sense to me, thanks.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug
  2026-07-01 15:20   ` Gregory Price
@ 2026-07-01 20:14     ` Mike Rapoport
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Rapoport @ 2026-07-01 20:14 UTC (permalink / raw)
  To: Gregory Price
  Cc: linux-mm, linux-kernel, linux-cxl, kernel-team, david, osalvador,
	akpm, mgorman, hannes, vbabka

On Wed, Jul 01, 2026 at 11:20:50AM -0400, Gregory Price wrote:
> On Wed, Jul 01, 2026 at 09:38:57AM +0300, Mike Rapoport wrote:
> > On Tue, Jun 30, 2026 at 05:40:39PM -0400, Gregory Price wrote:
> > > We miss a failed allocation check for pgdat->per_cpu_nodestats, which
> > > results in a NULL deref when we offset into the per-cpu area.
> > > 
> > > Propagate -ENOMEM up the stack and leave per_cpu_nodestats pointing
> > > at boot_nodestats so a later online can retry the allocation.
> > > 
> > > hotadd_init_pgdat() returns NULL on failure, which __try_online_node()
> > > already maps to -ENOMEM.
> > > 
> > > Assisted-by: Sashiko:unknown-model
> > 
> > I suppose it's rather 
> > 
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> 
> I honestly wasn't sure here, Assisted-by is the tag for LLMs lol
> i can change it

My understanding is that Assisted-by is for cases when you drive an LLM to
create the patch.

And Sashiko reports issues so I believe it should be attributed just as
kernel build robot with Reported-by.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-07-01 20:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 21:40 [PATCH] mm/mm_init: handle alloc_percpu failure in free_area_init_core_hotplug Gregory Price
2026-07-01  6:38 ` Mike Rapoport
2026-07-01 15:20   ` Gregory Price
2026-07-01 20:14     ` Mike Rapoport
2026-07-01  8:35 ` David Hildenbrand (Arm)
2026-07-01 15:05   ` Gregory Price
2026-07-01 16:54     ` David Hildenbrand (Arm)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox