* [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
@ 2026-06-04 18:24 Waiman Long
2026-06-30 21:36 ` Waiman Long
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Waiman Long @ 2026-06-04 18:24 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Frederic Weisbecker
Cc: linux-kernel, Waiman Long
When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
warn when freeing reserved memory before memory map is initialized"),
the following warning was hit when there was a "nohz_full" kernel boot
parameter.
Cannot free reserved memory because of deferred initialization of the memory map
WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
:
Call Trace:
<TASK>
memblock_phys_free+0xcb/0x100
housekeeping_init+0x14c/0x170
start_kernel+0x207/0x450
x86_64_start_reservations+0x24/0x30
x86_64_start_kernel+0xda/0xe0
common_startup_64+0x13e/0x141
</TASK>
IOW, we shouldn't free memblock allocated memory so early
in the boot process when memory map isn't fully initialized in
deferred_init_memmap().
Fix it by saving the housekeeping cpumask memblock memory to
be freed into a free list in housekeeping_init() and add a new
housekeeping_late_init() helper to defer the actual freeing of memblock
memory to when initcall's are being processed. The non-atomic version
of the llist APIs are used as there is no contention.
This commit also depends on the presence of commit 7c2eee9c1367
("memblock: don't touch memblock arrays when memblock_free() is called
late") to prevent a KASAN UAF bug report [1].
[1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/sched/isolation.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
[v3.1] Add __initdata to memblock_freelist
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index ef152d401fe2..156025ef81b7 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -8,6 +8,7 @@
*
*/
#include <linux/sched/isolation.h>
+#include <linux/llist.h>
#include <linux/pci.h>
#include "sched.h"
@@ -27,6 +28,7 @@ struct housekeeping {
};
static struct housekeeping housekeeping;
+static __initdata LLIST_HEAD(memblock_freelist);
bool housekeeping_enabled(enum hk_type type)
{
@@ -189,10 +191,22 @@ void __init housekeeping_init(void)
WARN_ON_ONCE(cpumask_empty(omask));
cpumask_copy(nmask, omask);
RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
- memblock_free(omask, cpumask_size());
+ __llist_add((struct llist_node *)omask, &memblock_freelist);
}
}
+static int __init housekeeping_late_init(void)
+{
+ struct llist_node *llnode, *pos, *t;
+
+ /* Free allocated memblock memory, if any */
+ llnode = __llist_del_all(&memblock_freelist);
+ llist_for_each_safe(pos, t, llnode)
+ memblock_free(pos, cpumask_size());
+ return 0;
+}
+pure_initcall(housekeeping_late_init);
+
static void __init housekeeping_setup_type(enum hk_type type,
cpumask_var_t housekeeping_staging)
{
--
2.54.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-06-04 18:24 [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall Waiman Long
@ 2026-06-30 21:36 ` Waiman Long
2026-07-01 13:28 ` Frederic Weisbecker
2026-07-01 14:13 ` Phil Auld
2 siblings, 0 replies; 7+ messages in thread
From: Waiman Long @ 2026-06-30 21:36 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Frederic Weisbecker
Cc: linux-kernel
On 6/4/26 2:24 PM, Waiman Long wrote:
> When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
> warn when freeing reserved memory before memory map is initialized"),
> the following warning was hit when there was a "nohz_full" kernel boot
> parameter.
>
> Cannot free reserved memory because of deferred initialization of the memory map
> WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> :
> Call Trace:
> <TASK>
> memblock_phys_free+0xcb/0x100
> housekeeping_init+0x14c/0x170
> start_kernel+0x207/0x450
> x86_64_start_reservations+0x24/0x30
> x86_64_start_kernel+0xda/0xe0
> common_startup_64+0x13e/0x141
> </TASK>
>
> IOW, we shouldn't free memblock allocated memory so early
> in the boot process when memory map isn't fully initialized in
> deferred_init_memmap().
>
> Fix it by saving the housekeeping cpumask memblock memory to
> be freed into a free list in housekeeping_init() and add a new
> housekeeping_late_init() helper to defer the actual freeing of memblock
> memory to when initcall's are being processed. The non-atomic version
> of the llist APIs are used as there is no contention.
>
> This commit also depends on the presence of commit 7c2eee9c1367
> ("memblock: don't touch memblock arrays when memblock_free() is called
> late") to prevent a KASAN UAF bug report [1].
>
> [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
>
> Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/sched/isolation.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> [v3.1] Add __initdata to memblock_freelist
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index ef152d401fe2..156025ef81b7 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -8,6 +8,7 @@
> *
> */
> #include <linux/sched/isolation.h>
> +#include <linux/llist.h>
> #include <linux/pci.h>
> #include "sched.h"
>
> @@ -27,6 +28,7 @@ struct housekeeping {
> };
>
> static struct housekeeping housekeeping;
> +static __initdata LLIST_HEAD(memblock_freelist);
>
> bool housekeeping_enabled(enum hk_type type)
> {
> @@ -189,10 +191,22 @@ void __init housekeeping_init(void)
> WARN_ON_ONCE(cpumask_empty(omask));
> cpumask_copy(nmask, omask);
> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> - memblock_free(omask, cpumask_size());
> + __llist_add((struct llist_node *)omask, &memblock_freelist);
> }
> }
>
> +static int __init housekeeping_late_init(void)
> +{
> + struct llist_node *llnode, *pos, *t;
> +
> + /* Free allocated memblock memory, if any */
> + llnode = __llist_del_all(&memblock_freelist);
> + llist_for_each_safe(pos, t, llnode)
> + memblock_free(pos, cpumask_size());
> + return 0;
> +}
> +pure_initcall(housekeeping_late_init);
> +
> static void __init housekeeping_setup_type(enum hk_type type,
> cpumask_var_t housekeeping_staging)
> {
Ping! Does anyone have comment for this patch?
Thanks,
Longman
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-06-04 18:24 [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall Waiman Long
2026-06-30 21:36 ` Waiman Long
@ 2026-07-01 13:28 ` Frederic Weisbecker
2026-07-01 14:13 ` Phil Auld
2 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2026-07-01 13:28 UTC (permalink / raw)
To: Waiman Long
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, linux-kernel
Le Thu, Jun 04, 2026 at 02:24:40PM -0400, Waiman Long a écrit :
> When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
> warn when freeing reserved memory before memory map is initialized"),
> the following warning was hit when there was a "nohz_full" kernel boot
> parameter.
>
> Cannot free reserved memory because of deferred initialization of the memory map
> WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> :
> Call Trace:
> <TASK>
> memblock_phys_free+0xcb/0x100
> housekeeping_init+0x14c/0x170
> start_kernel+0x207/0x450
> x86_64_start_reservations+0x24/0x30
> x86_64_start_kernel+0xda/0xe0
> common_startup_64+0x13e/0x141
> </TASK>
>
> IOW, we shouldn't free memblock allocated memory so early
> in the boot process when memory map isn't fully initialized in
> deferred_init_memmap().
>
> Fix it by saving the housekeeping cpumask memblock memory to
> be freed into a free list in housekeeping_init() and add a new
> housekeeping_late_init() helper to defer the actual freeing of memblock
> memory to when initcall's are being processed. The non-atomic version
> of the llist APIs are used as there is no contention.
>
> This commit also depends on the presence of commit 7c2eee9c1367
> ("memblock: don't touch memblock arrays when memblock_free() is called
> late") to prevent a KASAN UAF bug report [1].
>
> [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
>
> Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
> Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-06-04 18:24 [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall Waiman Long
2026-06-30 21:36 ` Waiman Long
2026-07-01 13:28 ` Frederic Weisbecker
@ 2026-07-01 14:13 ` Phil Auld
2026-07-01 14:25 ` Phil Auld
2 siblings, 1 reply; 7+ messages in thread
From: Phil Auld @ 2026-07-01 14:13 UTC (permalink / raw)
To: Waiman Long
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Frederic Weisbecker,
linux-kernel
Hi Waiman,
On Thu, Jun 04, 2026 at 02:24:40PM -0400 Waiman Long wrote:
> When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
> warn when freeing reserved memory before memory map is initialized"),
> the following warning was hit when there was a "nohz_full" kernel boot
> parameter.
>
> Cannot free reserved memory because of deferred initialization of the memory map
> WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> :
> Call Trace:
> <TASK>
> memblock_phys_free+0xcb/0x100
> housekeeping_init+0x14c/0x170
> start_kernel+0x207/0x450
> x86_64_start_reservations+0x24/0x30
> x86_64_start_kernel+0xda/0xe0
> common_startup_64+0x13e/0x141
> </TASK>
>
> IOW, we shouldn't free memblock allocated memory so early
> in the boot process when memory map isn't fully initialized in
> deferred_init_memmap().
>
> Fix it by saving the housekeeping cpumask memblock memory to
> be freed into a free list in housekeeping_init() and add a new
> housekeeping_late_init() helper to defer the actual freeing of memblock
> memory to when initcall's are being processed. The non-atomic version
> of the llist APIs are used as there is no contention.
>
> This commit also depends on the presence of commit 7c2eee9c1367
> ("memblock: don't touch memblock arrays when memblock_free() is called
> late") to prevent a KASAN UAF bug report [1].
>
> [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
>
> Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/sched/isolation.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> [v3.1] Add __initdata to memblock_freelist
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index ef152d401fe2..156025ef81b7 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -8,6 +8,7 @@
> *
> */
> #include <linux/sched/isolation.h>
> +#include <linux/llist.h>
> #include <linux/pci.h>
> #include "sched.h"
>
> @@ -27,6 +28,7 @@ struct housekeeping {
> };
>
> static struct housekeeping housekeeping;
> +static __initdata LLIST_HEAD(memblock_freelist);
>
> bool housekeeping_enabled(enum hk_type type)
> {
> @@ -189,10 +191,22 @@ void __init housekeeping_init(void)
> WARN_ON_ONCE(cpumask_empty(omask));
> cpumask_copy(nmask, omask);
> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> - memblock_free(omask, cpumask_size());
> + __llist_add((struct llist_node *)omask, &memblock_freelist);
This cast is somewhat concerning. I think I see why it's needed. Wrapping
it in a proper struct would require more allocating and freeing and
make the problem worse. It should work though.
Reviewed-by: Phil Auld <pauld@dhat.com>
Cheers,
Phil
> }
> }
>
> +static int __init housekeeping_late_init(void)
> +{
> + struct llist_node *llnode, *pos, *t;
> +
> + /* Free allocated memblock memory, if any */
> + llnode = __llist_del_all(&memblock_freelist);
> + llist_for_each_safe(pos, t, llnode)
> + memblock_free(pos, cpumask_size());
> + return 0;
> +}
> +pure_initcall(housekeeping_late_init);
> +
> static void __init housekeeping_setup_type(enum hk_type type,
> cpumask_var_t housekeeping_staging)
> {
> --
> 2.54.0
>
>
--
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-07-01 14:13 ` Phil Auld
@ 2026-07-01 14:25 ` Phil Auld
2026-07-01 14:56 ` Frederic Weisbecker
2026-07-01 19:03 ` Waiman Long
0 siblings, 2 replies; 7+ messages in thread
From: Phil Auld @ 2026-07-01 14:25 UTC (permalink / raw)
To: Waiman Long
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Frederic Weisbecker,
linux-kernel
On Wed, Jul 01, 2026 at 10:13:57AM -0400 Phil Auld wrote:
> Hi Waiman,
>
> On Thu, Jun 04, 2026 at 02:24:40PM -0400 Waiman Long wrote:
> > When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
> > warn when freeing reserved memory before memory map is initialized"),
> > the following warning was hit when there was a "nohz_full" kernel boot
> > parameter.
> >
> > Cannot free reserved memory because of deferred initialization of the memory map
> > WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> > :
> > Call Trace:
> > <TASK>
> > memblock_phys_free+0xcb/0x100
> > housekeeping_init+0x14c/0x170
> > start_kernel+0x207/0x450
> > x86_64_start_reservations+0x24/0x30
> > x86_64_start_kernel+0xda/0xe0
> > common_startup_64+0x13e/0x141
> > </TASK>
> >
> > IOW, we shouldn't free memblock allocated memory so early
> > in the boot process when memory map isn't fully initialized in
> > deferred_init_memmap().
> >
> > Fix it by saving the housekeeping cpumask memblock memory to
> > be freed into a free list in housekeeping_init() and add a new
> > housekeeping_late_init() helper to defer the actual freeing of memblock
> > memory to when initcall's are being processed. The non-atomic version
> > of the llist APIs are used as there is no contention.
> >
> > This commit also depends on the presence of commit 7c2eee9c1367
> > ("memblock: don't touch memblock arrays when memblock_free() is called
> > late") to prevent a KASAN UAF bug report [1].
> >
> > [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
> >
> > Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
> > Signed-off-by: Waiman Long <longman@redhat.com>
> > ---
> > kernel/sched/isolation.c | 16 +++++++++++++++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > [v3.1] Add __initdata to memblock_freelist
> >
> > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > index ef152d401fe2..156025ef81b7 100644
> > --- a/kernel/sched/isolation.c
> > +++ b/kernel/sched/isolation.c
> > @@ -8,6 +8,7 @@
> > *
> > */
> > #include <linux/sched/isolation.h>
> > +#include <linux/llist.h>
> > #include <linux/pci.h>
> > #include "sched.h"
> >
> > @@ -27,6 +28,7 @@ struct housekeeping {
> > };
> >
> > static struct housekeeping housekeeping;
> > +static __initdata LLIST_HEAD(memblock_freelist);
> >
> > bool housekeeping_enabled(enum hk_type type)
> > {
> > @@ -189,10 +191,22 @@ void __init housekeeping_init(void)
> > WARN_ON_ONCE(cpumask_empty(omask));
> > cpumask_copy(nmask, omask);
> > RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> > - memblock_free(omask, cpumask_size());
> > + __llist_add((struct llist_node *)omask, &memblock_freelist);
>
> This cast is somewhat concerning. I think I see why it's needed. Wrapping
> it in a proper struct would require more allocating and freeing and
> make the problem worse. It should work though.
>
>
Fwiw, opencode/sonnet suggested a comment like this:
/*
* We can't allocate wrapper structs from memblock as they'd need
* deferred freeing too. Instead, reuse the cpumask memory itself
* as llist nodes. This is safe because:
* - cpumask_size() >= sizeof(struct llist_node)
* - Memory is properly aligned (SMP_CACHE_BYTES)
* - The cpumask is never accessed after being added to the list
*/
... which may be overkill :)
Cheers,
Phil
> Reviewed-by: Phil Auld <pauld@dhat.com>
>
>
>
>
> Cheers,
> Phil
>
>
> > }
> > }
> >
> > +static int __init housekeeping_late_init(void)
> > +{
> > + struct llist_node *llnode, *pos, *t;
> > +
> > + /* Free allocated memblock memory, if any */
> > + llnode = __llist_del_all(&memblock_freelist);
> > + llist_for_each_safe(pos, t, llnode)
> > + memblock_free(pos, cpumask_size());
> > + return 0;
> > +}
> > +pure_initcall(housekeeping_late_init);
> > +
> > static void __init housekeeping_setup_type(enum hk_type type,
> > cpumask_var_t housekeeping_staging)
> > {
> > --
> > 2.54.0
> >
> >
>
> --
>
>
--
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-07-01 14:25 ` Phil Auld
@ 2026-07-01 14:56 ` Frederic Weisbecker
2026-07-01 19:03 ` Waiman Long
1 sibling, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2026-07-01 14:56 UTC (permalink / raw)
To: Phil Auld
Cc: Waiman Long, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
Mel Gorman, Valentin Schneider, K Prateek Nayak, linux-kernel
Le Wed, Jul 01, 2026 at 10:25:59AM -0400, Phil Auld a écrit :
> On Wed, Jul 01, 2026 at 10:13:57AM -0400 Phil Auld wrote:
> > Hi Waiman,
> >
> > On Thu, Jun 04, 2026 at 02:24:40PM -0400 Waiman Long wrote:
> > > When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
> > > warn when freeing reserved memory before memory map is initialized"),
> > > the following warning was hit when there was a "nohz_full" kernel boot
> > > parameter.
> > >
> > > Cannot free reserved memory because of deferred initialization of the memory map
> > > WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> > > :
> > > Call Trace:
> > > <TASK>
> > > memblock_phys_free+0xcb/0x100
> > > housekeeping_init+0x14c/0x170
> > > start_kernel+0x207/0x450
> > > x86_64_start_reservations+0x24/0x30
> > > x86_64_start_kernel+0xda/0xe0
> > > common_startup_64+0x13e/0x141
> > > </TASK>
> > >
> > > IOW, we shouldn't free memblock allocated memory so early
> > > in the boot process when memory map isn't fully initialized in
> > > deferred_init_memmap().
> > >
> > > Fix it by saving the housekeeping cpumask memblock memory to
> > > be freed into a free list in housekeeping_init() and add a new
> > > housekeeping_late_init() helper to defer the actual freeing of memblock
> > > memory to when initcall's are being processed. The non-atomic version
> > > of the llist APIs are used as there is no contention.
> > >
> > > This commit also depends on the presence of commit 7c2eee9c1367
> > > ("memblock: don't touch memblock arrays when memblock_free() is called
> > > late") to prevent a KASAN UAF bug report [1].
> > >
> > > [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
> > >
> > > Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > > kernel/sched/isolation.c | 16 +++++++++++++++-
> > > 1 file changed, 15 insertions(+), 1 deletion(-)
> > >
> > > [v3.1] Add __initdata to memblock_freelist
> > >
> > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > > index ef152d401fe2..156025ef81b7 100644
> > > --- a/kernel/sched/isolation.c
> > > +++ b/kernel/sched/isolation.c
> > > @@ -8,6 +8,7 @@
> > > *
> > > */
> > > #include <linux/sched/isolation.h>
> > > +#include <linux/llist.h>
> > > #include <linux/pci.h>
> > > #include "sched.h"
> > >
> > > @@ -27,6 +28,7 @@ struct housekeeping {
> > > };
> > >
> > > static struct housekeeping housekeeping;
> > > +static __initdata LLIST_HEAD(memblock_freelist);
> > >
> > > bool housekeeping_enabled(enum hk_type type)
> > > {
> > > @@ -189,10 +191,22 @@ void __init housekeeping_init(void)
> > > WARN_ON_ONCE(cpumask_empty(omask));
> > > cpumask_copy(nmask, omask);
> > > RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> > > - memblock_free(omask, cpumask_size());
> > > + __llist_add((struct llist_node *)omask, &memblock_freelist);
> >
> > This cast is somewhat concerning. I think I see why it's needed. Wrapping
> > it in a proper struct would require more allocating and freeing and
> > make the problem worse. It should work though.
> >
> >
>
> Fwiw, opencode/sonnet suggested a comment like this:
>
> /*
> * We can't allocate wrapper structs from memblock as they'd need
> * deferred freeing too. Instead, reuse the cpumask memory itself
> * as llist nodes. This is safe because:
> * - cpumask_size() >= sizeof(struct llist_node)
> * - Memory is properly aligned (SMP_CACHE_BYTES)
> * - The cpumask is never accessed after being added to the list
> */
>
> ... which may be overkill :)
It tells the truth, just a bit too much :-)
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
2026-07-01 14:25 ` Phil Auld
2026-07-01 14:56 ` Frederic Weisbecker
@ 2026-07-01 19:03 ` Waiman Long
1 sibling, 0 replies; 7+ messages in thread
From: Waiman Long @ 2026-07-01 19:03 UTC (permalink / raw)
To: Phil Auld
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Frederic Weisbecker,
linux-kernel
On 7/1/26 10:25 AM, Phil Auld wrote:
> On Wed, Jul 01, 2026 at 10:13:57AM -0400 Phil Auld wrote:
>> Hi Waiman,
>>
>> On Thu, Jun 04, 2026 at 02:24:40PM -0400 Waiman Long wrote:
>>> When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
>>> warn when freeing reserved memory before memory map is initialized"),
>>> the following warning was hit when there was a "nohz_full" kernel boot
>>> parameter.
>>>
>>> Cannot free reserved memory because of deferred initialization of the memory map
>>> WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
>>> :
>>> Call Trace:
>>> <TASK>
>>> memblock_phys_free+0xcb/0x100
>>> housekeeping_init+0x14c/0x170
>>> start_kernel+0x207/0x450
>>> x86_64_start_reservations+0x24/0x30
>>> x86_64_start_kernel+0xda/0xe0
>>> common_startup_64+0x13e/0x141
>>> </TASK>
>>>
>>> IOW, we shouldn't free memblock allocated memory so early
>>> in the boot process when memory map isn't fully initialized in
>>> deferred_init_memmap().
>>>
>>> Fix it by saving the housekeeping cpumask memblock memory to
>>> be freed into a free list in housekeeping_init() and add a new
>>> housekeeping_late_init() helper to defer the actual freeing of memblock
>>> memory to when initcall's are being processed. The non-atomic version
>>> of the llist APIs are used as there is no contention.
>>>
>>> This commit also depends on the presence of commit 7c2eee9c1367
>>> ("memblock: don't touch memblock arrays when memblock_free() is called
>>> late") to prevent a KASAN UAF bug report [1].
>>>
>>> [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/
>>>
>>> Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> ---
>>> kernel/sched/isolation.c | 16 +++++++++++++++-
>>> 1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> [v3.1] Add __initdata to memblock_freelist
>>>
>>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>>> index ef152d401fe2..156025ef81b7 100644
>>> --- a/kernel/sched/isolation.c
>>> +++ b/kernel/sched/isolation.c
>>> @@ -8,6 +8,7 @@
>>> *
>>> */
>>> #include <linux/sched/isolation.h>
>>> +#include <linux/llist.h>
>>> #include <linux/pci.h>
>>> #include "sched.h"
>>>
>>> @@ -27,6 +28,7 @@ struct housekeeping {
>>> };
>>>
>>> static struct housekeeping housekeeping;
>>> +static __initdata LLIST_HEAD(memblock_freelist);
>>>
>>> bool housekeeping_enabled(enum hk_type type)
>>> {
>>> @@ -189,10 +191,22 @@ void __init housekeeping_init(void)
>>> WARN_ON_ONCE(cpumask_empty(omask));
>>> cpumask_copy(nmask, omask);
>>> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
>>> - memblock_free(omask, cpumask_size());
>>> + __llist_add((struct llist_node *)omask, &memblock_freelist);
>> This cast is somewhat concerning. I think I see why it's needed. Wrapping
>> it in a proper struct would require more allocating and freeing and
>> make the problem worse. It should work though.
>>
>>
> Fwiw, opencode/sonnet suggested a comment like this:
>
> /*
> * We can't allocate wrapper structs from memblock as they'd need
> * deferred freeing too. Instead, reuse the cpumask memory itself
> * as llist nodes. This is safe because:
> * - cpumask_size() >= sizeof(struct llist_node)
I know that as the smallest allocation size is sizeof(long) which is the
size of a llist_node. I should have mentioned that either in the commit
log or as a comment.
Cheers,
Longman
> * - Memory is properly aligned (SMP_CACHE_BYTES)
> * - The cpumask is never accessed after being added to the list
> */
>
> ... which may be overkill :)
>
>
>
> Cheers,
> Phil
>
>
>> Reviewed-by: Phil Auld <pauld@dhat.com>
>>
>>
>>
>>
>> Cheers,
>> Phil
>>
>>
>>> }
>>> }
>>>
>>> +static int __init housekeeping_late_init(void)
>>> +{
>>> + struct llist_node *llnode, *pos, *t;
>>> +
>>> + /* Free allocated memblock memory, if any */
>>> + llnode = __llist_del_all(&memblock_freelist);
>>> + llist_for_each_safe(pos, t, llnode)
>>> + memblock_free(pos, cpumask_size());
>>> + return 0;
>>> +}
>>> +pure_initcall(housekeeping_late_init);
>>> +
>>> static void __init housekeeping_setup_type(enum hk_type type,
>>> cpumask_var_t housekeeping_staging)
>>> {
>>> --
>>> 2.54.0
>>>
>>>
>> --
>>
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-07-01 19:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 18:24 [PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall Waiman Long
2026-06-30 21:36 ` Waiman Long
2026-07-01 13:28 ` Frederic Weisbecker
2026-07-01 14:13 ` Phil Auld
2026-07-01 14:25 ` Phil Auld
2026-07-01 14:56 ` Frederic Weisbecker
2026-07-01 19:03 ` Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox