public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] x86/smpboot: WARN_ON in set_cpu_sibling_map triggered by numa=fake=2
@ 2026-04-29  3:41 w15303746062
  2026-05-04  7:48 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: w15303746062 @ 2026-04-29  3:41 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, x86; +Cc: hpa, peterz, linux-kernel

Hi x86 maintainers,

While fuzzing the v7.0 kernel, I encountered a persistent WARNING in `set_cpu_sibling_map()` during boot when using the `numa=fake=2` command-line parameter.

The issue appears to be a logic gap in the topology consistency check. When `numa=fake=N` is used, it artificially divides a single physical package into multiple software NUMA nodes. However, the existing check in `set_cpu_sibling_map()` does not account for this fake NUMA state:

    if (match_pkg(c, o) && !topology_same_node(c, o))
        WARN_ON_ONCE(topology_num_nodes_per_package() == 1);

Since `numa=fake` forces `!topology_same_node(c, o)` to be true for CPUs on the same package, the WARN_ON_ONCE is falsely triggered. With `panic_on_warn=1` enabled in many fuzzing and testing environments, this leads to an early boot panic.

Here is the relevant part of the crash log:

------------[ cut here ]------------
WARNING: arch/x86/kernel/smpboot.c:698 at set_cpu_sibling_map+0x1206/0x1f20
Modules linked in:
CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 7.0.0 #1 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
RIP: 0010:set_cpu_sibling_map+0x1206/0x1f20
Call Trace:
 <TASK>
 ap_starting arch/x86/kernel/smpboot.c:196 [inline]
 start_secondary+0xd8/0x2d0 arch/x86/kernel/smpboot.c:280
 common_startup_64+0x13e/0x148
 </TASK>
---[ end trace ]---

I am reporting this to bring it to your attention, as it might require a small adjustment to bypass this strict topology check when `numa=fake` is active.

Best regards,
Mingyu Wang


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] x86/smpboot: WARN_ON in set_cpu_sibling_map triggered by numa=fake=2
  2026-04-29  3:41 [BUG] x86/smpboot: WARN_ON in set_cpu_sibling_map triggered by numa=fake=2 w15303746062
@ 2026-05-04  7:48 ` Peter Zijlstra
  2026-05-04 13:57   ` w15303746062
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2026-05-04  7:48 UTC (permalink / raw)
  To: w15303746062; +Cc: tglx, mingo, bp, dave.hansen, x86, hpa, linux-kernel


You forgot to Cc lkml.

On Wed, Apr 29, 2026 at 11:41:25AM +0800, w15303746062@163.com wrote:
> Hi x86 maintainers,
> 
> While fuzzing the v7.0 kernel, I encountered a persistent WARNING in
> `set_cpu_sibling_map()` during boot when using the `numa=fake=2`
> command-line parameter.
> 
> The issue appears to be a logic gap in the topology consistency check.
> When `numa=fake=N` is used, it artificially divides a single physical
> package into multiple software NUMA nodes. However, the existing check
> in `set_cpu_sibling_map()` does not account for this fake NUMA state:
> 
>     if (match_pkg(c, o) && !topology_same_node(c, o))
>         WARN_ON_ONCE(topology_num_nodes_per_package() == 1);
> 
> Since `numa=fake` forces `!topology_same_node(c, o)` to be true for
> CPUs on the same package, the WARN_ON_ONCE is falsely triggered. With
> `panic_on_warn=1` enabled in many fuzzing and testing environments,
> this leads to an early boot panic.

*groan*, so ideally we'd just rip out the whole fake numa stuff
entirely, and let people use VMs to fake topology in a consistent
manner.

I'm assuming something like so helps?

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 294a8ea60298..2447b0317f7b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -468,6 +468,8 @@ static int x86_cluster_flags(void)
 }
 #endif
 
+static bool x86_has_fake_numa = false;
+
 static struct sched_domain_topology_level x86_topology[] = {
 	SDTL_INIT(tl_smt_mask, cpu_smt_flags, SMT),
 #ifdef CONFIG_SCHED_CLUSTER
@@ -489,7 +491,7 @@ static void __init build_sched_topology(void)
 	 * PKG domain since the NUMA domains will auto-magically create the
 	 * right spanning domains based on the SLIT.
 	 */
-	if (topology_num_nodes_per_package() > 1) {
+	if (topology_num_nodes_per_package() > 1 || x86_has_fake_numa) {
 		unsigned int pkgdom = ARRAY_SIZE(x86_topology) - 2;
 
 		memset(&x86_topology[pkgdom], 0, sizeof(x86_topology[pkgdom]));
@@ -662,6 +664,13 @@ int arch_sched_node_distance(int from, int to)
 		    topology_num_nodes_per_package() < 3)
 			return d;
 
+		/*
+		 * XXX someone needs to figure out what, if anything can be
+		 * done here.
+		 */
+		if (WARN_ON_ONCE(x86_has_fake_numa))
+			return d;
+
 		/*
 		 * Handle SNC-3 asymmetries.
 		 */
@@ -694,8 +703,10 @@ void set_cpu_sibling_map(int cpu)
 	for_each_cpu(i, cpu_sibling_setup_mask) {
 		o = &cpu_data(i);
 
-		if (match_pkg(c, o) && !topology_same_node(c, o))
-			WARN_ON_ONCE(topology_num_nodes_per_package() == 1);
+		if (match_pkg(c, o) && !topology_same_node(c, o)) {
+			if (topology_num_nodes_per_package() == 1)
+				x86_has_fake_numa = true;
+		}
 
 		if ((i == cpu) || (has_smt && match_smt(c, o)))
 			link_mask(topology_sibling_cpumask, cpu, i);

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re:Re: [BUG] x86/smpboot: WARN_ON in set_cpu_sibling_map triggered by numa=fake=2
  2026-05-04  7:48 ` Peter Zijlstra
@ 2026-05-04 13:57   ` w15303746062
  0 siblings, 0 replies; 3+ messages in thread
From: w15303746062 @ 2026-05-04 13:57 UTC (permalink / raw)
  To: Peter Zijlstra, tglx, mingo, bp, dave.hansen, hpa
  Cc: x86, linux-kernel, 25181214217


From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

Hi,

First, apologies for missing LKML in the initial report. I have added it to the Cc list for this reply.

Thank you for the patch.

I have applied it and run multiple test cycles in my fuzzing environment using the `numa=fake=2` parameter. I can confirm that the patch completely resolves the original boot panic. The system now boots reliably every time. 

Additionally, I carefully checked the logs and did not observe any new warnings being triggered, including the new `WARN_ON_ONCE(x86_has_fake_numa)` you added in `arch_sched_node_distance()`.

Please feel free to add:

Tested-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>

Thanks again for the quick and elegant fix!

Best regards,
Mingyu


At 2026-05-04 15:48:30, "Peter Zijlstra" <peterz@infradead.org> wrote:
>
>You forgot to Cc lkml.
>
>On Wed, Apr 29, 2026 at 11:41:25AM +0800, w15303746062@163.com wrote:
>> Hi x86 maintainers,
>> 
>> While fuzzing the v7.0 kernel, I encountered a persistent WARNING in
>> `set_cpu_sibling_map()` during boot when using the `numa=fake=2`
>> command-line parameter.
>> 
>> The issue appears to be a logic gap in the topology consistency check.
>> When `numa=fake=N` is used, it artificially divides a single physical
>> package into multiple software NUMA nodes. However, the existing check
>> in `set_cpu_sibling_map()` does not account for this fake NUMA state:
>> 
>>     if (match_pkg(c, o) && !topology_same_node(c, o))
>>         WARN_ON_ONCE(topology_num_nodes_per_package() == 1);
>> 
>> Since `numa=fake` forces `!topology_same_node(c, o)` to be true for
>> CPUs on the same package, the WARN_ON_ONCE is falsely triggered. With
>> `panic_on_warn=1` enabled in many fuzzing and testing environments,
>> this leads to an early boot panic.
>
>*groan*, so ideally we'd just rip out the whole fake numa stuff
>entirely, and let people use VMs to fake topology in a consistent
>manner.
>
>I'm assuming something like so helps?
>
>diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>index 294a8ea60298..2447b0317f7b 100644
>--- a/arch/x86/kernel/smpboot.c
>+++ b/arch/x86/kernel/smpboot.c
>@@ -468,6 +468,8 @@ static int x86_cluster_flags(void)
> }
> #endif
> 
>+static bool x86_has_fake_numa = false;
>+
> static struct sched_domain_topology_level x86_topology[] = {
> 	SDTL_INIT(tl_smt_mask, cpu_smt_flags, SMT),
> #ifdef CONFIG_SCHED_CLUSTER
>@@ -489,7 +491,7 @@ static void __init build_sched_topology(void)
> 	 * PKG domain since the NUMA domains will auto-magically create the
> 	 * right spanning domains based on the SLIT.
> 	 */
>-	if (topology_num_nodes_per_package() > 1) {
>+	if (topology_num_nodes_per_package() > 1 || x86_has_fake_numa) {
> 		unsigned int pkgdom = ARRAY_SIZE(x86_topology) - 2;
> 
> 		memset(&x86_topology[pkgdom], 0, sizeof(x86_topology[pkgdom]));
>@@ -662,6 +664,13 @@ int arch_sched_node_distance(int from, int to)
> 		    topology_num_nodes_per_package() < 3)
> 			return d;
> 
>+		/*
>+		 * XXX someone needs to figure out what, if anything can be
>+		 * done here.
>+		 */
>+		if (WARN_ON_ONCE(x86_has_fake_numa))
>+			return d;
>+
> 		/*
> 		 * Handle SNC-3 asymmetries.
> 		 */
>@@ -694,8 +703,10 @@ void set_cpu_sibling_map(int cpu)
> 	for_each_cpu(i, cpu_sibling_setup_mask) {
> 		o = &cpu_data(i);
> 
>-		if (match_pkg(c, o) && !topology_same_node(c, o))
>-			WARN_ON_ONCE(topology_num_nodes_per_package() == 1);
>+		if (match_pkg(c, o) && !topology_same_node(c, o)) {
>+			if (topology_num_nodes_per_package() == 1)
>+				x86_has_fake_numa = true;
>+		}
> 
> 		if ((i == cpu) || (has_smt && match_smt(c, o)))
> 			link_mask(topology_sibling_cpumask, cpu, i);

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-04 13:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29  3:41 [BUG] x86/smpboot: WARN_ON in set_cpu_sibling_map triggered by numa=fake=2 w15303746062
2026-05-04  7:48 ` Peter Zijlstra
2026-05-04 13:57   ` w15303746062

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox