* [PATCH 6.18 0/2] Fix NUMA sched domain build errors for GNR and CWF
@ 2026-02-05 21:33 Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 1/2] sched: Create architecture specific sched domain distances Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode Tim Chen
0 siblings, 2 replies; 6+ messages in thread
From: Tim Chen @ 2026-02-05 21:33 UTC (permalink / raw)
To: stable
Cc: Tim Chen, Peter Zijlstra, Ingo Molnar, Juri Lelli,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
Tim Chen, Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
K Prateek Nayak, Gautham R . Shenoy, Zhao Liu,
Vinicius Costa Gomes, Arjan Van De Ven
While testing Granite Rapids (GNR) and Clearwater Forest (CWF) systems
in SNC-3 mode, we encountered sched domain build errors in dmesg.
The scheduler domain code did not expect asymmetric node distances
from a local node to multiple nodes in a remote package. As a result,
remote nodes ended up being grouped partially with local nodes with
asymemtric groupings, and creating too many levels in the NUMA sched
domain hierarchy.
To address this, we simplify remote node distances for the purpose of
sched domain construction on GNR and CWF. Specifically, we replace the
individual distances to nodes within the same remote package with their
average distance. This resolves the domain build errors and reduces the
number of NUMA sched domain levels.
The actual SLIT NUMA node distances are still preserved separately, in
case they are needed when building sched domains. NUMA balancing
continues to use the true distances when selecting a closer remote node
for a task’s numa_group.
These patches have been merged upstream.
Thanks,
Tim
Tim Chen (2):
sched: Create architecture specific sched domain distances
sched/topology: Fix sched domain build error for GNR, CWF in SNC-3
mode
arch/x86/kernel/smpboot.c | 70 ++++++++++++++++++++++++
kernel/sched/topology.c | 108 ++++++++++++++++++++++++++++++--------
2 files changed, 156 insertions(+), 22 deletions(-)
--
2.32.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 6.18 1/2] sched: Create architecture specific sched domain distances
2026-02-05 21:33 [PATCH 6.18 0/2] Fix NUMA sched domain build errors for GNR and CWF Tim Chen
@ 2026-02-05 21:33 ` Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode Tim Chen
1 sibling, 0 replies; 6+ messages in thread
From: Tim Chen @ 2026-02-05 21:33 UTC (permalink / raw)
To: stable
Cc: Tim Chen, Peter Zijlstra, Ingo Molnar, Juri Lelli,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
Tim Chen, Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
K Prateek Nayak, Gautham R . Shenoy, Zhao Liu,
Vinicius Costa Gomes, Arjan Van De Ven
[ Upstream commit 06f2c90885e92992d1ce55d3f35b65b44d5ecc25 ]
Allow architecture specific sched domain NUMA distances that are
modified from actual NUMA node distances for the purpose of building
NUMA sched domains.
Keep actual NUMA distances separately if modified distances
are used for building sched domains. Such distances
are still needed as NUMA balancing benefits from finding the
NUMA nodes that are actually closer to a task numa_group.
Consolidate the recording of unique NUMA distances in an array to
sched_record_numa_dist() so the function can be reused to record NUMA
distances when the NUMA distance metric is changed.
No functional change and additional distance array
allocated if there're no arch specific NUMA distances
being defined.
Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
---
kernel/sched/topology.c | 108 ++++++++++++++++++++++++++++++++--------
1 file changed, 86 insertions(+), 22 deletions(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 444bdfdab731..711076aa4980 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1590,10 +1590,17 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
#ifdef CONFIG_NUMA
enum numa_topology_type sched_numa_topology_type;
+/*
+ * sched_domains_numa_distance is derived from sched_numa_node_distance
+ * and provides a simplified view of NUMA distances used specifically
+ * for building NUMA scheduling domains.
+ */
static int sched_domains_numa_levels;
+static int sched_numa_node_levels;
int sched_max_numa_distance;
static int *sched_domains_numa_distance;
+static int *sched_numa_node_distance;
static struct cpumask ***sched_domains_numa_masks;
#endif /* CONFIG_NUMA */
@@ -1845,10 +1852,10 @@ bool find_numa_distance(int distance)
return true;
rcu_read_lock();
- distances = rcu_dereference(sched_domains_numa_distance);
+ distances = rcu_dereference(sched_numa_node_distance);
if (!distances)
goto unlock;
- for (i = 0; i < sched_domains_numa_levels; i++) {
+ for (i = 0; i < sched_numa_node_levels; i++) {
if (distances[i] == distance) {
found = true;
break;
@@ -1924,14 +1931,34 @@ static void init_numa_topology_type(int offline_node)
#define NR_DISTANCE_VALUES (1 << DISTANCE_BITS)
-void sched_init_numa(int offline_node)
+/*
+ * An architecture could modify its NUMA distance, to change
+ * grouping of NUMA nodes and number of NUMA levels when creating
+ * NUMA level sched domains.
+ *
+ * A NUMA level is created for each unique
+ * arch_sched_node_distance.
+ */
+static int numa_node_dist(int i, int j)
{
- struct sched_domain_topology_level *tl;
- unsigned long *distance_map;
+ return node_distance(i, j);
+}
+
+int arch_sched_node_distance(int from, int to)
+ __weak __alias(numa_node_dist);
+
+static bool modified_sched_node_distance(void)
+{
+ return numa_node_dist != arch_sched_node_distance;
+}
+
+static int sched_record_numa_dist(int offline_node, int (*n_dist)(int, int),
+ int **dist, int *levels)
+{
+ unsigned long *distance_map __free(bitmap) = NULL;
int nr_levels = 0;
int i, j;
int *distances;
- struct cpumask ***masks;
/*
* O(nr_nodes^2) de-duplicating selection sort -- in order to find the
@@ -1939,17 +1966,16 @@ void sched_init_numa(int offline_node)
*/
distance_map = bitmap_alloc(NR_DISTANCE_VALUES, GFP_KERNEL);
if (!distance_map)
- return;
+ return -ENOMEM;
bitmap_zero(distance_map, NR_DISTANCE_VALUES);
for_each_cpu_node_but(i, offline_node) {
for_each_cpu_node_but(j, offline_node) {
- int distance = node_distance(i, j);
+ int distance = n_dist(i, j);
if (distance < LOCAL_DISTANCE || distance >= NR_DISTANCE_VALUES) {
sched_numa_warn("Invalid distance value range");
- bitmap_free(distance_map);
- return;
+ return -EINVAL;
}
bitmap_set(distance_map, distance, 1);
@@ -1962,18 +1988,46 @@ void sched_init_numa(int offline_node)
nr_levels = bitmap_weight(distance_map, NR_DISTANCE_VALUES);
distances = kcalloc(nr_levels, sizeof(int), GFP_KERNEL);
- if (!distances) {
- bitmap_free(distance_map);
- return;
- }
+ if (!distances)
+ return -ENOMEM;
for (i = 0, j = 0; i < nr_levels; i++, j++) {
j = find_next_bit(distance_map, NR_DISTANCE_VALUES, j);
distances[i] = j;
}
- rcu_assign_pointer(sched_domains_numa_distance, distances);
+ *dist = distances;
+ *levels = nr_levels;
+
+ return 0;
+}
+
+void sched_init_numa(int offline_node)
+{
+ struct sched_domain_topology_level *tl;
+ int nr_levels, nr_node_levels;
+ int i, j;
+ int *distances, *domain_distances;
+ struct cpumask ***masks;
- bitmap_free(distance_map);
+ /* Record the NUMA distances from SLIT table */
+ if (sched_record_numa_dist(offline_node, numa_node_dist, &distances,
+ &nr_node_levels))
+ return;
+
+ /* Record modified NUMA distances for building sched domains */
+ if (modified_sched_node_distance()) {
+ if (sched_record_numa_dist(offline_node, arch_sched_node_distance,
+ &domain_distances, &nr_levels)) {
+ kfree(distances);
+ return;
+ }
+ } else {
+ domain_distances = distances;
+ nr_levels = nr_node_levels;
+ }
+ rcu_assign_pointer(sched_numa_node_distance, distances);
+ WRITE_ONCE(sched_max_numa_distance, distances[nr_node_levels - 1]);
+ WRITE_ONCE(sched_numa_node_levels, nr_node_levels);
/*
* 'nr_levels' contains the number of unique distances
@@ -1991,6 +2045,8 @@ void sched_init_numa(int offline_node)
*
* We reset it to 'nr_levels' at the end of this function.
*/
+ rcu_assign_pointer(sched_domains_numa_distance, domain_distances);
+
sched_domains_numa_levels = 0;
masks = kzalloc(sizeof(void *) * nr_levels, GFP_KERNEL);
@@ -2016,10 +2072,13 @@ void sched_init_numa(int offline_node)
masks[i][j] = mask;
for_each_cpu_node_but(k, offline_node) {
- if (sched_debug() && (node_distance(j, k) != node_distance(k, j)))
+ if (sched_debug() &&
+ (arch_sched_node_distance(j, k) !=
+ arch_sched_node_distance(k, j)))
sched_numa_warn("Node-distance not symmetric");
- if (node_distance(j, k) > sched_domains_numa_distance[i])
+ if (arch_sched_node_distance(j, k) >
+ sched_domains_numa_distance[i])
continue;
cpumask_or(mask, mask, cpumask_of_node(k));
@@ -2059,7 +2118,6 @@ void sched_init_numa(int offline_node)
sched_domain_topology = tl;
sched_domains_numa_levels = nr_levels;
- WRITE_ONCE(sched_max_numa_distance, sched_domains_numa_distance[nr_levels - 1]);
init_numa_topology_type(offline_node);
}
@@ -2067,14 +2125,18 @@ void sched_init_numa(int offline_node)
static void sched_reset_numa(void)
{
- int nr_levels, *distances;
+ int nr_levels, *distances, *dom_distances = NULL;
struct cpumask ***masks;
nr_levels = sched_domains_numa_levels;
+ sched_numa_node_levels = 0;
sched_domains_numa_levels = 0;
sched_max_numa_distance = 0;
sched_numa_topology_type = NUMA_DIRECT;
- distances = sched_domains_numa_distance;
+ distances = sched_numa_node_distance;
+ if (sched_numa_node_distance != sched_domains_numa_distance)
+ dom_distances = sched_domains_numa_distance;
+ rcu_assign_pointer(sched_numa_node_distance, NULL);
rcu_assign_pointer(sched_domains_numa_distance, NULL);
masks = sched_domains_numa_masks;
rcu_assign_pointer(sched_domains_numa_masks, NULL);
@@ -2083,6 +2145,7 @@ static void sched_reset_numa(void)
synchronize_rcu();
kfree(distances);
+ kfree(dom_distances);
for (i = 0; i < nr_levels && masks; i++) {
if (!masks[i])
continue;
@@ -2129,7 +2192,8 @@ void sched_domains_numa_masks_set(unsigned int cpu)
continue;
/* Set ourselves in the remote node's masks */
- if (node_distance(j, node) <= sched_domains_numa_distance[i])
+ if (arch_sched_node_distance(j, node) <=
+ sched_domains_numa_distance[i])
cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
}
}
--
2.32.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode
2026-02-05 21:33 [PATCH 6.18 0/2] Fix NUMA sched domain build errors for GNR and CWF Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 1/2] sched: Create architecture specific sched domain distances Tim Chen
@ 2026-02-05 21:33 ` Tim Chen
2026-02-07 15:29 ` Greg KH
1 sibling, 1 reply; 6+ messages in thread
From: Tim Chen @ 2026-02-05 21:33 UTC (permalink / raw)
To: stable
Cc: Tim Chen, Peter Zijlstra, Ingo Molnar, Juri Lelli,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
Tim Chen, Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
K Prateek Nayak, Gautham R . Shenoy, Zhao Liu,
Vinicius Costa Gomes, Arjan Van De Ven
[ Upstream commit 4d6dd05d07d00bc3bd91183dab4d75caa8018db9 ]
It is possible for Granite Rapids (GNR) and Clearwater Forest
(CWF) to have up to 3 dies per package. When sub-numa cluster (SNC-3)
is enabled, each die will become a separate NUMA node in the package
with different distances between dies within the same package.
For example, on GNR, we see the following numa distances for a 2 socket
system with 3 dies per socket:
package 1 package2
----------------
| |
--------- ---------
| 0 | | 3 |
--------- ---------
| |
--------- ---------
| 1 | | 4 |
--------- ---------
| |
--------- ---------
| 2 | | 5 |
--------- ---------
| |
----------------
node distances:
node 0 1 2 3 4 5
0: 10 15 17 21 28 26
1: 15 10 15 23 26 23
2: 17 15 10 26 23 21
3: 21 28 26 10 15 17
4: 23 26 23 15 10 15
5: 26 23 21 17 15 10
The node distances above led to 2 problems:
1. Asymmetric routes taken between nodes in different packages led to
asymmetric scheduler domain perspective depending on which node you
are on. Current scheduler code failed to build domains properly with
asymmetric distances.
2. Multiple remote distances to respective tiles on remote package create
too many levels of domain hierarchies grouping different nodes between
remote packages.
For example, the above GNR topology lead to NUMA domains below:
Sched domains from the perspective of a CPU in node 0, where the number
in bracket represent node number.
NUMA-level 1 [0,1] [2]
NUMA-level 2 [0,1,2] [3]
NUMA-level 3 [0,1,2,3] [5]
NUMA-level 4 [0,1,2,3,5] [4]
Sched domains from the perspective of a CPU in node 4
NUMA-level 1 [4] [3,5]
NUMA-level 2 [3,4,5] [0,2]
NUMA-level 3 [0,2,3,4,5] [1]
Scheduler group peers for load balancing from the perspective of CPU 0
and 4 are different. Improper task could be chosen for load balancing
between groups such as [0,2,3,4,5] [1]. Ideally you should choose nodes
in 0 or 2 that are in same package as node 1 first. But instead tasks
in the remote package node 3, 4, 5 could be chosen with an equal chance
and could lead to excessive remote package migrations and imbalance of
load between packages. We should not group partial remote nodes and
local nodes together.
Simplify the remote distances for CWF and GNR for the purpose of
sched domains building, which maintains symmetry and leads to a more
reasonable load balance hierarchy.
The sched domains from the perspective of a CPU in node 0 NUMA-level 1
is now
NUMA-level 1 [0,1] [2]
NUMA-level 2 [0,1,2] [3,4,5]
The sched domains from the perspective of a CPU in node 4 NUMA-level 1
is now
NUMA-level 1 [4] [3,5]
NUMA-level 2 [3,4,5] [0,1,2]
We have the same balancing perspective from node 0 or node 4. Loads are
now balanced equally between packages.
Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
---
arch/x86/kernel/smpboot.c | 70 +++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index eb289abece23..5709c9cab195 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -515,6 +515,76 @@ static void __init build_sched_topology(void)
set_sched_topology(topology);
}
+#ifdef CONFIG_NUMA
+static int sched_avg_remote_distance;
+static int avg_remote_numa_distance(void)
+{
+ int i, j;
+ int distance, nr_remote, total_distance;
+
+ if (sched_avg_remote_distance > 0)
+ return sched_avg_remote_distance;
+
+ nr_remote = 0;
+ total_distance = 0;
+ for_each_node_state(i, N_CPU) {
+ for_each_node_state(j, N_CPU) {
+ distance = node_distance(i, j);
+
+ if (distance >= REMOTE_DISTANCE) {
+ nr_remote++;
+ total_distance += distance;
+ }
+ }
+ }
+ if (nr_remote)
+ sched_avg_remote_distance = total_distance / nr_remote;
+ else
+ sched_avg_remote_distance = REMOTE_DISTANCE;
+
+ return sched_avg_remote_distance;
+}
+
+int arch_sched_node_distance(int from, int to)
+{
+ int d = node_distance(from, to);
+
+ switch (boot_cpu_data.x86_vfm) {
+ case INTEL_GRANITERAPIDS_X:
+ case INTEL_ATOM_DARKMONT_X:
+
+ if (!x86_has_numa_in_package || topology_max_packages() == 1 ||
+ d < REMOTE_DISTANCE)
+ return d;
+
+ /*
+ * With SNC enabled, there could be too many levels of remote
+ * NUMA node distances, creating NUMA domain levels
+ * including local nodes and partial remote nodes.
+ *
+ * Trim finer distance tuning for NUMA nodes in remote package
+ * for the purpose of building sched domains. Group NUMA nodes
+ * in the remote package in the same sched group.
+ * Simplify NUMA domains and avoid extra NUMA levels including
+ * different remote NUMA nodes and local nodes.
+ *
+ * GNR and CWF don't expect systems with more than 2 packages
+ * and more than 2 hops between packages. Single average remote
+ * distance won't be appropriate if there are more than 2
+ * packages as average distance to different remote packages
+ * could be different.
+ */
+ WARN_ONCE(topology_max_packages() > 2,
+ "sched: Expect only up to 2 packages for GNR or CWF, "
+ "but saw %d packages when building sched domains.",
+ topology_max_packages());
+
+ d = avg_remote_numa_distance();
+ }
+ return d;
+}
+#endif /* CONFIG_NUMA */
+
void set_cpu_sibling_map(int cpu)
{
bool has_smt = __max_threads_per_core > 1;
--
2.32.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode
2026-02-05 21:33 ` [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode Tim Chen
@ 2026-02-07 15:29 ` Greg KH
2026-02-09 3:59 ` K Prateek Nayak
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2026-02-07 15:29 UTC (permalink / raw)
To: Tim Chen
Cc: stable, Peter Zijlstra, Ingo Molnar, Juri Lelli, Dietmar Eggemann,
Ben Segall, Mel Gorman, Valentin Schneider, Tim Chen,
Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
K Prateek Nayak, Gautham R . Shenoy, Zhao Liu,
Vinicius Costa Gomes, Arjan Van De Ven
On Thu, Feb 05, 2026 at 01:33:34PM -0800, Tim Chen wrote:
> [ Upstream commit 4d6dd05d07d00bc3bd91183dab4d75caa8018db9 ]
>
> It is possible for Granite Rapids (GNR) and Clearwater Forest
> (CWF) to have up to 3 dies per package. When sub-numa cluster (SNC-3)
> is enabled, each die will become a separate NUMA node in the package
> with different distances between dies within the same package.
>
> For example, on GNR, we see the following numa distances for a 2 socket
> system with 3 dies per socket:
>
> package 1 package2
> ----------------
> | |
> --------- ---------
> | 0 | | 3 |
> --------- ---------
> | |
> --------- ---------
> | 1 | | 4 |
> --------- ---------
> | |
> --------- ---------
> | 2 | | 5 |
> --------- ---------
> | |
> ----------------
>
> node distances:
> node 0 1 2 3 4 5
> 0: 10 15 17 21 28 26
> 1: 15 10 15 23 26 23
> 2: 17 15 10 26 23 21
> 3: 21 28 26 10 15 17
> 4: 23 26 23 15 10 15
> 5: 26 23 21 17 15 10
>
> The node distances above led to 2 problems:
>
> 1. Asymmetric routes taken between nodes in different packages led to
> asymmetric scheduler domain perspective depending on which node you
> are on. Current scheduler code failed to build domains properly with
> asymmetric distances.
>
> 2. Multiple remote distances to respective tiles on remote package create
> too many levels of domain hierarchies grouping different nodes between
> remote packages.
>
> For example, the above GNR topology lead to NUMA domains below:
>
> Sched domains from the perspective of a CPU in node 0, where the number
> in bracket represent node number.
>
> NUMA-level 1 [0,1] [2]
> NUMA-level 2 [0,1,2] [3]
> NUMA-level 3 [0,1,2,3] [5]
> NUMA-level 4 [0,1,2,3,5] [4]
>
> Sched domains from the perspective of a CPU in node 4
> NUMA-level 1 [4] [3,5]
> NUMA-level 2 [3,4,5] [0,2]
> NUMA-level 3 [0,2,3,4,5] [1]
>
> Scheduler group peers for load balancing from the perspective of CPU 0
> and 4 are different. Improper task could be chosen for load balancing
> between groups such as [0,2,3,4,5] [1]. Ideally you should choose nodes
> in 0 or 2 that are in same package as node 1 first. But instead tasks
> in the remote package node 3, 4, 5 could be chosen with an equal chance
> and could lead to excessive remote package migrations and imbalance of
> load between packages. We should not group partial remote nodes and
> local nodes together.
> Simplify the remote distances for CWF and GNR for the purpose of
> sched domains building, which maintains symmetry and leads to a more
> reasonable load balance hierarchy.
>
> The sched domains from the perspective of a CPU in node 0 NUMA-level 1
> is now
> NUMA-level 1 [0,1] [2]
> NUMA-level 2 [0,1,2] [3,4,5]
>
> The sched domains from the perspective of a CPU in node 4 NUMA-level 1
> is now
> NUMA-level 1 [4] [3,5]
> NUMA-level 2 [3,4,5] [0,1,2]
>
> We have the same balancing perspective from node 0 or node 4. Loads are
> now balanced equally between packages.
>
> Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Chen Yu <yu.c.chen@intel.com>
> Tested-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> arch/x86/kernel/smpboot.c | 70 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 70 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index eb289abece23..5709c9cab195 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -515,6 +515,76 @@ static void __init build_sched_topology(void)
> set_sched_topology(topology);
> }
>
> +#ifdef CONFIG_NUMA
> +static int sched_avg_remote_distance;
> +static int avg_remote_numa_distance(void)
> +{
> + int i, j;
> + int distance, nr_remote, total_distance;
> +
> + if (sched_avg_remote_distance > 0)
> + return sched_avg_remote_distance;
> +
> + nr_remote = 0;
> + total_distance = 0;
> + for_each_node_state(i, N_CPU) {
> + for_each_node_state(j, N_CPU) {
> + distance = node_distance(i, j);
> +
> + if (distance >= REMOTE_DISTANCE) {
> + nr_remote++;
> + total_distance += distance;
> + }
> + }
> + }
> + if (nr_remote)
> + sched_avg_remote_distance = total_distance / nr_remote;
> + else
> + sched_avg_remote_distance = REMOTE_DISTANCE;
> +
> + return sched_avg_remote_distance;
> +}
> +
> +int arch_sched_node_distance(int from, int to)
> +{
> + int d = node_distance(from, to);
> +
> + switch (boot_cpu_data.x86_vfm) {
> + case INTEL_GRANITERAPIDS_X:
> + case INTEL_ATOM_DARKMONT_X:
> +
> + if (!x86_has_numa_in_package || topology_max_packages() == 1 ||
> + d < REMOTE_DISTANCE)
> + return d;
> +
> + /*
> + * With SNC enabled, there could be too many levels of remote
> + * NUMA node distances, creating NUMA domain levels
> + * including local nodes and partial remote nodes.
> + *
> + * Trim finer distance tuning for NUMA nodes in remote package
> + * for the purpose of building sched domains. Group NUMA nodes
> + * in the remote package in the same sched group.
> + * Simplify NUMA domains and avoid extra NUMA levels including
> + * different remote NUMA nodes and local nodes.
> + *
> + * GNR and CWF don't expect systems with more than 2 packages
> + * and more than 2 hops between packages. Single average remote
> + * distance won't be appropriate if there are more than 2
> + * packages as average distance to different remote packages
> + * could be different.
> + */
> + WARN_ONCE(topology_max_packages() > 2,
> + "sched: Expect only up to 2 packages for GNR or CWF, "
> + "but saw %d packages when building sched domains.",
> + topology_max_packages());
> +
> + d = avg_remote_numa_distance();
> + }
> + return d;
> +}
> +#endif /* CONFIG_NUMA */
> +
> void set_cpu_sibling_map(int cpu)
> {
> bool has_smt = __max_threads_per_core > 1;
> --
> 2.32.0
>
>
This breaks the build:
CC arch/x86/kernel/smpboot.o
arch/x86/kernel/smpboot.c:548:5: error: no previous prototype for ‘arch_sched_node_distance’ [-Werror=missing-prototypes]
548 | int arch_sched_node_distance(int from, int to)
| ^~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
How was it tested?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode
2026-02-07 15:29 ` Greg KH
@ 2026-02-09 3:59 ` K Prateek Nayak
2026-02-09 19:21 ` Tim Chen
0 siblings, 1 reply; 6+ messages in thread
From: K Prateek Nayak @ 2026-02-09 3:59 UTC (permalink / raw)
To: Greg KH, Tim Chen
Cc: stable, Peter Zijlstra, Ingo Molnar, Juri Lelli, Dietmar Eggemann,
Ben Segall, Mel Gorman, Valentin Schneider, Tim Chen,
Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
Gautham R . Shenoy, Zhao Liu, Vinicius Costa Gomes,
Arjan Van De Ven
Hello Greg,
On 2/7/2026 8:59 PM, Greg KH wrote:
> This breaks the build:
> CC arch/x86/kernel/smpboot.o
> arch/x86/kernel/smpboot.c:548:5: error: no previous prototype for ‘arch_sched_node_distance’ [-Werror=missing-prototypes]
> 548 | int arch_sched_node_distance(int from, int to)
> | ^~~~~~~~~~~~~~~~~~~~~~~~
> cc1: all warnings being treated as errors
>
> How was it tested?
I believe this build issue was fixed by upstream commit 73cbcfe255f7
("sched/topology,x86: Fix build warning")
(Full upstream SHA: 73cbcfe255f7edca915d978a7d1b0a11f2d62812)
P.S. It cherry-picks cleanly on top of "Linux 6.18.9".
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode
2026-02-09 3:59 ` K Prateek Nayak
@ 2026-02-09 19:21 ` Tim Chen
0 siblings, 0 replies; 6+ messages in thread
From: Tim Chen @ 2026-02-09 19:21 UTC (permalink / raw)
To: K Prateek Nayak, Greg KH
Cc: stable, Peter Zijlstra, Ingo Molnar, Juri Lelli, Dietmar Eggemann,
Ben Segall, Mel Gorman, Valentin Schneider, Tim Chen,
Vincent Guittot, Len Brown, linux-kernel, Chen Yu,
Gautham R . Shenoy, Zhao Liu, Vinicius Costa Gomes,
Arjan Van De Ven
On Mon, 2026-02-09 at 09:29 +0530, K Prateek Nayak wrote:
> Hello Greg,
>
> On 2/7/2026 8:59 PM, Greg KH wrote:
> > This breaks the build:
> > CC arch/x86/kernel/smpboot.o
> > arch/x86/kernel/smpboot.c:548:5: error: no previous prototype for ‘arch_sched_node_distance’ [-Werror=missing-prototypes]
> > 548 | int arch_sched_node_distance(int from, int to)
> > | ^~~~~~~~~~~~~~~~~~~~~~~~
> > cc1: all warnings being treated as errors
> >
> > How was it tested?
>
> I believe this build issue was fixed by upstream commit 73cbcfe255f7
> ("sched/topology,x86: Fix build warning")
>
> (Full upstream SHA: 73cbcfe255f7edca915d978a7d1b0a11f2d62812)
>
> P.S. It cherry-picks cleanly on top of "Linux 6.18.9".
Pratek,
Thanks for pointing to the patch.
Tim
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-09 19:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05 21:33 [PATCH 6.18 0/2] Fix NUMA sched domain build errors for GNR and CWF Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 1/2] sched: Create architecture specific sched domain distances Tim Chen
2026-02-05 21:33 ` [PATCH 6.18 2/2] sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode Tim Chen
2026-02-07 15:29 ` Greg KH
2026-02-09 3:59 ` K Prateek Nayak
2026-02-09 19:21 ` Tim Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox